Indices and tables

tprojection

class tprojection.core.Tprojection(df, target, feature, target_type='', feature_type='', target_modality='', nb_buckets=0, n_estimators=1, continuous_threshold=0.05)[source]

this class allows to study the relation between the target and a single feature, with the specificity to display a chart type adapted to the type of the input variables (categorical or continuous)

Parameters:
  • df (pandas DataFrame) –
  • target (string) –
  • feature (string) –
  • target_type (string) – can take the values “categorical” or “continuous”
  • feature_type (string) – can take the values “categorical” or “continuous”
  • target_modality (string) – will be used for multiclass problem (not implemented yet)
  • nb_buckets (int (0)) – if > 0, encode feature on nb_buckets dummy modalities if the cardinality is to high
  • n_estimators (int (1)) – if > 1, use boostrapping to evaluate estimator variance (only relevant for categorical target and features)
tprojection.utils.get_encoding(df, target, feature, nb_buckets)[source]

Encode the feature modalities on a maximum of nb_buckets

Parameters:
  • df (pandas DataFrame) –
  • target (str) –
  • feature (str) –
  • nb_buckets (int) –
Returns:

Return type:

Dict()

tprojection.utils.is_continuous(s, thresh)[source]

Return true if the series is continuous

Parameters:
  • s (pandas Series) –
  • thresh (float) –
Returns:

Return type:

Boolean

tprojection.datasets.load_data(dataset)[source]

load test dataset, possible options are: - titanic

Parameters:dataset (str) – required dataset
Returns:
Return type:pandas DataFrame