PredictorConverter
This subclass of the Converter abstract class, which is used to transform Autogluon trained TabularPredictor object into leaderboard and dictionaries that can be used to build the Rashomon Set.
predictor : TabularPredictor
Trained TabularPredictor AutoGluon object.
test_data : TabularDataset
Test data for analysis in a TabularDataset format.
df_name : str
The name of the dataset to be used while saving converted results.
feature_imp_needed : bool, default = True
Whether to obtain feature importances from trained models or not. Can
result in a longer runtime of .convert() method.
predictor : TabularPredictor predictor parameter test_data : TabularDataset test_data parameter df_name : str df_name parameter feature_imp_needed : bool feature_imp_needed parameter metrics : list list of all evaluation metrics extracted from AutoGluon - multiclass or binary based on the predictor's problem_type attribute. leaderboard : pd.DataFrame leaderboard created with the create_leaderboard() method consisting only of the selected evaluation metrics.
Creates a dataframe with all trained models and their evaluation metrics values obtained from predictor.
Returns :
leaderboard : pd.DataFrame
Creates a dictionary with model names as keys and their class prediction vectors as values.
Returns :
predictions_dict : dict[str, pd.Series]
Creates a dictionary with model names as keys and their class probabilities predictions as values.
Returns :
proba_predictions_dict : dict[str, pd.Dataframe(n_observations, n_classes)]
Creates a dictionary where the keys are model names and the values are lists of features sorted in descending order of importance, so that the most important feature appears first in each list.
Returns :
feature_importance_dict : dict[str, list]
Extracts the target column from the test dataset using the .label attribute of the TabularPredictor object.
Returns :
y_true : pd.DataFrame
leaderboard, predictions_dict, proba_predictions_dict, feature_importance_dict, y_true, saving_path)
Method for saving results from creating a leaderboard and all dictionaries on disk in .csv and .pickle formats.
Parameters :
leaderboard : pd.DataFrame
created_ leaderboard to be saved as csv
predictions_dict : dict
created predictions dict to be saved as pickle
proba_predictions_dict : dict
created proba predictions dict to be saved as pickle
feature_importance_dict : dict
created feature importance dict to be saved as pickle
y_true : pd.DataFrame
extracted target column to be saved as a csv
saving_path : Path
path to a directory where results should be saved, if not specified the default of timestamp + df_name is used to create a new directory
saving_path)
Final method used to create leaderboard, predictions_dict, proba_predictions dict and feature_importance_dict and save the results using save_results() method. If feature_imp_needed parameter is False, feature_importance_dict is not created and the method returns NaN as its value.
Parameters :
saving_path : Path
path to a directory where results should be saved, if not specified the default of timestamp + df_name is used to create a new directory
Returns :
leaderboard : pd.DataFrame
created_ leaderboard using create_leaderboard() method
predictions_dict : dict[str, pd.Series]
created predictions dict created using create_predictions_dict() method
proba_predictions_dict : dict[str, pd.DataFrame]
created proba predictions dict created using create_proba_predictions_dict() method
feature_importance_dict : dict[str, list]
created feature importance dict created using create_feature_importance_dict() method
y_true : pd.DataFrame
extracted target column using extract_target_column() method