Skip to content

back

BuildRashomonIntersectionAutogluon

class arsa_ml.pipelines.pipelines_user_input.BuildRashomonIntersectionAutogluon(predictor, test_data, df_name, metrics, custom_weights, weighted_sum_method, delta, feature_imp_needed, converter_results_directory)


This i a subclass of a Builder abstract class, which provides pipeline for creating and exploring the Rashomon Intersection from user-provided models from AutoGluon framework.

Example usage of this pipeline can be found at demo_notebooks/AutoGluon_pipeline.ipynb.


Parameters


predictor : TabularPredictor
  Trained AutoGluon TabularPredictor object containing all models and training results.

test_data : TabularDataset
  Test dataset for evaluation, must be converted to TabularDataset object.
  Note 1: It is crucial to stratify your data while performing the train test split, so that there are no classes in test data that were not present in the train set.
  Note 2 (for binary classification task): It is crucial to convert binary target column labels into 0 for negative class and 1 for positive class. Otherwise, some evaluation metrics may not be calculated correctly.

df_name : str
  Name of the dataset used for saving the data generated by PredictorConverter output.

metrics : list
  List of two metrics to be used in the intersection calculation.

weighted_sum_method : str
  Specifies the method for selecting the base model for the Rashomon Intersection. Options are:

  • None or 'entropy' (default): selects the base model using an entropy-based method.
  • 'critic': selects the base model using a critic-based method.
  • 'custom_weights': selects the base model using weights provided by the user.
custom_weights : list
  Specifies the weights for base model selection when weighted_sum_method is set to 'custom_weights'. User must specify weights in 2-element list.

delta : float
  Delta parameter for probabilistic ambiguity and discrepancy (used only for binary task type).
  If not specified the default value of 0.1 will be used.

feature_imp_needed : bool
  Boolean value specifying whether feature importance computation is required.
  Defaults to True; set to False to skip feature importance computation for faster execution.

converter_results_directory : Path
  Path to the directory where the converter outputs will be saved.
  If None, a default directory current_working_directory/df_name_timestamp will be automatically created, where timestamp corresponds to the creation time of the converted outputs.


BuildRashomonIntersectionAutogluon Pipeline – Key Steps


  1. Initialization (__init__) – Convert AutoGluon TabularPredictor object and test data to format for Rashomon Intersection analysis.
  2. Preview Rashomon (preview_rashomon()) – Visualize leaderboard and Rashomon Intersection sizes to guide epsilon selection.
  3. Set Epsilon (set_epsilon()) – Set epsilon parameter value to be used when creating Rashomon Intersection.
  4. Build Pipeline (build()) – Create RashomonIntersection, IntersectionVisualizer objects and launch the Streamlit dashboard.
  5. Interactive Analysis – Explore plots via IntersectionVisualizer object or the dashboard.
  6. Close Dashboard (dashboard_close()) – Stop Streamlit processes.


Pipeline Initialization


  During initialization, the pipeline converts the user's input models into the internal format specified by PredictorConverter class object.
  All processed results are stored in the specified directory or in a default directory (see converter_results_directory parameter).

Methods


preview_rashomon()

  Method illustrating the leaderboard and the plot with all possible epsilon values and the Rashomon Intersection sizes for different epsilon values.
  Should be called to guide the selection of an appropriate epsilon threshold.

visualize_rashomon_set_volume()

  Method for visualising Rashomon Intersection size depending on different epsilon values.

set_epsilon(epsilon)

  Sets the epsilon parameter value to be used when constructing the Rashomon Intersection object.
Epsilon value must be set before calling build() method.

Parameters :
epsilon : float

build(launch_dashboard)

 Builds the Rashomon Intersection pipeline from AutoGluon output.
  Creates Rashomon Intersection object and Intersection Visualizer object from user's input and launches a Streamlit dashboard in a subprocess for interactive visualization.

  This method performs the following steps:

  • Validates that the epsilon threshold has been set using set_epsilon() method.

  • Creates the RashomonIntersection object based on the leaderboard, predictions, probability predictions, feature importances, metrics, methods to select base model, and epsilon threshold.
    All inputs required to build the Rashomon Intersection object are extracted from the user's data, which is processed and converted during the pipeline initialization.

  • Initializes the IntersectionVisualizer object for interactive analysis of the Rashomon Intersection.
    Individual plots can later be generated directly from the IntersectionVisualizer object.

  • If launch_dashboard is set to True (default) it generates plots for analysis depending on the task type (binary or multiclass), and stores them temporarily with their descriptions for the Streamlit dashboard. It closes any previous Streamlit processes to avoid conflicts.

  • If launch_dashboard is set to True (default) it launches the Streamlit dashboard in a subprocess on the local machine (localhost), allowing interactive exploration of the Rashomon Intersection properties without blocking the main workflow.


Returns :
rashomon_set : RashomonIntersection
visualizer : IntersectionVisualizer

dashboard_close()

  Method for stopping all Streamlit processes and closing the dashboard.
  Note: Always call this method after finishing the analysis to ensure the dashboard is properly closed.