Visualizer

class arsa_ml.visualizers.Visualizer(rashomon_set, y_true)

This class is responsible for creating visualizations and descriptions of the key properties and metrics related to the Rashomon Set.

The main library used for the visualizations is Plotly.

Parameters

rashomon_set : RashomonSet
created RashomonSet object for analysis.
y_true : pd.DataFrame
true class labels for every observation in the test dataset. (Returned by converters)

Attributes

rashomon_set : RashomonSet
rashomon_set parameter value
y_true : pd.DataFrame
y_true parameter value
binary_methods : list[str]
list of methods names, which create plots for binary task type analysis.
binary_methods : list[str]
list of methods names, which create plots for multiclass task type analysis.

Methods

base_model_return()

Returns the base_model name of the given RashomonSet object and an empty plot.

Returns :
empty_plot : go.Figure
base_model : str

base_model_score_return()

Returns the value of base_metric achieved by the base_model from the given RashomonSet object and an empty plot.

Returns :
empty_plot : go.Figure
base_score : str

base_metric_return()

Returns the base_metric name used in the given RashomonSet object and an empty plot.

Returns :
empty_plot : go.Figure
base_metric : str

number_of_classes_return()

Returns the number of classes obtained in the given RashomonSet object and an empty plot.

Returns :
empty_plot : go.Figure
number_of_classes : str

set_size_indicator()

Creates the gauge plot representing the number of models from the whole leaderboard that are included in the analyzed RashomonSet. Returns the plot and its description.

Returns :
gauge_plot : go.Figure
plot_descr : str

rashomon_ratio_indicator()

Creates the Indicator plot for the rashomon_ratio metric obtained from the Rashomon Set. Returns the plot with its description.

Returns :
indicator_plot : go.Figure
plot_descr : str

pattern_ratio_indicator()

Creates the Indicator plot for the pattern_rashomon_ratio metric obtained from the Rashomon Set. Returns the plot with its description.

Returns :
indicator_plot : go.Figure
plot_descr : str

lolipop_ambiguity_discrepancy()

Creates the Lolipop chart for the ambiguity and discrepancy metrics obtained from the Rashomon Set. The x-axis represents the metric (e.g ambiguity or discrepancy) and the y-axis presents it's value. Returns the plot with its description.

Returns :
lolipop_plot : go.Figure
plot_descr : str

lolipop_ambiguity_discrepancy_proba_version(delta)

Creates the Lolipop chart for the probabilistic_ambiguity and probabilistic_discrepancy metrics obtained from the Rashomon Set. The x-axis represents the metric (e.g ambiguity or discrepancy) and the y-axis presents it's value for the given delta. Returns the plot with its description.

Note : Method available only for binary classification task type.

Parameters :
delta : float
delta parameter indicates the minimum difference between two risk probabilities for the predictions to be considered conflicting.

Returns :
lolipop_plot : go.Figure
plot_descr : str

ambiguity_vs_epsilon()

Creates the Line chart of the possible ambiguity values with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding ambiguities. The actual ambiguity value for the given Rashomon Set is highlighted with a different color. Returns the plot with its description.

Returns :
line_plot : go.Figure
plot_descr : str

proba_ambiguity_vs_epsilon(delta)

Creates the Line chart of the possible probabilistic_ambiguity values for the given delta with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding ambiguities. The actual ambiguity value for the given Rashomon Set is highlighted with the different color. Returns the plot with its description.

Note : Method available only for binary classification task type.

Parameters :
delta : float
delta parameter indicates the minimum difference between two risk probabilities for the predictions to be considered conflicting.

Returns :
line_plot : go.Figure
plot_descr : str

discrepancy_vs_epsilon()

Creates the Line chart of the possible discrepancy values with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding discrepancies. The actual discrepancy value for the given Rashomon Set is highlighted with the different color. Returns the plot with its description.

Returns :
line_plot : go.Figure
plot_descr : str

proba_discrepancy_vs_epsilon(delta)

Creates the Line chart of the possible probabilistic_discrepancy values for the given delta with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding discrepancies. The actual discrepancy value for the given Rashomon Set is highlighted with a different color. Returns the plot with its description.

Note : Method available only for binary classification task type.

Parameters :
delta : float
delta parameter indicates the minimum difference between two risk probabilities for the predictions to be considered conflicting.

Returns :
line_plot : go.Figure
plot_descr : str

rashomon_ratio_vs_epsilon()

Creates the Line chart of the possible rashomon_ratio values with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding rashomon_ratios. The actual rashomon_ratio value for the given Rashomon Set is highlighted with a different color. Returns the plot with its description.

Returns :
line_plot : go.Figure
plot_descr : str

pattern_rashomon_ratio_vs_epsilon()

Creates the Line chart of the possible pattern_rashomon_ratio values with respect to different epsilons. The x-axis represents the epsilon values, while the y-axis shows the corresponding pattern_rashomon_ratios. The actual pattern_rashomon_ratio value for the given Rashomon Set is highlighted with the different color. Returns the plot with its description.

Returns :
line_plot : go.Figure
plot_descr : str

proba_probabilities_for_sample(sample_index)

Visualizes the predicted probabilities for the chosen sample across all models present in the Rashomon Set. The sizes of the segments indicate how confident each model is in predicting each class. If the number of possible classes is greater than 10, the barplot is switched to a heatmap for better clarity. Returns the plot with its description.

Parameters :
sample_index : int
index of the sample for which predictions are to be plotted
Returns :
probabilities_plot : go.Figure
plot_descr : str

rashomon_capacity_for_sample(sample_index)

Method to visualize rashomon_capacity metric for given sample index. It creates a 1D plot, where the x-axis represents the range of possible Rashomon Capacity values from 1 to number of classes in the classification task (marked as black dots). The highlighted dot represents Rashomon Capacity value for the selected sample. Returns the plot with its description.

Parameters :
sample_index : int
index of the sample for which predictions are to be plotted
Returns :
scatter_plot : go.Figure
plot_descr : str

rashomon_capacity_distribution()

Visualizes the distribution of rashomon_capacity values across all samples in the dataset. The histogram visualizes the frequency of different capacity values. Returns the plot with its description.

Returns :
histogram_plot : go.Figure
plot_descr : str

rashomon_capacity_distribution_by_class()

Creates a visualization of the distribution of the rashomon capacity metric across all samples, grouped by their true class labels. Returns the plot with its description.

Returns :
box_plot : go.Figure
plot_descr : str

rashomon_capacity_distribution_threshold(threshold)

Creates a visualization of the distribution of the rashomon_capacity_threshold metric calculated with threshold parameter. Returns the plot with its description.

Note : Method available only for binary classification task type.

Parameters :
threshold : float, by default 0.5. Decision threshold for binary classification. If the predicted probability of the positive class (1) is greater than this threshold, the sample is assigned a label 1 (positive); otherwise, it is assigned 0 (negative). This threshold is applied to all observations.

Returns :
histogram : go.Figure
plot_descr : str

rashomon_capacity_distribution_labels()

Visualizes the distribution of rashomon_capacity_labels values across all samples in the dataset. The histogram visualizes the frequency of different capacity values. Returns the plot with its description.

Returns :
histogram_plot : go.Figure
plot_descr : str

percent_agreement_barplot()

Visualizes the distribution of percent_agreements for all models in the Rashomon Set. The x-axis represents the different models, while the y-axis shows the corresponding percent_agreements with the base model. The horizontal line indicates the mean value of the percent_agreement across all models (besides the base_model) from the Rashomon Set. Returns the plot with its description.

Returns :
bar_plot : go.Figure
plot_descr : str

cohens_kappa_diverging_barplot()

Visualizes the distribution of cohens_kappa for all models in the Rashomon Set with respect to the base model. The y-axis represents the different models, while the x-axis shows the corresponding Cohen's Kappa metric value with the base model. Returns the plot with its description.

Returns :
fig : go.Figure
plot_descr : str

cohens_kappa_heatmap()

Generates a heatmap of Cohen’s Kappa values for every pair of models in the Rashomon Set, illustrating the level of agreement between their predictions. Returns the plot with its description.

Returns :
fig : go.Figure
plot_descr : str

generate_rashomon_set_table()

Creates the table containing all models from the Rashomon Set with their values of the base_metric sorted in a descending order with the highlighted name of the base model. Additionally, a brief text description of the Rashomon Set main properties is returned. Returns the plot with its description.

Returns :
table : pd.DataFrame
set_descr : str

vprs_widths_plot()

Visualizes the distribution of VPRs widths for all observations in the test dataset grouped by their true class label in a form of Box plots with visible points. The x-axis represents the class label (0 or 1), while the y-axis shows the corresponding VPRs widths. Returns the plot with its description.

Note : Method available only for binary classification task type.

Returns :
box_plot : go.Figure
plot_descr : str

vpr_width_histogram()

Creates the histogram of VPRs widths for all observations in the test dataset. The number of bins is determined using the Freedman–Diaconis method. The x-axis represents the ranges of VPR widths, while the y-axis indicates the number of observations falling within each range. Returns the plot with its description.

Note : Method available only for binary classification task type.

Returns :
histogram_plot : go.Figure
plot_descr : str

feature_importance_table()

Generates a table summarizing feature importance information. It provides information about the ranking and occurrence of the most important features across the base model and all models in the Rashomon set. If feature importances are not available for a certain model, the table column is empty.

Table columns:
Rank – the position of the feature (1 being the most important).
Base Model Top 3 – the three most important features for the base model, if feature importance is available.
Most Frequent in Rashomon Set – the features that most frequently appear as the 1st, 2nd, or 3rd most important across all models in the Rashomon set.

Returns :
table : go.Figure
table_descr : str

feature_importance_heatmap()

Generates a heatmap visualizing the top three most important features for each model in the Rashomon set. The x-axis shows feature names and the y-axis lists models from the Rashomon set.

Feature ranks are defined as follows:
Rank = 0 – feature is not present in the top 3 most important features.
Rank = 1 – feature is the most important feature selected by the model.
Rank = 2 – feature is the second most important feature selected by the model.
Rank = 3 – feature is the third most important feature selected by the model.

Cell colors represent the feature’s rank within each model. No color represents a feature not present in the top 3 for the model. The most important feature for model (Rank 1) is represented by red color, second most important (Rank 2) by orange and third most important (Rank 3) by yellow.

Note: If any model present in the Rashomon Set does not have feature importance available, then its corresponding row of the heatmap is empty. In other words, an empty row does not indicate that the model does not consider any of the features important.

Returns :
heatmap_plot : go.Figure
plot_descr : str

agreement_rates_density()

Creates the Violin plot of the agreement_rate values for all observations in the test dataset. Returns the plot with its description.

Returns :
violin_plot : go.Figure
plot_descr : str

vprs_vs_base_model_plot()

Creates the violin plot showing how the base model's risk predictions compare to the range of predictions produced by all models in the Rashomon set. For each observation, the Viable Prediction Range (VPR) is defined as [min risk prediction, max risk prediction] across all models from the Rashomon Set. This plot allows the user to see where the base model's prediction falls within this range for each observation. For the positive class the difference is calculated as the max risk prediction - base model's risk prediction, while for the negative class it's the base model's risk prediction - min risk prediction.

Note : Method available only for binary classification task type.

Returns :
violin_plot : go.Figure
plot_descr : str