Base Feature Importance Plot Data Preparer

GitHub Link to Code.

Base data preparer for feature importance plot types.

Provides shared data preparation logic for violin plots, density plots, and other feature importance visualizations.

class mdxplain.plots.helper.base_feature_importance_plot_data_preparer.BaseFeatureImportancePlotDataPreparer

Base class for feature importance plot data preparation.

Provides common data preparation logic for violin plots, density plots, and other visualizations based on feature importance analysis. Handles feature extraction, contact transformation, and metadata collection.

Subclasses (ViolinDataPreparer, DensityDataPreparer) can inherit all functionality or override specific methods for customization.

Examples

>>> # Feature Importance mode
>>> data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_feature_importance(
...     pipeline_data, "tree_analysis", n_top=10
... )
>>> # Manual Selection mode
>>> data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_manual_selection(
...     pipeline_data, "my_selector", ["cluster_0", "cluster_1"]
... )
static prepare_from_feature_importance(pipeline_data: PipelineData, feature_importance_name: str, n_top: int, contact_transformation: bool = True) Tuple[Dict[str, Dict[str, Dict[str, np.ndarray]]], Dict[str, Dict[str, Dict[str, Any]]], Dict[str, str], float | None]

Complete preparation for Feature Importance mode.

Coordinates all steps: validates feature importance, extracts top features, optionally converts contacts to distances, prepares plot data, collects feature metadata, and creates DataSelector color mapping.

Parameters

pipeline_dataPipelineData

Pipeline data container

feature_importance_namestr

Name of feature importance analysis

n_topint

Number of top features per comparison

contact_transformationbool, default=True

If True, converts contact features to distances. If False, keeps contacts as binary (0/1) values.

Returns

plot_dataDict[str, Dict[str, Dict[str, np.ndarray]]]

Three-level nested structure: feat_type -> feat_name -> data_selector_name -> values

feature_metadata_mapDict[str, Dict[str, Dict[str, Any]]]

Feature metadata: feat_type -> feat_name -> {type_metadata, features}

data_selector_colorsDict[str, str]

Mapping of data_selector_name -> color_hex

contact_cutoffOptional[float]

Contact cutoff value if converted from contacts, None otherwise

Raises

ValueError

If feature importance analysis not found

Examples

>>> # With contact transformation (default)
>>> plot_data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_feature_importance(
...     pipeline_data, "tree_analysis", n_top=10
... )
>>> print(plot_data.keys())  # ["distances", "torsions"]
>>> print(cutoff)  # 4.5
>>> # Without contact transformation (binary contacts)
>>> plot_data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_feature_importance(
...     pipeline_data, "tree_analysis", n_top=10, contact_transformation=False
... )
>>> print(plot_data.keys())  # ["contacts", "torsions"]
>>> print(cutoff)  # None

Notes

Complete coordination method for Feature Importance mode. Returns same structure as prepare_from_manual_selection() for consistent downstream processing.

When contact_transformation=False, contacts remain as binary features for visualization with Gaussian smoothing.

static prepare_from_manual_selection(pipeline_data: PipelineData, feature_selector_name: str, data_selectors: List[str], contact_transformation: bool = True) Tuple[Dict[str, Dict[str, Dict[str, np.ndarray]]], Dict[str, Dict[str, Dict[str, Any]]], Dict[str, str], float | None]

Complete preparation for Manual selection mode.

Coordinates all steps: optionally converts contacts to distances, gets all features from selector, builds plot data, collects feature metadata, and creates DataSelector color mapping.

Parameters

pipeline_dataPipelineData

Pipeline data container

feature_selector_namestr

Name of feature selector

data_selectorsList[str]

DataSelector names to plot

contact_transformationbool, default=True

If True, converts contact features to distances. If False, keeps contacts as binary (0/1) values.

Returns

plot_dataDict[str, Dict[str, Dict[str, np.ndarray]]]

Three-level nested structure: feat_type -> feat_name -> data_selector_name -> values

feature_metadata_mapDict[str, Dict[str, Dict[str, Any]]]

Feature metadata: feat_type -> feat_name -> {type_metadata, features}

data_selector_colorsDict[str, str]

Mapping of data_selector_name -> color_hex

contact_cutoffOptional[float]

Contact cutoff value if converted from contacts, None otherwise

Raises

ValueError

If feature selector or data selectors not found

Examples

>>> # With contact transformation (default)
>>> plot_data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_manual_selection(
...     pipeline_data, "my_selector", ["cluster_0", "cluster_1"]
... )
>>> print(plot_data.keys())  # ["distances", "torsions"]
>>> print(cutoff)  # 4.5
>>> # Without contact transformation (binary contacts)
>>> plot_data, metadata, colors, cutoff = BaseFeatureImportancePlotDataPreparer.prepare_from_manual_selection(
...     pipeline_data, "my_selector", ["cluster_0", "cluster_1"],
...     contact_transformation=False
... )
>>> print(plot_data.keys())  # ["contacts", "torsions"]
>>> print(cutoff)  # None

Notes

Complete coordination method for Manual mode. Returns same structure as prepare_from_feature_importance() for consistent downstream processing.

When contact_transformation=False, contacts remain as binary features for visualization with Gaussian smoothing.