Visualization Data Helper

GitHub Link to Code.

Visualization data helper for structure visualization operations.

This module provides helper utilities for preparing visualization data, including PDB info preparation, feature color assignment, and feature extraction from selectors. Used by both StructureVizFeatureImportanceService and StructureVizFeatureService.

class mdxplain.structure_visualization.helper.visualization_data_helper.VisualizationDataHelper

Helper class for structure visualization data operations.

Provides static methods for preparing PDB info dictionaries with colors, assigning feature colors, and extracting features from selectors. Used by both Feature Importance and Feature-based services.

Examples

>>> pdb_info = VisualizationDataHelper.prepare_pdb_info_from_viz_data(
...     viz_data
... )
>>> colors = VisualizationDataHelper.assign_feature_colors(features, 5)
static prepare_pdb_info_from_comp_data(viz_data: StructureVisualizationData, comp_data: Any) Dict[str, Dict[str, str]]

Prepare PDB info from visualization and comparison data.

Creates dictionary mapping sub-comparison names to absolute PDB paths and assigned colors. Used by Feature Importance Service.

Parameters

viz_dataStructureVisualizationData

Visualization data containing PDB paths

comp_dataComparisonData

Comparison data with sub-comparisons

Returns

Dict[str, Dict[str, str]]

Dictionary with structure info:

  • Keys: sub-comparison identifiers

  • Values: {“path”: absolute_pdb_path, “color”: hex_color}

Examples

>>> pdb_info = VisualizationDataHelper.prepare_pdb_info_from_comp_data(
...     viz_data, comp_data
... )
>>> pdb_info["cluster_0_vs_rest"]
{'path': '/abs/path/to/c0.pdb', 'color': '#bf4242'}

Notes

  • Paths normalized via PathUtils.prepare_file_path()

  • Colors generated using ColorUtils.generate_distinct_colors()

  • Color assignment follows sub-comparison order

static prepare_pdb_info_from_viz_data(viz_data: StructureVisualizationData) Dict[str, Dict[str, str]]

Prepare PDB info from visualization data only.

Creates dictionary with structure info from viz_data.get_all_pdbs(). Used by Feature Service where no comparison data is available.

Parameters

viz_dataStructureVisualizationData

Visualization data with PDB paths

Returns

Dict[str, Dict[str, str]]

Dictionary with structure info:

  • Keys: structure identifiers

  • Values: {“path”: absolute_pdb_path, “color”: hex_color}

Examples

>>> viz_data.add_pdb("cluster_0", "/path/to/c0.pdb")
>>> viz_data.add_pdb("cluster_1", "/path/to/c1.pdb")
>>> pdb_info = VisualizationDataHelper.prepare_pdb_info_from_viz_data(
...     viz_data
... )
>>> len(pdb_info)
2

Notes

  • Paths normalized via PathUtils.prepare_file_path()

  • Colors generated using ColorUtils.generate_distinct_colors()

  • Color assignment follows dictionary iteration order

static assign_feature_colors(top_features: List[Dict[str, Any]], n_features: int, offset: int = 0) Dict[str, str]

Assign distinct colors to top features.

Creates color mapping for features using visually distinct colors for highlighting in visualizations. Uses ColorUtils to generate perceptually distinct colors.

Parameters

top_featuresList[Dict[str, Any]]

List of top feature dictionaries with ‘feature_name’ key

n_featuresint

Number of features to assign colors to

offsetint, default=0

Color offset for continuous coloring across feature groups. Used to assign different colors to global vs local features.

Returns

Dict[str, str]

Mapping from feature name to HEX color string

Examples

>>> # Global features get colors 0-2
>>> global_features = [{"feature_name": "ALA_5_CA-GLU_10_CA"}]
>>> global_colors = VisualizationDataHelper.assign_feature_colors(
...     global_features, 3, offset=0
... )
>>> # Local features get colors 3-5
>>> local_features = [{"feature_name": "GLY_3_phi"}]
>>> local_colors = VisualizationDataHelper.assign_feature_colors(
...     local_features, 3, offset=3
... )

Notes

  • Uses ColorUtils.generate_distinct_colors() for color generation

  • If feature_name missing, uses “feature_{i}” as fallback

  • Only first n_features are assigned colors

  • Offset enables continuous coloring: global (0-N), local (N-M)

static extract_local_features_and_colors(pipeline_data: PipelineData, fi_data: Any, comp_data: Any, n_top_local: int, n_top_global: int, feature_own_color: bool = True) tuple

Extract local top features and colors per cluster.

Extracts cluster-specific top features from feature importance data and assigns colors with optional offset for continuous coloring.

Parameters

pipeline_dataPipelineData

Pipeline data object

fi_dataFeatureImportanceData

Feature importance data

comp_dataComparisonData

Comparison data with sub-comparisons

n_top_localint

Number of local features per cluster

n_top_globalint

Number of global features (for color offset)

feature_own_colorbool

Not used for color assignment (kept for API compatibility). Colors always use offset to avoid conflicts with global features.

Returns

tuple

(top_features_local, feature_colors_local) where:

  • top_features_local: Dict[str, List[Dict]] - features per cluster

  • feature_colors_local: Dict[str, Dict[str, str]] - colors per cluster

Examples

>>> local_feats, local_colors = VisualizationDataHelper.extract_local_features_and_colors(
...     pipeline_data, fi_data, comp_data, n_top_local=3,
...     n_top_global=3, feature_own_color=False
... )
>>> local_feats["cluster_0"]
[{'feature_name': 'ALA_5-GLU_10', ...}]

Notes

Returns empty dicts if n_top_local is 0. Color offset enables continuous coloring: global (0-N), local (N-M).

static extract_features_from_selector(pipeline_data: PipelineData, selector_name: str) List[Dict[str, Any]]

Extract all features from feature selector.

Uses PUBLIC API get_selected_metadata() to retrieve already selected and formatted feature metadata, same approach as Feature Importance Service.

Parameters

pipeline_dataPipelineData

Pipeline data object

selector_namestr

Feature selector name

Returns

List[Dict[str, Any]]

List of feature dicts with metadata:

  • feature_index: int

  • feature_name: str

  • feature_type: str

  • residue_seqids: List[int]

  • residue_indices: List[int]

Raises

KeyError

If selector not found in selected_feature_data

Examples

>>> features = VisualizationDataHelper.extract_features_from_selector(
...     pipeline_data, "important_distances"
... )
>>> features[0]["feature_name"]
'ALA_5_CA-GLU_10_CA'

Notes

This method uses the same PUBLIC API as Feature Importance Service: pipeline_data.get_selected_metadata(). This ensures consistency and avoids index translation issues.