Feature Meta Data Utils
GitHub Link to Code.
Central utilities for feature metadata operations.
Provides generic methods for extracting feature names and types from pipeline metadata, usable across all modules (plots, feature_importance, etc.).
- class mdxplain.utils.feature_metadata_utils.FeatureMetadataUtils
Central utility class for feature metadata operations.
Provides static methods for extracting human-readable feature names and types from pipeline metadata. These utilities are shared across multiple modules to ensure consistency.
Examples
>>> # Get feature name from metadata >>> metadata = pipeline_data.get_selected_metadata("my_selector") >>> name = FeatureMetadataUtils.get_feature_name(metadata, 42) >>> print(name) # "ALA_5_CA-GLU_10_CA" or "ALA_5_phi"
>>> # Get feature type >>> ftype = FeatureMetadataUtils.get_feature_type(metadata, 42) >>> print(ftype) # "distances" or "torsions"
- static get_feature_name(feature_metadata: List[Any] | None, feature_idx: int, use_for_plotting: bool = False) str
Get human-readable name for a feature index.
Extracts the feature name from metadata for the given index, handling both pair features (distances) and non-pair features (torsions). Provides fallback when metadata is unavailable.
Parameters
- feature_metadatalist or None
Feature metadata list from pipeline (e.g., from get_selected_metadata())
- feature_idxint
Index of the feature to get name for
- use_for_plottingbool, default=False
If True, apply Matplotlib-friendly formatting for residue names with consensus labels. For patterns like
R131x3.50, thexseparator is replaced by superscript formatting, resulting inR131$^{3.50}$.
Returns
- str
Human-readable feature name
Examples
>>> # Pair feature (distances) >>> name = FeatureMetadataUtils.get_feature_name(metadata, 10) >>> print(name) # "ALA_5_CA-GLU_10_CA"
>>> # Non-pair feature (torsions) >>> name = FeatureMetadataUtils.get_feature_name(metadata, 20) >>> print(name) # "ALA_5_phi"
>>> # Fallback when metadata unavailable >>> name = FeatureMetadataUtils.get_feature_name(None, 42) >>> print(name) # "feature_42"
Notes
Pair features (2 partners): Joined with “-” separator
Non-pair features (1 element): Single name
Uses numpy array iteration for partner extraction
- static get_feature_type(feature_metadata: List[Any] | None, feature_idx: int) str
Get feature type for a feature index.
Extracts the feature type from metadata for the given index, providing fallback when metadata is unavailable.
Parameters
- feature_metadatalist or None
Feature metadata list from pipeline
- feature_idxint
Index of the feature to get type for
Returns
- str
Feature type name (e.g., “distances”, “torsions”, “sasa”)
Examples
>>> # Get feature type >>> ftype = FeatureMetadataUtils.get_feature_type(metadata, 42) >>> print(ftype) # "distances"
>>> # Fallback when metadata unavailable >>> ftype = FeatureMetadataUtils.get_feature_type(None, 42) >>> print(ftype) # "unknown"
Notes
Returns “unknown” when metadata is unavailable or feature index is out of bounds.
- static get_feature_residues(feature_metadata: List[Any] | None, feature_idx: int) List[Dict[str, Any]]
Get residue information directly from feature metadata.
Extracts structured residue information for a feature without any string parsing. Returns list of residue dictionaries with complete residue information (index, seqid, name, etc.).
Parameters
- feature_metadatalist or None
Feature metadata list from pipeline
- feature_idxint
Index of the feature to get residues for
Returns
- List[Dict[str, Any]]
List of residue dictionaries. Each dict contains:
index: int - Residue index in topology
seqid: int - Residue sequence ID
name: str - Residue name (e.g., “THR”, “GLU”)
aaa_code: str - Three-letter code
a_code: str - One-letter code
consensus: str or None - Consensus label if available
Returns empty list if metadata unavailable.
Examples
>>> # Get residues for distance feature (2 residues) >>> residues = FeatureMetadataUtils.get_feature_residues(metadata, 10) >>> len(residues) 2 >>> residues[0]["seqid"] 24 >>> residues[0]["name"] 'THR'
>>> # Get residues for torsion feature (1 residue) >>> residues = FeatureMetadataUtils.get_feature_residues(metadata, 20) >>> len(residues) 1
>>> # Returns empty list when metadata unavailable >>> residues = FeatureMetadataUtils.get_feature_residues(None, 42) >>> len(residues) 0
Notes
NO string parsing - reads structured metadata directly
Typesafe - returns complete residue dictionaries
Works for both pair features (distances) and single features (torsions)
Replaces FeatureResidueParser.parse_residues_from_name()
- static create_feature_map(metadata_array: ndarray, use_for_plotting: bool = False) dict
Create feature index to name mapping from metadata array.
Extracts all feature names from metadata array and creates a dictionary mapping feature indices to their names.
Parameters
- metadata_arraynp.ndarray
Feature metadata array from pipeline
- use_for_plottingbool, default=False
Whether to apply plot-specific formatting when building feature names, such as consensus superscripts.
Returns
- Dict[int, str]
Mapping of feature_index -> feature_name
Examples
>>> metadata = pipeline_data.get_selected_metadata("my_selector") >>> feature_map = FeatureMetadataUtils.create_feature_map(metadata) >>> print(feature_map[42]) # "ALA_5_CA-GLU_10_CA"
Notes
Uses get_feature_name() internally for consistent name extraction across all feature types (pairs and non-pairs).
- static get_top_level_metadata(feature_type: str, feature_metadata: List[Any] | None) Dict[str, Any]
Get top-level metadata for a feature type.
Searches feature metadata list for the first entry matching the specified feature type and returns its type-level metadata. This metadata contains type-wide configuration (e.g., unit labels, visualization settings) rather than feature-specific information.
Parameters
- feature_typestr
Name of the feature type to search for (e.g., “distances”, “torsions”)
- feature_metadatalist or None
Feature metadata list from pipeline
Returns
- Dict[str, Any]
Type-level metadata dictionary. Returns empty dict if:
feature_metadata is None
No entry with matching type found
Matching entry has no type_metadata
Examples
>>> # Get visualization metadata for distances >>> metadata = pipeline_data.get_selected_metadata("my_selector") >>> type_meta = FeatureMetadataUtils.get_top_level_metadata("distances", metadata) >>> print(type_meta.get("unit_label")) # "Å" >>> print(type_meta.get("allow_hide_prefix")) # False
>>> # Get metadata for torsions >>> type_meta = FeatureMetadataUtils.get_top_level_metadata("torsions", metadata) >>> print(type_meta.get("unit_label")) # "°" >>> print(type_meta.get("allow_hide_prefix")) # True
>>> # Returns empty dict when type not found >>> type_meta = FeatureMetadataUtils.get_top_level_metadata("nonexistent", metadata) >>> type_meta {}
Notes
Returns metadata from FIRST matching entry only
Type-level metadata shared across all features of that type
Used for visualization settings, unit labels, display options
Complements feature-specific metadata in individual entries