DSSP Data
GitHub Link to Code.s
DSSP feature type implementation for molecular dynamics analysis.
DSSP feature type implementation for computing secondary structure assignments using the DSSP algorithm with support for multiple encoding formats.
- class mdxplain.feature.feature_type.dssp.dssp.DSSP(simplified: bool = False, encoding: str = 'integer')
DSSP feature type for computing secondary structure assignments.
Computes secondary structure using the DSSP (Dictionary of Secondary Structure in Proteins) algorithm. Supports both simplified classification (H/E/C for helix/sheet/coil) and full classification with all 8 DSSP classes. Multiple encoding formats are available for different analysis needs.
This is a base feature type with no dependencies that provides structural classification information for protein analysis.
Uses mdtraj for dssp calculations under the hood.
Examples
>>> # Simplified DSSP with one-hot encoding >>> dssp = DSSP(simplified=True, encoding='onehot') >>> pipeline.feature.add_feature(dssp)
>>> # Full DSSP classification with integer encoding >>> dssp = DSSP(simplified=False, encoding='integer') >>> pipeline.feature.add_feature(dssp)
>>> # Character encoding for visualization >>> dssp = DSSP(simplified=True, encoding='char') >>> pipeline.feature.add_feature(dssp)
- __init__(simplified: bool = False, encoding: str = 'integer') None
Initialize DSSP feature type with classification and encoding parameters.
Parameters
- simplifiedbool, default=False
Secondary structure classification level:
True: Simplified 3-class (H=helix, E=sheet, C=coil/other)
False: Full 8-class DSSP (H, B, E, G, I, T, S, C)
- encodingstr, default=’integer’
Output encoding format:
‘onehot’: One-hot encoded binary vectors
‘integer’: Integer class indices (0, 1, 2, …) [default]
‘char’: Character codes (‘H’, ‘E’, ‘C’, etc.)
Returns
None
Examples
>>> # Default: Full classification with one-hot encoding >>> dssp = DSSP()
>>> # Simplified for basic analysis >>> dssp = DSSP(simplified=True)
>>> # Integer encoding for machine learning >>> dssp = DSSP(simplified=False, encoding='integer')
>>> # Character codes for visualization >>> dssp = DSSP(simplified=True, encoding='char')
Raises
- ValueError
If encoding is not one of ‘onehot’, ‘integer’, or ‘char’
- init_calculator(use_memmap: bool = False, cache_path: str = './cache', chunk_size: int = 2000) None
Initialize the DSSP calculator with specified configuration.
Parameters
- use_memmapbool, default=False
Whether to use memory mapping for large datasets
- cache_pathstr, optional
Directory path for storing cache files when using memory mapping
- chunk_sizeint, optional
Number of frames to process per chunk for memory-efficient processing
Returns
None
Examples
>>> # Basic initialization >>> dssp.init_calculator()
>>> # With memory mapping for large datasets >>> dssp.init_calculator(use_memmap=True, cache_path='./cache/')
>>> # With custom chunk size >>> dssp.init_calculator(chunk_size=1000)
- compute(input_data: Trajectory, feature_metadata: Dict[str, Any]) Tuple[ndarray, Dict[str, Any]]
Compute DSSP secondary structure assignments from trajectory.
Parameters
- input_datamdtraj.Trajectory
MD trajectory to compute DSSP from
- feature_metadatadict
Residue metadata (passed through for residue-level analysis)
Returns
- tuple[numpy.ndarray, dict]
Tuple containing (dssp_array, feature_metadata) where dssp_array format depends on encoding:
‘onehot’: (n_frames, n_residues * n_classes)
‘integer’: (n_frames, n_residues)
‘char’: (n_frames, n_residues) with string dtype
Examples
>>> # Compute simplified DSSP with one-hot encoding >>> dssp = DSSP(simplified=True, encoding='onehot') >>> dssp.init_calculator() >>> data, metadata = dssp.compute(trajectory, res_metadata) >>> print(f"DSSP array shape: {data.shape}") # (n_frames, n_residues * 3)
>>> # Compute full DSSP with character encoding >>> dssp = DSSP(simplified=False, encoding='char') >>> dssp.init_calculator() >>> data, metadata = dssp.compute(trajectory, res_metadata) >>> print(f"Secondary structure codes: {data[0]}") # ['H', 'H', 'E', 'C', ...]
Raises
- ValueError
If calculator is not initialized
- get_dependencies() List[str]
Get list of feature type dependencies for DSSP calculations.
Parameters
None
Returns
- List[str]
Empty list as DSSP is a base feature with no dependencies
Examples
>>> dssp = DSSP() >>> print(dssp.get_dependencies()) []