Decomposition Manager
GitHub Link to Code.
DecompositionManager for managing decomposition data objects.
Manager for creating and managing decomposition results from feature matrices. Used to add, reset, and manage decomposition data in trajectory data objects.
- class mdxplain.decomposition.manager.decomposition_manager.DecompositionManager(use_memmap: bool = False, chunk_size: int = 2000, cache_dir: str = './cache')
Manager for decomposition data objects.
Manages the creation and storage of decomposition results from feature matrices. Works with TrajectoryData objects to perform dimensionality reduction using various decomposition methods (PCA, KernelPCA, etc.).
Examples
>>> # Create manager and add PCA decomposition >>> from mdxplain.decomposition import decomposition_type >>> manager = DecompositionManager() >>> manager.add_decomposition( ... pipeline_data, "feature_selection", decomposition_type.PCA, ... n_components=10 ... )
>>> # Manager with memory mapping for large datasets >>> manager = DecompositionManager(use_memmap=True, chunk_size=1000) >>> manager.add_decomposition( ... pipeline_data, "contact_selection", decomposition_type.KernelPCA, ... n_components=20, kernel='rbf' ... )
- __init__(use_memmap: bool = False, chunk_size: int = 2000, cache_dir: str = './cache') None
Initialize decomposition manager.
Parameters
- use_memmapbool, default=False
Whether to use memory mapping for decomposition data
- chunk_sizeint, optional
Processing chunk size for incremental computation
- cache_dirstr, optional
Cache directory path for decomposition data
Returns
- None
Initializes DecompositionManager instance with specified configuration
Examples
>>> # Basic manager >>> manager = DecompositionManager()
>>> # Manager with memory mapping >>> manager = DecompositionManager( ... use_memmap=True, ... chunk_size=1000, ... cache_dir="./cache/decomposition" ... )
- add_decomposition(pipeline_data: PipelineData, selection_name: str, decomposition_type: DecompositionTypeBase, decomposition_name: str | None = None, data_selector_name: str | None = None, force: bool = False) None
Add and compute a decomposition for selected feature data.
This method creates a DecompositionData instance for the specified decomposition type, retrieves the selected feature matrix, performs the decomposition computation, and stores the result in the TrajectoryData object.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.decomposition.add("selection", decomposition_type.PCA()) # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = DecompositionManager() >>> manager.add_decomposition(pipeline_data, "selection", decomposition_type.PCA()) # pipeline_data required
Parameters
- pipeline_dataPipelineData
Trajectory data object containing feature selections
- selection_namestr
Name of the feature selection to decompose
- decomposition_typeDecompositionTypeBase instance
Decomposition type instance with parameters (e.g., PCA(n_components=10))
- decomposition_namestr
Name to save the decomposition. If None (default), it is “selection_name_{str(decomposition_type)}”
- data_selector_namestr, optional
Name of DataSelector to apply frame filtering before decomposition. If None, uses all frames from the selection.
- forcebool, default=False
Whether to force recomputation if decomposition already exists
Returns
- None
Adds computed decomposition to trajectory data
Raises
- ValueError
If the decomposition already exists, if required selection is missing, or if the decomposition computation fails
Examples
>>> # Add PCA decomposition >>> from mdxplain.decomposition import decomposition_type >>> manager = DecompositionManager() >>> manager.add_decomposition( ... pipeline_data, "feature_selection", decomposition_type.PCA(n_components=10) ... )
>>> # Add KernelPCA with custom parameters >>> manager.add_decomposition( ... pipeline_data, "any_selection", decomposition_type.KernelPCA(n_components=15, gamma=0.1) ... )
>>> # Add ContactKernelPCA for contact features >>> manager.add_decomposition( ... pipeline_data, "contact_selection", decomposition_type.ContactKernelPCA(n_components=20) ... )
>>> # Force recomputation of existing decomposition >>> manager.add_decomposition( ... pipeline_data, "feature_selection", decomposition_type.PCA(n_components=20), force=True ... )
- reset_decompositions(pipeline_data: PipelineData) None
Reset all computed decompositions and clear decomposition data.
This method removes all computed decompositions and their associated data, requiring decompositions to be recalculated from scratch.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.decomposition.reset_decompositions() # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = DecompositionManager() >>> manager.reset_decompositions(pipeline_data) # pipeline_data required
Parameters
- pipeline_dataPipelineData
Trajectory data object
Returns
- None
Clears all decomposition data from pipeline_data.decomposition_data
Examples
>>> manager = DecompositionManager() >>> manager.reset_decompositions(pipeline_data)
- save(pipeline_data: PipelineData, save_path: str) None
Save all decomposition data to single file.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.decomposition.save('decomposition.npy') # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = DecompositionManager() >>> manager.save(pipeline_data, 'decomposition.npy') # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container with decomposition data
- save_pathstr
Path where to save all decomposition data in one file
Returns
- None
Saves all decomposition data to the specified file
Examples
>>> manager.save(pipeline_data, 'decomposition.npy')
- load(pipeline_data: PipelineData, load_path: str) None
Load all decomposition data from single file.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.decomposition.load('decomposition.npy') # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = DecompositionManager() >>> manager.load(pipeline_data, 'decomposition.npy') # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container to load decomposition data into
- load_pathstr
Path to saved decomposition data file
Returns
- None
Loads all decomposition data from the specified file
Examples
>>> manager.load(pipeline_data, 'decomposition.npy')
- print_info(pipeline_data: PipelineData) None
Print decomposition data information.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.decomposition.print_info() # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = DecompositionManager() >>> manager.print_info(pipeline_data) # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container with decomposition data
Returns
- None
Prints decomposition data information to console
Examples
>>> manager.print_info(pipeline_data)
- property add: DecompositionAddService
Service for adding decomposition algorithms with simplified syntax.
Provides an intuitive interface for adding decomposition algorithms without requiring explicit decomposition type instantiation or imports.
Returns
- DecompositionAddService
Service instance for adding decomposition algorithms with combined parameters
Examples
>>> # Add different decomposition algorithms >>> pipeline.decomposition.add.pca("my_features", n_components=10) >>> pipeline.decomposition.add.kernel_pca("contact_features", kernel='rbf', n_components=20) >>> pipeline.decomposition.add.contact_kernel_pca("contact_features", n_components=15) >>> pipeline.decomposition.add.diffusion_maps("distance_features", n_components=12)
Notes
Pipeline data is automatically injected by AutoInjectProxy. All decomposition type parameters are combined with manager.add parameters.