Decomposition Manager

GitHub Link to Code.

DecompositionManager for managing decomposition data objects.

Manager for creating and managing decomposition results from feature matrices. Used to add, reset, and manage decomposition data in trajectory data objects.

class mdxplain.decomposition.manager.decomposition_manager.DecompositionManager(use_memmap: bool = False, chunk_size: int = 2000, cache_dir: str = './cache')

Manager for decomposition data objects.

Manages the creation and storage of decomposition results from feature matrices. Works with TrajectoryData objects to perform dimensionality reduction using various decomposition methods (PCA, KernelPCA, etc.).

Examples

>>> # Create manager and add PCA decomposition
>>> from mdxplain.decomposition import decomposition_type
>>> manager = DecompositionManager()
>>> manager.add_decomposition(
...     pipeline_data, "feature_selection", decomposition_type.PCA,
...     n_components=10
... )
>>> # Manager with memory mapping for large datasets
>>> manager = DecompositionManager(use_memmap=True, chunk_size=1000)
>>> manager.add_decomposition(
...     pipeline_data, "contact_selection", decomposition_type.KernelPCA,
...     n_components=20, kernel='rbf'
... )
__init__(use_memmap: bool = False, chunk_size: int = 2000, cache_dir: str = './cache') None

Initialize decomposition manager.

Parameters

use_memmapbool, default=False

Whether to use memory mapping for decomposition data

chunk_sizeint, optional

Processing chunk size for incremental computation

cache_dirstr, optional

Cache directory path for decomposition data

Returns

None

Initializes DecompositionManager instance with specified configuration

Examples

>>> # Basic manager
>>> manager = DecompositionManager()
>>> # Manager with memory mapping
>>> manager = DecompositionManager(
...     use_memmap=True,
...     chunk_size=1000,
...     cache_dir="./cache/decomposition"
... )
add_decomposition(pipeline_data: PipelineData, selection_name: str, decomposition_type: DecompositionTypeBase, decomposition_name: str | None = None, data_selector_name: str | None = None, force: bool = False) None

Add and compute a decomposition for selected feature data.

This method creates a DecompositionData instance for the specified decomposition type, retrieves the selected feature matrix, performs the decomposition computation, and stores the result in the TrajectoryData object.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.decomposition.add("selection", decomposition_type.PCA())  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = DecompositionManager()
>>> manager.add_decomposition(pipeline_data, "selection", decomposition_type.PCA())  # pipeline_data required

Parameters

pipeline_dataPipelineData

Trajectory data object containing feature selections

selection_namestr

Name of the feature selection to decompose

decomposition_typeDecompositionTypeBase instance

Decomposition type instance with parameters (e.g., PCA(n_components=10))

decomposition_namestr

Name to save the decomposition. If None (default), it is “selection_name_{str(decomposition_type)}”

data_selector_namestr, optional

Name of DataSelector to apply frame filtering before decomposition. If None, uses all frames from the selection.

forcebool, default=False

Whether to force recomputation if decomposition already exists

Returns

None

Adds computed decomposition to trajectory data

Raises

ValueError

If the decomposition already exists, if required selection is missing, or if the decomposition computation fails

Examples

>>> # Add PCA decomposition
>>> from mdxplain.decomposition import decomposition_type
>>> manager = DecompositionManager()
>>> manager.add_decomposition(
...     pipeline_data, "feature_selection", decomposition_type.PCA(n_components=10)
... )
>>> # Add KernelPCA with custom parameters
>>> manager.add_decomposition(
...     pipeline_data, "any_selection", decomposition_type.KernelPCA(n_components=15, gamma=0.1)
... )
>>> # Add ContactKernelPCA for contact features
>>> manager.add_decomposition(
...     pipeline_data, "contact_selection", decomposition_type.ContactKernelPCA(n_components=20)
... )
>>> # Force recomputation of existing decomposition
>>> manager.add_decomposition(
...     pipeline_data, "feature_selection", decomposition_type.PCA(n_components=20), force=True
... )
reset_decompositions(pipeline_data: PipelineData) None

Reset all computed decompositions and clear decomposition data.

This method removes all computed decompositions and their associated data, requiring decompositions to be recalculated from scratch.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.decomposition.reset_decompositions()  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = DecompositionManager()
>>> manager.reset_decompositions(pipeline_data)  # pipeline_data required

Parameters

pipeline_dataPipelineData

Trajectory data object

Returns

None

Clears all decomposition data from pipeline_data.decomposition_data

Examples

>>> manager = DecompositionManager()
>>> manager.reset_decompositions(pipeline_data)
save(pipeline_data: PipelineData, save_path: str) None

Save all decomposition data to single file.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.decomposition.save('decomposition.npy')  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = DecompositionManager()
>>> manager.save(pipeline_data, 'decomposition.npy')  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container with decomposition data

save_pathstr

Path where to save all decomposition data in one file

Returns

None

Saves all decomposition data to the specified file

Examples

>>> manager.save(pipeline_data, 'decomposition.npy')
load(pipeline_data: PipelineData, load_path: str) None

Load all decomposition data from single file.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.decomposition.load('decomposition.npy')  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = DecompositionManager()
>>> manager.load(pipeline_data, 'decomposition.npy')  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container to load decomposition data into

load_pathstr

Path to saved decomposition data file

Returns

None

Loads all decomposition data from the specified file

Examples

>>> manager.load(pipeline_data, 'decomposition.npy')
print_info(pipeline_data: PipelineData) None

Print decomposition data information.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.decomposition.print_info()  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = DecompositionManager()
>>> manager.print_info(pipeline_data)  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container with decomposition data

Returns

None

Prints decomposition data information to console

Examples

>>> manager.print_info(pipeline_data)
property add: DecompositionAddService

Service for adding decomposition algorithms with simplified syntax.

Provides an intuitive interface for adding decomposition algorithms without requiring explicit decomposition type instantiation or imports.

Returns

DecompositionAddService

Service instance for adding decomposition algorithms with combined parameters

Examples

>>> # Add different decomposition algorithms
>>> pipeline.decomposition.add.pca("my_features", n_components=10)
>>> pipeline.decomposition.add.kernel_pca("contact_features", kernel='rbf', n_components=20)
>>> pipeline.decomposition.add.contact_kernel_pca("contact_features", n_components=15)
>>> pipeline.decomposition.add.diffusion_maps("distance_features", n_components=12)

Notes

Pipeline data is automatically injected by AutoInjectProxy. All decomposition type parameters are combined with manager.add parameters.