Decomposition Types Base

GitHub Link to Code.

Abstract base class defining the interface for all decomposition types.

Defines the interface that all decomposition types (PCA, KernelPCA, etc.) must implement for consistency across different dimensionality reduction methods.

class mdxplain.decomposition.decomposition_type.interfaces.decomposition_type_base.DecompositionTypeBase

Abstract base class for all decomposition types.

Defines the interface that all decomposition types (PCA, KernelPCA, etc.) must implement. Each decomposition type encapsulates computation logic for a specific type of dimensionality reduction analysis.

Examples

>>> class MyDecomposition(DecompositionTypeBase):
...     @classmethod
...     def get_type_name(cls) -> str:
...         return 'my_decomposition'
...     def init_calculator(self, **kwargs):
...         self.calculator = MyCalculator(**kwargs)
...     def compute(self, data, **kwargs):
...         return self.calculator.compute(data, **kwargs)
__init__() None

Initialize the decomposition type.

Sets up the decomposition type instance with an empty calculator that will be initialized later through init_calculator().

Parameters

None

Returns

None

Examples

>>> # Create decomposition type instance
>>> decomp = MyDecomposition()
>>> print(f"Type: {decomp.get_type_name()}")
abstractmethod classmethod get_type_name() str

Return unique string identifier for this decomposition type.

Used as the key for storing decomposition results in TrajectoryData dictionaries and for type identification.

Parameters

clstype

The decomposition type class

Returns

str

Unique string identifier (e.g., ‘pca’, ‘kernel_pca’)

Examples

>>> print(PCA.get_type_name())
'pca'
>>> print(KernelPCA.get_type_name())
'kernel_pca'
abstractmethod init_calculator(use_memmap: bool = False, cache_path: str = './cache', chunk_size: int = 2000) None

Initialize the calculator instance for this decomposition type.

Parameters

use_memmapbool, default=False

Whether to use memory mapping for efficient handling of large datasets

cache_pathstr, optional

Directory path for cache files

chunk_sizeint, optional

Number of samples to process per chunk for incremental computation

Returns

None

Sets self.calculator to initialized calculator instance

Examples

>>> # Basic initialization
>>> pca = PCA()
>>> pca.init_calculator()
>>> # With memory mapping for large datasets
>>> pca.init_calculator(
...     use_memmap=True,
...     cache_path='./cache/',
...     chunk_size=1000
... )
abstractmethod compute(data: ndarray) Tuple[ndarray, Dict[str, Any]]

Compute decomposition using the initialized calculator.

Parameters

datanumpy.ndarray

Input data matrix to decompose, shape (n_samples, n_features)

Returns

Tuple[numpy.ndarray, Dict]

Tuple containing:

  • transformed_data: Decomposed data matrix (n_samples, n_components)

  • metadata: Dictionary with transformation information including hyperparameters, explained variance, components, etc.

Examples

>>> # Compute PCA decomposition
>>> pca = PCA()
>>> pca.init_calculator()
>>> data = np.random.rand(100, 50)
>>> transformed, metadata = pca.compute(data, n_components=10)
>>> print(f"Transformed shape: {transformed.shape}")

Raises

ValueError

If calculator is not initialized or input data is invalid

get_required_feature_type() str | None

Return required feature type for this decomposition method.

Some decomposition methods require specific feature types (e.g., DiffusionMaps requires ‘coordinates’). If a specific feature type is required, the DecompositionManager will validate that the FeatureSelector contains only features of this type.

Parameters

None

Returns

Optional[str]

Required feature type name, or None if no specific type required

Examples

>>> # Most decompositions work with any feature type
>>> pca = PCA()
>>> print(pca.get_required_feature_type())
None
>>> # DiffusionMaps requires coordinate features
>>> diffmaps = DiffusionMaps()
>>> print(diffmaps.get_required_feature_type())
'coordinates'