Decomposition Types Base
GitHub Link to Code.
Abstract base class defining the interface for all decomposition types.
Defines the interface that all decomposition types (PCA, KernelPCA, etc.) must implement for consistency across different dimensionality reduction methods.
- class mdxplain.decomposition.decomposition_type.interfaces.decomposition_type_base.DecompositionTypeBase
Abstract base class for all decomposition types.
Defines the interface that all decomposition types (PCA, KernelPCA, etc.) must implement. Each decomposition type encapsulates computation logic for a specific type of dimensionality reduction analysis.
Examples
>>> class MyDecomposition(DecompositionTypeBase): ... @classmethod ... def get_type_name(cls) -> str: ... return 'my_decomposition' ... def init_calculator(self, **kwargs): ... self.calculator = MyCalculator(**kwargs) ... def compute(self, data, **kwargs): ... return self.calculator.compute(data, **kwargs)
- __init__() None
Initialize the decomposition type.
Sets up the decomposition type instance with an empty calculator that will be initialized later through init_calculator().
Parameters
None
Returns
None
Examples
>>> # Create decomposition type instance >>> decomp = MyDecomposition() >>> print(f"Type: {decomp.get_type_name()}")
- abstractmethod classmethod get_type_name() str
Return unique string identifier for this decomposition type.
Used as the key for storing decomposition results in TrajectoryData dictionaries and for type identification.
Parameters
- clstype
The decomposition type class
Returns
- str
Unique string identifier (e.g., ‘pca’, ‘kernel_pca’)
Examples
>>> print(PCA.get_type_name()) 'pca' >>> print(KernelPCA.get_type_name()) 'kernel_pca'
- abstractmethod init_calculator(use_memmap: bool = False, cache_path: str = './cache', chunk_size: int = 2000) None
Initialize the calculator instance for this decomposition type.
Parameters
- use_memmapbool, default=False
Whether to use memory mapping for efficient handling of large datasets
- cache_pathstr, optional
Directory path for cache files
- chunk_sizeint, optional
Number of samples to process per chunk for incremental computation
Returns
- None
Sets self.calculator to initialized calculator instance
Examples
>>> # Basic initialization >>> pca = PCA() >>> pca.init_calculator()
>>> # With memory mapping for large datasets >>> pca.init_calculator( ... use_memmap=True, ... cache_path='./cache/', ... chunk_size=1000 ... )
- abstractmethod compute(data: ndarray) Tuple[ndarray, Dict[str, Any]]
Compute decomposition using the initialized calculator.
Parameters
- datanumpy.ndarray
Input data matrix to decompose, shape (n_samples, n_features)
Returns
- Tuple[numpy.ndarray, Dict]
Tuple containing:
transformed_data: Decomposed data matrix (n_samples, n_components)
metadata: Dictionary with transformation information including hyperparameters, explained variance, components, etc.
Examples
>>> # Compute PCA decomposition >>> pca = PCA() >>> pca.init_calculator() >>> data = np.random.rand(100, 50) >>> transformed, metadata = pca.compute(data, n_components=10) >>> print(f"Transformed shape: {transformed.shape}")
Raises
- ValueError
If calculator is not initialized or input data is invalid
- get_required_feature_type() str | None
Return required feature type for this decomposition method.
Some decomposition methods require specific feature types (e.g., DiffusionMaps requires ‘coordinates’). If a specific feature type is required, the DecompositionManager will validate that the FeatureSelector contains only features of this type.
Parameters
None
Returns
- Optional[str]
Required feature type name, or None if no specific type required
Examples
>>> # Most decompositions work with any feature type >>> pca = PCA() >>> print(pca.get_required_feature_type()) None
>>> # DiffusionMaps requires coordinate features >>> diffmaps = DiffusionMaps() >>> print(diffmaps.get_required_feature_type()) 'coordinates'