Decomposition Type Helper
GitHub Link to Code.
Helper classes for decomposition type calculations.
Automatic Parameter Helper
Automatic parameter calculation helper for decomposition methods.
- class mdxplain.decomposition.decomposition_type.helper.automatic_parameter_helper.AutomaticParameterHelper
Helper for automatic parameter calculation in decomposition methods.
Provides methods for automatic calculation of n_components via elbow detection, gamma parameters for kernel methods, and variance computation for large memory-mapped datasets.
Examples
>>> # Elbow detection for PCA >>> helper = AutomaticParameterHelper() >>> variance_ratios = np.array([0.4, 0.3, 0.15, 0.1, 0.05]) >>> n = helper.find_elbow(variance_ratios, max_components=5)
>>> # Gamma calculation for KernelPCA >>> data = np.random.rand(1000, 100) >>> gamma = helper.calculate_gamma_scale(data)
>>> # Variance for large datasets >>> large_data = np.memmap('data.dat', dtype='float64', shape=(10000, 500)) >>> var = helper.compute_variance_chunked(large_data, chunk_size=1000)
- static find_elbow(values: ndarray, sensitivity: float = 1.0, max_components: int = None, offset: int | float = 0) int
Find elbow point in decreasing curve using kneed algorithm.
Parameters
- valuesnumpy.ndarray
Decreasing values (variance ratios or eigenvalues)
- sensitivityfloat, default=1.0
Kneed sensitivity parameter (S)
- max_componentsint, optional
Maximum number of components for warning
- offsetint or float, default=0
Adjustment to the elbow position:
int: Direct addition/subtraction of components (e.g., -2 selects 2 fewer components, +3 selects 3 more)
float: Percentage-based adjustment (e.g., -0.5 selects 50% fewer, 0.5 selects 50% more)
Returns
- int
Optimal number of components (1-indexed)
Examples
>>> helper = AutomaticParameterHelper() >>> variances = np.array([0.4, 0.3, 0.15, 0.1, 0.05]) >>> n = helper.find_elbow(variances) >>> print(f"Optimal: {n}")
>>> # Select 2 fewer components than elbow >>> n = helper.find_elbow(variances, offset=-2)
>>> # Select 50% fewer components than elbow >>> n = helper.find_elbow(variances, offset=-0.5)
- static calculate_gamma_scale(data: ndarray, use_memmap: bool = False, chunk_size: int = 1000) float
Calculate gamma parameter using scale method.
Parameters
- datanumpy.ndarray
Input data matrix
- use_memmapbool, default=False
Whether to use chunked variance calculation
- chunk_sizeint, default=1000
Chunk size for memmap variance calculation
Returns
- float
Gamma parameter value
Examples
>>> helper = AutomaticParameterHelper() >>> data = np.random.rand(1000, 50) >>> gamma = helper.calculate_gamma_scale(data)
- static calculate_gamma_auto(n_features: int) float
Calculate gamma parameter using auto method.
Parameters
- n_featuresint
Number of features in the dataset
Returns
- float
Gamma parameter value
Examples
>>> helper = AutomaticParameterHelper() >>> gamma = helper.calculate_gamma_auto(100) >>> print(gamma) 0.01
- static compute_variance_chunked(data: ndarray, chunk_size: int = 1000) float
Compute mean variance using chunk-wise processing.
Parameters
- datanumpy.ndarray
Input data matrix
- chunk_sizeint, default=1000
Number of samples to process per chunk
Returns
- float
Mean variance across all features
Examples
>>> helper = AutomaticParameterHelper() >>> data = np.random.rand(10000, 100) >>> var = helper.compute_variance_chunked(data, chunk_size=1000)