Centroid Helper
GitHub Link to Code.
Centroid calculation helper for trajectory analysis.
This module provides utilities for computing centroids (mean frames) and finding frames closest to centroids. Used for representative frame selection in clustering and other analyses.
- class mdxplain.pipeline.helper.centroid_helper.CentroidHelper
Helper class for centroid calculations.
Provides method to find frames closest to the centroid (mean) using memory-efficient chunked processing for large datasets.
Examples
>>> # Find centroid frame with fast numpy >>> best_idx = CentroidHelper.find_centroid( ... selected_data, use_memmap=False ... )
>>> # Find centroid frame with memmap-safe chunked processing >>> best_idx = CentroidHelper.find_centroid( ... selected_data, use_memmap=True, chunk_size=1000 ... )
- static find_centroid(selected_data: ndarray, use_memmap: bool = False, chunk_size: int = 1000) int
Find frame closest to centroid (mean).
Computes the centroid (mean) of all frames and finds the frame that minimizes Euclidean distance to it. Uses fast numpy operations for small datasets or chunked processing for large memmap datasets.
Parameters
- selected_datanp.ndarray
Data array with shape (n_frames, n_features)
- use_memmapbool, default=False
Whether to use chunked processing for memmap
- chunk_sizeint, default=1000
Number of frames to process per chunk when use_memmap=True
Returns
- int
Local index of centroid frame
Examples
>>> # Fast mode for standard numpy arrays >>> data = np.random.rand(1000, 100) >>> idx = CentroidHelper.find_centroid(data, use_memmap=False) >>> print(idx) # Index between 0 and 999
>>> # Memmap mode for large datasets >>> idx = CentroidHelper.find_centroid( ... large_memmap_data, use_memmap=True, chunk_size=500 ... )