Residue Aggregation Helper

GitHub Link to Code.

JIT-compiled residue aggregation function.

This module provides a high-performance implementation of residue aggregation using Numba’s JIT compilation to optimize performance for large datasets.

Dependencies:: numba: Required for JIT compilation. Install with: pip install numba

mdxplain.analysis.structure.helper.residue_aggregation_helper.aggregate_residues_jit(atom_values: ndarray, group_indices: ndarray, group_boundaries: ndarray, aggregator_code: int) → ndarray

JIT-compiled residue aggregation for maximum performance.

This function aggregates per-atom values to per-residue values using optimized Numba compilation. The input data structure is flattened to enable efficient JIT processing.

Parameters

atom_valuesnp.ndarray

Per-atom values to aggregate (e.g., RMSF values). Shape: (n_atoms,)

group_indicesnp.ndarray

Flattened array containing all atom indices grouped by residue. Example: [0,1,2,3,4,5] for groups [[0,1,2], [3,4], [5]]

group_boundariesnp.ndarray

Start positions of each residue group plus end marker. Example: [0,3,5,6] means group 0: indices 0-2, group 1: indices 3-4, group 2: index 5

aggregator_codeint

Aggregation method code:

0: mean (arithmetic mean)
1: median (50th percentile)
2: rms (root mean square)
3: rms_median (root median square)

Returns

np.ndarray: Aggregated values per residue. Shape: (n_residues,)

Examples

>>> atom_values = np.array([0.5, 0.3, 0.7, 0.4, 0.8, 0.2])
>>> group_indices = np.array([0, 1, 2, 3, 4, 5])  # From [[0,1,2], [3,4], [5]]
>>> group_boundaries = np.array([0, 3, 5])
>>> result = aggregate_residues_jit(atom_values, group_indices, group_boundaries, 0)
>>> print(result)  # [0.5, 0.6, 0.2] (means of [0.5,0.3,0.7], [0.4,0.8], [0.2])

Notes

This function is designed for internal use by the residue aggregation system. The flattened data structure enables ~10x performance improvement through Numba JIT compilation compared to Python loops over nested lists.