# mdxplain Orchestration This table provides an overview of the methods, algorithms, and backends used in mdxplain, organized by category.

Category

Method/Algorithm

Backend

Explanation

Trajectory

Trajectory handling

mdtraj + mdxplain

mdxplain: In memmap-mode uses DaskTrajectory - Dask/Zarr manages data streaming and storage.

mdtraj: Reads topology and frames, performs geometric operations (superpose, smooth, align, distances) on each loaded chunk.

In normal mode uses mdtraj trajectories.

Nomenclature/Labeling

Consensus labeling (GPCR/CGN/KLIFS)

mdciao

Consensus labels come from mdciao; mdxplain maps them to features/selectors and integrates them into its internal metadata system.

Standard labeling and metadata generation

mdxplain

mdxplain implements own metadata / AA labeling using topology from mdtraj.

Feat. Sel. DSL

Feature Selector/Internal DSL

mdxplain + mdtraj

Whole Selection DSL by mdxplain. For atom-selection from topology mdtraj DSL is used.

Data Handling

Archiving and large data handling

Pickle

Archiving and saving implemented by mdxplain using Pickle.

Memmap arrays and low-level arithmetic

NumPy

Intermediate data is saved as NumPy memmap arrays.

If not stated differently, NumPy also used for basic arithmetics like mean, median, var, etc.

Geometric Features

Coordinates

mdxplain

Whole feature type by mdxplain, trajectory object provides coordinates & topology.

Distances

mdtraj

mdxplain orchestrates uses mdtraj.compute_contacts as computation backend.

Contacts

mdxplain

Whole feature type by mdxplain, uses distance outputs (mdtraj kernel).

Torsions

mdtraj

mdxplain orchestrates and uses mdtraj.compute_phi/psi/omega/chi as backend.

DSSP

mdtraj

mdxplain orchestrates and uses mdtraj.dssp as backend.

SASA

mdtraj

mdxplain orchestrates and uses mdtraj.shrake_rupley as backend.

Reduction Logic

mdxplain + SciPy

mdxplain implements feature-reduction and uses NumPy/SciPy for metrics where possible, extending them when needed.

RMSD

mdxplain

The trajectory object provides topology/coordinates; mdxplain handles computation plus data/metadata, including window and reference metrics (frame-to-frame or to-reference) and variants like RMSD with mean or median for flexible systems.

RMSF

mdxplain + Numba-JIT

Trajectory supplies topology/coords; mdxplain handles computation and metadata, including window/reference metrics and RMSF variants (mean/median).

It also performs residue-level aggregation (mean, median, RMS, RMdS) with Numba-JIT acceleration.

MAD

mdxplain

mdxplain offers MAD values in frame-based and atom- / residue-based manner.

This is implemented by mdxplain directly inside the RMSD and RMSF calculators.

Dim Reduction

PCA/IncrementalPCA

scikit-learn

PCA and incremental PCA wrap scikit-learn; mdxplain orchestrates.

Kernel PCA

scikit-learn + scipy + mdxplain

Standard KPCA uses scikit-learn; iterative KPCA and metadata/data handling are implemented by mdxplain.

Nystroem KPCA combines scikit-learn's RBF Nystroem with IncrementalPCA, with mdxplain adding component/epsilon approximations.

Diffusion Maps

mdxplain

Full algorithm incl. iterative and nystroem mode implemented by mdxplain.

Clustering

DBSCAN

scikit-learn

mdxplain orchestrates DBSCAN: standard mode uses scikit-learn's DBSCAN, precomputed mode uses scikit-learn's NearestNeighbors, and kNN-mode combines scikit-learn's DBSCAN on a subsample with scikit-learn's kNN for assignment.

HDBSCAN

hdbscan + scikit-learn

mdxplain orchestrates HDBSCAN workflows: standard mode uses the hdbscan implementation, approximate-prediction mode uses random sampling plus hdbscan's own prediction, and kNN-mode combines hdbscan on a subsample with scikit-learn's kNN for assignment.

DPA

dpa + scikit-learn

mdxplain orchestrates DPA workflows: standard mode uses the dpa library's DPA implementation, and kNN-mode runs DPA on a subsample and assigns remaining points via scikit-learn's kNN.

Feature Importance

Decision Tree

scikit-learn

mdxplain orchestrates decision-tree workflows: it uses scikit-learn's DecisionTreeClassifier, adds stratified sampling for large datasets, and provides comparison modes (one-vs-rest, multiclass, binary, pairwise).

Feature Statistics

Feature-type analysis

mdxplain

mdxplain implements feature-based analysis and uses NumPy/SciPy for metrics where possible, extending them when needed.

Plots

DecisionTree-Plotter

mdxplain + matplotlib

Layout, styling, export by mdxplain. Uses tree-structure of scikit-learn model.

Plot-library is matplotlib.

DensityPlotter

mdxplain + matplotlib + scipy

mdxplain implementation using matplotlib.

DensityPlotter uses scipy KDEs to display smoothed histrograms.

ViolinPlotter

mdxplain + matplotlib

mdxplain implementation using matplotlib.

TimeSeries-Plotter

mdxplain + matplotlib

mdxplain implementation using matplotlib.

Time-series can be smoothed using a savgol filter from scipy.

Landscape-Plotter

mdxplain + matplotlib + scipy

mdxplain implementation using matplotlib.

Uses NumPy for histograms and SciPy for optional KDE smoothing.

Cluster-Membership-Plotter

mdxplain + matplotlib

mdxplain implementation using matplotlib.

3D Viz

StructureViz Feature Service

mdxplain + mdtraj + PyMol

mdxplain implements representative-finding (centroid or decision-tree–based), uses mdtraj to generate PDBs.

mdxplain embed feature-importance values as B-factors, and provides PyMOL script generation for optional visualization.

PyMOL can be used optional to visualize this script.

NGLView

mdxplain + mdtraj + nglview

mdxplain implements representative-finding (centroid or decision-tree–based), uses mdtraj to generate PDBs.

mdxplain embed feature-importance values as B-factors, and provides an extended nglview widget with checkboxes, selections, and legends for PyMOL-like 3D comparison in Jupyter.