Structural Analysis
Structural metrics provide quantitative measures of protein dynamics and conformational changes. mdxplain offers multiple RMSD/RMSF variants optimized for different analysis scenarios.
RMSD Metrics - Which Variant to Use
.rmsd.mean - Standard RMSD (Root Mean Square Deviation)
What: Classic RMSD using arithmetic mean
When to use:
Standard conformational analysis and literature comparison
Systems without highly flexible regions
Fastest computation
Avoid when: System has highly flexible regions that would dominate the metric
.rmsd.median - Robust RMSD (Root Median Square Deviation)
What: Square root of median of squared deviations (instead of mean)
When to use:
Systems with occasional outlier atoms (flexible loops + rigid core)
Multi-domain proteins with independent movement
Combined folded/unstructured regions
Why: Robust against outlier atoms affecting the metric
.rmsd.mad - MAD RMSD (Median Absolute Deviation)
What: Most robust RMSD variant based on MAD
When to use:
Extremely flexible systems
Proteins with intrinsically disordered regions
Maximum outlier resistance needed
Why: Statistically most robust, least affected by rare large-amplitude moves
RMSD Modes - What to Measure
.to_reference(reference_traj, reference_frame, atom_selection="all")
What: RMSD to a fixed reference frame
Why use it:
Track structural drift from starting structure
Measure equilibration (distance from crystal structure)
Monitor return to specific conformation
Stability relative to known structure (X-ray, cryo-EM)
Example:
rmsd.mean.to_reference(0, 0)→ drift from initial frame
.frame_to_frame(lag=1, atom_selection="all")
What: RMSD between consecutive or lag-separated frames
Why use it:
Quantify local structural fluctuations
Identify smooth vs. jerky dynamics
Detect conformational transition events
lag=1: frame-to-frame noise, lag=10+: larger conformational shifts
Example:
rmsd.mean.frame_to_frame(lag=10)→ local dynamics
.window_frame_to_start()
What: Sliding window RMSD to window start
Why use it:
Assess equilibration within time windows
Local stability analysis
Detect gradual structural drift - Standard flexibility analysis
Comparison with experimental B-factors
Identify rigid vs. flexible regions
.window_frame_to_frame()
What: Sliding window frame-to-frame RMSD
Why use it:
Track local fluctuations over time
Identify periods of high/low dynamics
RMSF Metrics - Which Variant
.rmsf.mean - Standard RMSF (Root Mean Square Fluctuation)
What: Classic per-residue flexibility metric
When to use:
Standard flexibility analysis
Comparison with experimental B-factors
Identify rigid vs. flexible regions
.rmsf.median - Robust RMSF (Root Median Square Fluctuation)
What: Square root of median of squared fluctuations (instead of mean)
When to use: Systems with occasional large-amplitude rare events
Why: Robust against rare outlier movements
.rmsf.mad - MAD RMSF (Median Absolute Deviation)
What: Most robust fluctuation measure
When to use: Very flexible systems requiring maximum robustness
RMSF Modes - Resolution Level
.per_atom
What: Atom-level fluctuations
Why use it:
Detailed side-chain dynamics
Specific functional atoms (active site, binding residues)
High-resolution flexibility mapping
Example: Side-chain rotamer dynamics
.per_residue
What: Residue-aggregated fluctuations
Why use it:
Protein-wide flexibility overview
Domain mobility comparison
Experimental B-factor comparison
Flexibility profiles
Example: Identify flexible loops vs. rigid helices
Per-Residue Aggregation Methods
When using .per_residue, atom-level RMSF values within each residue must be
aggregated to a single per-residue value. Four aggregation strategies are available:
.with_mean_aggregation (Default, Simple Average)
What: Arithmetic mean of atom-level RMSF values:
mean(RMSF_atoms)When to use:
Standard flexibility profiles
All atoms in residue have similar flexibility
No extreme outlier atoms
Sensitivity: Affected by outlier atoms (very flexible side-chain tips)
Example: Backbone CA atoms or residues with uniform flexibility
.with_median_aggregation (Robust to Outliers)
What: Median of atom-level RMSF values:
median(RMSF_atoms)When to use:
Long flexible side chains (e.g., LYS, ARG) with rigid backbone
Mixed flexibility within residue (some atoms rigid, others flexible)
Want typical flexibility, not influenced by extreme atoms
Benefit: Terminal side-chain atoms don’t dominate the residue score
Example: Surface residues with flexible tips but stable core
.with_rms_aggregation (Emphasize Larger Values)
What: Root-mean-square of RMSF values:
sqrt(mean(RMSF_atoms²))When to use:
Want to emphasize larger fluctuations more strongly
Quadratic weighting is desired (larger RMSF = disproportionately higher weight)
Effect: Residue score dominated by most flexible atoms
Example: Identifying highly dynamic regions where any flexible atom matters
.with_rms_median_aggregation (Emphasize + Robust)
What: Root-median-square of RMSF values:
sqrt(median(RMSF_atoms²))When to use:
Want quadratic weighting but robust against extreme outliers
Very flexible side chains with occasional large-amplitude jumps
Effect: Emphasizes typical flexibility, ignores rare extremes
Example: Disordered regions with occasional extreme conformations
Aggregation Selection Guide
Your Goal |
Recommended Method |
Why |
|---|---|---|
Standard flexibility profile |
|
Simple average, standard representation |
Residues with very flexible tips |
|
Ignores outlier atoms at side-chain ends |
Emphasize most flexible atoms |
|
Larger fluctuations get more weight (quadratic) |
Flexibility with outlier protection |
|
Emphasizes flexibility but ignores extremes |
Practical Examples
Equilibration Monitoring
Question: Has the simulation equilibrated? How far does structure drift from starting point?
Why: mean.to_reference = standard metric for comparing to crystal structure
Why: backbone = ignores side-chain noise, focuses on secondary structure stability
Use case: Quality control, detect slow conformational drift, assess convergence
rmsd_crystal = pipeline.analysis.structure.rmsd.mean.to_reference(
reference_traj=0, reference_frame=0, atom_selection="backbone"
)
Flexible Multi-Domain Protein
Question: What is overall conformational change in protein with mobile loops?
Why: median.to_reference = robust against flexible loop outliers affecting global RMSD
Why: “protein” selection = all protein atoms, including flexible regions
Use case: Multi-domain proteins, antibodies, disordered regions that would dominate mean
rmsd_robust = pipeline.analysis.structure.rmsd.median.to_reference(
reference_traj=0, reference_frame=0, atom_selection="protein"
)
Conformational Transition Detection
Question: Are there sudden conformational changes or smooth dynamics?
Why: frame_to_frame(lag=5) = local structural changes every 5 frames
Why: mean metric = sufficient for well-behaved dynamics
Why: CA atoms = coarse-grained, computationally efficient
Use case: Identify transition events, measure local stability, detect jerky vs smooth motion
rmsd_dynamics = pipeline.analysis.structure.rmsd.mean.frame_to_frame(
lag=5, atom_selection="name CA"
)
Standard Flexibility Profile
Question: Which regions are rigid vs flexible?
Why: rmsf.mean.per_residue = standard
Why: with_mean_aggregation (default) = balanced residue flexibility
Why: CA atoms = one value per residue, standard representation
Use case: identify flexible loops/hinges
rmsf_profile = pipeline.analysis.structure.rmsf.mean.per_residue.to_mean_reference(
atom_selection="name CA"
)
Binding Site Side-Chain Flexibility
Question: How do side chains move in the binding pocket?
Why: per_atom = high-resolution, individual atom fluctuations
Why: mean metric = standard for well-defined binding site
Why: resid 120-140 = specific binding site region
Use case: Ligand binding analysis, induced fit, side-chain conformational sampling
rmsf_detailed = pipeline.analysis.structure.rmsf.mean.per_atom.to_mean_reference(
atom_selection="resid 120-140" # binding site
)
Multi-Domain Protein Flexibility
Question: Which domains are rigid vs mobile, with robustness against outliers?
Why: rmsf.median = robust against occasional large-amplitude motions
Why: per_residue.to_median_reference = consistent robust statistics at all levels
Why: with_median_aggregation (implicit) = prevents outlier atoms from dominating
Use case: Domain motion analysis, linker flexibility, proteins with mobile tails
rmsf_robust = pipeline.analysis.structure.rmsf.median.per_residue.to_median_reference(
atom_selection="protein"
)
Side-Chain Rotamer Switching (Advanced Aggregation)
Question: Which residues show side-chain flexibility with robust aggregation?
Why: with_rms_aggregation = combines backbone and side-chain motion correctly
Why: Emphasizes larger fluctuations, suitable for rotamer analysis
Use case: Functional side-chain dynamics, allosteric communication pathways
rmsf_rotamers = pipeline.analysis.structure.rmsf.mean.per_residue.with_rms_aggregation(
atom_selection="protein"
)