Dask Trajectory Archive Helper
GitHub Link to Code.
DaskMDTrajectory Archive Helper - save/load trajectory archives.
Handles creation and extraction of self-contained .dask_traj archives
that bundle the pickle metadata with the underlying Zarr cache directory.
The resulting archive is fully portable: no external files are required
after loading.
Archive layout (inside the compressed tar):
traj.pkl – pickle of the DaskMDTrajectory object (metadata only)
zarr/ – the Zarr cache directory with all coordinate data
- class mdxplain.trajectory.helper.dask_trajectory_helper.dask_trajectory_archive_helper.DaskMDTrajectoryArchiveHelper
Save and load DaskMDTrajectory objects as portable
.dask_trajarchives.An archive is a zstd-compressed tar that contains:
traj.pkl– pickle of the trajectory object (metadata only, no arrays)zarr/– the Zarr cache directory with all coordinate data
The archive can be moved or shared freely. On load the Zarr cache is extracted next to the archive file and the trajectory handles are reconnected automatically. Subsequent loads reuse the already-extracted cache without re-extracting.
Methods
- save(trajectory, filepath)
Write a
.dask_trajarchive for trajectory.- load(filepath)
Read a
.dask_trajarchive and return a ready-to-use trajectory.
- static save(trajectory: DaskMDTrajectory, filepath: str) None
Save a DaskMDTrajectory to a portable self-contained archive.
Serialises the trajectory metadata as a pickle and bundles it with the underlying Zarr cache into a single zstd-compressed tar archive. The resulting
.dask_trajfile can be transferred to another machine or directory without carrying the Zarr cache separately.Parameters
- trajectoryDaskMDTrajectory
Trajectory instance to archive.
- filepathstr
Destination path. The
.dask_trajextension is appended automatically when not already present.
Returns
None
Examples
>>> DaskMDTrajectoryArchiveHelper.save(traj, "output/run1") Trajectory saved: output/run1.dask_traj
- static load(filepath: str) DaskMDTrajectory
Load a DaskMDTrajectory from a
.dask_trajarchive.On the first call the archive is extracted to a sibling directory (
<name>_extracted/). Subsequent calls reuse the extracted directory without re-extracting, so repeated loads are fast.After extraction the trajectory handles (Dask arrays, Zarr store) are reconnected to the extracted Zarr cache.
Parameters
- filepathstr
Path to a
.dask_trajarchive created bysave().
Returns
- DaskMDTrajectory
Fully initialised trajectory with coordinate access ready.
Raises
- FileNotFoundError
If filepath does not point to an existing archive.
Examples
>>> traj = DaskMDTrajectoryArchiveHelper.load("output/run1.dask_traj") >>> print(traj.n_frames) 501