Dask Trajectory Archive Helper

GitHub Link to Code.

DaskMDTrajectory Archive Helper - save/load trajectory archives.

Handles creation and extraction of self-contained .dask_traj archives that bundle the pickle metadata with the underlying Zarr cache directory. The resulting archive is fully portable: no external files are required after loading.

Archive layout (inside the compressed tar):

traj.pkl   – pickle of the DaskMDTrajectory object (metadata only)
zarr/      – the Zarr cache directory with all coordinate data
class mdxplain.trajectory.helper.dask_trajectory_helper.dask_trajectory_archive_helper.DaskMDTrajectoryArchiveHelper

Save and load DaskMDTrajectory objects as portable .dask_traj archives.

An archive is a zstd-compressed tar that contains:

  • traj.pkl – pickle of the trajectory object (metadata only, no arrays)

  • zarr/ – the Zarr cache directory with all coordinate data

The archive can be moved or shared freely. On load the Zarr cache is extracted next to the archive file and the trajectory handles are reconnected automatically. Subsequent loads reuse the already-extracted cache without re-extracting.

Methods

save(trajectory, filepath)

Write a .dask_traj archive for trajectory.

load(filepath)

Read a .dask_traj archive and return a ready-to-use trajectory.

static save(trajectory: DaskMDTrajectory, filepath: str) None

Save a DaskMDTrajectory to a portable self-contained archive.

Serialises the trajectory metadata as a pickle and bundles it with the underlying Zarr cache into a single zstd-compressed tar archive. The resulting .dask_traj file can be transferred to another machine or directory without carrying the Zarr cache separately.

Parameters

trajectoryDaskMDTrajectory

Trajectory instance to archive.

filepathstr

Destination path. The .dask_traj extension is appended automatically when not already present.

Returns

None

Examples

>>> DaskMDTrajectoryArchiveHelper.save(traj, "output/run1")
Trajectory saved: output/run1.dask_traj
static load(filepath: str) DaskMDTrajectory

Load a DaskMDTrajectory from a .dask_traj archive.

On the first call the archive is extracted to a sibling directory (<name>_extracted/). Subsequent calls reuse the extracted directory without re-extracting, so repeated loads are fast.

After extraction the trajectory handles (Dask arrays, Zarr store) are reconnected to the extracted Zarr cache.

Parameters

filepathstr

Path to a .dask_traj archive created by save().

Returns

DaskMDTrajectory

Fully initialised trajectory with coordinate access ready.

Raises

FileNotFoundError

If filepath does not point to an existing archive.

Examples

>>> traj = DaskMDTrajectoryArchiveHelper.load("output/run1.dask_traj")
>>> print(traj.n_frames)
501