Memmap Utils

GitHub Link to Code.

Centralized helpers for memmap path handling and lifecycle operations.

class mdxplain.utils.memmap_utils.MemmapUtils

Utility methods for memmap path prep, creation, and cleanup.

static prepare_memmap_path(path: str | PathLike, mode: str) str

Normalize memmap file path and create parent directory for write modes.

Parameters

pathstr or os.PathLike

Target memmap file path.

modestr

Memmap mode (e.g. "w+", "r+", "r").

Returns

str

Normalized absolute memmap path.

static is_memmap_view(array: Any) bool

Check whether an array is backed by a numpy memmap (including views).

Parameters

arrayAny

Array or view to check.

Returns

bool

True if the array is a memmap or view on a memmap.

static evict_from_os_cache(array: Any) None

Request the OS to release cached pages for a memmap backing an array.

Parameters

arrayAny

Array, memmap, or view potentially backed by a memmap.

Returns

None

Applies MADV_DONTNEED when supported; otherwise flushes writeable mappings.

Notes

Automatically flushes the array before eviction, so callers do not need to manually issue .flush() in addition to this method.

static evict_memory_range(array: Any, start_row: int, end_row: int) None

Request the OS to release cached pages for a specific row range.

Parameters

arrayAny

Array, memmap, or view potentially backed by a memmap.

start_rowint

Starting row index (inclusive).

end_rowint

Ending row index (exclusive).

Returns

None

Applies MADV_DONTNEED to the row range when supported; otherwise flushes writeable mappings.

Notes

This is crucial during large dataset copying inside loops to prevent OS RAM accumulation from read/write caching. Automatically flushes the underlying memmap before eviction.

static close_memmap_view(array: Any) None

Close underlying memmap handle(s) for an array or array view.

Parameters

arrayAny

Array, memmap, or view potentially backed by a memmap.

Returns

None

Flushes, evicts from cache, and closes any discovered memmap handles.

static close_memmaps_for_path(path: str | PathLike) None

Close all tracked memmaps whose filename matches the given path.

Parameters

pathstr or os.PathLike

Target memmap file path.

Returns

None

static close_memmaps_under_path(path: str | PathLike) None

Close tracked memmaps whose filename is inside the given directory path.

Parameters

pathstr or os.PathLike

Directory path root for memmap cleanup.

Returns

None

static create_memmap(path: str | PathLike, dtype: dtype | str | type, mode: str, shape: Tuple[int, ...], *, close_existing: bool | None = None, access_pattern: str | None = 'random') ndarray

Create a memmap with standardized path prep and optional stale-handle cleanup.

Parameters

pathstr or os.PathLike

Target memmap file path.

dtypenumpy dtype, str, or type

Data type for the array.

modestr

Memmap mode (e.g. "w+", "r+", "r").

shapetuple[int, …]

Desired output shape.

close_existingbool, optional

Whether to close currently tracked memmaps for this path before create. If None, defaults to True only for write-truncate modes ("w+", "w+b").

access_patternstr, optional

Access hint for ResourceUtils.tune_memmap (e.g. "random", "sequential"). If None, no hint is applied.

Returns

np.ndarray

Memmap-backed array when possible, otherwise regular ndarray for empty shapes.