.. raw:: html
.. rubric:: Performance and System Stability :class: rubric-h1 Large-scale analyses in mdxplain can stress a machine in ways that are unusual for everyday scientific software. This page explains **what we observed**, **why this can happen**, and **what mdxplain does to reduce the risk of system freezes**. The goal is not to scare users, but to make heavy jobs **predictably slow instead of unpredictably unstable**. This explanation is written for computational scientists and bioinformaticians, not for operating-system specialists. Technical terms are explained when they first appear. -------------------------------------------------------------------------- What we observed ---------------- When running very large jobs (many samples, very large intermediate matrices or graphs, large memory-mapped files, and multiple full passes over the data), we repeatedly observed the following behavior: - The system can become **completely unresponsive** during intensive read/write phases. The graphical interface stops updating, terminals do not respond, and SSH connections may freeze. - Disk activity appears **fully saturated**: other programs (desktop, logging, remote login) compete for the same storage and become extremely slow or stop responding. - CPU cores are busy, but the machine does not feel "just slow" — it feels **frozen**, as if input events are no longer processed. - In rare but reproducible cases (for example by running kernel PCA or diffusion maps on very large simulations), the system was so blocked that **even long pressing the power button did not turn the machine off**. Recovery was only possible by cutting power completely. These effects were not random. They appeared consistently at large simulations (over 200k frames and 5k+ atoms) during heavy computations such as kernel PCA or diffusion maps, indicating real **stability limits** rather than individual hardware failures. -------------------------------------------------------------------------- Why this happens ---------------- To understand these freezes, it helps to know how large numerical workloads interact with the operating system. Memory-mapped files and the page cache ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ mdxplain uses *memory-mapped files* (often called *memmaps*) for large intermediate arrays. A memmap lets the program treat data on disk as if it were in memory, while the operating system transparently loads and writes parts of the file as needed. The operating system keeps recently used data in a **page cache** (RAM used as a buffer for disk I/O). When the program writes to a memmap, modified pages become *dirty pages* — they must eventually be written back to disk. If a program writes large amounts of data very quickly, many dirty pages can accumulate. When internal limits are reached, the operating system forces a large **writeback** operation (flushing data to disk). During this phase, many processes can be blocked, and the system may feel frozen. -------------------------------------------------------------------------- The pressure of approximations (e.g., Nyström) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Even when using memory-efficient methods such as the **Nyström approximation**, system stability is not guaranteed. While Nyström avoids the O(N²) memory explosion, the construction phase introduces distinct sources of pressure: 1. **Landmark selection:** Selecting landmarks requires full linear scans of the original trajectory. For very large datasets, these scans can evict large parts of the page cache, forcing system services and other applications out of memory. 2. **Address translation overhead:** Accessing large matrices (even N × m) in partially irregular patterns increases pressure on the CPU’s memory translation machinery (for example the Translation Lookaside Buffer, TLB). A high rate of page faults shifts CPU time from numerical work to memory management in the kernel. -------------------------------------------------------------------------- Disk I/O saturation ~~~~~~~~~~~~~~~~~~~ Large analyses often read and write hundreds of gigabytes. If the storage device (SSD or HDD) is fully busy, all programs that need disk access are delayed. This includes system services, logging, desktop components, and remote login. When disk queues are full, the system can appear unresponsive even if CPU and memory are still available. -------------------------------------------------------------------------- Thrashing ~~~~~~~~~ In extreme cases, the combination of large working sets, repeated full passes over the data, and concurrent parallel workers can push the system into a state known as *thrashing*. In this state, the operating system spends most of its time moving memory pages between RAM and disk instead of making forward progress, which can make the entire machine appear frozen. -------------------------------------------------------------------------- Too much parallelism (oversubscription) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Many numerical libraries (NumPy, SciPy, BLAS, LAPACK) use multiple CPU threads internally. At the same time, mdxplain or libraries it depends on (such as scikit-learn) may run multiple processes in parallel. If both levels are active, the system can end up with **far more active threads than CPU cores**. This is called *oversubscription*. Oversubscription causes heavy context switching, cache thrashing, and additional I/O pressure, making freezes more likely. Some libraries, such as scikit-learn, already use ``threadpoolctl`` internally to avoid oversubscription within their own code. However, oversubscription can still occur when multiple layers of parallelism interact (for example, process-based parallelism combined with threaded numerical libraries, or a mix of different libraries with independent thread pools). mdxplain therefore applies additional global limits to ensure that the total number of active threads remains bounded. -------------------------------------------------------------------------- Access pattern confusion ~~~~~~~~~~~~~~~~~~~~~~~~ Some phases of an analysis read data sequentially (linear scans), while others access data in a more random pattern (graphs, neighborhoods, iterative solvers). The operating system attempts to infer the access pattern automatically, but it can make suboptimal decisions. If this happens, unnecessary disk reads and cache evictions can significantly slow the system. -------------------------------------------------------------------------- Why even the power button may not work ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On many modern systems, the power button is not a purely mechanical hardware switch. Instead, it generates an event that is handled through firmware and the operating system, often in user space (for example by ``systemd-logind`` on Linux). Under extreme conditions—such as sustained disk I/O saturation, memory reclaim, or thrashing—handling of such events can be delayed. This can make the machine appear completely unresponsive. In very rare situations, prolonged resource starvation may contribute to *livelock-like* behavior, where progress is prevented despite ongoing activity. In such cases, even firmware-mediated mechanisms involved in power-button handling may not react immediately, requiring a full power cut to recover. -------------------------------------------------------------------------- How mdxplain reduces the risk ----------------------------- mdxplain cannot make extremely large jobs cheap, but it can make them **much more predictable and safer to run**. The following measures are applied automatically or when stability mode is enabled. Chunk-level flushing of memmaps ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Large memmap writes are flushed to disk **in small chunks** (for example via ``msync``) instead of all at once. This reduces the buildup of dirty pages and prevents sudden global writeback stalls. -------------------------------------------------------------------------- Explicit access-pattern hints ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Before large linear scans, mdxplain informs the operating system (via ``madvise``) that access will be sequential. Afterward, this hint is reset for random-access phases. This improves caching behavior during large streaming passes. -------------------------------------------------------------------------- Controlled parallelism ~~~~~~~~~~~~~~~~~~~~~~ mdxplain limits the number of threads used by numerical libraries to avoid oversubscription. When process-based parallelism is active, internal threading may be reduced to one thread per process. -------------------------------------------------------------------------- CPU and I/O fairness ~~~~~~~~~~~~~~~~~~~~ When supported by the platform, mdxplain lowers its CPU and disk scheduling priority (for example using ``nice`` and ``ionice``). This helps ensure that the operating system, desktop, and interactive processes remain responsive even under heavy load. -------------------------------------------------------------------------- Platform safety and HPC environments ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All resource controls are **best-effort**. If a platform does not support a specific control, mdxplain logs a warning and continues safely. On HPC systems, mdxplain respects scheduler limits (CPU sets, cgroups) and never uses resources outside the job allocation. -------------------------------------------------------------------------- Summary ------- Large-scale analyses can push a system into extreme resource-pressure states. Without care, this can lead to complete system freezes. mdxplain actively manages memory, I/O, and parallelism to reduce these risks, prioritizing system stability over raw throughput for large jobs. -------------------------------------------------------------------------- References ---------- - Linux virtual memory and writeback behavior: https://docs.kernel.org/admin-guide/sysctl/vm.html - Memory pressure, page reclaim, and thrashing (background): https://gist.github.com/JPvRiel/bcc5b20aac0c9cce6eefa6b88c125e03 - ``madvise`` system call and access pattern hints: https://man7.org/linux/man-pages/man2/madvise.2.html - Thread pool control and oversubscription: https://github.com/joblib/threadpoolctl - Parallelism in scientific Python: https://scikit-learn.org/stable/computing/parallelism.html - systemd power button handling: https://www.freedesktop.org/software/systemd/man/logind.conf.html - Livelock and resource starvation: https://en.wikipedia.org/wiki/Livelock .. raw:: html