PyMM: Heterogeneous Memory Programming for Python Data Science

Daniel Waddington; Moshik Hershcovitch; Clem Dickey

doi:10.1145/3477113.3487266

PLOS 2021

Conference paper

25 Oct 2021

PyMM: Heterogeneous Memory Programming for Python Data Science

View publication

Abstract

While persistent memory (PMEM) is a promising technology, leveraging it with legacy applications is non-trivial. This is primarily because legacy applications assume all memory is volatile and there is no notion of crash-consistency or state recovery. As new types of persistent and intelligent memory emerge, propelled by the CXL standard, the problem of integration and adoption remains. In this paper we present PyMM, a framework for heterogeneous memory management in Python. It provides a means to abstract upon different memory types and their underlying traits (e.g., persistence, near/far). PyMM focuses on ease-of-use and employs an approach of sub-classing existing heavily-used types such as NumPy ndarray and PyTorch tensors. By doing so, PyMM allows new memory adoption with only minor modification to the application.

Poster