A Case for using Cache Line Deltas for High Frequency VM Snapshotting
Active-standby schemes for Virtual Machine (VM) high-availability require periodic synchronization of memory and CPU state. The most common approach to synchronization is to use page tables and software to identify “dirty” memory pages at the source and in turn copy them to the target via a network or interconnect. However, this approach results in significant page table traversal and data copying overhead, resulting in considerable VM downtime. A principal contributor to this overhead is that many applications using this approach incur data copy-amplification as a result of copying more data than is necessary; this arises because of the processor’s virtual memory system design in which memory pages are 4KiB or larger. With the emergence of CXL-enabled memory devices, it is now possible to track memory changes at a finer-granularity (e.g., 64- byte cache lines instead of 4KiB pages). Moreover, CXL will enable new functions to be pushed down into custom memory controllers that can directly intercept and manipulate memory transactions. This paper examines the potential advantages of moving to cache line-based memory change detection and transfer. We focus on ex- ploring continuous synchronization of VM guest memory spaces for the purpose of achieving high-availability. For this use-case, the maximum outage-time, resulting from snapshot and synchroniza- tion latency, must be kept to a minimum. Our analysis examines memory access patterns from 30 different benchmarks and derives a quantitative understanding of potential gains CXL-based technology could offer. The results show that more than 35% of the benchmarks exhibit an amplification factor of greater than 10 and therefore would significantly benefit from the finer granularity proposed. Furthermore, to reduce the data transfer between machines, we combine fine-grained tracking with compression to reduce data copy volume by half.