Adaptive spill-receive for robust high-performance caching in CMPs

Moinuddin K. Qureshi

doi:10.1109/HPCA.2009.4798236

HPCA 2009

Conference paper

14 Feb 2009

Adaptive spill-receive for robust high-performance caching in CMPs

View publication

Abstract

In a Chip Multi-Processor (CMP) with private caches, the last level cache is statically partitioned between all the cores. This prevents such CMPs from sharing cache capacity in response to the requirement of individual cores. Capacity sharing can be provided in private caches by spilling a line evicted from one cache to another cache. However, naively allowing all caches to spill evicted lines to other caches have limited performance benefit as such spilling does not take into account which cores benefit from extra capacity and which cores can provide extra capacity. This paper proposes Dynamic Spill-Receive (DSR) for efficient capacity sharing. In a DSR architecture, each cache uses Set Dueling to learn whether it should act as a "spiller cache" or "receiver cache" for best overall performance. We evaluate DSR for a Quad-core system with 1MB private caches using 495 multi-programmed workloads. DSR improves average throughput by 18% (weighted-speedup by 13% and harmonic-mean fairness metric by 36%) compared to no spilling. DSR requires a total storage overhead of less than two bytes per core, does not require any changes to the existing cache structure, and is scalable to a large number of cores (16 in our evaluation). Furthermore, we propose a simple extension of DSR that provides Quality of Service (QoS) by guaranteeing that the worst-case performance of each application remains similar to that with no spilling, while still providing an average throughput improvement of 17.5%. © 2008 IEEE.

Conference paper