EuroSys 2008

MemorIES: A programmable, real-time hardware emulation tool for multiprocessor server design

View publication


Modern system design often requires multiple levels of simulation for design validation and performance debugging. However, while machines have gotten faster, and simulators have become more detailed, simulation speeds have not tracked machine speeds. As a result, it is difficult to simulate realistic problem sizes and hardware configurations for a target machine. Instead, researchers have focussed on developing scaling methodologies and running smaller problem sizes and configurations that attempt to represent the behavior of the real problem. Given the increasing size of problems today, it is unclear whether such an approach yields accurate results. Moreover, although commercial workloads are prevalent and important in today's marketplace, many simulation tools are unable to adequately profile such applications, let alone for realistic sizes. In this paper we present a hardware-based emulation tool that can be used to aid memory system designers. Our focus is on the memory system because the ever-widening gap between processor and memory speeds means that optimizing the memory subsystem is critical for performance. We present the design of the Memory Instrumentation and Emulation System (MemoriES). MemoriES is a programmable tool designed using FPGAs and SDRAMs. It plugs into an SMP bus to perform on-line emulation of several cache configurations, structures and protocols while the system is running real-life workloads in real-time, without any slowdown in application execution speed. We demonstrate its usefulness in several case studies, and find several important results. First, using traces to perform system evaluation can lead to incorrect results (off by 100% or more in some cases) if the trace size is not sufficiently large. Second, MemoriES is able to detect performance problems by profiling miss behavior over the entire course of a run, rather than relying on a small interval of time. Finally, we observe that previous studies of SPLASH2 applications using scaled application sizes can result in optimistic miss rates relative to real sizes on real machines, providing potentially misleading data when used for design evaluation.