Andreea Anghel, Laura Mihaela Vasilescu, et al.
CF 2015
Simulators help computer architects optimize system designs. The limited performance of simulators even of moderate size and detail makes the approach infeasible for design-space exploration of future exascale systems. Analytic models, in contrast, offer very fast turn-around times. In this paper we propose an analytic multi-core processor-performance model that takes as inputs a) a parametric microarchitecture-independent characterization of the target workload, and b) a hardware configuration of the core and the memory hierarchy. The processor-performance model considers instruction-level parallelism (ILP) per type, models single instruction, multiple data (SIMD) features, and considers cache and memory-bandwidth contention between cores. We validate our model by comparing its performance estimates with measurements from hardware performance counters on Intel Xeon and ARM Cortex-A15 systems. We estimate multi-core contention with a maximum error of 11.4 percent. The average single-thread error increases from 25 percent for a state-of-the-art simulator to 59 percent for our model, but the correlation is still 0.8, a high relative accuracy, while we achieve a speedup of several orders of magnitude. With a much higher capacity than simulators and more reliable insights than back-of-the-envelope calculations it makes automated design-space exploration of exascale systems possible, which we show using a real-world case study from radio astronomy.
Andreea Anghel, Laura Mihaela Vasilescu, et al.
CF 2015
Leandro Fiorin, Erik Vermij, et al.
Int. J. Parallel Program
Giovanni Mariani, Andreea Anghel, et al.
Parallel Computing
Leandro Fiorin, Rik Jongerius, et al.
IEEE TPDS