Global wiring on a wire routing machine
Ravi Nair, Se June Hong, et al.
DAC 1982
This paper presents a framework for analyzing the performance of multithreaded programs using a model called a constraint graph. We review previous constraint graph definitions for sequentially consistent systems, and extend these definitions for use in analyzing other memory consistency models. Using this framework, we present two constraint graph analysis case studies using several commercial and scientific workloads running on a full system simulator. The first case study illustrates how a constraint graph can be used to determine the necessary conditions for implementing a memory consistency model, rather than conservative sufficient conditions. Using this method, we classify coherence misses as either required or unnecessary. We determine that on average over 30% of all load instructions that suffer cache misses due to coherence activity are unnecessarily stalled because the original copy of the cache line could have been used without violating the memory consistency model. Based on this observation, we present a novel delayed consistency implementation that uses stale cache lines when possible. The second case study demonstrates the effects of memory consistency constraints on the fundamental limits of instruction level parallelism, compared to previous estimates that did not include multiprocessor constraints. Using this method we determine the differences in exploitable ILP across memory consistency models for processors that do not perform consistency-related speculation.
Ravi Nair, Se June Hong, et al.
DAC 1982
Harold W. Cain, Mikko H. Lipasti
SPLASH 2012
Arpith Chacko Jacob, Ravi Nair, et al.
SBAC-PAD 2015
Rajeev Balasubramonian, Jichuan Chang, et al.
IEEE Micro