Hiroshi Sasaki, Alper Buyuktosunoglu, et al.
IEEE Computer Architecture Letters
We consider the floating point microarchitecture support in RISC superscalar processors. We briefly review the fundamental performance trade-offs in the design of such microarchitecutres. We propose a simple, yet effective bounds model to deduce the "best-case" loop performance limits for these processors. We compare these bounds to simulated and real performance measurements. From this study, we identify several loop tuning opportunities. In particular, we illustrate the use of this analysis in suggesting loop unrolling and scheduling heuristics. We report our experimental results in the context of a set of application-based loop test cases. These are designed to stress various resource limits in the core (infinite cache) microarchitecture.
Hiroshi Sasaki, Alper Buyuktosunoglu, et al.
IEEE Computer Architecture Letters
Raphael Viguier, Chung Ching Lin, et al.
ICCD 2015
Mateja Putic, Alper Buyuktosunoglu, et al.
DAC 2018
Alper Buyuktosunoglu, Tejas Karkhanis, et al.
ISCA 2003