Performance and Energy-Aware Characterization of the Sparse Matrix-Vector Multiplication on Multithreaded Architectures
Abstract
The end of Dennard scaling (i.e., the ability to shrink the feature size of integrated circuits while maintaining a constant power density) has now placed energy as a primary design principle in par with performance, all the way from the hardware to the application software. Along this line, optimizing the performance-energy balance of the 7/13 "dwarfs", introduced by UC Berkeley in 2006, represents a challenge with a potential tremendous impact on a vast number of scientific applications. However, without a careful modeling and a subsequent understanding of the performance-energy interaction, the optimization process of these kernels is doomed to fail. In this paper we investigate the performance-power-energy characterization of the sparse matrix-vector product (SpMV), a challenging kernel due to its indirect and irregular memory access pattern, which constitutes the key ingredient of the sparse linear algebra dwarf. In the first part of our analysis we identify a reduced set of critical features (based on statistics about the sparse matrix structure) which impact the performance, power, and energy consumption of a baseline implementation of SpMV. We then generate a small synthetic sparse benchmark collection (the training set) that we use to build (i) a general classification of sparse matrices and (ii) a model to accurately predict performance and energy consumption of any SpMV. Both tools are based on the features (parameters) emerged from the first part of our study, and they are validated using the entire University of Florida Matrix Collection, run on two high-end multithreaded architectures.