Analyzing the energy-efficiency of dense linear algebra kernels by power-profiling a hybrid CPU/FPGA system
Abstract
It has been shown that FPGA accelerators can outperform pure CPU systems for highly parallel applications and they are considered as a power-efficient alternative to software programmable processors. However, when using FPGA accelerator cards in a server environment multiple sources of power consumption have to get taken into account in order to rate the systems energy-efficiency. In this paper we study the energy-efficiency of a hybrid CPU/FPGA system for a dense linear algebra kernel. We present an FPGA GEMM accelerator architecture that can be tailored to various data types. The performance and energy consumption is compared against tuned, multi-threaded GEMM functions running on the host CPU. We measure the power consumption with internal current/voltage sensors and break down the power draw to the systems components in order to classify the energy consumed by the processor cores, the memory, the I/O bus system and the FPGA card. Our experimental results show that the FPGA-accelerated DGEMM is less energy-efficient than a multi-threaded software implementation with respect to the full systems power consumption, but the most efficient choice when only the dynamic parts of the power are factored in. © 2014 IEEE.