Reducing instruction fetch energy with backwards branch control information and buffering
Abstract
Many emerging applications, e.g. in the embedded and DSP space, are often characterized by their loopy nature where a substantial part of the execution time is spent within a few program phases. Loop buffering techniques have been proposed for capturing and processing these loops in small buffers to reduce the processor's instruction fetch energy. However, these schemes are limited to straight-line or innermost loops and fail to adequately handle complex loops. In this paper, we propose a dynamic loop buffering mechanism that uses backwards branch control information to identify, capture and process complex loop structures. The DLB controller has been fully implemented in VHDL, synthesized and timed with the IBM Booledozer and Einstimer Synthesis tools, and analyzed for power with the Sequence PowerTheater tool. Our experiments show that the DLB approach, on average, results in a factor of 3 reduction in energy consumption compared to a traditional instruction memory design at an area overhead of about 9%.