Resilient low voltage accelerators for high energy efficiency
Low voltage architecture and design are key enablers of high throughput per watt in heterogeneous, accelerator-rich manycore designs. However, such low voltage operation poses significant challenges because of difficulties in achieving reliable functionality of on-chip memories, particularly SRAMs at these design points. In this paper, we present a technique of low-voltage neural network acceleration, where the embedded SRAM architecture is equipped with a novel applicationaware supply voltage boosting capability. This technique mitigates low-voltage induced failures, while enabling Very low voltage (VLV)1 operation during most of the application run, resulting in substantial improvement in net energy efficiency. We present a framework to evaluate the impact of low-voltage SRAM errors on machine learning applications and characterize trade-offs between output inference accuracy and energy efficiency in our application-programmable supply boosted SRAM architecture. Using the proposed technique we push the limits on the minimum operable voltage (Vmin) for the desired output quality. As a proof of concept, we demonstrate these techniques on Dante, a Deep Neural Network (DNN) accelerator chip taped out in state-of-the art 14nm technology.