Publication
MRS Fall Meeting 2022
Talk

Optimized Weight Programming for Analog Memory-based Deep Neural Networks

Abstract

Analog memory-based Deep Neural Networks (DNNs) provide energy-efficiency and per-area throughput gains relative to state-of-the-art digital counterparts such as graphic processing units (GPUs). Recent advances focus largely on hardware-aware algorithmic training and improvements in circuits, architectures, and memory device characteristics. Optimal translation of software-trained weights into analog hardware weights—given the plethora of complex memory non-idealities—represents an equally important goal in realizing the full potential of this technology. We report a generalized computational framework that automates the process of crafting complex weight programming strategies for analogue memory-based DNNs, in order to minimize accuracy degradations during inference, particularly over time. This framework is agnostic to DNN structure and is shown to generalize well across Long Short-Term Memory (LSTM), Convolution Neural Networks (CNNs), and Transformer networks. Being a highly-flexible numerical heuristic, our approach can accommodate arbitrary device-level complexity, and is thus broadly applicable to a variety of analogue memories and their continually evolving device characteristics. Interestingly, this computational technique is capable of optimizing inference accuracy without the need to run inference simulations or evaluate large training, validation, or test datasets. Lastly, by quantifying the limit of achievable inference accuracy given imperfections in analogue memory, weight programming optimization represents a unique and foundational tool for enabling analog memory-based DNN accelerators to reach their full inference potential.