About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IBM J. Res. Dev
Paper
Deep learning acceleration based on in-memory computing
Abstract
Performing computations on conventional von Neumann computing systems results in a significant amount of data being moved back and forth between the physically separated memory and processing units. This costs time and energy, and constitutes an inherent performance bottleneck. In-memory computing is a novel non-von Neumann approach, where certain computational tasks are performed in the memory itself. This is enabled by the physical attributes and state dynamics of memory devices, in particular, resistance-based nonvolatile memory technology. Several computational tasks such as logical operations, arithmetic operations, and even certain machine learning tasks can be implemented in such a computational memory unit. In this article, we first introduce the general notion of in-memory computing and then focus on mixed-precision deep learning training with in-memory computing. The efficacy of this new approach will be demonstrated by training the MNIST multilayer perceptron network achieving high accuracy. Moreover, we show how the precision of in-memory computing can be further improved through architectural and device-level innovations. Finally, we present system aspects, such as high-level system architecture, including core-to-core interconnect technologies, and high-level ideas and concepts of the software stack.