Mingu Kang, Sungmin Lim, et al.
IEEE JESTCS
This paper presents a robust deep in-memory machine learning classifier with a stochastic gradient descent (SGD)-based on-chip trainer using a standard 16-kB 6T SRAM array. The deep in-memory architecture (DIMA) enhances both energy efficiency and throughput over conventional digital architectures by reading multiple bits per bit line (BL) per read cycle and by employing mixed-signal processing in the periphery of the bit-cell array. Though these techniques improve the energy efficiency and latency, DIMA's analog nature makes it sensitive to process, voltage, and temperature (PVT) variations, especially under reduced BL swings. On-chip training enables DIMA to adapt to chip-specific variations in PVT as well as data statistics, thereby further enhancing its energy efficiency. The 65-nm CMOS prototype IC demonstrates this improvement by realizing an on-chip trainable support vector machine. By learning chip-specific weights, on-chip training enables robust operation under reduced BL swing leading to a 2.4 times reduction in energy over an off-chip trained DIMA. The prototype IC in 65-nm CMOS consumes 42 pJ/decision at 32 M decisions/s, corresponding to 3.12 TOPS/W (1 OP = one 8-b × 8-b MAC) during inference, thereby achieving a reduction of 21 times in energy and 100 times in energy-delay product as compared with a conventional digital architecture. The energy overhead of training is <26% per decision for SGD batch sizes of 128 and higher.
Mingu Kang, Sungmin Lim, et al.
IEEE JESTCS
Ankur Agrawal, Saekyu Lee, et al.
ISSCC 2021
Mingu Kang, Yongjune Kim, et al.
IEEE TCAS-I
Swagath Venkataramani, Vijayalakshmi Srinivasan, et al.
ISCA 2021