Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power Performance

H. Tsai; H. Benmeziane; I. Boybat; J. Buchel; P. Narayanan; M. Le Gallo; S. Jain; A. Vasilopoulos; W. Simon; K. Hosokawa; M. Ishii; Y. Kohda; A. Chen; C. MacKin; K. El Maghraoui; A. Okazaki; A. Friz; J. Luquin; A. Sebastian; V. Narayanan; G.W. Burr

doi:10.1109/IMW61990.2025.11026974

IMW 2025

Conference paper

18 May 2025

Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power Performance

View publication

Abstract

Transformer-based Large Language Models (LLMs) demand large weight capacity, efficient computing, and high throughput access to large amount of dynamic memory. These challenges present great opportunities for algorithmic and hardware innovations, including Analog AI accelerators. In this paper, we describe recent progress on Phase Change Memory-based hardware and architectural designs to address the challenges for LLM inference.

Conference paper

Solving optimization tasks power-efficiently exploiting VO₂'s phase-change properties with Oscillating Neural Networks

Olivier Maher, N. Harnack, et al.

DRC 2023

Paper

Filamentary TaO_x/HfO₂ ReRAM Devices for Neural Networks Training with Analog In-Memory Computing

Tommaso Stecconi, Roberto Guido, et al.

Advanced Electronic Materials

Conference paper

A Multiscale Workflow for Thermal Analysis of 3DI Chip Stacks

Max Bloomfield, Amogh Wasti, et al.

ITherm 2025

Conference paper

SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems

Xiaofan Zhang, Haoming Lu, et al.

MLSys 2020

View all publications

Abstract

Related

Solving optimization tasks power-efficiently exploiting VO2's phase-change properties with Oscillating Neural Networks

Filamentary TaOx/HfO2 ReRAM Devices for Neural Networks Training with Analog In-Memory Computing

A Multiscale Workflow for Thermal Analysis of 3DI Chip Stacks

SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems

Solving optimization tasks power-efficiently exploiting VO₂'s phase-change properties with Oscillating Neural Networks

Filamentary TaO_x/HfO₂ ReRAM Devices for Neural Networks Training with Analog In-Memory Computing