6 minute read

IEDM 2020: Advances in memory, analog AI and interconnects point to the future of hybrid cloud and AI

At this year’s IEDM, IBM researchers to describe key hardware infrastructure components, including memory, analog AI.

At this year’s IEDM, IBM researchers to describe key hardware infrastructure components, including memory, analog AI.

The symbiotic connection between AI and hybrid cloud might not be obvious at first glance, but a closer look reveals important connections between the two—connections that will only strengthen as the technologies mature.

You need AI, for example, to effectively automate the application modernization, business process mapping, and workload migration – to create successful hybrid cloud environments. Hybrid clouds, meanwhile, give businesses the ability to harness the enormous amounts of computing power needed to train increasingly complex AI models, which need a lot of data to generate useful predictions or observations. Hybrid cloud and AI share another common denominator—their growth and maturation depend largely on improvements in systems on which they operate.

This is exactly what we at IBM are working on.

At this year’s IEEE International Electron Devices Meeting (IEDM 2020) conference—held virtually December 12-18—IBM researchers will describe a number of breakthroughs aimed at advancing key hardware infrastructure components, including: Spin-Transfer Torque Magnetic Random-Access Memory (STT-MRAM), analog AI hardware, and advanced interconnect scaling designed to meet those hardware infrastructure demands.

One team will reveal the first 14 nm node STT-MRAM to address memory-compute bottlenecks in hybrid cloud systems that hinder their performance. Another will discuss advances in phase-change memory based analog AI and “analog advantage” in AI training. And one other important area of research addressed at the conference will be interconnect scaling that enables powerful computing and a roadmap of performance improvements needed for AI in hybrid cloud environments. Let’s look at them more in detail.

Breaking memory bottlenecks with advanced MRAM

Data transfer bottlenecks have long been a problem for large workloads and create a challenge for running AI workloads in hybrid cloud environments. STT-MRAM uses electron spin to store data in magnetic domains, combining the high speed of Static RAM (SRAM) and the high density of DRAM—both of which rely on electrical charges for storage—to offer a more dependable storage solution.

IBM Research’s 14 nm node embedded MRAM, debuting at IEDM, is the most advanced MRAM demonstrated to date. It features circuit design and process technology that could soon enable system designers to replace SRAM with twice the amount of MRAM in last-level CPU cache. The use of last-level cache—also called system cache—reduces the amount of reading and writing to memory, which in turn reduces system latency and power consumption while increasing bandwidth. This new 14 nm node MRAM, as described in the paper, “A 14 nm Embedded STT-MRAM CMOS Technology,” will help solve hybrid clouds’ data bottlenecks and allow for a much more efficient, higher-performing system.

In a second STT-MRAM paper, “Demonstration of Narrow Switching Distributions in STT-MRAM Arrays for LLC Applications at 1x nm Node,” IBM demonstrates advanced magnetic materials with high-speed of 3 ns switching and tight distributions of the switching current. Optimizing switching speed characteristics is another key step toward use of MRAM as last-level cache. By speeding up the exchange between memory and compute, this enhanced design promises to deliver a much more efficient, higher-performing system.

Together, these advances point to MRAM’s steady march toward achieving superior density and increased speed needed to replace SRAM for CPU caches. That would be a whole new application for MRAM, which is typically used today as either a replacement for NAND flash memory or as a stand-alone storage chip, and significantly increase data retrieval performance.

Analog AI takes aim at more energy-efficient neural networks

Existing hardware platforms have difficulty processing the large amounts of data needed to take advantage of Deep Neural Networks (DNNs) to train AI models: either the systems are too slow or they require too much power to be practical in most data centers. Analog in-memory computing—which combines compute and memory into a single device—will play a significant role in the development of AI hardware that can, with greater energy efficiency, train increasingly complex models. IBM Research is presenting research at IEDM that addresses the synaptic weight mapping problem for inference and demonstrates the analog advantage in training.

The accurate mapping of synaptic weights onto analog non-volatile memory devices for deep learning inference is a considerable challenge to developing analog AI cores. Synaptic weight indicates the strength of a connection between two nodes in a neural network. In the paper, “Precision of Synaptic Weights Programmed in Phase-Change Memory Devices for Deep Learning Inference,” IBM researchers discuss how analog resistance-based memory devices such as PCM in in-memory computing applications could address the mapping challenge. Their work addresses how to accurately map the synaptic weights analytically and through array-level experiments. The paper also analyzes the impact of inaccuracy associated with synaptic weight storage on a range of networks for some common AI applications: image classification and language modeling.

A second analog AI paper, “Unassisted True Analog Neural Network Training Chip,” details the first analog neural network training chip—a resistive processing unit, or RPU—to demonstrate the elusive “analog advantage” in AI training. Analog advantage occurs when analog neural network training is faster than a comparable digital system in real time. The researchers achieved this speedup by performing all Multiply and Accumulate (MAC) functions in analog cross-point arrays and updating all weights in parallel.

Navigating advanced interconnect scaling challenges

Innovations in chip interconnect technology will play a more subtle, though no less important, role in improving hybrid cloud performance. Interconnects that fail to scale with AI workloads can constrain overall system performance, regardless of how powerful AI accelerators, memory, and other components are.

One IBM Research paper at IEDM 2020, “Interconnect Scaling Challenges, and Opportunities to Enable System-Level Performance Beyond 30 nm Pitch,” details the interconnect advances needed to increase system performance, especially in hybrid cloud environments. Continued focus must shift from traditional scaling and device performance towards providing the high interconnectivity needed to support heterogeneous systems. As this work continues, researchers will continue to look for ways to offset signal delays caused by shrinking the dimensions in interconnection wires at below 30nm pitch.

Another interconnect technology paper, “Topological Semimetals for Scaled Back-End-Of-Line Interconnect Beyond Cu,” explores the need for new interconnect materials to counter the resistance bottleneck inherent when shrinking interconnection wiring dimension. The researchers analyze two semimetals—cobalt silicide (CoSi) and tantalum arsenic (TaAs)—as a potential solution, illustrating the physics behind enhanced current conduction that occurs with downward scaling.

At IEDM 2020, IBM researchers are exploring the important roles that memory, analog AI and interconnects play in building the flexible, composable systems crucial to the evolution of both hybrid cloud and AI. These technologies require robust, reliable hardware capability to perform optimally for businesses. Through its work presented at IEDM, IBM Research is proving such an advanced system infrastructure is well within our grasp.