11 Jan 2023
News
6 minute read

How the future of computing might look like nothing before

A Q&A with IBM researcher and recently elected IEEE Fellow, Abu Sebastian, on in-memory computing, and why his career has been building to redefine the way that computer systems are architected.

DF4AF5D7-A629-4923-BB18-A99665FAB276_1_201_a.jpeg
IBM Researcher Abu Sebastian.

A Q&A with IBM researcher and recently elected IEEE Fellow, Abu Sebastian, on in-memory computing, and why his career has been building to redefine the way that computer systems are architected.

Our modern world would be unimaginable without computers. Everything we do, from checking the weather and browsing social media, to manufacturing goods, and running businesses, relies on a technological foundation which started to emerge many decades ago. Today’s digital computers are indeed built according to design principles proposed by the mathematician John Von Neumann in the 1940s. In von Neumann’s architecture design, data processing happens in a central processing unit (CPU), and data and programs are confined in a physically separated piece of hardware called the memory.

The von Neumann architecture proved advantageous over alternative models at the time, but with the rapid proliferation of artificial intelligence, its shortcomings are becoming increasingly apparent. Shuttling data back and forth between memory and CPU comes at a high price in terms of both energy consumption and speed when performing AI computations. This is particularly relevant when training deep learning models which requires frequently updating model parameters.

Given this AI boom, IBM researchers are looking for alternative hardware architectures that better fit the requirements of AI workloads. One template is the human brain, which itself is a massively complex, but efficient, computing device. Requiring only about 20 watts of average power consumption, our brains excel when it comes to solving complex tasks at a minimal energy cost.

One of the most promising, brain-mimicking approaches is in-memory computing (IMC). The idea behind it is simple, but powerful: Instead of having distinct compartments for memory and processing, the operations are performed in the memory itself. This removes the need to constantly move data around and gives faster access to stored data. A way to implement IMC is by building a crossbar array of “wires” with each crossing point representing a unit of memory storage. The choice of the right material is crucial. A popular material candidate to make the wires are phase-change materials. They have electrical resistance that can be changed by heating or cooling them down, which makes the materials switch from a highly conductive crystalline to an insulating amorphous atomic structure.

IBM researcher Abu Sebastian was recently named a fellow of the Institute of Electrical and Electronics Engineers (IEEE) for his significant contributions to the field of in-memory computing. He sat down with us to give his thoughts on his career, the current state of in-memory computing, and where he sees the technology heading.

You were just named an IEEE Fellow for your contributions to the field of in-memory computing. Explain to me how it differs from the way our traditional digital computers work?

Our current computing paradigm is struggling with two converging trends: The slowdown of conventional semiconductor scaling laws and the explosive growth of AI, which is being fueled by compute-intensive algorithms such as deep learning. These bring about two big challenges: the rapidly growing energy consumption along with stagnation in the latency associated with computing.

There’s a sense of urgency around solving those issues, and in-memory computing could provide a path forward. When it comes to energy efficiency, a natural place to look for answers is the human brain, a remarkable information processing engine that consumes as little power as a light bulb. There are two features of the way the brain processes information that are particularly attractive in this context: First, the neurons and synaptic weights in the brain are stationary, and data flows through the physically connected neural network. Second, both synaptic weighting and activation propagation are performed with limited arithmetic precision in an analog domain. These two properties form the foundation for in-memory computing, where we exploit the physical attributes of memory devices and their organization to perform approximate computing, often in the analog domain.

The term “memory wall” refers to a limitation of current digital computers that in-memory computing could help overcome. What does the term mean and how can your research address that fundamental bottleneck?

In the traditional digital computers, there is a physical separation between memory and processing which means that data need to be shuttled back and forth between the two units. Over the years, the processing units have been getting faster and more energy efficient very quickly, whereas memory access and data transfer have been lagging far behind to such a point that even if processing takes zero energy, we still end up consuming a lot of energy and time just accessing and moving data back and forth. This is referred to as the memory wall. As the name suggests, in-memory computing directly tackles this problem by computing in the memory itself.

Let’s back up a little. What brought you to IBM Research?

During my PhD, I was trained in mathematical engineering, which spans control theory, signal processing, and communications, among other things. This served as a foundation when I ventured into different fields of research. My initial foray was into nanoscale dynamics and control, which is what brought me to the IBM Research Zurich lab to work on a high-profile project called Millipede. During my PhD and initial years at IBM, I made some key contributions to the fields of nanopositioning, nanoscale sensing, and atomic force microscopy.

However, I became increasingly fascinated by nanoelectronic devices, especially those that can store information in terms of their atomic configurations, like phase-change memory. During almost a decade of highly enjoyable years, my team and I made key contributions to understanding various aspects of PCM device physics. We also introduced new ways of designing PCM devices and proposed the use of simpler materials with nanoscale confinement to address the challenge of reset current. We also introduced new ways to write and read out information on PCM devices.

How did you get interested in in-memory computing?

Around eight years ago, we were getting increasingly fascinated by the nascent field of in-memory computing and felt that with our leadership role in PCM technology, in particular our expertise in storing analog conductance values in single PCM devices, could lead us to make substantial contributions to the field. I also received a European Research Council grant that gave significant impetus to this work.

In subsequent years, we demonstrated ways to realize both synaptic and neuronal elements using PCM devices and established ways to achieve software-equivalent classification accuracies for both inference and training. We also showed several applications of in-memory computing in scientific computing, signal processing, reservoir computing, database query and hyperdimensional computing. And with the help of the newly formed IBM Research AI Hardware Center and the global IBM research teams, we proved many of these concepts in silicon by fabricating multiple generations of in-memory computing chips in with 90 nm and 14 nm CMOS technology with embedded PCM. Together with external collaborators, we could also extend the field to the photonic domain.

What do you see as the most promising implementations of in-memory computing?

Sometimes, in-memory computing also refers to approaches where some digital logical operations are performed right next to the memory tile, typically on the same memory chip. In the narrower version of in-memory computing that we work on, we perform computations collectively in the memory without having to read back the individual memory contents. In this category, the most advanced one uses static random-access memory (SRAM)-based compute elements. However, SRAM is volatile, meaning it loses the stored information when powered down. Also, with SRAM it will be challenging to have gigabytes of on-chip storage capacity. In contrast, PCM and resistive random-access memory (RRAM) store their information even when powered down and can potentially offer higher areal density. A real game changer, however, would be if some 3D memory could be used for in-memory computing, as this would increase the weight capacity significantly.

Where does in-memory computing stand today?

In-memory computing is a very active field of research, both in academia and industry. The main electronics conferences, such as ISSCC, IEDM, and VLSI, all regularly have multiple sessions on in-memory computing. The explosive growth of AI is also fueling research into IMC. Much of the initial work on in-memory computing was based on simulations, single-device data, or small test chips. More recently, we’ve been seeing increasingly more sophisticated, fully integrated chips including those designed and fabricated within the IBM Research AI Hardware Center. There are even a few startups who have shown some impressive prototypes.

Where do you see the technology going in the next five to 10 years? What challenges are ahead?

In the next few years, I foresee many more technical advances, such as improved compute precision and compute density, highly optimized peripheral circuitry, and optimally architected compute systems with IMC cores effectively integrated with the rest of the digital compute blocks in a single system.

The biggest challenge will be to disrupt the existing hardware and software infrastructure that has been built up over the last half century. However, we have reached a point where disruptive ideas need to be tried, not just to bring down the energy cost of computing, but also to enable new applications. For example, what if our phones could translate across multiple languages in real time without access to cloud resources or if we could run a large language model like GPT-3 in a toaster, as was recently suggested by Geoff Hinton? Fields such as autonomous driving, robotics, finance, and medicine will all benefit from these capabilities. IMC is also amenable to new AI paradigms that transcend deep learning. That’s why I am quite optimistic about the future of IMC.

Finally, what piece of advice would you give a young researcher interested in entering this field?

In-memory computing is a multidisciplinary field with lots of interesting problems that still need to be tackled. I would strongly encourage young researchers entering the field to first educate themselves on the fundamental open questions and try to address them rather than targeting potentially eye-catching contributions that, in the long run, may have little impact on the field.

Date

11 Jan 2023