- IEEE T-ED
Analog in-memory computing can be used for two different deep learning tasks: training and inference. The first step is to train the models using some typically labeled dataset. For instance, if you want your model to recognize different images, you would provide a set of labeled images for the training exercise. After the model has been trained, it can then be used for inference.
Like most computing today, the training of AI models is a digital process performed on traditional computers with traditional architectures. These systems move information from memory into a queue and then to the CPU, where it is then processed.
AI training can require vast amounts of data, all of which needs to move through the queue as it is transferred to the CPU. This results in what is called “the von Neumann bottleneck” and can severely limit computation speed and efficiency. IBM Research is exploring technologies that can train AI models faster, using less energy – without the bottleneck created by data queuing. These technologies are analog – that is, they represent information as a variable physical quantity – like sound is captured by the wiggles in the grooves of vinyl LPs. We are exploring two types of devices for training, resistive random-access memory (RRAM) and electrochemical random-access memory (ECRAM). Both devices can store and process information. As there is no transfer of data through a queue from memory to CPU, tasks can now be performed in a fraction of the time while requiring much less energy.
Inference is the act of reaching a conclusion from known facts. This is a process that humans do effortlessly, but when performed by a computer, inference is expensive and slow. IBM Research is taking on that challenge by using an analog approach. When you think of analog you might remember the pre-digital world of Polaroid Instant cameras or vinyl LPs. Digital information is represented as long strings of 1’s and 0’s. Analog information is represented as a continuously varying physical quantity, like the grooves in a record. Phase-change memory (PCM) is at the heart of our analog AI inference chips. It is a highly tunable analog technology that can both store and compute information from pulses of electricity. This results in a considerably more energy-efficient chip.
We are using PCM as a synaptic cell, an AI term for a single unit of information or weight. Our analog AI inference chips have over 13 million of these PCM synaptic cells arranged in an architecture that allows us to build a large physical neural network populated with pretrained data, meaning it’s ready to jam and perform inference on your AI workloads.