In traditional computing systems such as CPUs calculations are performed by repeatedly transferring instructions and data between the memory and processors. But AI workloads have much higher computational requirements and operate on large quantities of data. So as you infuse AI into application workflows, it is critical to have a heterogeneous system consisting of both CPU and AI cores that are tightly integrated on the same chip to support very low-latency AI inference.
Telum achieves exactly that, providing a dedicated AI resource for future IBM systems alongside the traditional horsepower of the CPU. The CPU cores are used effectively for general-purpose software applications, the AI cores are highly efficient at running deep learning workloads—and the tight coupling of the two types of cores helps facilitate fast data exchange.
Telum follows IBM’s history of a full-stack approach to system design—co-optimizing silicon technology, hardware, firmware, operating systems, and middleware—for our clients’ most critical workloads. With Telum, clients can process tens of thousands of transactions infused with AI every second.
The chip contains eight processor cores, running with more than 5GHz clock frequency, optimized for the demands of enterprise-class workloads. The completely redesigned cache and chip-interconnection infrastructure provides 32MB cache per core. The chip also contains 22 billion transistors and 19 miles of wire on 17 metal layers.
This horsepower is necessary to keep up with the demands of enterprise-grade AI solutions. AI model sizes are getting larger, as are the datasets they’re relying on, and the systems used to deploy them are getting increasingly complex. Telum represents the first IBM product developed using research advances from the AI Hardware Center. We believe that the future of AI is moving from systems that rely on reams of data and machine learning models, to ones that can reason and infer more like humans, providing context and nuance to future business decisions. We call this Neurosymbolic AI, and we’re working on our own hardware to support this vision—not just in the lab, but in an enterprise.
AI keeps proliferating into more and more enterprise systems. Even today, the fraud detection systems that many financial institutions use to determine whether transactions are legitimate or not are based on AI. But even the fastest systems tend to react seconds, minutes, or hours after the transaction happens, and tend to be quite compute-intensive. When that happens, the bad actor has the time to get away with the stolen items or cash. According to a recent Federal Trade Commission report, consumers lost more than $3.3 billion to fraud in 2020, up from $1.8 billion in 2019.
With Telum, financial institutions will be able to move from fraud detection to a fraud prevention, catching instances of fraud while the transaction is still ongoing. The dedicated AI core in Telum makes future IBM systems ideal for deploying AI in a wide range of financial services industries—such as payment and loan processing, clearing and settling trades, detecting money laundering, and risk analysis. But the potential use cases go beyond banking. Our research team thinks these AI cores could be applicable in natural language processing, computer vision, speech and genomics, and more, enabling the AI revolution powered by deep learning.
We are at a pivotal moment in computing, in the midst of a transition where AI is being applied everywhere. That has implications for the hardware and systems we design and the software we build. Such transitions don’t happen often, perhaps once every 10 to 30 years. The last significant shift into a new form of compute was in the ’80s. With the new artificial intelligence workloads we’re facing, there’s an opportunity for great ideas from our team at IBM Research, working closely with IBM Systems, to create new AI compute capabilities and lay the foundation for the next generation of computing.