8 minute read

Five takeaways from this year’s IBM AI Hardware Forum

A panel at the 2023 IBM AI Hardware Forum.

We’ve all seen the explosive growth in AI’s popularity this year. But beyond powerful chatbots and fun consumer applications, this year has really shown how AI can be applied to real business uses at the scale enterprise needs.

This summer, IBM launched watsonx, the new AI and data platform specifically designed to bring these sorts of enterprise use cases to life. But to realize the true potential of AI at scale, the world is going to need chips designed specifically to support what they do, and infrastructure that’s flexible, robust, and secure enough to meet enterprise demand.

That’s why hundreds gathered at IBM Research’s Yorktown Heights, NY headquarters on November 30 to attend the fifth AI Hardware Forum. The event brings together semiconductor researchers, AI practitioners, and technologists, from around IBM, enterprise, academia, and government agencies. Hosted by the IBM Research The center in Albany, NY, is focused on enabling next-generation chips and systems that support the tremendous processing power and unprecedented speed that AI requires to realize its full potential. Learn more about it here.AI Hardware Research Center, the AI Hardware Forum aims to be a venue for the industry to come together to discuss the challenges and opportunities in designing infrastructure to support future AI workloads.

Mukesh Khare, general manager of IBM Semiconductors and VP of hybrid cloud research at IBM, opened the event, explaining how much AI has advanced over the last decade. “We’ve transitioned from AI being fun to AI being mission-critical,” Khare said. But many key advancements are still required to realize AI’s true potential, many of which involved those in attendance that day. “At the heart of this AI journey is the semiconductor industry,” he said.

403A5784.jpgGeneral Manager of IBM Semiconductors Mukesh Khare opening up the day's proceedings.

Across the day’s discussions and keynotes, several key themes emerged. Here’s a quick rundown of the main takeaways from this year’s forum:

The new technology scaling AI workloads

While there have been countless breakthroughs in computer hardware over the decades, the primary devices used for AI training and inference today were not specifically designed for those tasks. GPUs were designed to render graphics for video games and multimedia, and many accelerators like FPGAs are not particularly dense, or power efficient. To lower latency and increase chip speeds, the memory and processing units need to be closer than ever before.

YKT_AI_Shot_01_0071_Figures_16x9_MedRes.jpgA cluster of AIU chips installed at IBM Research's Think Lab in Yorktown Heights.

There are several ways to tackle this problem, and many were discussed at the forum. Jeff Burns, director of the IBM Research AI Hardware Center, announced that roughly 100 low-power IBM AIU chips on PCIe cards were installed in a completely new rack design at the Think Lab in Yorktown Heights. IBM’s AIU chip, first unveiled at last year's forum, is now powering the inference cluster and running internal IBM production workloads.

The AIU is designed for enterprise generative AI and optimized for inference tasks. The new research cluster is a complete system implementation containing storage, network, and compute nodes running Red Hat Linux and the OpenShift software stack designed for watsonx. The internal production workload currently running is using roughly eight times less power — with comparably throughput — to our pool of GPUs that have been optimized for training.

Several other technologies were discussed in detail at the forum, including the future of analog AI chips. While still very much in the research stage, these chip ideas have the potential to greatly improve the energy efficiency of running AI models, as well as reducing latency with in-memory computing. And that’s important for the future of AI, according to Matt Baker, SVP for AI Strategy at Dell Technologies. “AI energy efficiency must be significantly improved to pave the way for mainstream enterprise adoption,” he said.

This year, IBM researchers had several breakthroughs in analog AI, two of which were published in Nature and Nature Electronics respectively. The first design showed it could be possible to run NLP keyword utterance tasks on analog hardware considerably faster than digital counterparts, with next to no energy when it isn’t actively looking for a wake word. And the second design was shown to be as capable at computer vision AI tasks as digital counterparts, while being considerably more energy efficient. “Analog has tremendous promise going forward from a performance perspective,” IBM’s Burns said.

Another area of interest was chiplets, which see the hegemony of a system on a chip broken down into its composite parts. Chiplet devices could lead to entirely new ways of building processors for AI.

One issue that pervaded the day was the argument that while there are many research projects underway, a bottleneck in the current availability of chips for AI persists. But between these projects maturing, and more investments in hardware production, that will change soon. “Right now if you’re out there trying to procure hardware, it’s pretty difficult,” said Dell’s Baker. “But there’s a coming dam break.”

Developing fabs of the future — and their workforce

During a presentation at the forum on the state of the AI compute, IDC’s Ashish Nadkarni said that there are 83 new foundries around the world that have either been announced or are under construction, with investments in countries beyond those that have traditionally produced chips over the last few decades.

403A7402.jpgIDC’s Nadkarni leading a panel discussion on the role of government in semiconductors and AI.

One of the biggest investments in the future of AI chip production was on full display at the forum. Last year, Rapidus, a new company founded by semiconductor experts with backing from eight of the biggest Japanese tech companies, announced that it would license IBM’s nanosheet technology to develop and package 2 nanometer node chips. Yasumitsu Orii, the senior managing executive officer of the 3D assembly department at Rapidus, explained how the project has been going. So far, close to 100 Rapidus researchers are working at IBM's Albany NanoTech Complex to learn more about the 2 nm production process. The company is on track to start producing at scale in 2027.

When asked by a member of the audience why Rapidus chose the city of Chitose in Hokkaido for its fab location, and whether the close proximity to Sapporo and its famous brewery was a factor, Orii said that much like with beer, semiconductor production uses a lot of fresh water, so the location was ideal. Or as he put it: “Good water, good beer, good semiconductors.”

And for the workforce for tomorrow, a discussion between government, research, and academic groups at the forum had a few ideas for how to ensure there’s a robust pipeline — for AI practitioners and semiconductor fabricators alike. Right now, labor has been at a premium, partially due to a lack of programs to educate people on cutting-edge semiconductors and fabrication methods. But new programs at colleges and universities with strong ties to research and production facilities could help ease the gap. “The workforce of five years from now is [going to be] very different,” SUNY’s Provost-in-Charge Shadi Shahedipour-Sandvik said.

Albert Heuberger, executive director of Germany’s Fraunhofer Institute for Integrated Circuits, said that the institute’s close ties to industry and academia are a model to replicate. By bringing in staff on limited-time contracts, they can learn from their experience at the institute and move onto a role within industry. SUNY’s Shahedipour-Sandvik took the concept even further, suggesting that the advantage of co-locating educational spaces within industry is a key way to ensure that future workers are ready for the workforce as quickly as possible. By being in those spaces, academic institutions can tailor their curricula to what’s actually happening on production lines today, ensuring that students are ready for the specific challenges of the industry as soon as they graduate. “It’s a very smooth transition — your curriculum is tailored,” Sandvik said.

Open-source will be key to advancing AI

While there are a few louder companies dominating the AI conversation right now, innovation won’t be able to continue without burgeoning open-source communities. And we’re starting to see that. “Shared problems are solved faster,” said Steven Huels, Red Hat’s general manager for AI. “Never in the history of AI has it been accessible as it is today.”

There are several challenges for companies that want to adopt AI workflows, from ensuring AI workloads are consistent and repeatable, to the complexity of AI platforms, to managing AI systems on a fleet of devices. Repeated failures with rolling out AI tends to lead to a lack of confidence in AI technology itself, which limits the potential for a business, said Huels. This is where open source can help.

Priya Nagpurkar, IBM’s VP for hybrid cloud and AI platform research, said IBM sees open-source tools like PyTorch to be critical to the watsonx platform. “Everybody just reinventing the wheel is just not efficient,” she said.

Integrating with open, hybrid, and cloud-native systems allows you to get the most out of hardware choices, Nagpurkar said. “This will help make the ROI on AI effective.”

PyTorch is the most widely adopted machine-learning framework for foundation models in the world. Ownership of the framework was transferred from Meta (where it was built within Facebook) to the Linux Foundation last year. Soumith Chintala, VP of AI Research at Meta, was one of the original founders of PyTorch, and spoke with Nagpurkar at the forum. In his view, PyTorch and its growing governance community has the potential to make it easier for anyone — enterprises, software designers, and hardware manufacturers alike — to get into AI that can make a difference. “The role we want to play with PyTorch is to facilitate entry to the AI market,” Chintala said.

Advanced chip packaging is at the center of the AI chip future

For years, when you thought about advances in computing, you were usually thinking about smaller, or more performant processors. What those were packaged in was often less considered. But in recent years, as processors have gotten down to the nanoscale, the connection between the materials used for chips and what they’re packaged in has gotten much closer.

The future of AI will require even more advanced packaging solutions than those we’ve seen today. In a presentation, Raja Swaminathan, AMD’s corporate VP for packaging, said that the cost to add to the number of transistors on a chip is actually increasing: to produce a computing system capable of Zettascale computing would likely require the same amount of energy as you’d find in a nuclear power plant to run, extrapolating current trends. Obviously, something has to change. “We have to bend the curve,” Swaminathan said.

AMD, like IBM and others, is working on advanced packaging concepts that redefine the notion of a chip, such as moving to high-bandwidth memory that’s accessed by CPUs and GPUs together on a single system. Packaging devices up together, or closer together, or even in a three-dimensional space, all have the potential to make future computers faster and more energy efficient. “Energy per bit is reducing, but not when normalized to channel quality,” Swaminathan said.

403A7756.jpgIBM's Khare checking out one of the demos at the forum.

The AI revolution is here to stay

Throughout history, there have been technology trends that have been hyped to death. Countless experts have sworn that some new technology will be the answer to all our problems in the near future, only for there to be unforeseen hurdles and eventually interest dies out. AI itself has experienced decades of “winter” before the recent boom. But at this week’s event, the consensus was that the growth the AI industry has experienced recently isn’t about to slow down. “We’re just in the first inning of AI,” IBM’s Khare said. Dell’s Baker took it even further: “We’re not even at batting practice, we’re not even at the stadium yet."

And adoption is starting to mirror expectations. Red Hat’s Huels argued that while the time to operationalize AI systems is long today, with recent advancements in hardware and infrastructure, that’s likely to fall precipitously in the coming years. “We believe every application that consumers deploy moving forward will have some form of AI in it,” he added.




  1. Note 1The center in Albany, NY, is focused on enabling next-generation chips and systems that support the tremendous processing power and unprecedented speed that AI requires to realize its full potential. Learn more about it here. ↩︎