Meet AI-Hilbert, a new algorithm for transforming scientific discovery

In a new Nature Communications paper, IBM researchers and collaborators outline an ‘AI scientist’ called AI-Hilbert that turns existing theories and data into new, consistent, interpretable, mathematical models. With this new tool, they hope to revolutionize the very process of scientific discovery.

If you can’t make a new discovery on your own, the limiting factor may be your human brain. At least, that’s what IBM Research mathematician and senior manager Lior Horesh suspects. Humans have long attempted to outsource tasks to other people or to time-saving devices, but what if machines could address challenges we’ve never been able to tackle, like solving the unanswered mysteries of our universe?

It sounds lofty, but Horesh and his team at IBM Research are trying to do just that. AI-Hilbert, described in a new Nature Communications paper, is an ‘AI scientist’ that can generate and derive new equations from a combination of existing theory and data, with the goal of filling in gaps in scientific knowledge, including introducing new theories. While there are fruitful efforts to utilize AI to accelerate discovery, it’s important to note AI-Hilbert is addressing a complementary objective: how we can augment the scientific method itself, rather than accelerate its various steps independently.

Horesh, along with project co-lead Sanjeeb Dash, an optimization expert at IBM, and their team, developed AI-Hilbert to aid the discovery of these unknowns, which can only be done by symbiotically integrating theoretical knowledge and empirical data. They achieved their aims with the This included Bachir El Khadir (now at Two Sigma) and Ryan Cory-Wright (now a faculty member at Imperial College). Another team member and former IBMer, Cristina Cornelio, now works at Samsung AI.help of scientists who came through IBM’s Herman Goldstine Memorial Postdoctoral Fellowship for mathematical and computer sciences.

The personal motivation behind this project is a crisis that many scientists may relate to: As the years add up behind the bench, that one big discovery has not yet happened.

“With my own gray-hair-to-gray-matter ratio, I will not be able to come up with a Nobel-worthy discovery,” Horesh says. “If I cannot do that, maybe I can build a machine that can.” This is where AI-Hilbert comes in.

Chatbots and other systems are bringing generative AI to multiple aspects of life, from customer service to writing computer code. But higher-order tasks like formal reasoning and scientific innovation are still primarily human territory. Discovery is difficult for AI because it is not just about populating many equations; the results must also stand up to experimentation and theoretical validation. This is a high bar to meet.

But Horesh, Dash, and their team are in the business of scientific creativity — and for them, that mission involves more math than art. They’re seeking mathematical models of the universe, consistent links between what goes into a system and what comes out. The art of crafting these models comes with some major challenges, one of which involves complexity. “We often make some compromises between the complexity of the model and the realism of how descriptive it is,” Horesh says.

Scientists must also balance other objectives when devising mathematical models, including accuracy, universality, and interpretability. With AI-Hilbert, named for mathematician David Hilbert (whose work on geometric algebra transformed the whole field), the team attempted to strike a balance among these traits. Mathematical modeling can be divided into two paradigms: The first-principles approach that’s based on deductive derivations of fundamental laws, and the data-driven approach based on inductive generation of hypotheses from empirical observations.

The first-principles models are often quite neat, universal, and compact. Galilei, Kepler, Newton, Einstein, and Maxwell (Horesh’s favorite of the bunch) all discovered equations that describe fundamental laws of physics — all without massive servers full of data to draw upon. Maxwell, Horesh points out, used only four compact equations to describe “everything you ever wanted to know about electromagnetism.”

These physicists had something in common: They were working before digital computers existed.

Data-driven discovery, on the other hand, is a more modern approach. This strategy uses the vast swathes of available scientific data to determine equations linking cause and effect in a system, whether it involves traffic patterns, brain activity, or air turbulence. This approach is great when scientists have tons of data but little in the way of formal knowledge, Horesh says.

But collecting data can be expensive and difficult, and some data-driven models can be overly rigid as they attempt to fit the data into a pre-determined functional form. Others can be hard to interpret and understand. The ideal outcome is equations that are frugal and economical, capturing the intricacies of a system in the simplest possible terms while still describing it accurately. Many equations may appear to achieve this aim, but in reality, they merely fit the data. In the absence of theory, they may all seem plausible. Often, AI will generate equations that “overfit” the data, spitting out relations that generalize poorly.

AI-Hilbert changes the process of discovery at its core, unifying the otherwise sequential process of generating hypotheses (either in a data-driven fashion or via deductive theoretical derivation) and then testing them against theoretical knowledge or empirical data. To make this integration possible, the project team chose to narrow down the language used to express background theory and hypotheses, and convey knowledge in the form of multivariate polynomial expressions to enable the use of algebraic geometry machinery, an area collaborator El Khadir specialized in.

AIH v1.png — AI-Hilbert unifies hypothesis generation and hypothesis testing, augmenting the traditional process of scientific discovery.

“Allowing only polynomial expressions in AI-Hilbert did restrict the set of problems that could be tackled, but we could do more with this restricted set of problems than we could do before,” says Dash, who also manages a team of optimizers and probabilists. “We could search over the space of polynomial expressions that are consistent with the background theory and explain the data.”

So far, AI-Hilbert has successfully replicated influential scientific laws, including Kepler’s third law of planetary motion, the Hagen-Poiseuille equation, Einstein’s time dilation law, and the radiated gravitational wave power equation, as well as demonstrated the ability to rediscover quantum mechanics’ Bell inequalities.

AI-Hilbert does this by ingesting every relevant piece of symbolic background knowledge (referred to as axioms), in formal logic. Then the algorithm combines these axioms with data to search for new expressions that both conform with the background theory and honor the data, all while minimizing complexity. For now, the team has made the AI-Hilbert code available to anyone who wants to try a new path to discovery.

This work builds on the team’s previous effort in this area, another AI scientist tool called AI-Descartes — named for the French philosopher and scientist René Descartes who emphasized the importance of deductive reasoning in scientific discovery, rather than mere reliance upon empirical evidence. It reversed the classical approach, generating hypotheses from data and testing them against theory. AI-Descartes is meant to work especially well with noisy real-world data, because testing it against theory ensures that the system does not get distracted by every blip. The eventual goal is to provide the answers that have eluded scientists.

The 20th-century physicist Paul Dirac noted that the early days of quantum mechanics saw a boom in discoveries. The young field had many open questions, so whenever a scientist solved one, they could make an important scientific contribution. “It was very easy in those days for any second-rate physicist to do first-rate work,” Dirac said. “There has not been such a glorious time since. It is very difficult now for a first-rate physicist to do second-rate work.” Horesh feels this in his bones, pointing out that the rate of new discoveries is slowing down.

Perhaps we have picked all the low hanging fruit, and now it’s time to reach beyond human capacity. AI-Hilbert is just one step along the road to evolving the scientific method. The team behind AI-Hilbert has shown that it can independently derive existing laws of physics with just a modest amount of data and theory. They hope to not only tackle some of the unanswered questions in physics, especially in areas where little data is provided and the underlying background theory is incomplete, but also to further transform the scientific method.

Subscribe to our Future Forward newsletter and stay up to date on the latest research news

Subscribe to our newsletter

Notes

Note 1: This included Bachir El Khadir (now at Two Sigma) and Ryan Cory-Wright (now a faculty member at Imperial College). Another team member and former IBMer, Cristina Cornelio, now works at Samsung AI. ↩︎

All decisions have trade-offs. IBM’s Wei Sun is an expert at weighing them
Q & A
Kim Martineau
06 Aug 2025
IBM Storage Scale delivers real-world performance: an in-depth analysis
Technical note
Brian Belgodere, Chris Miller, John Lewars, Matthew Klos, Yukio Hayashi Leon, Mara Miranda Bautista, and Olaf Weiser
04 Aug 2025
- AI
- Hybrid Cloud Infrastructure
Debugging LLMs to improve their credibility
Research
Kim Martineau
30 Jul 2025
From simulated steps to real-world care: AI learns how we walk for neurology
Research
Peter Hess
29 Jul 2025

Notes

Related posts

All decisions have trade-offs. IBM’s Wei Sun is an expert at weighing them

IBM Storage Scale delivers real-world performance: an in-depth analysis

Debugging LLMs to improve their credibility

From simulated steps to real-world care: AI learns how we walk for neurology