Machine Learning for Drug Development and Causal Inference

Combining foundation models with causal inference to accelerate early drug development


The Machine Learning for Drug Development and Causal Inference group is developing machine learning models for innovative drug discovery technologies and bringing them to fruition for IBM clients. Our researchers believe that drug discovery can benefit from technologies that learn from the rich clinical, omics, and molecular data being collected nowadays in large quantities. The team’s vision is that with the advance of AI technologies and recent innovative foundation models, biomedical foundation model technologies can drive critical tasks in computational drug discovery, focusing on omics data analysis.

Such analysis provides models that can differentiate between cell states using very little labeled data. For example, it can identify stages of disease progression, responses to treatments, drug resistance, and more. However, finding new protein targets for drug development requires uncovering the underlying mechanisms that lead to these differences. That, in turn, requires considering potential confounding variables to separate between affecting and affected genes and pathways. To this end, we utilize our open-source Causallib library, applying bias correction through causal inference to estimate the actual effect of each potential effector gene.

By utilizing the above technologies, we combine the latest advances in generative AI and foundation models with well-established data analysis methods to provide reliable tools for preclinical drug discovery.