Causal inference consists of a set of methods attempting to estimate the effect of an intervention on an outcome from observational data. With the IBM Causal Inference 360 Toolkit, people can use multiple tools to move their decision-making processes from a “best guess” scenario to concrete answers based on data.
The IBM Causality 360 library is an open-source Python library that uses ML models internally and, unlike most packages, allows users to plug in almost any ML model they want. It also has methodologies to select the best ML models and their parameters based on ML paradigms like cross-validation, and to use well-established and novel causal-specific metrics.
At IBM’s research lab in Haifa, Israel, we have been using the causal inference toolkit as part of our work on drug repurposing.1 Drug repurposing or repositioning is a method for finding new therapeutic uses for accepted drugs. Here, the question we searched for was: “What would happen if patient X took drug Y?”
The result? Discovery of two new potential treatments for dementia that typically accompanies Parkinson’s disease. More specifics on how the causal modeling in this research worked can be found in a blog from April of this year, by our colleague Michal Rosen-Zvi.
The team also used the toolkit in a collaboration with Assuta health services, the largest private network of hospitals in Israel, to analyze the impact of COVID on access to care.2 Specifically, the team analyzed more than 300,000 invitations sent to women for breast screening exams, focusing on instances where the women did not show up for their appointments. The causal inference technology revealed that while at first it seemed the nonpharmaceutical interventions of the government resulted in the no-shows, in reality, it was the number of newly infected people that influenced whether or not the women showed up to their appointments.
In another example, we wanted to understand whether new irrigation practices contribute to a desired reduction in pollution and nutrient runoff. To do this, we used a dataset that captured multiple aspects of the agricultural use of the land, including its irrigation method, and measuring the amount of runoff.
We saw that the data showed little effect. Then we used the causal inference toolkit to correct for the fact that the irrigation methods depend heavily on the type of land use and the type of crop. The outcome changed - we showed that introducing these novel irrigation techniques does reduce runoff. It could save fertilization and water and reduce pollution of the watershed. This reduction can be further quantified to estimate the tradeoff between savings and initial investment.
With the new IBM Causal Inference 360 Toolkit capability and website, we hope to allow people in the field of causal inference to easily apply machine learning methodologies, and to allow ML practitioners to move from asking purely predictive questions to 'what-if' questions using causal inference.
Causality: We study the inference of causal effects and relationships, as well as the application of causal thinking to out-of-distribution generalization, fairness, robustness, and explainability.
Laifenfeld, D.; Yanover, C.; Ozery-Flato, M.; et al. Emulated Clinical Trials from Longitudinal Real-World Data Efficiently Identify Candidates for Neurological Disease Modification: Examples from Parkinson’s Disease. Front. Pharmacol. (2021). ↩
Ozery-Flato, M., Pinchasov, O., Dabush-Kasa, M., et al. Predictive and Causal Analysis of No-Shows for Medical Exams During COVID-19: A Case Study of Breast Imaging in a Nationwide Israeli Health Organization. medRxiv 2021.03.12.21253358, (2021). ↩