Accelerating clinical trials

Developing AI and analytics to understand the drivers of study or clinical trial efficiency.


Clinical trials (CT) are considered the gold standard method of studying whether new drugs or interventions are safe and effective in humans. Still, there are several known challenges associated with trials that introduce inefficiencies and make it difficult, expensive, and slow for researchers to execute trials and determine outcomes. Our aim is to develop Machine Learning (ML) and Artificial Intelligence (AI) techniques to help transform key steps in clinical trials, accelerating their efficiency through improved design, recruitment, or engagement. We do this by identifying or creating novel composite features or biomarkers that are important to trial efficiency. Subsequently, we develop models to study the impact of these features in practice using real-world data. We also aim to identify and extract new features that span across domains (e.g. social and behavioral features that slow down trial execution or skew outcomes). Our research can be summarized in three parts:

  • Knowledge Representation and Organization – Defining novel representations that integrate clinical trial data with other relevant data (social, behavioral, clinical, etc.) to support, for example, a composite 360° view of an individual patient, cohort or study.
  • Information Extraction and Discovery of New Features – Developing advanced analytics and language models that can help extract new biomarkers, features or patterns from multiple sources and types of data.
  • Advanced Analytics and Machine Learning – Developing AI models that leverage our representation and extraction tools and lead to a better understanding of trial inefficiencies (e.g., recruitment of a more diverse trial population, avoiding dropouts).


Together with IBM’s Deep Search platform we are working on a number of ML and AI technologies to address the unmet needs of studies and trials:

  • Knowledge Graphs and Ontologies
  • Natural Language Processing Tools and Language-based Models
  • Statistical Graph Networks (Inference on Bayesian Networks and - Functional Graphical Models)
  • Predictive Modeling and Feature Selection, Deep Search Frameworks and Graph Neural Networks
  • IBM Deep Search

Selected Assets

  • RCT-extract: extractor of more than 80 entities in clinical trials (available via Deep search)
  • Health & Social Person-centric Ontology: an ontology connecting the clinical and social determinants of health around a 360-view of an individual (available via HSPO)
  • UMLS Tagger: a UMLS-based annotator to detect concepts from text (available via Deep search)
  • Relation Extraction Module: classifier detecting semantic relations in text (available via Deep search)

Research Collaborations

  • Cleveland Clinic Foundation

Our team is working together with experts at the Cleveland Clinic as part of the Accelerate Discovery Ecosystem for Healthcare. The goal is helping to identify, extract and understand the impact of new biomarker features (such as the social and behavioral determinants of health) in clinical trials.

  • Scaling up Proactive Digital Integrated Care (SEURO)

SEURO is a EU funded Horizon 2020 project, where a team from IBM research Dublin will investigate novel techniques to examine outputs of clinical trials for predicting the long-term impact of adopting new technologies. We are developing new tools to identify the success of clinical trials (by measure of engagement or clinical endpoints). Tools developed through this work can contribute to risk stratification for patient selection and understanding features that may be linked to engagement. Please see a video of SEURO’s ProInsight tool below.

  • Human Behavior Change Project

Our teams have also been working together with behavioral scientists, computer scientists, and systems architects on the Human Behavior Change Project (HBCP). Through this we have leveraged Natural Language Processing and Machine Learning to extract information from intervention evaluation reports and to answer key questions about the evidence.