Active feature acquisition

Learning complex patterns in knowledge graphs


This project is about learning patterns in knowledge graphs (KGs), with a focus on digital healthcare applications such as triage. In brief, a patient medical record can be modelled as a graph, where nodes represent encounters (i.e., visits to the doctor), symptoms, or medical tests, and the edges capture how nodes are related. Our goal is to learn the nodes' connectivity patterns to help healthcare professionals make decisions. For example, which medical test to carry out to speed up the diagnosis of a (rare) disease.

View of a full medical record KG (left), and the KG with encounters, observations, and conditions (right).

The project is part of a big effort within IBM Research of accelerating scientific discovery in healthcare, and the team is composed by Mykhaylo Zayats, Víctor Valls and Alessandra Pascale from the Dublin lab.

The project short-term goals are:

  • The design of graph neural networks (GNNs) that can learn complex patterns in KGs
  • The development of accelerated algorithms for training GNNs that can handle sparse and incomplete data
  • The validation of the approach within the digital health use case of medical triage

A challenge of working with healthcare KGs is that the graph is often incomplete or has missing information, which is analogous to working with corrupted data in classical ML (e.g., images with missing pixels). The mid-term goal is to develop algorithms that can discover unknown parts of healthcare KGs to improve the quality of medical action recommendations.

We expect the tool to be applied in the long-term to other domains such as chemistry, drugs discovery and clinical trials.