poster-generator-2022-09-13-16-44 (1).pngposter-generator-2022-09-13-16-44 (1).png

Composite biomarker discovery

Modern discovery and development of new patient treatments depends on clinical biomarkers from multiple data modalities, including clinical records, imaging, and molecular data, to identify personalized composite phenotypes.
Archived

Overview

Modern discovery and development of new patient treatments depends on clinical biomarkers from multiple data modalities, including clinical records, imaging, and molecular data, to identify personalized composite phenotypes. These phenotypes, and the modalities they are extracted from, operate at different biological, contextual and time scales. Multimodal representations can accelerate composite clinical biomarker detection and quantification and improve patient stratification to enable a better prediction of disease progression and treatment response.

Enabling technology

The generation of multimodal representations is facilitated by several IBM technologies that generate and integrate modality-specific encodings. These range from molecular analyses through deep learning on medical images, tissue images and clinical data, to multiple data integration and fusion techniques (see FuseMedML). While biomarkers from modality-specific representations can enhance clinical trials through patient stratification and disease staging, an integrated 360 view of the patient that also encodes the patient journey can accelerate the next-generation of discovery by revealing the connections between biomarkers across biological, contextual and time scales (Figure 1).

figure_1.png
Figure 1: The role and contribution of the various modalities in capturing biological, contextual, and time relevant information to accelerate discovery of composite biomarkers-based patient representations to improve patient stratification and treatment response. (In the bottom examples of tools leveraged.)

Diagnosis and response prediction in cancer

Using different modalities, our technologies succeeded in identifying novel biomarkers and disease mechanisms to better understand breast cancer. These include characterizing spatial heterogeneity, clonal evolution in drug resistance, and deconvolving cell-free DNA from multiple lesions3. We also demonstrated the value of integrating our deep learning frameworks across modalities. The patient representations derived from 3D mammograms and clinical records were shown to be useful for reducing radiologists’ workloads by 40% (Figure 2). Further, we showed that our technologies can leverage imaging and clinical data to enable the pre-biopsy histopathological subtyping of breast lesions and predict future metastases.

figure_2.png
Figure 2. Reader study results. (A) Readers and artificial intelligence (AI) model receiver operating characteristic (ROC) curves. All ROC curves include Kolmogorov-Smirnov confidence bands (19,20), marked by a blue area around the readers’ curve, mean reader, and dotted lines for AI. Dots on the curves mark the sensitivity and specificity achieved by the readers. In the high sensitivity range, AI exceeds readers performance. (B) Each cell depicts agreement, as measured by Cohen k (21) between pairs of readers or between AI classification and each one of the five readers. For this comparison, an AI operation point of 0.79 sensitivity and 0.66 specificity was chosen because it was the closest point to the reader mean of 0.79 sensitivity and 0.67 specificity. Although most readers are in moderate agreement, with k values between 0.4 and 0.65 (warmer colors), AI differs from a human reader, with k values between 0.24 and 0.34 (darker, colder colors). (C) Inter-reader variability per cancer-free decision. Each bar shows the percentage of readers who provided identical interpretation (number on bar represents number of examinations). For example, all five readers agreed correctly on 35% of examinations, whereas none of them answered correctly on 10% of the examinations. (D) The AI answers on each one of the inter-reader variability bars (number on bar represents number of examinations). AUC = area under the receiver operating characteristic curve.