Artificial Intelligence for Reducing Workload in Breast Cancer Screening with Digital Breast Tomosynthesis

Yoel Shoshan; Ran Bakalo; Flora Gilboa-Solomon; Vadim Ratner; Ella Barkan; Michal Ozery-Flato; Mika Amit; Daniel Khapun; Emily Ambinder; Eniola Oluyemi; Babita Panigrahi; Philip Dicarlo; Michal Rosen-Zvi; Lisa Mullen

doi:10.1148/radiol.211105

Radiology

Paper

18 Jan 2022

Artificial Intelligence for Reducing Workload in Breast Cancer Screening with Digital Breast Tomosynthesis

View publication

Abstract

Background: Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammography, but interpretation time is substantially longer. Artificial intelligence (AI) could improve reading efficiency.

Purpose: To evaluate the use of AI to reduce workload by filtering out normal DBT screens.

Materials and Methods: The retrospective study included 13 306 DBT examinations from 9919 women performed between June 2013 and November 2018 from two health care networks. The cohort was split into training, validation, and test sets (3948, 1661, and 4310 women, respectively). A workflow was simulated in which the AI model classified cancer-free examinations that could be dismissed from the screening worklist and used the original radiologists’ interpretations on the rest of the worklist examinations. The AI system was also evaluated with a reader study of five breast radiologists reading the DBT mammograms of 205 women. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and recall rate were evaluated in both studies. Statistics were computed across 10 000 bootstrap samples to assess 95% CIs, noninferiority, and superiority tests.

Results: The model was tested on 4310 screened women (mean age, 60 years ± 11 [standard deviation]; 5182 DBT examinations). Compared with the radiologists’ performance (417 of 459 detected cancers [90.8%], 477 recalls in 5182 examinations [9.2%]), the use of AI to automatically filter out cases would result in 39.6% less workload, noninferior sensitivity (413 of 459 detected cancers; 90.0%; P = .002), and 25% lower recall rate (358 recalls in 5182 examinations; 6.9%; P = .002). In the reader study, AUC was higher in the standalone AI compared with the mean reader (0.84 vs 0.81; P = .002).

Conclusion: The artificial intelligence model was able to identify normal digital breast tomosynthesis screening examinations, which decreased the number of examinations that required radiologist interpretation in a simulated clinical workflow.

Paper