ECCV 2022
Conference paper

Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images

View code


Multiple Instance Learning (MIL) methods have become increasingly popular for classifying gigapixel-sized Whole-Slide Images (WSIs) in digital pathology. Most MIL methods operate at a single WSI magnification, by processing all the tissue patches. Such a formulation induces high computational requirements and constrains the contextualization of the WSI-level representation to a single scale. Certain MIL methods extend to multiple scales, but they are computationally more demanding. In this paper, inspired by the pathological diagnostic process, we propose ZoomMIL, a method that learns to perform multi-level zooming in an end-to-end manner. ZoomMIL builds WSI representations by aggregating tissue-context information from multiple magnifications. The proposed method outperforms the state-of-the-art MIL methods in WSI classification on two large datasets, while significantly reducing computational demands with regard to Floating-Point Operations (FLOPs) and processing time by 40–50x. Our code is available at: