Probabilistic feature matching for fast scalable visual prompting

Thomas Frick; Cezary Skura; Filip Michal Janicki; Roy Assaf; Niccolo Avogaro; Daniel Caraballo-Rivera; Yagmur Gizem Cinar; Brown Ebouky; Ioana Giurgiu; Takayuki Katsuki; Piotr Sebastian Kluska; Cristiano Malossi; Haoxiang Qiu; Tomoya Sakai; Florian Scheidegger; Andrej Simeski; Daniel Yang; Andrea Bartezzaghi; Mattia Rigotti

IJCAI 2024

Demo paper

03 Aug 2024

Probabilistic feature matching for fast scalable visual prompting

Abstract

In this work, we propose a novel framework for image segmentation guided by visual prompting which leverages the power of vision foundation models. Inspired by recent advancements in computer vision, our approach integrates multiple large-scale pretrained models to address the challenges of segmentation tasks with limited and sparsely annotated data interactively provided by a user. Our method combines a frozen feature extraction backbone with a scalable and efficient probabilistic feature correspondence (soft matching) procedure derived from Optimal Transport to couple pixels between reference and target images. Moreover, a pretrained segmentation model is harnessed to translate user scribbles into reference masks and matched target pixels into output target segmentation masks. This results in a framework that we name Softmatcher, a versatile and fast training-free architecture for image segmentation by visual prompting. We demonstrate the efficiency and scalability of Softmatcher for real-time interactive image segmentation by visual prompting and showcase it in diverse visual domains including technical visual inspection use cases.

Workshop paper