CLA-RA: Collaborative Active Learning Amidst Relabeling Ambiguity

Oishik Chatterjee; Kaizer Rahaman; Pooja Aggarwal

SSE 2024

Short paper

07 Jul 2024

CLA-RA: Collaborative Active Learning Amidst Relabeling Ambiguity

Abstract

Obtaining diverse and high-quality labeled data for training efficient classifiers remains a practical challenge. Crowdsourcing, which involves employing multiple weak labelers, is a popular approach to address this issue. However, crowd labelers often introduce noise, inaccuracies, and possess limited domain knowledge. In this paper, we propose a novel framework CLA-RA to optimize the labeling process by determining what to label next and assigning tasks to the most suitable annotators. Our technique aims to optimize classifier efficiency by utilizing the collective wisdom of various annotators while limiting the influence of error-prone annotations. The key contributions of our work include an annotator disagreement based instance selection mechanism which identifies the noise present in annotations of the instances and an instance-dependent annotator confidence model, which identifies the annotator with the highest confidence to correctly label an instance. These methods, combined with a similarity based annotator inference method, result in improved classifier accuracy while reducing annotation efforts. Experimental results over 9 datasets demonstrate significant improvements over state-of-the-art multi-annotator active learning methods, highlighting the effectiveness of our approach in obtaining high-quality labeled data for training classifiers with minimal labeling costs and errors.

Conference paper