Active learning for BERT: An empirical study

Liat Ein-Dor; Alon Halfon; Ariel Gera; Eyal Shnarch; Lena Dankin; Leshem Choshen; Marina Danilevsky; Ranit Aharonov; Yoav Katz; Noam Slonim

EMNLP 2020

Conference paper

16 Nov 2020

Active learning for BERT: An empirical study

Download paper

Abstract

Real world scenarios present a challenge for text classification, since labels are usually expensive and the data is often characterized by class imbalance. Active Learning (AL) is a ubiquitous paradigm to cope with data scarcity. Recently, pre-trained NLP models, and BERT in particular, are receiving massive attention due to their outstanding performance in various NLP tasks. However, the use of AL with deep pre-trained models has so far received little consideration. Here, we present a large-scale empirical study on active learning techniques for BERT-based classification, addressing a diverse set of AL strategies and datasets. We focus on practical scenarios of binary text classification, where the annotation budget is very small, and the data is often skewed. Our results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the minority class. We release our research framework, aiming to facilitate future research along the lines explored here.

Conference paper