A general methodology to quantify biases in natural language data

Jiawei Chen; Anbang Xu; Zhe Liu; Yufan Guo; Xiaotong Liu; Yingbei Tong; Rama Akkiraju; John M. Carroll

doi:10.1145/3334480.3382949

CHI EA 2020

Conference paper

25 Apr 2020

A general methodology to quantify biases in natural language data

View publication

Abstract

Biases in data, such as gender and racial stereotypes, are propagated through intelligent systems and amplified at end-user applications. Existing studies detect and quantify biases based on pre-defined attributes. However, in real practices, it is difficult to gather a comprehensive list of sensitive concepts for various categories of biases. We propose a general methodology to quantify dataset biases by measuring the difference of its data distribution with a reference dataset using Maximum Mean Discrepancy. For the case of natural language data, we show that lexicon-based features quantify explicit stereotypes, while deep learning-based features further capture implicit stereotypes represented by complex semantics. Our method provides a more flexible way to detect potential biases.

Conference paper