Data Augmentation for Fairness in Personal Knowledge Graph Population

Lingraj S Vannur; Balaji Ganesan; Lokesh Nagalapatti; Hima Patel; MN Thippeswamy

doi:10.1007/978-3-030-75015-2_15

PAKDD 2021

Workshop paper

10 May 2021

Data Augmentation for Fairness in Personal Knowledge Graph Population

View publication

Abstract

Cold start knowledge base population (KBP) is the problem of populating a knowledge base from unstructured documents. While neural networks have led to improvements in the different tasks that are part of KBP, the overall F1 of the end-to-end system remains quite low. This problem is more acute in personal knowledge bases, which present additional challenges with regard to data protection, fairness and privacy. In this work, we use data augmentation to populate a more complete personal knowledge base from the TACRED dataset. We then use explainability techniques and representative set sampling to show that the augmented knowledge base is more fair and diverse as well.

Short paper