Pre-training BERT on domain resources for short answer grading

Chul Sung; Tengfei Ma; Tejas I. Dhamecha; Vinay Reddy; Swarnadeep Saha; Rishi Arora

EMNLP-IJCNLP 2019

Conference paper

03 Nov 2019

Pre-training BERT on domain resources for short answer grading

Abstract

Pre-trained BERT contextualized representations have achieved state-of-the-art results on multiple downstream NLP tasks by fine-tuning with task-specific data. While there has been a lot of focus on task-specific fine-tuning, there has been limited work on improving the pre-trained representations. In this paper, we explore ways of improving the pre-trained contextual representations for the task of automatic short answer grading, a critical component of intelligent tutoring systems. We show that the pre-trained BERT model can be improved by augmenting data from the domain-specific resources like textbooks. We also present a new approach to use labeled short answering grading data for further enhancement of the language model. Empirical evaluation on multi-domain datasets shows that task-specific fine-tuning on the enhanced pre-trained language model achieves superior performance for short answer grading.

Conference paper