Publication
Big Data 2020
Conference paper

Supervised Topic Compositional Neural Language Model for Clinical Narrative Understanding

View publication

Abstract

Clinical narratives that describe complex medical events are often accompanied by meta-information such as a patient's demographics, diagnoses and medications. This structured information implicitly relates to the logical and semantic structure of the entire narrative, and thus affects vocabulary choices for the narrative composition. To leverage this meta-information, we propose a supervised topic compositional neural language model, called MeTRNN, that integrates the strength of supervised topic modeling in capturing global semantics with the capacity of contextual recurrent neural networks (RNN) in modeling local word dependencies. MeTRNN generates interpretable topics from global meta-information and uses them to facilitate contextual RNNs in modeling local dependencies of text. For efficient training of MeTRNN, we develop an autoencoding variational Bayes inference method. We evaluate MeTRNN on the word prediction tasks using public text datasets. MeTRNN consistently outperforms all baselines across all datasets in perplexity ranging from 5% to 40%. Our case studies on real world electronic health records (EHR) data show that MeTRNN can learn and benefit from meaningful topics.