Decoupling Encoder and Decoder Networks for Abstractive Document Summarization

Ying Xu; Jey Han Lau; Timothy Baldwin; Trevor Cohn

Multiling@EACL 2017

Conference paper

03 Apr 2017

Decoupling Encoder and Decoder Networks for Abstractive Document Summarization

Abstract

Abstractive document summarization seeks to automatically generate a summary for a document, based on some abstract “understanding” of the original document. State-of-the-art techniques traditionally use attentive encoder–decoder architectures. However, due to the large number of parameters in these models, they require large training datasets and long training times. In this paper, we propose decoupling the encoder and decoder networks, and training them separately. We encode documents using an unsupervised document encoder, and then feed the document vector to a recurrent neural network decoder. With this decoupled architecture, we decrease the number of parameters in the decoder substantially, and shorten its training time. Experiments show that the decoupled model achieves comparable performance with state-of-the-art models for in-domain documents, but less well for out-of-domain documents.

Paper