Publication
SLT 2010
Conference paper

Call transcript segmentation using word cooccurrence model

View publication

Abstract

In this paper, we propose a word cooccurrence model to perform topic segmentation of call center conversational speech. This model is estimated from training data to discriminatively represent how likely various pairs of words are to cooccur within homogeneous topic segments. We show that such model provide an effective measure of lexical cohesion and hence provide useful evidence of topical coherence or lack thereof between various parts of the call transcripts. We propose two approaches of utilizing such evidence for segmentation: 1) An efficient dynamic programming algorithm to perform segmentation simply utilizing the word cooccurrence model. 2) Extracting features based on word cooccurrence model to utilize them as additional features in conditional random field (CRF) based segmentation. Experimental evaluation of these approaches against state-of-the-art approaches show the effectiveness of word cooccurrence model for the topic segmentation task. ©2010 IEEE.

Date

01 Dec 2010

Publication

SLT 2010

Authors

Share