About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2012
Conference paper
Spoken document clustering using word confusion networks
Abstract
In this paper, we propose a word confusion network (WCN) based approach to perform clustering of the spoken documents and analyze its ability to handle the influence of speech recognition errors. WCN compactly represents multiple confidence weighted recognition hypotheses. Thus it provides scope for improving the clustering accuracy as a result of the likely presence of the correct transcription in the alternative hypotheses for those cases where 1-best transcripts are erroneous. On the other hand, several of the remaining hypotheses are incorrect and hence could pose a challenge during the clustering. In our approach, we extract TF-IDF vectors from the WCNs to perform clustering using K-Means algorithm. The components of TF-IDF vectors are further weighted with the word posterior probabilities. This is to potentially down-weight those vector components that are contributed by the incorrect hypotheses of low posterior probabilities. The experimental results obtained using switchboard data illustrate the usefulness of rich information in the WCN for clustering, showing upto 4% absolute improvement in normalized mutual information metric.