Phonetic confusion matrix based spoken document retrieval

S. Srinivasan; D. Petkovic

doi:10.1145/345508.345552

ACM SIGIR Forum

Conference paper

01 Jan 2000

Phonetic confusion matrix based spoken document retrieval

View publication

Abstract

Combined word-based indexes and phonetic indexes have been used to improve the performance of spoken document retrieval systems primarily by addressing the out-of-vocabulary retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken document retrieval against word-based retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based retrieval with a 17% loss in precision for word error rates ranging from 35 to 65%.

Conference paper