A cross-lingual spoken content search system

Jitendra Ajmera; Ashish Verma

INTERSPEECH 2011

Conference paper

01 Dec 2011

A cross-lingual spoken content search system

Abstract

This paper presents an approach towards enabling audio search for those languages where training an automatic speech recognition (ASR) system is difficult, owing to lack of training resources. Our work is related to previous approaches where the problem of allowing search for out-of-vocabulary terms has been addressed. A phonetic recognizer is used to convert the audio data into phonetic lattices. In the proposed approach, the acoustic models (AM) for the phonetic recognizer are trained on a base language for which training data is available and used to search the content in a similar language. A phonetic language model (PLM) is trained for each language independently using text data available from a variety of sources including the web. We have performed experiments to evaluate this approach for searching through Gujarati corpus where the AM were trained on Indian-English corpus. The experimental results show that this approach can provide a P@10 (precision at 10) accuracy of up to 0.65. Copyright © 2011 ISCA.

Conference paper