About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SLT 2012
Conference paper
Noisy channel adaptation in language identification
Abstract
Language identification (LID) of speech data recorded over noisy communication channels is a challenging problem especially when the LID system is tested on speech data from an unseen communication channel (not seen in training). In this paper, we consider the scenario in which a small amount of adaptation data is available from a new communication channel. Various approaches are investigated for efficient utilization of the adaptation data in a supervised as well as unsupervised setting. In a supervised adaptation framework, we show that support vector machines (SVMs) with higher order polynomial kernels (HO-SVM) trained using lower dimensional representations of the the Gaussian mixture model supervectors (GSVs) provide significant performance improvements over the baseline SVM-GSV system. In these LID experiments, we obtain 30% reduction in error-rate with 6 hours of adaptation data for a new channel. For unsupervised adaptation, we develop an iterative procedure for re-labeling the development data using a co-training framework. In these experiments, we obtain considerable improvements(relative improvements of 13 %) over a self-training framework with the HO-SVM models. © 2012 IEEE.