Online speaker diarization using adapted i-vector transforms
Weizhong Zhu, Jason Pelecanos
ICASSP 2016
Language identification (LID) of speech data recorded over noisy communication channels is a challenging problem especially when the LID system is tested on speech data from an unseen communication channel (not seen in training). In this paper, we consider the scenario in which a small amount of adaptation data is available from a new communication channel. Various approaches are investigated for efficient utilization of the adaptation data in a supervised as well as unsupervised setting. In a supervised adaptation framework, we show that support vector machines (SVMs) with higher order polynomial kernels (HO-SVM) trained using lower dimensional representations of the the Gaussian mixture model supervectors (GSVs) provide significant performance improvements over the baseline SVM-GSV system. In these LID experiments, we obtain 30% reduction in error-rate with 6 hours of adaptation data for a new channel. For unsupervised adaptation, we develop an iterative procedure for re-labeling the development data using a co-training framework. In these experiments, we obtain considerable improvements(relative improvements of 13 %) over a self-training framework with the HO-SVM models. © 2012 IEEE.
Weizhong Zhu, Jason Pelecanos
ICASSP 2016
Seyed Omid Sadjadi, Sriram Ganapathy, et al.
Odyssey 2016
Jason Pelecanos, Weizhong Zhu, et al.
INTERSPEECH 2014
Samuel Thomas, Sriram Ganapathy, et al.
ICASSP 2014