Oznur Alkan, Massimilliano Mattetti, et al.
INFORMS 2020
IBM's submission for the Phase II speaker recognition evalua- Tion of the DARPA sponsored Robust Automatic Transcription of Speech (RATS) program is examined. The objectives of the paper are three fold: (1) to provide a system description, (2) to identify key techniques for performance improvement, and (3) to quantify their contribution. In the system design, the funda- mental idea revolves around exploiting diversity and modeling complementary information at all levels. To speed up system development a push-button system is designed whereby all sys- Tem development steps could be rapidly completed. Noise ro- bustness is improved by utilizing two speech activity detectors (SADs) and five acoustic feature extractors. Furthermore, the probabilistic linear discriminant analysis (PLDA) based back- ends were trained with two different data subsets. To better ex- ploit the complementary information, system combination was performed in two modules. The first module trained new PLDA back-ends from concatenated compact representations while the second combined all the system scores and duration related side information in a neural network. The official results from the Phase II evaluation are also examined. The results indicate that for the 30s-30s task the performance of the overall system was better than the best single system by 46% and 40% on the inter- nal and evaluation test sets respectively. Copyright © 2013 ISCA.
Oznur Alkan, Massimilliano Mattetti, et al.
INFORMS 2020
Rajesh Balchandran, Leonid Rachevsky, et al.
INTERSPEECH 2009
Casey Dugan, Werner Geyer, et al.
CHI 2010
Seyed Omid Sadjadi, Jason W. Pelecanos, et al.
INTERSPEECH 2014