About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2013
Conference paper
TRAP language identification system for RATS phase II evaluation
Abstract
Automatic language identification or detection of au- dio data has become an important preprocessing step for speech/speaker recognition and audio data mining. In many surveillance applications, language detection has to be per- formed on highly degraded audio inputs. In this paper, we present our work on language detection in highly degraded ra- dio channel scenarios. We provide a brief description of the Targeted Robust Audio Processing (TRAP) language detection system built for the Phase II Evaluation of the Robust Automatic Transcription of Speech (RATS) program. This system is a combination of 15 systems with different frontends and speech activity decisions. We also analyze the usefulness of multi-layer perceptron (MLP) based non-linear projection of i-vectors be- fore SVM classification. The proposed backend reduces the Equal Error Rate (EER) by 11%-25% relative compared to the baseline PCA-based feature representation for SVM classifica- Tion, on the RATS test data consisting of data from eight high- frequency radio communication channels. Copyright © 2013 ISCA.