About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
INTERSPEECH 2015
Conference paper
Score stabilization for speaker recognition trained on a small development set
Abstract
Nowadays state-of-the-art speaker recognition systems obtain quite accurate results for both text-independent and textdependent tasks as long as they are trained on a fair amount of development data from the target domain (assuming clean speech). In this work, we address the challenge of building a speaker recognition system with a small development dataset from the target domain without using out-of-domain data whatsoever. When development data is limited, the Nuisance Attribute Projector (NAP) algorithm is (in general) superior to the i-vector approach. We have investigated the relative degradation observed from the different components of the NAP system trained on a small dataset and conclude that score normalization is a major source of degradation. We introduce a novel method for stabilizing the normalized scores. We explicitly estimate a low dimensional subspace in supervector space which accounts for high variability in score normalization parameters. We then compensate the estimated subspace. We report experiments on both text-dependent and text-independent tasks which validate our method and show large error reductions.