About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2006
Conference paper
Quantization for adapted GMM-based speaker verification
Abstract
State-of-the-art speaker verification systems are built around the likelihood ratio test, using Gaussian Mixture Models (GMM) for likelihood functions, a universal background model (UBM) for alternative speaker representation, and a form of Bayesian adaptation to derive speaker models from the UBM. This work tackles optimal quantizer design of the speech cepstral features (MFCCs) for such systems. The problem is posed as the minimization of loss of log-likelihood ratio between the quantized and unquantized speech features. First we show that the conventional mean squared error (MSE) quantizer for the top-scoring UBM Gaussian is optimal under practical assumptions. Then we derive the optimal bit allocation strategy across the dimensions of the feature vectors. Finally we demonstrate the validity of the approach against various quantization and bit allocation schemes by running experiments on the appropriately modified IBM Speaker Verification system. Experimental results on the HUB4 corpora show negligible impact on verification performance for bit rates as low as less than 1 bit per dimension on average in contrast to 32 bits per dimension in the original system. © 2006 IEEE.