Transformation enhanced multi-grained modeling for text-independent speaker recognition

Upendra V. Chaudhari; Jiří Navrátil; Stéphane H. Maes; Ramesh Gopinath

ICSLP 2000

Conference paper

16 Oct 2000

Transformation enhanced multi-grained modeling for text-independent speaker recognition

Abstract

We describe our formulation of transformation enhanced data modeling used to develop a multi-grained data analysis approach to text independent speaker recognition. The broad goal is to address difficulties caused by sparse training and test data. First, our development of maximum likelihood transformation based recognition with diagonally constrained Gaussian mixture models is detailed. We give results to show its robustness to decreasing training data. Then using the these models as building blocks, a multigrained model structure is developed. For this, the training data must be labeled, e.g. with an HMM based phone labeler. A graduated phone class structure is then used to train the speaker model at various levels of detail. This structure is a tree with the root node containing all the phones. Subsequent levels partition the phones into increasingly finer grained linguistic classes. We demonstrate the effectiveness of the modeling with identification and verification experiments.

Conference paper