Scott Axelrod, Vaibhava Goel, et al.
IEEE Transactions on Audio, Speech and Language Processing
We introduce a class of Gaussian mixture models (GMMs) in which the covariances or the precisions (inverse covariances) are restricted to lie in subspaccs spanned by rank-one symmetric matrices. The rank-one basis arc shared between the Gaussians according to a sharing structure. We describe an algorithm for estimating the parameters of the GMM in a maximum likelihood framework given a sharing structure. We employ these models for modeling the observations in the hidden-states of a hidden Markov model based speech recognition system. We show that this class of models provide improvement in accuracy and computational efficiency over well-known covariance modeling techniques such as classical factor analysis, shared factor analysis and maximum likelihood linear transformation based models which are special instances of this class of models. We also investigate different sharing mechanisms. We show that for the same number of parameters, modeling precisions leads to better performance when compared to modeling covariances. Modeling precisions also gives a distinct advantage in computational and memory requirements. © 2006 IEEE.
Scott Axelrod, Vaibhava Goel, et al.
IEEE Transactions on Audio, Speech and Language Processing
Ankur Gandhe, Rashmi Gangadharaiah, et al.
IJCNLP 2011
Mahesh Viswanathan, Homayoon S.M. Beigi, et al.
IJDAR
Brian Kingsbury, Lidia Mangu, et al.
INTERSPEECH - Eurospeech 2003