Stereo-based stochastic mapping for robust speech recognition
Mohamed Afify, Xiaodong Cui, et al.
ICASSP 2007
The bilinear transformation (BT) is used for vocal tract length normalization (VTLN) in speech recogniton systems. We prove two properties of the bilinear mapping that motivated the band-diagonal transform proposed in M. Afify and O. Siohan, (ldquoConstrained maximum likelihood linear regression for speaker adaptation,rdquo in Proc. ICSLP, Beijing, China, Oct. 2000.) This is in contrast to what is stated in M. Pitz and H. Ney, (ldquoVocal tract length normalization equals linear transformation in cepstral space,rdquo IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp 930-944, September 2005) that the transform of Afify and Siohan was motivated by empirical observations. © 2007 IEEE.
Mohamed Afify, Xiaodong Cui, et al.
ICASSP 2007
Ruhi Sarikaya, Bowen Zhou, et al.
ICASSP 2007
Olivier Siohan, Bhuvana Ramabhadran, et al.
ICSLP 2004
Bhuvana Ramabhadran, Olivier Siohan, et al.
ICSLP 2004