Orly Stettiner, Dan Chazan
ICPR 1994
This paper presents a novel low complexity, frequency domain algorithm for reconstruction of speech from the mel-frequency cepstral coefficients (MFCC), commonly used by speech recognition systems, and the pitch frequency values. The reconstruction technique is based on the sinusoidal speech representation. A set of sine-wave frequencies is derived using the pitch frequency and voicing decisions, and synthetic phases are then assigned to each respective sine wave. The sine-wave amplitudes are generated by sampling a linear combination of frequency domain basis functions. The basis function gains are determined such that the mel-frequency binned spectrum of the reconstructed speech is similar to the mel-frequency binned spectrum, obtained from the original MFCC vector by IDCT and antilog operations. Natural sounding, good quality intelligible speech is obtained by this procedure.
Orly Stettiner, Dan Chazan
ICPR 1994
R. Donovan
ICASSP 2000
Jiri Navratil, Jan Kleindienst, et al.
ICASSP 2000
Raul Fernandez, Asaf Rendel, et al.
ICASSP 2013