About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 1995
Conference paper
Context dependent phonetic duration models for decoding conversational speech
Abstract
Phonetic context was used to predict the durations of phones using a decision tree. These predictions were used to calculate context dependent HMM transition probabilities for these phone models, which were used to decode telephone conversations from the SwitchBoard corpus. We observed that the duration models do not appreciably improve the word error rate; that more can be gained by modeling phone durations within words than by adjusting for local average speaking rates; and conclude that local or global variations in speaking rate are not major contributors to the observed high error rates for SwitchBoard.