About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2007
Conference paper
Segmental modeling for audio segmentation
Abstract
Trainable speech/non-speech segmentation and music detection algorithms usually consist of a frame based scoring phase combined with a smoothing phase. This paper suggests a framework in which both phases are explicitly unified in a segment based classifier. We suggest a novel segment based generative model in which audio segments are modeled as supervectors and each class (speech, silence, music) is modeled by a distribution over the supervector space. Segmental speech classes can then be modeled by generative models such as GMMs or can be classified by SVMs. Our suggested framework leads to a significant reduction in error rate. © 2007 IEEE.