About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2004
Conference paper
Speech discrimination based on multiscale spectro-temporal modulations
Abstract
A novel approach for content based audio classification is presented based on multiscale spectro-temporal modulation features extracted using a model of auditory cortex. The task is to discriminate speech from non-speech which consists of animal vocalizations, music and environmental sounds. Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches. The results demonstrate the advantages of the auditory model over the other two systerns, especially at low SNRs and high reverberation.