Speech discrimination based on multiscale spectro-temporal modulations

Nima Mesgarani; Shihab Shamma; Malcolm Slaney

ICASSP 2004

Conference paper

28 Sep 2004

Speech discrimination based on multiscale spectro-temporal modulations

Abstract

A novel approach for content based audio classification is presented based on multiscale spectro-temporal modulation features extracted using a model of auditory cortex. The task is to discriminate speech from non-speech which consists of animal vocalizations, music and environmental sounds. Generalization of the system to signals in high level of additive noise and reverberation is evaluated and compared to two existing approaches. The results demonstrate the advantages of the auditory model over the other two systerns, especially at low SNRs and high reverberation.

Conference paper