About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2015
Conference paper
Data augmentation for deep convolutional neural network acoustic modeling
Abstract
This paper investigates data augmentation based on label-preserving transformations for deep convolutional neural network (CNN) acoustic modeling to deal with limited training data. We show how stochastic feature mapping (SFM) can be carried out when training CNN models with log-Mel features as input and compare it with vocal tract length perturbation (VTLP). Furthermore, a two-stage data augmentation scheme with a stacked architecture is proposed to combine VTLP and SFM as complementary approaches. Improved performance has been observed in experiments conducted on the limited language pack (LLP) of Haitian Creole in the IARPA Babel program.