Jihun Yun, Aurelie Lozano, et al.
NeurIPS 2021
In this paper, we present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term bi-level joint unsupervised and supervised training (BL-JUST). BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel optimization to solve this challenging ASR problem with affordable complexity and rigorous convergence guarantees. To evaluate BL-JUST, extensive experiments on the LibriSpeech and TED-LIUM v2 datasets have been conducted. BL-JUST achieves superior performance over the commonly used pre-training followed by fine-tuning strategy.
Jihun Yun, Aurelie Lozano, et al.
NeurIPS 2021
Imran Nasim, Michael E. Henderson
Mathematics
Ge Gao, Xi Yang, et al.
AAAI 2024
Smit Marvaniya, Jitendra Singh, et al.
ICASSP 2024