Naiyu Yin, Hanjing Wang, et al.
CVPR 2026
Pronunciation modeling in automatic speech recognition systems has had mixed results in the past; one likely reason for poor performance is the increased confusability in the lexicon from adding new pronunciation variants. In this work, we propose a new framework for determining lexically confusable words based on inverted finite state transducers (FSTs); we also present experiments designed to test some of the implementation details of this framework. The method is evaluated by examining how well the algorithm predicts the errors in an ASR system. The model is able to generalize confusions learned from a training set to predict errors made by the speech recognizer on an unseen test set. © 2005 Elsevier B.V. All rights reserved.
Naiyu Yin, Hanjing Wang, et al.
CVPR 2026
Luís Henrique Neves Villaça, Sean Wolfgand Matsui Siqueira, et al.
SBSI 2023
Mahesh Viswanathan, Homayoon S.M. Beigi, et al.
ICDAR 1999
C. Neti, Salim Roukos
ASRU 1997