Progress in dynamic network decoding
Abstract
We show how we boosted the efficiency of the dynamic network decoder in IBM's Attila speech recognition framework, by transforming the underlying concept from token-passing to word-conditioned, and adding speedup methods like sparse LM look-ahead. On several different tasks, we achieve improvements of 30 to 50% in efficiency at equal precision. We compare the efficiency to a state-of-the-art WFST based static decoder, and note that the added methods improve the dynamic decoder under conditions where it was lacking before in comparison, specifically when using a relatively small LM. Overall, the new dynamic decoder performs similarly to the static decoder, with a lead for the dynamic decoder on tasks with a larger LM, and a lead for the static decoder on tasks with a smaller LM. © 2014 IEEE.