About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICASSP 2011
Conference paper
The IBM 2009 GALE Arabic speech transcription system
Abstract
We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include improvements to context-dependent modeling in vowelized Arabic acoustic models; the use of neural-network features provided by the International Computer Science Institute; Model M language models; a neural network language model that uses syntactic and morphological features; and improvements to our system combination strategy. These advances were instrumental in achieving a word error rate of 8.9% on the Phase 4 evaluation set, and an absolute improvement of 1.6% word error rate over our 2008 system on the unsequestered Phase 3.5 evaluation data. © 2011 IEEE.