The IBM 2011 GALE Arabic speech transcription system

Lidia Mangu; Hong-Kwang Kuo; Stephen Chu; Brian Kingsbury; George Saon; Hagen Soltau; Fadi Biadsy

doi:10.1109/ASRU.2011.6163943

ASRU 2011

Conference paper

01 Dec 2011

The IBM 2011 GALE Arabic speech transcription system

View publication

Abstract

We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 5 machine translation evaluation. Key advances over our Phase 4 system include a new Bayesian Sensing HMM acoustic model; multistream neural network features; a MADA vowelized acoustic model; and the use of a variety of language model techniques with significant additive gains. These advances were instrumental in achieving a word error rate of 7.4% on the Phase 5 evaluation set, and an absolute improvement of 0.9% word error rate over our 2009 system on the unsequestered Phase 4 evaluation data. © 2011 IEEE.

Conference paper