Syntax based reordering with automatically derived rules for improved statistical machine translation

Karthik Visweswariah; Jiri Navratil; Jeffrey Sorensen; Vijil Chenthamarakshan; Nanda Kambhatla

COLING 2010

Conference paper

01 Dec 2010

Syntax based reordering with automatically derived rules for improved statistical machine translation

Abstract

Syntax based reordering has been shown to be an effective way of handling word order differences between source and target languages in Statistical Machine Translation (SMT) systems. We present a simple, automatic method to learn rules that reorder source sentences to more closely match the target language word order using only a source side parse tree and automatically generated alignments. The resulting rules are applied to source language inputs as a pre-processing step and demonstrate significant improvements in SMT systems across a variety of languages pairs including English to Hindi, English to Spanish and English to French as measured on a variety of internal test sets as well as a public test set.

Conference paper