Reducing Asymmetry between language-pairs to Improve Alignment and Translation Quality
Abstract
This paper presents a novel method to remove asymmetry between the source and the target languages thereby improving alignment and machine translation (MT) quality. Some words in the source language are redundant for MT tasks but necessary for the source sentence to be grammatical. This paper proposes a method to automatically detect such words. In addition, constraints under which these words should or should not be removed are extracted automatically from the target language. A lattice scheme is used for test sentences to provide alternate paths with and without removal of these words. Such a constraint-based removal technique gives a significant improvement (p < 0.001) of 5.29 BLEU points over the baseline Phrase-based MT system for the English-Hindi language-pair.