About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NAACL-HLT 2013
Conference paper
Discriminative training of 150 million translation parameters and its application to pruning
Abstract
Until recently, the application of discriminative training to log linear-based statistical machine translation has been limited to tuning the weights of a limited number of features or training features with a limited number of parameters. In this paper, we propose to scale up discriminative training of (He and Deng, 2012) to train features with 150 million parameters, which is one order of magnitude higher than previously published effort, and to apply discriminative training to redistribute probability mass that is lost due to model pruning. The experimental results confirm the effectiveness of our proposals on NIST MT06 set over a strong baseline.