Shuang Chen, Herbert Freeman
International Journal of Pattern Recognition and Artificial Intelligence
This paper extends previous work on extracting parallel sentence pairs from comparable data (Munteanu and Marcu, 2005). For a given source sentence S, a maximum entropy (ME) classifier is applied to a large set of candidate target translations. A beam-search algorithm is used to abandon target sentences as non-parallel early on during classification if they fall outside the beam. This way, our novel algorithm avoids any document-level pre-filtering step. The algorithm increases the number of extracted parallel sentence pairs significantly, which leads to a BLEU improvement of about 1 % on our Spanish-English data. © 2009 ACL and AFNLP.
Shuang Chen, Herbert Freeman
International Journal of Pattern Recognition and Artificial Intelligence
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Wei Zhang, Timothy Wood, et al.
ICAC 2014
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025