Improved Monolingual Hypothesis Alignment for Machine Translation System Combination

  • Xiaodong He ,
  • Mei Yang ,
  • ,
  • Patrick Nguyen ,
  • Robert Moore

ACM Transactions on Asian Language Information Processing -- special issue on Machine Translation of Asian Languages |

This paper presents a new hypothesis alignment method for combining outputs of multiple machine translation (MT) systems. An indirect hidden Markov model (IHMM) is proposed to address the synonym matching and word ordering issues in hypothesis alignment.  Unlike traditional HMMs whose parameters are trained via maximum likelihood estimation (MLE), the parameters of the IHMM are estimated indirectly from a variety of sources including word semantic similarity, word surface similarity, and a distance-based distortion penalty. The IHMM-based method significantly outperforms the state-of-the-art TER-based alignment model in our experiments on NIST benchmark datasets.  Our combined SMT system using the proposed method achieved the best Chinese-to-English translation result in the constrained training track of the 2008 NIST Open MT Evaluation.