Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing

Hao Zhang; Chris Quirk; Bob Moore; Daniel Gildea

Bayesian Learning of Non-Compositional Phrases with Synchronous Parsing

Hao Zhang ,
Chris Quirk ,
Bob Moore ,
Daniel Gildea

Proceedings of ACL-08: HLT | April 2008

Published by Association for Computational Linguistics

Download BibTex

We combine the strengths of Bayesian modeling and synchronous grammar in unsupervised learning of basic translation phrase pairs. The structured space of a synchronous grammar is a natural fit for phrase pair probability estimation, though the search space can be prohibitively large. Therefore we explore efficient algorithms for pruning this space that lead to empirically effective results. Incorporating a sparse prior using Variational Bayes, biases the models toward generalizable, parsimonious parameter sets, leading to significant improvements in word alignment. This preference for sparse solutions together with effective pruning methods forms a phrase alignment regimen that produces better end-to-end translations than standard word alignment approaches.