DATE: | Thursday, January 17, 2013 |
TIME: | 4:00 pm |
PLACE: | Council Room (SITE 5-084) |
TITLE: | Still Phrase-Based and Proud of it: NRC's Participation in NIST OpenMT 2012 |
PRESENTER: | George Foster National Research Council Canada (NRC) |
ABSTRACT:
NIST OpenMT is the longest-running series of evaluations of Machine Translation technology, and one of the most important, attracting participation from many top MT groups. In 2012, NRC's Portage entry did very well in this evaluation (http://www.nist.gov/itl/iad/mig/openmt12results.cfm). This result was quite surprising, since Portage is a phrase-based system with no linguistic syntax, which is thought to be advantageous for language pairs such as Chinese-English. In this talk, I will describe the Portage NIST system, focusing on four techniques that were key to its success: batch MIRA tuning, hierarchical reordering, diverse phrase extraction, and domain adaptation using mixture models. |