|DATE:||Thursday, March 14, 2013|
|PLACE:||Council Room (SITE 5-084)|
|TITLE:||FUN-NRC's Paraphrase-augmented Phrase-based SMT for NTCIR-10 Patent MT OpenMT 2012|
|PRESENTER:||Atsushi FUJITA |
Future University Hakodate
I will report on the results of the recent Japanese-English patent MT evaluation campaign, "NTCIR-10 Patent MT". Our systems were developed with reasonably large-scale in-domain bilingual and monolingual corpora using NRC's statistical machine translation (SMT) system, PortageII 1.0. During the system development, first, the effectiveness of Portage's various features was verified and a strong baseline system was obtained. The systems were then augmented with paraphrases. Several sets of sub sentential paraphrases and ways of integrating them in both source and target sides of both JE and EJ directions were investigated and the two best systems were submitted. The official results distributed to participants indicate that there was a significant gap between our systems and the top scorers. However, our follow-up experiments focusing on a reordering parameter revealed that our systems can be comparable to some of the other superior systems (individual ones but not system combinations), including those dealing with sentence structures and manually tailored rule-based systems, according to automatic measures.