Third Machine Translation Marathon

Prague, Czech Replublic

26-30 January 2009

 

Winter School

 

Contents

 

programme; complete slides

 

 

Monday 26 January

 

Adam Lopez: Statistical machine translation.

 

Adam Lopez: Evaluating translation quality.

 

Alon Lavie: Stat-XFER: a general framework for search-based syntax-driven MT. 

 

Jonathan Clark & Greg Hannemann: Czech-to-English translation.

 

Ondřej Bojar: TectoMT for plaintext freaks.

 

 

Tuesday 27 January

 

Barry Haddow: Word-based models and the EM algorithm.

 

Francis M. Tyers & Kevin Donnelly: apertium-cy: a collaboratively-developed RBMT system for Welsh to English. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]  [presentation]

 

Joăo Graça, Kuzman Ganchev & Ben Taskar: PostCAT – posterior constrained alignment toolkit. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]

 

Antal van den Bosch & Peter Berck: Memory based machine translation and language modelling. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]  [presentation]

 

 

Wednesday 28 January

 

Chris Dyer: Decoding: phrase-based models. 

 

Zhifei Li, Chris Callison-Burch, Sanjeev Khudanpur, & Wren Thornton: Decoding Joshua: open source, parsing-based machine translation. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]

 

Omar F. Zaidan: Z-MERT: a fully configurable open source tool for minimum error rate training of machine translation systems. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]

 

Ashish Venugopal & Andreas Zollmann: Grammar based statistical MT on Hadoop: an end-to-end toolkit for large scale PSCFG based MT. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]

 

Hieu Hoang & Josh Schroeder: Moses installation and training run-through.

 

 

Thursday 29 January

 

Zdeněk Žabokrtský: TectoMT: software framework for developing MT systems (and other NLP applications).

 

David Mareček: Analysis and alignment of parallel data in TectoMT.

 

Ondřej Bojar: Bad news, NLP hacking and feature fishing.

 

Jana Kravalová: TectoMT tutorial. 

 

Yvette Graham & Josef van Genabith: An open source rule induction tool for transfer-based SMT. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]

 

Ventsislav Zhechev: Unsupervised generation of parallel treebanks through sub-tree alignment. [published in Prague Bulletin of Mathematical Linguistics 91, 2009]  [presentation]

 

 

Friday 30 January

 

Philipp Koehn: Discriminative training and factored translation models. 

 

Panel: future of MT Marathon [not available]

 

 

Other presentations:

 

Adam Lopez: Introduction to statistical machine translation.

 

Hieu Hoang, Barry Haddow, & Abhishek Arun: Using factored models and MERT in Moses. [not available]

 

Minimum error rate training lab.

 

Michal Hrusecky, Tomas Caithaml, & Chris Dyer: Feature function overhaul.