Machine Translation Marathon 2012

September 3-8, Edinburgh, UK


Talks and lectures

Monday, 3rd September

Lecture: Introduction to Statistical Machine Translation
Chris Dyer

TrTok: A Fast and Trainable Tokenizer for Natural Languages
Jiří Maršik and Ondřej Bojar

High-Precision Sentence Alignment by Bootstrapping from Wood Standard Annotations [not available]

Eva Mújdricza-Maydt, Huiqin Körkel-Qu, Stefan Riezler and Sebastian Padó

Invited Talk: Discourse and SMT: Where and How?
Bonnie Webber

Labs: Moses and the Experiment Management System []
Barry Haddow and Philipp Koehn


Tuesday, 4th September

Lecture:  Word-based Models
Colin Cherry

Lecture: Phrase-based Models
Hieu Hoang

Phrasal Rank-Encoding:   Exploiting Phrase Redundancy and Translational Relations for Phrase Table Compression
Marcin Junczys-Dowmunt

Parallel Phrase Scoring for Extra-large Corpora
Mohammed Mediani, Jan Niehues and Alex Waibel

Invited Talk: Large Scale Parallel Data-mining for Google Translate [not available]
Arne Mauser

Labs: Alignment [not available]

Marcello Federico, Colin Cherry and Dave Matthews


Wednesday, 5th September

Lecture: Decoding for Phrase-based Models
Colin Cherry

Lecture: Language Modelling
Marcello Federico

pycdec: A Python Interface to cdec
Victor Chahuneau, Noah A. Smith and Chris Dyer

Better Splitting Algorithms for Parallel Corpus Processing
Lane Schwartz

Invited Talk: Translation Process Research and the CRITT TPR Database
Michael Carl


Thursday, 6th September

Lecture: Hierarchical and Syntactic Models
Phil Blunsom

Lecture: Chart-based Decoding
Kenneth Heafield

Hierarchical Phrase-Based Translation with Jane 2
Matthias Huck, Jan-Thorsten Peter, Markus Freitag, Stephan Peitz and Hermann Ney [presentation]

Extending Hiero Decoding in Moses with Cube Growing
Wenduan Xu and Philipp Koehn

Invited Talk:   MT R&D in Academia and Industry: Observations from the Trenches

Andy Way


Labs: Decoding [not available]


Discussion: The Future of Open-Source Machine Translation [not available]

Marcello Federico


Friday, 7th September

Lecture: Discriminative Training
Chris Dyer

Lecture: Computer Aided Translation
Philipp Koehn

Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output

Christian Federmann  [presentation]

DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena
Antonio Toral, Sudip Kumar Naskar, Federico Gaspari and Declan Groves

Invited Talk: Quality Estimation for MT: State of the Art and Challenges
Lucia Specia


Diagnostic evaluation of MT with DELiC4MT

Walid Aransa, Luong Ngoc Quang, & Antonio Toral


Document-level decoding in Moses

Nicola Bertoldi, Robert Grabowski, Liane Guillou, Michal Novak, Sorin Slavescu, Jose de Souza


Multiple reference translations for European languages

Christian Buck, Daniel Zeman, Eva Hasler


Sparse features in Moses

Colin Cherry, Barry Haddow


Building Moses training pipelines with Arrows

Jie Jiang, David Kolovratnik, Ian Johnson


Sparse features in Joshua

Matt Post, Juri Ganitkevich


Bounded-memory language model building

Ivan Pouzvrevsky, Mohammed Mediani, Kenneth Heafield


Parallel corpus extraction from CommonCrawl

Hervé Saint-Amand, Jason Smith, Magdalena Plamada


New development functionality for the Asiya Suite parameter optimization with Mert

Meritxell Gonzŕlez, Cristina Espańa-Bonet: