Proceedings of the

International Workshop on

Spoken Language Translation

December 8 and 9, 2011

San Francisco, USA

Edited by

Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker


Table of contents


Keynotes                                                                                                                                                             8


Data-intensive approaches for ASR

Sadaoki Furui


Meaning-equivalent semantics forunderstanding, generation, translation, and evaluation

Daniel Marcu


Resource-rich research on natural language processing and understanding

Junichi Tsujii


Overview of the IWSLT 2011 evaluation campaign.                                                                                                   11-27

Marcello Federico, Luisa Bentivogli, Michael Paul, Sebastian Stüker


The NICT ASR system for IWSLT2011                                                                                                                          28-33

Kazuhiko Abe, Youzheng Wu, Chien-lin Huang, Paul R.Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka


The MIT-LL/AFRL IWSLT-2011 MT system.                                                                                                              34-40

A.Ryan Aminzadeh, Tim Anderson, Ray Slyh, Brian Ore, Eric Hansen, Wade Shen, Jennifer Drexler, Terry Gleason


The DCU machine translation systems for IWSLT 2011.                                                                                           41-48

Pratyush Banerjee, Hala Almaghout, Sudip Naskar, Johann Roturier, Jie Jiang, Andy Way, Josef van Genabith


The NICT translation system for IWSLT 2011.                                                                                                             49-56

Andrew Finch, Chooi-Ling Goh, Graham Neubig, Eiichiro Sumita


The MSR system for IWSLT 2011 evaluation.                                                                                                             57-61

Xiaodong He, Amittai Axelrod, Li Deng, Alex Acero, Mei-Yuh Hwang, Alisa Nguyen, Andrew Wang, Xiahui Huang


LIMSI’s experiments in domain adaptation for IWSLT11.                                                                                           62-67

Thomas Lavergne, Alexandre Allauzen, Hai-Son Le, François Yvon


LIG English-French spoken language translation system for IWSLT 2011                                                              68-72

Benjamin Lecouteux, Laurent Besacier, Hervé Blanchon


The KIT English-French translation systems for IWSLT 2011.                                                                                  73-78

Mohammed Mediani, Eunach Cho, Jan Niehues,Teresa Herrmann, Alex Waibel


LIUM’s systems for the IWSLT 2011 speech translation tasks.                                                                                79-85

Anthony Rousseau, Fethi Bougares, Paul Deléglise, Holger Schwenk, Yannick Estève


FBK@IWSLT 2011.                                                                                                                                                            86-93

                N.Ruiz, A.Bisazza, F.Brugnara, D.Falavigna, D.Giuliani, S.Jaber, R.Gretter, M.Federico


The 2011 KIT English ASR system for the IWSLT evaluation.                                                                                  94-97

Sebastian Stüker, Kevin Kilgour, Christian Saam, Alex Waibel


DFKI's SC and MT submissions to IWSLT 2011.                                                                                                          98-105

David Vilar, Eleftherios Avramidis, Maja Popović, Sabine Hunsicker


The RWTH Aachen machine translation system for IWSLT 2011.                                                                         106-113

Joern Wuebker, Matthias Huck, Saab Mansour, Markus Freitag, Minwei Feng, Stephan Peitz, Christoph Schmidt, Hermann Ney


Advances on spoken language translation in the Quaero program.                                                                          114-120

Karim Boudahmane, Bianka Buschbeck, Eunah Cho, Josep Maria Crego, Markus Freitag, Thomas Lavergne, Hermann Ney, Jan Niehues, Stephan Peitz, Jean Senellart, Artem Sokolov, Alex Waibel, Tonio Wandmacher, Joern Wuebker, François Yvon


Speech recognition for machine translation in Quaero.                                                                                               121-128

Lori Lamel, Sandrine Courcinous, Julien Despres, Jean-Luc Gauvain, Yvan Josse, Kevin Kilgour, Florian Kraft, Viet Bac Le, Hermann Ney, Markus Nußbaum-Thom, Ilya Oparin, Tim Schlippe, Ralf Schlüter, Tanja Schultz, Thiago Fraga da Silva, Sebastian Stüker, Martin Sundermeyer, Bianca Vieru, Ngoc Thang Vu, Alexander Waibel, Cécile Woehrling


Protocol and lessons learnt from the production of parallel corpora for the evaluation of speech translation systems.                129-135

Victoria Arranz, Olivier Hamon, Karim Boudahmane, Martine Garnier-Rizet


Fill-up versus interpolation methods for phrase-based SMT adaptation                                                                 136-143

Arianna Bisazza, Nick Ruiz, Marcello Federico


Semantic smoothing and fabrication of phrase pairs for SMT.                                                                                 144-150

Boxing Chen, Roland Kuhn, George Foster


SCFG latent annotation for machine translation.                                                                                                         151-158

               Tagyoung Chung, Licheng Fang, Daniel Gildea


Long-distance hierarchical structure transformation rules utilizing function words.                                              159-166

Chenchen Ding, Takashi Inui, Mikio Yamamoto


Investigation of the effects of ASR tuning on speech translation performance.                                                    167-174

Paul R.Dixon, Andrew Finch, Chiori Hori, Hideki Kashioka


Extending a probabilistic phrase alignment approach for SMT.                                                                                175-182

Mridul Gupta, Sanjika Hewavitharana, Stephan Vogel


Left language model state for syntactic machine translation.                                                                                   183-190

Kenneth Heafield, Hieu Hoang, Philipp Koehn, Tetsuo Kiso, Marcello Federico


Lexicon models for hierarchical phrase-based machine translation.

Matthias Huck, Saab Mansour, Simon Wiesler, Hermann Ney                                                                 191-198


The 2011 KIT QUAERO speech-to-text system for Spanish.                                                                                     199-205

Kevin Kilgour, Christian Saam, Christian Mohr, Sebastian Stüker, Alex Waibel


Named entity translation using anchor texts.                                                                                                                206-213

Wang Ling, Pável Calado, Bruno Martins, Isabel Trancoso, Alan Black, Luísa Coheur


Unsupervised vocabulary selection for simultaneous lecture translation.                                                                214-221

Paul Maergner, Kevin Kilgour, Ian Lane, Alex Waibel


Combining translation and language model scoring for domain-specific data filtering.                                       222-229

Saab Mansour, Joern Wuebker, Hermann Ney


Using Wikipedia to translate domain-specific terms in SMT.                                                                                     230-237

Jan Niehues, Alex Waibel


Modeling punctuation prediction as machine translation.                                                                                           238-245

Stephan Peitz, Markus Freitag, Arne Mauser, Hermann Ney


Soft string-to-dependency hierarchical machine translation.                                                                                     246-253

Jan-Thorsten Peter, Matthias Huck, Hermann Ney, Daniel Stein


Speaker alignment in synthesised, machine translation communication.                                                                254-260

Anne H.Schneider, Saturnino Luz:


How good are your phrases? Assessing phrase quality with single class classification.                                     261-268

Nadi Tomeh, Marco Turchi, Guillaume Wisinewski, Alexandre Allauzen, François Yvon


Annotating data selection for improving machine translation.                                                                                  269-274

Keiji Yasuda, Hideo Okuma, Masao Utiyama, Eiichiro Sumita