Machine Translation Archive

Index to evaluation measures

Publications 2005-2009

For other periods go to: Publications since 2010; publications 2000-2004; publications 1990-1999; publications 1970-1989;  publications before 1970

To return to home page click here

Automatic evaluation see Evaluation measures and metrics

Back translation

(2009) Dmitry Davidov & Ari Rappoport: Enhancements of lexical concepts using cross-lingual web mining. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.852-861. [PDF, 162KB]

(2009) Alain Désilets & Matthieu Hermet: Using automatic roundtrip translation to repair general errors in second language writing. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.198-206. [PDF, 187KB]

(2009) Reinhard Rapp: The back-translation score: automatic MT evaluation at the sentence level without reference translations. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.133-136. [PDF, 146KB]

(2005) Harold Somers: Round-trip translation: what is it good for? Australasian Language Technology Workshop 2005 (ALTW 2005): Proceedings of the Workshop, 10-11 December 2005, University of Sydney; pp.71-77. [PDF, 106KB]

Confidence measures

 (2009) Jennifer DeCamp: What is missing in user-centric MT? MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 489-485. [PDF, 454KB]

(2009) Fei Huang: Confidence measure for word alignment. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.932-940. [PDF, 727KB]

(2009) Gregor Leusch & Hermann Ney: Edit distances with block movements and error rate confidence estimates [abstract]. Machine Translation 23 (2/3), September 2009; pp.129-140.

(2009) Sylvain Raybaud, Caroline Lavecchia, David Langlois, & Kamel Smaïli: Word- and sentence-level confidence measures for machine translation. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.104-111. [PDF, 305KB]

(2009) Sylvain Raybaud, David Langlois & Kamel Smaili: Efficient combination of confidence measures for machine translation. Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 6-10 September 2009, Brighton, UK; abstract [PDF]

(2009) Kuniko Saito & Kenji Imamura: Tag confidence measure for semi-automatically updating. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.168-176. [PDF, 204KB]

(2009) Lucia Specia, Nicola Cancedda, Marc Dymetman, Marco Turchi, & Nello Cristianini: Estimating the sentence-level quality of machine translation systems. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.28-35 [PDF, 379KB]

(2009) Lucia Specia, Craig Saunders, Marco Turchi, Zhuoran Wang & John Shawe-Taylor: Improving the confidence of machine translation quality estimates. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.136-143. [PDF, 82KB]

(2009) Lucia Specia, Nicola Cancedda, Marc Dymetman, Craig Saunders, Marco Turchi, Nello Cristianini, Zhuoran Wang & John Shawe-Taylor: Sentence-level confidence estimation for MT. SMART Workshop at EACL 2009, Barcelona, Spain, 13 May 2009. 35 slides. [PDF of PPT, 549KB]

(2008) Nguyen Bach, Qin Gao, & Stephan Vogel: Improving word alignment with language model based confidence scores. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.151-154. [PDF, 607KB]

(2008) Youssef Kadri & Jian-Yun Nie: A comparative study for query translation using linear combination and confidence measure. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.181-188. [PDF, 429KB]

(2007) Alberto Sanchis, Alfons Juan, & Enrique Vidal: Estimation of confidence measures for machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.407-412 [PDF, 129KB]

(2007) Nicola Ueffing & Hermann Ney: Word-level confidence estimation for machine translation. Computational Linguistics 33 (1), pp. 9-40. [PDF, 279KB]

(2006) Toshiyuki Takezawa & Tohru Shimizu: Performance improvement of dialog speech translation by rejecting unreliable utterances. Interspeech 2006: ICSLP Ninth International Conference on  Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006, paper 1100; abstract [PDF, 78KB]

(2005) Nicola Ueffing & Hermann Ney: Word-class confidence estimation for machine translation using phrase-based translation models.  HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 763-770. [PDF, 336KB]

(2005) Nicola Ueffing & Hermann Ney: Application of word-level confidence measures in interactive statistical machine translation. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 262-270. [PDF, 173KB]

Edit distance

(2009) Petr Homola, Vladislav Kuboň, & Pavel Pecina: A simple automatic MT evaluation metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.33-36. [PDF, 108KB]

(2009) Gregor Leusch & Hermann Ney: Edit distances with block movements and error rate confidence estimates [abstract]. Machine Translation 23 (2/3), September 2009; pp.129-140.

 (2008) Meghan Lammie Glenn, Stephanie Strassel, Lauren Friedman, & Haejoong Lee: Management of large annotation projects involving multiple human judges: a case study of GALE machine translation post-editing. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 144KB]

(2008) Cong-Phap Huynh, Christian Boitet, & Hervé Blanchon: SECTra_w.1: an online collaborative system for evaluating, post-editing and presenting MT translation corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 824KB]

(2008) Damianos Karakos, Jason Eisner, Sanjeev Khudanpur, & Markus Dreyer: Machine translation system combination using ITG-based alignments. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Short papers, June 16-17, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 81-84. [PDF, 65KB]

(2008) Antti-Veikko I. Rosti, Bing Zhang, Spyros Matsoukas, & Richard Schwartz: Incremental hypothesis alignment for building confusion networks with application to machine translation system combination. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.183-186. [PDF, 61KB]

(2007) Christopher Cieri, Stephanie Strassel, Meghan Lammie Glenn, & Lauren Friedman: Linguistic resources in support of various evaluation metrics. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 34pp. [PDF of PPT presentation, 1007KB]

(2006) Andrew T. Freeman, Sherri L. Condon & Christopher M. Ackerman: Cross linguistic name matching in English and Arabic: a “one to many mapping” extension of the Levenshtein edit distance algorithm. HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 471-478 [PDF, 371KB]

(2006) Gregor Leusch, Nicola Ueffing, & Hermann Ney: CDER: efficient MT evaluation using block movements. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, April 3-7, 2006; pp.241-248 [PDF, 146KB]

(2006) Agam Patel & Dragomir R.Radev: Lexical similarity can distinguish between automatic and manual translations. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1230-1235 [PDF, 326KB]

(2006) Mark Przybocki, Gregory Sanders, & Audrey Le: Edit distance: a metric for machine translation evaluation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2038-2043 [PDF, 446KB]

(2006) Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, & John Makhoul: A study of translation edit rate with targeted human annotation.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.223-231 [PDF, 185KB]

(2005) Takao Doi, Hirofumi Yamamoto, & Eiichiro Sumita: Graph-based retrieval for example-based machine translation using edit-distance MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.51-58. [PDF, 2335KB]

Error detection and correction

(2009) Joshua S.Albrecht, Rebecca Hwa, & G.Elisabeta Marai: Correcting automatic translations through collaborations between MT and monolingual target-language users.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.60-68. [PDF, 178KB]

(2009) Michael Auli, Adam Lopez, Hieu Hoang & Philipp Koehn: A systematic analysis of translation model search spaces. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.224-232. [PDF, 101KB]

(2009) Jennifer DeCamp: What is missing in user-centric MT? MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 489-485. [PDF, 454KB]

(2009) Alain Désilets & Matthieu Hermet: Using automatic roundtrip translation to repair general errors in second language writing. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.198-206. [PDF, 187KB]

(2009) Mireia Farrús, Marta R.Costa-jussà, Marc Poch, Adolfo Hernández, & José B.Mariño: Improving a Catalan-Spanish statistical translation system using morphosyntactic knowledge. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.52-57. [PDF, 399KB]

(2009) Cyril Goutte, David Kurokawa, & Pierre Isabelle: Improving SMT by learning translation direction. SMART Workshop at EACL 2009, Barcelona, Spain, 13 May 2009. 24 slides. [PDF of PPT, 116KB]

(2009) Gregor Leusch & Hermann Ney: Edit distances with block movements and error rate confidence estimates [abstract]. Machine Translation 23 (2/3), September 2009; pp.129-140.

(2009) Wei-Yun Ma & Kathleen McKeown: Where’s the verb? Correcting machine translation during question answering. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.333-336. [PDF, 170KB]

(2009) Kuniko Saito & Kenji Imamura: Tag confidence measure for semi-automatically updating. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.168-176. [PDF, 204KB]

(2009) Anders Søgaard & Jonas Kuhn: Empirical lower bounds on alignment error rates in syntax-based machine translation.  Proceedings of SSST-3: Third Workshop on Syntax and Structure in Statistical Translation, Boulder, Colorado, 5 June 2009; pp.19-27. [PDF, 138KB]

(2009) Phan Thi Thanh Thao: Grammatical and lexical errors analysis of English-Vietnamese translation texts with the Google & EVTRAN engines and post-editing tasks.  ISMTCL: International Symposium on Data and Sense Mining, Machine Translation and Controlled Languages, and their application to emergencies and safety critical domains, July 1-3, 2009, Centre Tesnière, University of Franche-Comté, Besançon, France (Presses universitaires de Franche-Comté, 2009); pp.190-197 [abstract]

(2008) Takehashi Abekawa & Kyo Kageura: Constructing a corpus that indicates patterns of modification between draft and final translations by human translators. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 65KB]

(2008) Michael Gamon, Jianfeng Gao, Chris Brockett, Alexandre Klementiev, William B.Dolan, Dmitriy Belenko, & Lucy Vanderwende: Using contextual speller techniques and language modelling for ESL error correction. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.449-456. [PDF, 475KB]

(2008) Jesús Giménez & Luís Màrquez: Towards heterogeneous automatic MT error analysis. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 90KB]

(2008) Masaki Itagaki & Takako Aikawa: Post-MT term swapper: supplementing a statistical machine translation system with a user dictionary.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 352KB]

(2008) Tim Schlippe, ThuyLinh Nguyen, & Stephan Vogel: Diacritization as a machine translation problem and as a sequence labeling problem. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.270-278 [PDF, 701KB]

(2008) Ming Zhou, Bo Wang, Shujie Liu, Mu Li, Dongdong Zhang, & Tiejun Zhao: Diagnostic evaluation of machine translation systems using automatically constructed check-points. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.1121-1128. [PDF, 231KB]

(2007) Yi Chang, Ying Zhang, Stephan Vogel, & Jie Yang: Enhancing image-based Arabic document translation using noisy channel correction model. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.89-95 [PDF, 157KB]

(2007) Ariadna Font Llitjós, Jaime Carbonell, & Alon Lavie: Improving transfer-based MT systems with automatic refinements. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.183-190 [PDF, 339KB]

(2007) Ariadna Font Llitjós & William A. Ridmann: The inner works of an automatic rule refiner for machine translation. METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 10pp. [PDF, 412KB]

(2007) John Lee, Ming Zhou, & Xiaohua Liu: Detection of non-native sentences using machine-translated training data. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; Companion volume, pp.93-96 [PDF, 188KB]

(2007) Young-Ae Seo, Chang-Hyun Kim, Seong-Il Yang, & Young-gil Kim: Getting professional translation through user interaction. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.413-419 [PDF, 479KB]

(2007) Yokoyama Shoichi & Kennendai Shigehiro: Error correcting system for analysis of Japanese patent sentences. MT Summit XI Workshop on patent translation, 11 September 2007, Copenhagen, Denmark; pp.24-27. [PDF, 73KB]

(2006) Chris Brockett, William B.Dolan, & Michael Gamon: Correcting ESL errors using phrasal SMT techniques.  Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.249-256. [PDF, 109KB]

(2006) Jakob Elming: Transformation-based correction of rule-based MT. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.219-226 [PDF, 124KB]

(2006) Ariadna Font-Llitjós: Can the Internet help improve machine translation?  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 219-222 [PDF, 135KB]

(2006) David Kauchak: Contributions to research on machine translation. PhD thesis, University of California, San Diego, 2006. xiv,92pp. [PDF, 491KB]

(2005) Graham Russell, Ngoc Tran Nguyen, & George Foster: Automatic detection of translation errors: the state of the art. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; Demonstration abstracts, p. 1. [PDF, 75KB]

(2005) Graham Russell: Automatic detection of translation errors: the TransCheck system.  Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 17pp. [PDF, 79KB]

(2005) Advaith Siddharthan & Kathleen McKeown: Improving multilingual summarization: using redundancy in the input to correct MT errors.  HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 33-40. [PDF, 255KB]

Evaluation measures and metrics

(2009) Enrique Amigó, Jesús Giménez, Julio Gonzalo, & Felisa Verdejo: The contribution of linguistic features to automatic machine translation evaluation. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.306-314. [PDF, 518KB]

(2009) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Evaluation-guided pre-editing of source text: improving MT-tractability of light verb constructions. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.36-43 [PDF, 336KB]

(2009) Chris Callison-Burch: Fast, cheap, and creative: evaluating translation quality using Amazon’s Mechanical Turk. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.286-295. [PDF, 289KB]

(2009) Jieun Chae & Ani Nenkova: Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.139-147. [PDF, 120KB]

(2009) Yee Seng Chan & Hwee Tou Ng: MaxSim: performance and effects of translation fluency [abstract]. Machine Translation 23 (2/3), September 2009; pp.157-168.

(2009) Sherri Condon, Gregory A.Sanders, Dan Parvaz, Alan Rubenstein, Christy Doran, John Aberdeen, & Beatrice Oshika: Normalization for automated metrics: English and Arabic speech translation. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 33-40. [PDF, 326KB]

(2009) John DeNero: Estimation problems in machine translation (learning to translate). Presentation at Johns Hopkins University Summer School on Human Language Technology, June 12, 2009; 88 slides [PDF of PPT slides, 3496KB]

(2009) Stephen Doherty & Sharon O'Brien: Can MT output be evaluated through eye tracking? MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.214-221. [PDF, 103KB]

(2009) Hiroshi Echizen-ya, Terumasa Ehara, Sayori Shimohata, Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, Takehito Utsuro, & Noriko Kando: Meta-evaluation of automatic evaluation methods for machine translation using patent translation data in NTCIR-7. MT Summit XII: Third Workshop on Patent Translation, August 30, 2009, Ottawa, Ontario, Canada; pp. 9-16. [PDF, 502KB]

(2009) Daniel Galron, Sergio Penkale, Andy Way, & I.Dan Melamed: Accuracy-based scoring for DOT: towards direct error minimization for data-oriented translation. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.371-380. [PDF, 164KB]

(2009) Jesús Giménez: Empirical machine translation and its evaluation. SMART Workshop at EACL 2009, Barcelona, Spain, 13 May 2009. 144 slides. [PDF of PPT, 5606KB]

(2009) Jesús Giménez & Lluís Màrquez: On the robustness of syntactic and semantic features for automatic MT evaluation.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.250-258. [PDF, 171KB]

(2009) Yifan He & Andy Way: Improving the objective function in minimum error rate training. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.238-245. [PDF, 82KB]

(2009) Yifan He & Andy Way: Learning labeled dependencies in machine translation evaluation. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.44-51. [PDF, 426KB]

(2009) Petr Homola, Vladislav Kuboň, & Pavel Pecina: A simple automatic MT evaluation metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.33-36. [PDF, 108KB]

(2009) Jeremy G.Kahn, Matthew Snover, & Mari Ostendorf: Expected dependency pair match: predicting translation quality with expected syntactic structure [abstract]. Machine Translation 23 (2/3), September 2009; pp.169-179.

(2009) Kimmo Kettunen: Choosing the best MT programs for CLIR purposes -- can MT metrics be helpful? Advances in Information Retrieval: 31st European Conference on IR Research (ECIR 2009), ed. M.Boughanan et al.,Toulouse, France, April 6-9, 2009 (Lecture Notes in Computer Science 5478, Springer, Berlin); pp. 706–712. [PDF, 166KB]

(2009) Kimmo Kettunen: Facing the machine translation Babel in CLIR – can MT metrics help in choosing CLIR resources? (IIS-2009) Recent Advances in Intelligent Information Systems, ed. M.A.Klopotek, A. Przepiórkowski, S.T. Wierzchon, K. Trojanowski, Kraków, Poland, June 15-18, 2009; pp.103-116. [PDF, 145KB]

(2009) Kimmo Kettunen: Packing it all up in search for a language independent MT quality measure tool. Proceedings of 4th Langauge and Technology Conference, November 6-8, Poznań, Poland; pp.280-284. [PDF, 939KB]

(2009) Kamil Kos & Ondřej Bojar: Evaluation of machine translation metrics for Czech as the target language. Prague Bulletin of Mathematical Linguistics, no.92, December 2009; pp.135-147 [PDF, 139KB]

(2009) Alon Lavie & Michael J.Denkowski: The METEOR metric for automatic evaluation of machine translation [abstract]. Machine Translation 23 (2/3), September 2009; pp.105-115.

(2009) Alon Lavie & Mark Przybocki: Introduction to the special issue on “Automated metrics for machine translation evaluation”. Machine Translation 23 (2/3), September 2009; pp.69-70. [see publication]

(2009) Gregor Leusch & Hermann Ney: Edit distances with block movements and error rate confidence estimates [abstract]. Machine Translation 23 (2/3), September 2009; pp.129-140.

(2009) Adam Lopez: Evaluating translation quality. Third Machine Translation Marathon, Prague, Czech Republic, 26-30 January 2009; pp.13-18 [PDF, 86KB]

(2009) Sebastian Padó, Daniel Cer, Michel Galley, Dan Jurafsky, & Christopher D.Manning: Measuring machine translation quality as semantic equivalence: a metric based on entailment features [abstract]. Machine Translation 23 (2/3), 2009; pp.181-193.

(2009) Sebastian Padó, Michel Galley, Dan Jurafsky & Chris Manning: Robust machine translation evaluation with entailment features. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.297-305. [PDF, 437KB]

(2009) Sebastian Padó, Michel Galley, Dan Jurafsky, & Christopher D.Manning: Textual entailment features for machine translation evaluation.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.37-41. [PDF, 378KB]

(2009) Maja Popović & Hermann Ney: Syntax-oriented evaluation measures for machine translation output.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.29-32. [PDF, 278KB]

(2009) Mark Przybocki, Kay Peterson, Sebastian Bronsart, & Gregory Sanders: The NIST 2008 metrics for machine translation challenge—overview, methodology, metrics, and results [abstract]. Machine Translation 23 (2/3), September 2009; pp.71-103.

(2009) Reinhard Rapp: The back-translation score: automatic MT evaluation at the sentence level without reference translations. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.133-136. [PDF, 146KB]

(2009) Sylvain Raybaud, Caroline Lavecchia, David Langlois, & Kamel Smaïli: Word- and sentence-level confidence measures for machine translation. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.104-111. [PDF, 305KB]

(2009) Manny Rayner, Paula Estrella, Pierrette Bouillon, Sonia Halimi & Yukie Nakao: Using artificial data to compare the difficulty of using statistical machine translation in different language-pairs. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.300-307. [PDF, 169KB]

(2009) Matthew Snover, Nitin Madnani, Bonnie J.Dorr, & Richard Schwartz: Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.259-268. [PDF, 147KB]

(2009) Matthew G.Snover, Nitin Madnani, Bonnie Dorr, & Richard Schwartz: TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Error Rate [abstract]. Machine Translation 23 (2/3), September 2009; pp.117-127.

(2009) Lucia Specia, Nicola Cancedda, Marc Dymetman, Marco Turchi, & Nello Cristianini: Estimating the sentence-level quality of machine translation systems. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.28-35 [PDF, 379KB]

(2009) Midori Tatsumi: Correlation between automatic evaluation metric scores, post-editing speed, and some other factors. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.332-339. [PDF, 515KB]

(2009) Kommaluri Vijayanand, Inampudi Ramesh Babu, & Poonguzhali Sandiran: Testing and performance evaluation of machine transliteration system for Tamil language. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.48-51. [PDF, 118KB]

(2009) Bo Wang, Tiejun Zhao, Muyun Yang, & Sheng Li: References extension for the automatic evaluation of MT by syntactic hybridization.  Proceedings of SSST-3: Third Workshop on Syntax and Structure in Statistical Translation, Boulder, Colorado, 5 June 2009; pp.37-44. [PDF, 378KB]

(2009) Billy Wong & Chunyu Kit: ATEC: automatic evaluation of machine translation via word choice and word order [abstract]. Machine Translation 23 (2/3), September 2009; pp.141-151.

(2009) Omar F.Zaidan & Chris Callison-Burch: Feasibility of human-in-the-loop minimum error rate training. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.52-61. [PDF, 321KB]

(2009) Hongmei Zhao, Jun Xie, Qun Liu, Yajuan Lü, Dongdong Zhang, & Mu Li: Introduction to China's CWMT2008 machine translation evaluation. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 168-175. [PDF, 208KB]

(2009) The 2009 NIST machine translation evaluation plan (MT09). [NIST, 2009]; 8pp. [PDF, 360KB]

(2008) Abhaya Agarwal & Alon Lavie: Meteor, M-BLEU and M-TER: Evaluation metrics for high-correlation with human rankings of machine translation output.  ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.115-118. [PDF, 148KB]

(2008) Joshua S.Albrecht & Rebecca Hwa: Regression for machine translation evaluation at the sentence level [abstract].  Machine Translation 22 (1/2), March-June 2008; pp.1-27.

(2008) Joshua S. Albrecht & Rebecca Hwa: The role of pseudo reference in MT evaluation. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.187-190. [PDF, 53KB]

(2008) Bogdan Babych & Anthony Hartley: Sensitivity of automated MT evaluation metrics on higher quality MT output: BLEU vs task-based evaluation methods. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 57KB]

(2008) Alexandra Birch, Miles Osborne, & Philipp Koehn: Predicting success in machine translation.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.745-754. [PDF, 405KB]

(2008) Chris Callison-Burch, Trevor Cohn, & Mirella Lapata: ParaMetric: an automatic evaluation metric for paraphrasing. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.97-104. [PDF, 232KB]

(2008) Yee Seng Chan & Hwee Tou Ng: MAXSIM: a maximum similarity metric for machine translation evaluation. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 55-62. [PDF, 125KB]

(2008) Colin Cherry & Chris Quirk: Discriminative, syntactic language modeling through latent SVMs. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.65-74. [PDF, 609KB]

(2008) David Chiang, Steve DeNeefe, Yee Seng Chan & Hwee Tou Ng: Decomposability of translation metrics for improved evaluation and efficient algorithms.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.610-619. [PDF, 316KB]

(2008) Sherri Condon, Jon Phillips, Christy Doran, John Aberdeen, Dan Parvaz, Beatrice Oshika, Greg Sanders, & Craig Schlendoff: Applying automated metrics to speech translation dialogs.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 178KB]

(2008) Jennifer Doyon, Christine Doran, C.Donald Means, & Domenique Parr: Automated machine translation improvement through post-editing techniques: analyst and translator experiments. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.346-353. [PDF, 712KB]

(2008) Kevin Duh: Ranking vs. regression in machine translation evaluation. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.191-194. [PDF, 91KB]

(2008) Paula Estrella, Andrei Popescu-Belis, & Maghi King: Improving quality models for MT evaluation based on evaluators’ feedback. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 196KB]

(2008) Nicola Ferro & Carol Peters: From CLEF to TrebleCLEF: the evolution of the cross-language evaluation forum. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp.577-593. [PDF, 2080KB]

(2008) Lauren Friedman & Stephanie Strassel: Identifying common challenges for human and machine translation: a case study from the GALE program. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.364-369. [PDF, 548KB]

(2008) Lauren Friedman, Haejoong Lee, & Stephanie Strassel: A quality control framework for gold standard reference translations: the process and toolkit developed for GALE. Translating and the Computer 30, 27-28 November 2008, London; 6pp. [PDF, 77KB]

(2008) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Toward the evaluation of machine translation using patent information.  AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.97-106. [PDF, 649KB]

(2008) Jesús Giménez & Lluís Màrquez: A smorgasbord of features for automatic MT evaluation. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.195-198. [PDF, 98KB]

(2008) Jesús Giménez & Lluís Màrquez: Heterogeneous automatic MT evaluation through non-parametric metric combinations. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.319-326. [PDF, 327KB]

(2008) Jesús Giménez & Luís Màrquez: Towards heterogeneous automatic MT error analysis. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 90KB]

(2008) Olivier Hamon & Djamel Mostefa: The impact of reference quality on automatic MT evaluation.  Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.39-42. [PDF, 66KB]

(2008) Meghan Lammie Glenn, Stephanie Strassel, Lauren Friedman, & Haejoong Lee: Management of large annotation projects involving multiple human judges: a case study of GALE machine translation post-editing. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 144KB]

(2008) Olivier Hamon, Djamel Mostefa, & Victoria Arranz: Diagnosing human judgments in MT evaluation: an example based on the Spanish language. MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; pp. 19-26. [PDF, 308KB]

(2008) Olivier Hamon & Djamel Mostefa: An experimental methodology for an end-to-end evaluation in speech-to-speech translation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 93KB]

(2008) Kimmo Kettunen: Machine translation meets frequent case generation in query translation based CLIR. SLTC 2008: Second Swedish Language Technology Conference, November 20-21, 2008, Stockholm; pp.61-62. [PDF, 322KB]

(2008) Katsunori Kotani, Takehiko Yoshimi, Takeshi Kutsumi, Ichiko Sata, & Hitoshi Isahara: A method of automatically evaluating machine translations using a word-alignment based classifier. MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; pp. 11-18. [PDF, 259KB]

(2008) Nitin Madnani, Philip Resnik, Bonnie J.Dorr, & Richard Schwartz: Are multiple reference translations necessary? Investigating the value of paraphrased reference translations in parameter optimization. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.143-152. [PDF, 655KB]

(2008) Arne Mauser, Saša Hasan, & Hermann Ney: Automatic evaluation measures for statistical machine translation – system optimization.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 74KB]

(2008) Joaquim Moré López & Salvador Climent Roca: A machine translationness typology for MT evaluations. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.130-139. [PDF, 496KB]

(2008) Sylwia Ozdowska: Cross-corpus evaluation of word alignment.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 63KB]

(2008) Mark Przybocki, Kay Peterson, & Sébastien Bronsart: Translation adequacy and preference evaluation tool (TAP-ET).  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 605KB]

(2008) Gregory A. Sanders, Sébastien Bronsart, Sherri Condon, & Craig Schlenoff: Odds of successful transfer of low-level concepts: a key metric for bidirectional speech-to-speech machine translation in DARPA’s TRANSTAC program.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 177KB]

(2008) Suqi Sun, Yin Chen & Jufeng Li: A re-examination on features in regression based approach to automatic MT evaluation. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the student research workshop, June 16, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 25-30. [PDF, 234KB]

(2008) A. Cüneyd Tantuğ, Kemal Oflazer, & İlknur D.El-Kahlout: BLEU+: a tool for fine-grained BLEU computation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 753KB]

(2008) Calandra R. Tate: A statistical analysis of automated MT evaluation metrics for assessments in task-based MT evaluation. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.182-191. [PDF, 583KB]

(2008) Clare R. Voss, Jamal Laoudi, & Jeffrey Micher: Exploitation of an Arabic language resource for MT evaluation: using Buckwalter-based lookup tool to augment CMU alignment algorithm.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 246KB]

(2008) Ming Zhou, Bo Wang, Shujie Liu, Mu Li, Dongdong Zhang, & Tiejun Zhao: Diagnostic evaluation of machine translation systems using automatically constructed check-points. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.1121-1128. [PDF, 231KB]

(2008) NIST: MetricsMATR challenge. [NIST, 2008]. 6pp. [PDF, 35KB]

(2008)  The 2008 NIST open machine translation evaluation plan (MT08). [NIST, 2008]; 7pp. [PDF, 360KB]

(2007) proceedings of IWSLT 2007: International Workshop on Spoken Language Translation, 15-16 October 2007, Trento, Italy.

(2007) Eneko Agirre, Bernardo Magnini, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau, & Piek Vossen: SemEval-2007 task 01: evaluating WSD on cross-language information retrieval. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.1-6 [PDF, 79KB]

(2007) Joshua S. Albrecht & Rebecca Hwa: Regression for sentence-level MT evaluation with pseudo references. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 296-303 [PDF, 98KB]

(2007) Joshua S. Albrecht & Rebecca Hwa: A re-examination of machine learning approaches for sentence-level MT evaluation. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 880-887 [PDF, 126KB]

(2007) Núria Bel: Review of Dubkjaer, Laila; Hemsen, Holmer; Minker, Wolfgang (eds.) Evaluation of text and speech systems.  Machine Translation 21 (1), March 2007; pp.73-76. [see publication]

(2007) Hiroshi Echizen-ya & Kenji Araki: Automatic evaluation of machine translation based on recursive acquisition of an intuitive common parts continuum. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.151-158 [PDF, 589KB]

(2007) Paula Estrella, Olivier Hamon, & Andrei Popescu-Belis: How much data is needed for reliable MT evaluation? Using bootstrapping to study human and automatic metrics. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.167-174 [PDF, 143KB]

(2007) Paula Estrella, Andrei Popescu-Belis, & Maghi King: A new method for the study of correlations between MT evaluation metrics. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.55-64 [PDF, 949KB]; presentation [PDF, 95KB]

(2007) Alexander Fraser & Daniel Marcu: Measuring word alignment quality for statistical machine translation. Computational Linguistics 33 (3), pp. 293-303. [PDF, 163KB]

(2007) Jesús Giménez & Lluís Màrquez: Linguistic features for automatic evaluation of heterogenous MT systems. ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 256-264 [PDF, 134KB]

(2007) Olivier Gouirand: A probabilistic approach to linguistic analysis in machine translation output evaluation. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.46-54 [PDF, 2182KB]

(2007) Yvette Graham, Deirdre Hogan, & Josef van Genabith: Automatic evaluation of generation and parsing for machine translation with automatically acquired transfer rules. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.5-12 [PDF, 1796KB]

(2007) Olivier Hamon, Anthony Hartley, Andrei Popescu-Belis, & Khalid Choukri: Assessing human and automated quality judgments in the French MT evaluation campaign CESTA. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp. 231-238 [PDF, 93KB]

(2007) Douglas Jones, Martha Herzog, Hussny Ibrahim, Arvind Jairam, Wade Shen, Edward Gibson, & Michael Emonts: ILR-based MT comprehension test with multi-level questions. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; Companion volume, pp.77-80 [PDF, 67KB]

(2007) Sarvnaz Karimi, Andrew Turpin, & Falk Scholer: Corpus effects on the evaluation of automated transliteration systems. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 640-647 [PDF, 1215KB]

(2007) Maghi King, Andrei Popescu-Belis, & Paula Estrella: Context-based evaluation of MT systems: principles and tools. Tutorial at MT Summit XI, 10 September 2007, Copenhagen, Denmark; abstract, 1p. [PDF, 78KB]; outline and materials, 7pp. [PDF, 80KB]; presentation, 59pp. [PDF of PPT slides, 6897KB]

(2007) Katrin Kirchhoff, Owen Rambow, Nizar Habash, & Mona Diab: Semi-automatic error analysis for large-scale statistical machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.289-296 [PDF, 192KB]

(2007) Alon Lavie & Abhaya Agarwal: METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments. ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 228-231 [PDF, 159KB]

(2007) Ding Liu & Daniel Gildea: Source-language features and maximum correlation training for machine translation evaluation. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.41-48 [PDF, 201KB]

(2007) Dennis N. Mehay & Chris Brew: BLEUÂTRE: flattening syntactic dependencies for MT evaluation. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.122-131 [PDF, 370KB]; presentation [PDF, 802KB]

(2007) Joaquim Moré López & Salvador Climent Roca: A cheap MT evaluation method based on the notion of machine translationness.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 8pp. [PDF, 285KB]

(2007) Andrew Mutton, Mark Dras, Stephen Wan, & Robert Dale: GLEU: automatic evaluation of sentence-level fluency. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 344-351 [PDF, 157KB]

(2007) Karolina Owczarzak & Josef van Genabith: Evaluating machine translation with LFG dependencies [abstract]. Machine Translation 21 (2), June 2007; pp.95-119.

(2007) Karolina Owczarzak, Josef van Genabith, & Andy Way: Labelled dependencies in machine translation evaluation.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 104-111 [PDF, 239KB]

(2007) Karolina Owczarzak, Josef van Genabith, & Andy Way: Dependency-based automatic evaluation for machine translation. SSST, NAACL-HLT-2007 AMTA Workshop on Syntax and Structure in Statistical Translation, 26 April 2007, Rochester, NY; pp.80-87 [PDF, 245KB]

(2007) Michael Paul, Andrew Finch, & Eiichiro Sumita: Reducing human assessment of machine translation quality to binary classifiers. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.154-162 [PDF, 791KB]; presentation [PDF, 1127KB]

 (2007) Patrizia Pierini: Quality in web translation: an investigation intoUK and Italian tourism web sites. Journal of Specialised Translation 8 (July 2007); pp.85-103. [PDF, 329KB]

(2007) Andrei Popescu-Belis: Evaluation of NLG: some analogies and differences with machine translation and reference resolution. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.66-68 [PDF, 683KB]

(2007) Maja Popovic & Hermann Ney: Word error rates: decomposition over POS classes and applications for error analysis. ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 48-55 [PDF, 159KB]

(2007) Alberto Sanchis, Alfons Juan, & Enrique Vidal: Estimation of confidence measures for machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.407-412 [PDF, 129KB]

(2007) Nicolas Stroppa & Karolina Owczarzak: A cluster-based representation for multi-system MT evaluation. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.221-230 [PDF, 306KB]; presentation [PDF, 336KB]

(2007) Calandra Rilette Tate: An investigation of the relationship between automated machine translation evaluation metrics and user performance on an information extraction task. PhD dissertation, University of Maryland; xv,145pp. [PDF, 1170KB]

(2007) Gregor Thurmair, Khalid Choukri, & Bente Maegaard (eds.): MT Summit XI Workshop: Automatic procedures in MT evaluation, organised by the ELRA Evaluation Committee, 11 September 2007, Copenhagen, Denmark, [Proceedings]; programme, 2pp. [PDF, 111KB]; contents

(2007) Kiyotaka Uchimoto, Katsunori Kotani, Yujie Zhang, & Hitoshi Isahara: Automatic evaluation of machine translation based on rate of accomplishment of sub-goals. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.33-40 [PDF, 124KB]

(2007) Nicola Ueffing & Hermann Ney: Word-level confidence estimation for machine translation. Computational Linguistics 33 (1), pp. 9-40. [PDF, 279KB]

(2007) Kees van Deemter & Albert Gatt: Content determination in GRE: evaluating the evaluator. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.101-103 [PDF, 754KB]

(2007) David Vilar, Gregor Leusch, Rafael E.Banchs, & Hermann Ney: Human evaluation of machine translation through binary system comparisons.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 96-103 [PDF, 203KB]

(2007) Martin Volk & Søren Harder: Evaluating MT with translations or translators: what is the difference? MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.499-506 [PDF, 61KB]

(2007) Chao Wang & Stephanie Seneff: Automatic assessment of student translations for foreign language tutoring. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.468-475 [PDF, 124KB]

(2007) Yang Ye, Ming Zhou, & Chin-Yew Lin: Sentence level machine translation evaluation as a ranking problem: one step aside from BLEU.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 240-247 [PDF, 311KB]

(2006) Enrique Amigó, Jesús Giménez, Julio Gonzalo, & Lluís Màrquez: MT evaluation: human-like vs human-acceptable. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.17-24. [PDF, 2788KB]

(2006) Necip Fazil Ayan & Bonnie J. Dorr: Going beyond AER: an extensive analysis of word alignments and their impact on MT. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.9-16. [PDF, 274KB]

(2006) Christian Boitet, Youcef Bey, Mutsuko Tomokio, Wenjie Cao, & Hervé Blanchon: IWSLT-06: experiments with commercial MT systems and lessons from subjective evaluations.  International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; 8pp.  [PDF, 324KB]

(2006) A.Bonafonte, H.Höge, I.Kiss, A.Moreno, U.Ziegenhain, H.van den Heuvel, H.-U.Hain, X.S.Wang, M.N.Garcia: TC-STAR: specifications of language resources and evaluation for speech synthesis. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.311-314 [PDF, 272KB]

(2006) Chris Callison-Burch, Miles Osborne, & Philipp Koehn: Re-evaluating the role of BLEU in machine translation research. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, April 3-7, 2006; pp.249-256 [PDF, 151KB]

(2006) Matthias Eck, Stephan Vogel, & Alex Waibel: A flexible online server for machine translation evaluation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.89-94 [PDF, 111KB]

(2006) Jesús Giménez & Enrique Amigó: IQMT: a framework for automatic machine translation evaluation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.685-690 [PDF, 370KB]

(2006) O. Hamon & M. Rajman: X-score: automatic evaluation of machine translation grammaticality.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.155-160 [PDF, 318KB]

(2006) O. Hamon, A. Popescu-Belis, K.Choukri, M.Dabbadie, A.Hartley, W.Mustafa El Hadi, M.Rajman, & I.Timimi: CESTA: first conclusions of the Technolangue MT evaluation campaign.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.179-184 [PDF, 288KB]

(2006) Douglas Jones, Wade Shen, Brian Delaney, Martha Herzog, Michael Emonts, Sabine Atwell, James Dirgin, Neil Granoien, Sargon Jabri, Jurgen Sottung, Timothy Anderson, & Timothy Hunter: Toward an Interagency Language Roundtable based assessment of speech-to-speech translation capabilities. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.82-89 [PDF, 150KB]

(2006) David Kauchak: Contributions to research on machine translation. PhD thesis, University of California, San Diego, 2006. xiv,92pp. [PDF, 491KB]

(2006) David Kauchak & Regina Barzilay: Paraphrasing for automatic evaluation. HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 455-462 [PDF, 196KB]

(2006) Jamal Laoudi, Calandra R.Tate, & Clare R.Voss: Task-based MT evaluation: from who/when/where extraction to event understanding.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2048-2053 [PDF, 381KB]

(2006) Gregor Leusch, Nicola Ueffing, & Hermann Ney: CDER: efficient MT evaluation using block movements. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, April 3-7, 2006; pp.241-248 [PDF, 146KB]

(2006) Ding Liu & Daniel Gildea: Stochastic iterative alignment for machine translation evaluation.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.539-546. [PDF, 201KB]

(2006) Adam Lopez & Philip Resnik: Word-based alignment, phrase-based translation: what’s the link?  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.90-99 [PDF, 183KB]

(2006) Joaquim Moré & Salvador Climent: A cheap MT-evaluation method based on Internet searches. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.19-26 [PDF, 260KB]

(2006) Sara Morrissey & Andy Way: Lost in translation: the problems of using mainstream MT evaluation metrics for sign language translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.91-98. [PDF, 465KB]

(2006) Karolina Owczarzak, Declan Groves, Josef Van Genabith, & Andy Way: Contextual bitext-derived paraphrases in automatic MT evaluation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 86-93 [PDF, 222KB]

(2006) Agam Patel & Dragomir R.Radev: Lexical similarity can distinguish between automatic and manual translations. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1230-1235 [PDF, 326KB]

(2006) Andrei Popescu-Belis, Paula Estrella, Margaret King, & Nancy Underwood: A model for context-based evaluation of language processing systems and its application to machine translation evaluation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.691-696 [PDF, 536KB]

(2006) Maja Popović, Adrià de Gispert, Deepa Gupta, Patrik Lambert, Hermann Ney, José B. Mariño, Marcello Federico, & Rafael Banchs: Morpho-syntactic information for automatic error analysis of statistical machine translation output.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 1-6 [PDF, 306KB]

(2006) Mark Przybocki, Gregory Sanders, & Audrey Le: Edit distance: a metric for machine translation evaluation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2038-2043 [PDF, 446KB]

(2006) Florence Reeder: Direct application of a language learner test to MT evaluation. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.166-175 [PDF, 123KB]

(2006) Florence Reeder: Measuring MT adequacy using Latent Semantic Analysis.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.176-184 [PDF, 180KB]

(2006) David M.Rojas & Takako Aikawa: Predicting MT quality as a function of the source language.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2534-2537 [PDF, 613KB]

(2006) Salim Roukos: Rosetta: an analyst’s co-pilot. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; 60pp.  [PDF, 2789KB]

(2006) Horacio Saggion: Multilingual multidocument summarization tools and evaluation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1312-1317 [PDF, 757KB]

(2006) Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, & John Makhoul: A study of translation edit rate with targeted human annotation.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.223-231 [PDF, 185KB]

(2006) Harold Somers: Language engineering and the pathway to healthcare: a user-oriented view.  HLT-NAACL 2006: Proceedings of the  Workshop on Medical Speech Translation, 9 June 2006, New York, NY, USA; pp.32-39 [PDF, 187KB]

(2006) Harold Somers, Federico Gaspari, & Ana Niño: Detecting inappropriate use of free online machine translation by language students. A special case of plagiarism detection. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.41-48 [PDF, 276KB]

(2006) Calandra R. Tate & Clare R. Voss: Combining evaluation metrics via loss functions. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.242-250 [PDF, 106KB]

(2006) David Vilar, Jia Xu, Luis Fernando D’Haro, & Hermann Ney: Error analysis of statistical machine translation output.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.697-702 [PDF, 272KB]

(2006) Clare R. Voss & Calandra R. Tate: Task-based evaluation of machine translation (MT) engines. Measuring how well people extract who, when, where-type elements in MT output. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.203-212 [PDF, 235KB]

(2006) Liang Zhou, Chin-Yew Lin, & Eduard Hovy: Re-evaluating machine translation results with paraphrase support.  EMNLP-2006: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006; pp. 77-84. [PDF, 324KB]

(2006) The 2006 NIST machine translation evaluation plan (MT06). [NIST, 2006]; 6pp. [PDF, 92KB]

(2005) proceedings of ELRA-HLT Evaluation Workshop, Malta, 1-2 December 2005 [HTML]

(2005) Bogdan Babych, Anthony Hartley & Debbie Elliott: Estimating the predictive power of n-gram MT evaluation metrics across language and text types . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.412-418. [PDF, 180KB]

(2005) Bogdan Babych: Information extraction technology in machine translation: IE methods for improving and evaluating MT quality. Ph D thesis, University of Leeds, Centre for Translation Studies, March 2005. 186pp. [PDF, 859KB]

(2005) Satanjeev Banerjee & Alon Lavie: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. ACL-2005: Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, University of Michigan, Ann Arbor, 29 June 2005; pp. 65-72.[PDF, 225KB]

(2005) Christian Boitet: Gradable quality translations through mutualisation of human translation and revision, and UNL-based MT and coedition. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.395-412 [abstract, PDF, 41KB]

(2005) Chris Callison-Burch: [Statistical machine translation, lecture 4:] evaluation of translation quality. ESSLLI-2005: 17th European Summer School in Logic, Language and Information, Heriot-Watt University, Edinburgh, Scotland, 8-19 August 2005; 10pp. [PDF of PPT presentation, 226KB]

(2005) Andre Castilla, Alice Bacic, & Sergio Furuie: Machine translation on the medical domain: the role of BLEU/NIST and METEOR in a controlled vocabulary setting. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.47-54. [PDF, 156KB]

(2005) Etienne Denoual & Yves Lepage: BLEU in characters: towards automatic MT evaluation in languages without word delimiters. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.79-84. [PDF, 495KB]

(2005) Andrew Finch, Young-Sook Hwang, & Eiichiro Sumita: Using machine translation evaluation techniques to determine sentence-level semantic equivalence. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 17-24. [PDF, 111KB]

(2005) Jesús Giménez, Enrike Amigó, & Chiori Hori: Machine translation evaluation inside QARLA. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2005], 24-25 October, 2005, Pittsburgh, PA, USA; 8pp. [PDF, 91KB]

(2005) Michael Gamon, Anthony Aue, & Martine Smets: Sentence-level MT evaluation without reference translations: beyond language modeling. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 103-111. [PDF, 76KB]

(2005) Tony Hartley: MT evaluation – the little we know. ELRA-HLT Evaluation Workshop, Malta, 1-2 December 2005; 11pp. [PDF from PPT, 74KB]

(2005) John Henderson & William Morgan: Gaming fluency: evaluating the bounds and expectations of segment-based translation memory. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 175-182. [PDF, 194KB]

(2005) Douglas Jones, Edward Gibson, Wade Shen, Neil Granoien, Martha Herzog, Douglas Reynolds, & Clifford Weinstein: Measuring human readability of machine generated text: three case studies in speech recognition and machine translation. Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 18-23, 2005, Philadelphia, PA, USA; vol.5, pp. 1009-1012 [PDF, 233KB]

(2005) M.King: ISSCO and evaluation 1988-2005. ELRA-HLT Evaluation Workshop, Malta, 1-2 December 2005; 8pp. [PDF from PPT, 94KB]

(2005) Katsunori Kotani, Takehiko Yoshimi, Takeshi Kutsumi, Ichiko Sata & Hitoshi Isahara: Toward a unified evaluation method for multiple reading support systems: a reading speed-based procedure. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.244-249. [PDF, 113KB]

(2005) Katsunori Kotani, Takehiko Yoshimi, Takeshi Kutsumi, Ichiko Sata & Hiroshi Isahara: A useful-based evaluation of reading support systems: comprehension, reading speed and effective speed . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp 419-426. [PDF, 264KB]

(2005) Yves Lepage & Etienne Denoual: Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 57-64. [PDF, 131KB]

(2005) Gregor Leusch, Nicola Ueffing, David Vilar, & Hermann Ney: Preprocessing and normalization for automatic evaluation of machine translation.  ACL-2005: Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, University of Michigan, Ann Arbor, 29 June 2005; pp. 17-24. [PDF, 544KB]

(2005) Lucian Vlad Lita, Monica Rogati, & Alon Lavie: BLANC: learning evaluation metrics for MT. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 33-40. [PDF, 215KB]

(2005) Ding Liu & Daniel Gildea: Syntactic features for evaluation of machine translation. ACL-2005: Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, University of Michigan, Ann Arbor, 29 June 2005; pp. 25-32. [PDF, 158KB]

(2005) Evgeny Matusov, Gregor Leusch, Oliver Bender, & Hermann Ney: Evaluating machine translation output with automatic sentence segmentation. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2005], 24-25 October, 2005, Pittsburgh, PA, USA; 7pp. [PDF, 113KB]

(2005) Keith J. Miller & Michelle Vanni: Inter-rater agreement measures, and the refinement of metrics in the PLATO MT evaluation paradigm. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.125-132. [PDF, 183KB]

(2005) Andrei Popescu-Belis: Some questions from the discussant [of paper by Tony Hartley on MT evaluation]. ELRA-HLT Evaluation Workshop, Malta, 1-2 December 2005; 8pp. [PDF from PPT, 48KB]

(2005) Andrei Popescu-Belis, Paula Estrella, Margaret King, & Nancy Underwood: Towards automatic generation of evaluation plans for context-based MT evaluation. (ISSCO Working paper 64.) Geneva: ISSCO/ETI University of Geneva, August 2005. 18pp. [PDF, 175KB]

(2005) Laura Ramirez Polo & Johann Haller: Controlled language and the implementation of machine translation for technical documentation. Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 9pp. [PDF, 109KB]; presentation [PDF, 117KB]

(2005) Stefan Riezler & John T.Maxwell III: On some pitfalls in automatic evaluation and significance testing for MT. ACL-2005: Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, University of Michigan, Ann Arbor, 29 June 2005; pp. 57-64. [PDF, 115KB]

(2005) Anil Kumar Singh & Samar Husain: Comparison, selection and use of sentence alignment algorithms for new language pairs. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 99-106. [PDF, 124KB]

(2005) Harold Somers: Round-trip translation: what is it good for? Australasian Language Technology Workshop 2005 (ALTW 2005): Proceedings of the Workshop, 10-11 December 2005, University of Sydney; pp.71-77. [PDF, 106KB]

(2005) Sylvain Surcin, Olivier Hamon, Antony Hartley, Martin Rajman, Andrei Popescu-Belis, Widad Mustafa El Hadi, Ismaïl Timimi, Marianne Dabbadie, & Khalid Choukri: Evaluation of machine translation with predictive metrics beyond BLEU/NIST: CESTA evaluation campaign # 1. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.117-124. [PDF, 162KB]

(2005) Toshiyuki Takezawa, Keiji Yasuda, Masahide Mizushima, & Genichiro Kikui: Assessing degradation of spoken language translation by measuring speech recognizer's output against non-native speakers' listening capabilities. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.203-210. [PDF, 2073KB]

(2005) Gr. Thurmair: Automatic means of MT evaluation. ELRA-HLT Evaluation Workshop, Malta, 1-2 December 2005; 40pp. [PDF from PPT, 2974KB]

(2005) Kiyotaka Uchimoto, Naoko Hayashida, Toru Ishida, & Hitoshi Isahara: Automatic rating of machine translatability. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.235-242. [PDF, 271KB]

(2005) Nicola Ueffing & Hermann Ney: Word-class confidence estimation for machine translation using phrase-based translation models.  HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 763-770. [PDF, 336KB]

(2005) Ashish Venugopal, Andreas Zollmann, & Alex Waibel: Training and evaluating error minimization rules for statistical machine translation.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 208-215. [PDF, 112KB]

(2005) Andy Way & Nano Gough: Controlled translation in an example-based environment: what do automatic evaluation metrics tell us? [abstract]. Machine Translation 19 (1), 2005; pp.1-36.

(2005) The 2005 NIST machine translation evaluation plan (MT-05). [NIST, 2005];6pp. [PDF, 51KB]

Evaluations of systems (see also User experiences)

(2009) Romaric Besançon, Djamel Mostefa, Ismaïl Timimi, Stéphane Chaudiron, Mariama Laïb, & Khalid Choukri: Arabic, English and French: three languages in a filtering systems evaluation project. MEDAR 2009: 2nd International Conference on Arabic Language Resources & Tools, 22-23 April 2009, Cairo, Egypt; pp.163-167. [PDF, 465KB]

(2009) Alexandra Birch, Phil Blunsom & Miles Osborne: A quantitative analysis of reordering phenomena. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.197-205. [PDF, 808KB]

(2009) Krzysztof Bogacki: Controlled languages and machine translation. ISMTCL: International Symposium on Data and Sense Mining, Machine Translation and Controlled Languages, and their application to emergencies and safety critical domains, July 1-3, 2009, Centre Tesnière, University of Franche-Comté, Besançon, France (Presses universitaires de Franche-Comté, 2009); pp.49-55 [abstract]

(2009) Chris Callison-Burch, Philipp Koehn, Christof Monz, & Josh Schroeder: Findings of the 2009 Workshop on Statistical Machine Translation. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.1-28. [PDF, 544KB]

(2009) Nicola Cancedda: SMART final review meeting: introduction and overview, Luxembourg, 27 November 2009; 19 slides [PDF of PPT, 890KB]

(2009) Jordi Carrera & Alex Yanishevsky: Technology for translators: what doesn’t kill you, makes you stronger. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 410-416. [PDF of PPT presentation, 109KB]

(2009) Yi-Chang Chen & Chia-Ping Chen: A framework for machine translation output combination. ROCLING 2009: Proceedings of the 21st Conference on Computational Linguistics and Speech Processing, Taichung, Taiwan, 2009; pp.309-317. [PDF, 342KB]

(2009) Barrou Diallo: EPO machine translation programme: from research to development – experience over European and Asian languages. IPWARE Summit 2009: International conference and exhibition on Software for Intellectual Property, 21-23 October 2009, Saint-Raphaël, France; abstract, 1 pp. [PDF, 70KB]

(2009) Rebecca Fiederer & Sharon O’Brien: Quality and machine translation: a realistic objective?  Journal of Specialised Translation 4 (July 2005); pp.52-74. [PDF, 103KB]

(2009) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Exploiting patent information for the evaluation of machine translation. MT Summit XII: Third Workshop on Patent Translation, August 30, 2009, Ottawa, Ontario, Canada; pp. 1-8. [PDF, 108KB], presentation [PDF of PPT, 98KB]

(2009) Jesús Giménez & Lluís Màrquez: Discriminative phrase selection for SMT.  In: Cyril Goutte, Nicola Cancedda, Marc Dymetman, & George Foster (eds.) Learning machine translation. (Cambridge, Mass.: The MIT Press, 2009); pp.205-236.

(2009) Vishal Goyal & Gurpreet Singh Lehal: Evaluation of Hindi to Punjabi machine translation system. International Journal of Computer Science Issues, vol.4, no.1 (2009), pp.36-39. [PDF, 101KB]

(2009) Olivier Hamon, Christian Fügen, Djamel Mostefa, Victoria Arranz, Muntsin Kolss, Alex Waibel, & Khalid Choukri: End-to-end evaluation in simultaneous translation. EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.345-353. [PDF, 106KB]

(2009) Maxim Khalilov & José A.R.Fonollosa: N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.424-432. [PDF, 564KB]

(2009) Haizhou Li, A.Kumaran, Vladimir Pervouchine, & Min Zhang: Report of NEWS 2009 machine translation shared task. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.1-18. [PDF, 631KB]

(2009) Tomoki Nagase, Katsunori Kotani, Masaaki Nagata, Nobutoshi Hatanaka, Yoshiyuki Sakamoto, Eiichiro Sumita, & Kiyotaka Uchimoto: Evaluation of Japanese-Chinese MT system using AAMT’s test-set. CWMT 2009: the 5th China Workshop on Machine Translation, Nanjing, China, October 16-17, 2009; 8pp. [PDF, 811KB]

(2009) William Ogden, Ron Zacharski, Sieun An, & Yuki Ishikawa: User choice as an evaluation metric for web translation services in cross language instant messaging applications. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 97-103. [PDF, 376KB]

(2009) Michael Paul: Overview of the IWSLT 2009 evaluation campaign. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; pp. 1-18. [PDF, 398KB]; presentation [PDF of PPT, 481KB]

(2009) Mark Przybocki, Kay Peterson, Sebastian Bronsart, & Gregory Sanders: The NIST 2008 metrics for machine translation challenge—overview, methodology, metrics, and results [abstract]. Machine Translation 23 (2/3), September 2009; pp.71-103.

(2009) Manny Rayner, Paula Estrella, Pierrette Bouillon, Beth Ann Hockey & Yukie Nakao: Using artificially generated data to evaluate statistical machine translation. ACL-IJCNLP 2009: Workshop on Grammar Engineering Across Frameworks (GEAF 2009), Proceedings of the workshop, 6 August 2009, Suntec, Singapore; pp.54-62. [PDF, 214KB]

(2009) Mitra Shahahbi: An evaluation of output quality of machine translation program. [RANLP 2009] Student Research Workshop [held at], International conference: Recent Advances in Natural Language Processing. Proceedings ed. Irina Temnikova, Ivelina Nikolova, Natalia Konstantinova, Borovets, Bulgaria, 14-15 September 2009; pp. 71-75. [PDF, 133KB]

(2009) Marianne Starlander & Paula Estrella: Relating recognition and translation quality with usability of two different versions of MedSLT. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.324-331. [PDF, 84KB]

 (2009) Phan Thi Thanh Thao: Grammatical and lexical errors analysis of English-Vietnamese translation texts with the Google & EVTRAN engines and post-editing tasks.  ISMTCL: International Symposium on Data and Sense Mining, Machine Translation and Controlled Languages, and their application to emergencies and safety critical domains, July 1-3, 2009, Centre Tesnière, University of Franche-Comté, Besançon, France (Presses universitaires de Franche-Comté, 2009); pp.190-197 [abstract]

(2009) Gr.Thurmair: Will there be winners? [contribution to panel]. Translingual Europe 2009, May 13-14, Prague, Czech Republic; 4pp. [PDF of PPT, 172KB]

 (2009) Carol Van Ess-Dykema, Dennis Perzanowski, Susan Converse, Rachel Richardson, John S.White, & Tucker Maney: Translation memory technology assessment. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 550-559 [PDF of PPT presentation, 2534KB]

(2009) Eric Wehrli, Violeta Seretan, Luka Nerima, & Lorenza Russo: Collocations in a rule-based MT system: a case study evaluation of their translation adequacy. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.128-135. [PDF, 375KB]

(2009) NIST 2009 open machine translation evaluation (MT09), Official release of results: Arabic-English. [NIST, 2009]; [HTML, 301KB]

(2009) NIST 2009 open machine translation evaluation (MT09), Official release of results: Urdu-English. [NIST, 2009]; [HTML, 125KB]

(2009) NIST 2009 open machine translation evaluation (MT09), Official release of results: Combination tests. [NIST, 2009]; [HTML, 114KB]

(2009) NIST 2009 open machine translation evaluation (MT09), Official release of results: Progress test. [NIST, 2009]; [HTML, 108KB]

(2008) see IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA.

(2008) A.Allauzen & H.Bonneau-Maynard: Training and evaluation of POS taggers on the French MULTITAG corpus. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 113KB]

(2008) Dimitra Anastasiou: Identification of idioms by machine translation: a hybrid research system vs. three commercial systems.  EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.12-20. [PDF, 498KB]

(2008) Sivaji Bandyopadhyay, Tapabrata Mondal, Sudip Kumar Naskar, Asif Ekbal, Rejwanul Haque, & Srinivasa Rao Godavarthy: Bengali, Hindi and Telegu to English ad-hoc bilingual task. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; abstract, p.66. [PDF, 12KB]

(2008) Alexandra Birch, Miles Osborne, & Philipp Koehn: Predicting success in machine translation.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.745-754. [PDF, 405KB]

(2008) Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, & Josh Schroeder: Further meta-evaluation of machine translation. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.70-106. [PDF, 320KB]

(2008) Michael Carl: Using log-linear models for selecting best machine translation output. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 255KB]

(2008) Maxine Carpuat & Dekai Wu: Evaluation of context-dependent phrasal translation lexicons for statistical machine translation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 122KB]

(2008) Shin Chang-Meadows: MT errors in CH-to-EN MT systems: user feedback. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.324-333.  [PDF of PPT presentation, 778KB]

(2008) Manoj Kumar Chinnakotla, Sagar Ranadive, Om P.Damani & Pushpak Bhattacharyya: Hindi and  Marathi to English cross language information. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; abstract, p.64. [PDF, 12KB]

(2008) Sherri Condon, Jon Phillips, Christy Doran, John Aberdeen, Dan Parvaz, Beatrice Oshika, Greg Sanders, & Craig Schlendoff: Applying automated metrics to speech translation dialogs.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 178KB]

(2008) Jennifer Doyon, Christine Doran, C.Donald Means, & Domenique Parr: Automated machine translation improvement through post-editing techniques: analyst and translator experiments. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.346-353. [PDF, 712KB]

(2008) Loïc Dugast, Jean Senellart, & Philipp Koehn: Can we relearn an RBMT system? ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.175-178. [PDF, 87KB]

(2008) Andreas Eisele, Christian Federmann, Hervé Saint-Amand, Michael Jellinghaus, Teresa Herrmann, & Yu Chen: Using Moses to integrate multiple rule-based machine translation engines into a hybrid system. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.179-182. [PDF, 90KB]

(2008) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Overview of the patent translation task at the NTCIR-7 Workshop. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 389-400. [PDF, 637KB]

(2008) Cong-Phap Huynh, Christian Boitet, & Hervé Blanchon: SECTra_w.1: an online collaborative system for evaluating, post-editing and presenting MT translation corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 824KB]

(2008) Tatsuya Izuha, Akira Kumano, & Yuka Kuroda: Toshiba rule-based machine translation system at NTCIR-7 PAT MT. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 430-434. [PDF, 545KB]

(2008) Janne Bondi Johannessen, Torbjørn Nordgård, & Lars Nygaard: Evaluation of linguistics-based translation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 249KB]

(2008) Youssef Kadri & Jian-Yun Nie: A comparative study for query translation using linear combination and confidence measure. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.181-188. [PDF, 429KB]

(2008) Debasis Mandal, Sandipan Dandapat, Mayank Gupta, Pratyush Banerjee, & Sudeshna Sarkar: Bengali and Hindi to English CLIR evaluation. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; abstract, p.65. [PDF, 12KB]

(2008) Teruko Mitamura, Eric Nyberg, Hideki Shima, Tsuneaki Kato, Tatsunori Mori, Chin-Yew Lin,  Ruihua Song, Chuan-Jie Lin, Tetsuya Sakai, Donghong Ji, & Noriko Kando: Overview of the NTCIR-7 ACLIA tasks: advanced cross-lingual information access. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp.11-25. [PDF, 1062KB]

(2008) Mandar Mitra & Prasenjit Majumder: FIRE: forum for information retrieval evaluation. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; abstract, p.69. [PDF, 11KB]

(2008) Lene Offersgaard, Claus Povlsen, Lisbeth Almsten, & Bente Maegaard: Domain specific MT in use. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.150-159. [PDF, 554KB]

(2008) Constantin Orasan & Oana Andreea Chiorean: Evaluation of a cross-lingual Romanian-English multi-document summariser.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 74KB]

(2008) Michael Paul: Overview of the IWSLT 2008 evaluation campaign. IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA; pp. 1-17. [PDF, 262KB]; presentation [PDF, 765KB]

(2008) Carol Peters, Martin Braschler, Giorgio Di Nunzio, Nicola Ferro, Julio Gonzalo, & Mark Sanderson: From research to application in multilingual information access: the contribution of evaluation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 55KB]

(2008) Marianne Starlander, Pierrette Bouillon, Glenn Flores, Manny Rayner, & Niks Tsourakis: Comparing two different bidirectional versions of the limited-domain medical spoken language translator MedSLT.  EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.176-181. [PDF, 521KB]

(2008) Vincent Vandeghinste, Peter Dirix, Ineke Schuurman, Stella Markantonatou, Sokratis Sofianopoulos, Marina Vassiliou, Olga Yannoutsou, Toni Badia, Maite Melero, Gemma Boleda, Michael Carl, & Paul Schmidt: Evaluation of a machine translation system for low resource languages: METIS-II.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 78KB]

(2008) Clare R.Voss, Matthew Aguirre, Jeffrey Micher, Richard Chang, Jamal Laoudi, & Reginald Hobbs: Boosting performance of weak MT engines automatically: using MT output to align segments & build statistical post-editors. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.192-201. [PDF, 506KB]

(2008) Clare R. Voss, Jamal Laoudi, & Jeffrey Micher: Exploitation of an Arabic language resource for MT evaluation: using Buckwalter-based lookup tool to augment CMU alignment algorithm.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 246KB]

(2008) Brian A. Weiss, Craig Schlenoff, Greg Sanders, Michelle P.Steves, Sherri Condon, Jon Phillips, & Dan Parvaz: Performance evaluation of speech translation systems.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 123KB]

(2008) Andreas Zollmann, Ashish Venugopal, Franz Och, & Jay Ponte: A systematic comparison of phrase-based, hierarchical and syntax-augmented statistical MT. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.1145-1152. [PDF, 153KB]

(2008) Simon Zwarts & Mark Dras: Choosing the right translation: a syntactically informed classification approach. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.1153-1160. [PDF, 194KB]

(2008) NIST: MetricsMATR challenge. [NIST, 2008]. 6pp. [PDF, 35KB]

(2008) NIST 2008 open machine translation evaluation – (MT08). Official evaluation results. [NIST, 2008]

(2007) proceedings of IWSLT 2007: International Workshop on Spoken Language Translation, 15-16 October 2007, Trento, Italy.

(2007) Maristella Agosti, Giorgio Maria di Nunzio, Nicola Ferro & Carol Peters: CLEF: ongoing activities and plans for the future. Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007, Tokyo, Japan; 12pp. [PDF, 2108KB]

(2007) Alison Alvarez, Lori Levin, Robert Frederking, & Jill Lehman: An assessment of language elicitation without the supervision of a linguist. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.1-10 [PDF, 415KB]; presentation [PDF, 412KB]

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Translating from under-resourced languages: comparing direct transfer against pivot translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.29-35 [PDF, 197KB]

(2007) Anja Belz & Albert Gatt: The attribute selection for GRE challenge: overview and evaluation results. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.75-83 [PDF, 438KB]

(2007) Chris Callison-Burch, Cameron Fordyce, Philipp Koehn, Christof Monz, & Josh Schroeder: (Meta-) evaluation of machine translation.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 136-158 [PDF, 373KB]

(2007) Olivier Hamon, Djamel Mostefa, & Khalid Choukri: End-to-end evaluation of a speech-to-speech translation system in TC-STAR. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.223-230 [PDF, 116KB]

(2007) Olivier Hamon: Experiences and conclusions from the CESTA evaluation project. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 22pp. [PDF of PPT presentation, 108KB]

(2007) Pierre Isabelle, Cyril Goutte, & Michel Simard: Domain adaptation of MT systems through automatic post-editing.  MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.255-261 [PDF, 151KB]

(2007) Heiki-Jaan Kaalep & Kaarel Veskis: Comparing parallel corpora and evaluating their quality. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.275-279 [PDF, 164KB]

(2007) Philipp Koehn & Chris Callison-Burch: Evaluating evaluation – lessons from the WMT 2007 shared task. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 38pp. [PDF of PPT presentation, 425KB]

(2007) Gorka Labaka, Nicolas Stroppa, Andy Way, & Kepa Sarasola: Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.297-304 [PDF, 558KB]

(2007) Sharon O’Brien & Johann Roturier: How portable are controlled language rules? A comparison of two empirical MT studies. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.345-352 [PDF, 77KB]

(2007) Lene Offersgaard & Claus Povlsen: Patent documentation – comparison of two MT strategies. MT Summit XI Workshop on patent translation, 11 September 2007, Copenhagen, Denmark; pp.19-23. [PDF, 60KB]

(2007) Anna Sågvall Hein: Rule-based and statistical machine translation with a focus on Swedish [abstract]. Invited talk at TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; 1p. [PDF, 15KB]

(2007) Harold Somers: Machine translation and the World Wide Web. In: Khurshid Ahmad, Christopher Brewster, & Mark Stevenson (eds.): Words and intelligence II: essays in honor of Yorick Wilks (Dordrecht: Springer); pp.209-233.

(2007) Gregor Thurmair: Automatic evaluation in MT system production. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 28pp. [PDF of PPT presentation, 153KB]

(2007) Hua Wu & Haifeng Wang: Comparative study of word alignment heuristics and phrase-based SMT. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.507-514 [PDF, 168KB]

(2006) proceedings of  International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan

 (2006) Azzah Al-Maskari & Mark Sanderson: The affect of machine translation on the performance of Arabic-English QA system. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multilingual Question Answering (MLQA06), Trento, Italy, April 4, 2006; pp.9-14 [PDF, 452KB]

 (2006) Stephen Armstrong, Andy Way, Colm Caffrey, Marian Flanagan, Dorothy Kenny, & Minako O’Hagan: Improving the quality of automated DVD substitles via example-based machine translation. Translating and the Computer 28: proceedings of the Twenty-eighth International Conference on Translating and the Computer, 16-17 November 2006, London. (London: Aslib, 2006); 13pp. [PDF, 159KB]

(2006) Rafael Banchs, Antonio Bonafonte, & Javier Pérez: Acceptance testing of a spoken language translation system. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2074-2079 [PDF, 262KB]

(2006) Nikos Chatzichrisafis, Pierrette Bouillon, Manny Rayner, Marianne Santaholma, Marianne Starlander & Beth Ann Hockey: Evaluating task performance for a unidirectional controlled language medical speech translation system.  HLT-NAACL 2006: Proceedings of the  Workshop on Medical Speech Translation, 9 June 2006, New York, NY, USA; pp.9-16 [PDF, 109KB]

(2006) John DeNero, Dan Gillick, James Zhang, & Dan Klein: Why generative phrase models underperform surface heuristics.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 31-38 [PDF, 393KB]

(2006) Donald A. DePalma: Evaluating MT prior to deployment.  AMTA 2006: 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; presentation on User Track [PDF of PPT presentation, 226KB]

(2006) O. Hamon, A. Popescu-Belis, K.Choukri, M.Dabbadie, A.Hartley, W.Mustafa El Hadi, M.Rajman, & I.Timimi: CESTA: first conclusions of the Technolangue MT evaluation campaign.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.179-184 [PDF, 288KB]

(2006) Gábor Hodász: Evaluation methods of a linguistically enriched translation memory system.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2044-2047 [PDF, 365KB]

(2006) Gábor Hodasz: Towards a comprehensive evaluation method of memory-based translation systems. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.213-217 [PDF, 142KB]

(2006) Sattar Izwaini: Problems of Arabic machine translation: evaluation of three systems. The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.118-148. [PDF, 281KB]

(2006) Howard Johnson, Fatiha Sadat, George Foster, Roland Kuhn, Michel Simard, Eric Joanis, & Samuel Larkin: PORTAGE, with smoothed phrase tables and segment choice models.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 134-137 [PDF, 118KB]

(2006) Kyo Kageura & Genichiro Kikui: A self-referring quantitative evaluation of the ATR Basic Travel Expression Corpus (BTEC).  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1945-950 [PDF, 368KB]

(2006) Philipp Koehn & Christof Monz: Manual and automatic evaluation of machine translation between European languages.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 102-121 [PDF, 199KB]

 (2006) Elina Lagoudaki: Translation memories survey 2006. Translation memory systems: enlightening users’ perspective. Imperial College London, November 2006; 39pp. [PDF, 1046KB]

(2006) Anne-Laure Ligozat, Brigitte Grau, Isabelle Robba, & Anne Vilnat: Evaluation and improvement of cross-language question answering strategies. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multilingual Question Answering (MLQA06), Trento, Italy, April 4, 2006; pp.23-30 [PDF, 551KB]

(2006) Elliott Macklovitch: TransType2: the last word.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.167-172 [PDF, 30KB]

(2006) Federica Mandreoli, Riccardo Martoglia, & Paolo Tiberio: EXTRA: a system for example-based translation assistance [abstract]. Machine Translation 20 (3),2006; pp.167-197.

(2006) Keith J.Miller & Michelle Vanni: Formal vs. informal: register-differentiated Arabic MT evaluation in the PLATO paradigm.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.161-166 [PDF, 296KB]

(2006) Djamal Mostefa, Olivier Hamon, & Khalid Choukri: Evaluation of automatic speech recognition and speech language translation within TC-STAR: results from the first evaluation campaign.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.149-154 [PDF, 388KB]

(2006) Alexandre Patry, Fabrizio Gotti, & Philippe Langlais: Mood at work: Ramses versus Pharaoh. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 126-129 [PDF, 67KB]

(2006) Michael Paul: Overview of the IWSLT06 evaluation campaign. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; pp.1-15  [PDF, 134KB]

(2006) Evan Ratliff: Me translate pretty one day. Wired 14 (12), 2006; pp.210-213. [PDF, 32KB]

(2006) Toshiyuki Takezawa & Tohru Shimizu: Performance improvement of dialog speech translation by rejecting unreliable utterances. Interspeech 2006: ICSLP Ninth International Conference on  Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006, paper 1100; abstract [PDF, 78KB]

(2006) Taro Watanabe, Hajime Tsukada, & Hideki Isozaki: NTT system description for the WMT2006 shared task.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 122-125 [PDF, 91KB]

(2006) NIST 2006 machine translation evaluation. Official results. [NIST, 2006]; [HTML, 263KB]

(2005) proceedings of International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2005], 24-25 October, 2005, Pittsburgh, PA, USA.

 (2005) Michael S.Blekhman, Olga Bezhanova, & Marina Byezhanova: Comparative analysis of the translation quality produced by three MT systems. International Journal of Translation 17 (1-2), Jan-Dec 2005; pp.39-85. [PDF, 236KB]

(2005) Marine Carpuat & Dekai Wu: Evaluating the word sense disambiguation performance of statistical machine translation. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.120-125. [PDF, 93KB]

(2005) Matthias Eck & Chiori Hori: Overview of the IWSLT 2005 Evaluation Campaign. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2005], 24-25 October, 2005, Pittsburgh, PA, USA; 22pp. [PDF, 256KB]

(2005) Declan Groves & Andy Way: Hybrid example-based SMT: the best of both worlds? ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp.183-190. [PDF, 65KB]

(2005) Declan Groves & Andy Way: Hybrid data-driven models of machine translation [abstract]. Machine Translation 19 (3-4), 2005; pp. 301-323.

(2005) Douglas Jones, Edward Gibson, Wade Shen, Neil Granoien, Martha Herzog, Douglas Reynolds, & Clifford Weinstein: Measuring human readability of machine generated text: three case studies in speech recognition and machine translation. Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 18-23, 2005, Philadelphia, PA, USA; vol.5, pp. 1009-1012 [PDF, 233KB]

(2005) Philipp Koehn & Christof Monz: Shared task: statistical machine translation between European languages. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 119-124. [PDF, 41KB]

(2005) Yves Lepage & Etienne Denoual: Purest ever example-based machine translation: detailed presentation and assessment [abstract]. Machine Translation 19 (3-4), 2005; pp.251-282.

(2005) Paula Estrella, Andrei Popescu-Belis, & Nancy Underwood: Finding the system that suits you best: towards the normalization of MT evaluation.  Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 12pp. [PDF, 637KB]

(2005) Liu Qun, Hou Hongxu, Lin Shouxun, Qian Yueliang, Zhang Yujie, & Isahara Hitoshi: Introduction to China’s HTRDP machine translation evaluation. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit: invited paper; pp.i-18-22 [PDF, 310KB], also PDF of PPT presentation [502KB]

(2005) Stephan Oepen, Helge Dyvik, Dan Flickinger, Jan Tore Lønning, Paul Meurer, & Victoria Rosén: Holistic regression testing for high-quality MT: some methodological and technological reflections. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 198-204. [PDF, 173KB]

(2005) Nathalie de Preux: How much does using controlled language improve machine translation results? Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 14pp. [PDF, 109KB]

(2005) Manny Rayner, Pierrette Bouillon, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Hitoshi Isahara, Kyoko Kanzaki, & Yukie Nakao: A methodology for comparing grammar-based and robust approaches to speech understanding. Interspeech 2005 - Eurospeech: 9th European  Conference on  Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005; pp.1877-1880 [PDF, 52KB]; abstract [PDF, 55KB]

(2005) NIST 2005 machine translation evaluation. Official results. [NIST, 2005]; 6pp. [PDF, 38KB]

Examples of MT output

(2009) Dmitriy Genzel, Klaus Machery & Jakob Uszkoreit: Creating a high-quality machine translation system for a low-resource language: Yiddish. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 41-48. [PDF, 157KB]

(2008) Jin’ichi Murakami, Masato Tokuhisa, & Satoru Ikehara: Statistical machine translation with long phrase table and without long parallel sentences. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 454-461. [PDF, 538KB]

(2005) Michael S.Blekhman, Olga Bezhanova, & Marina Byezhanova: Comparative analysis of the translation quality produced by three MT systems. International Journal of Translation 17 (1-2), Jan-Dec 2005; pp.39-85. [PDF, 236KB]

(2005) Mickel Grönroos & Ari Becks: Bringing intelligence to translation memory technology. Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 11pp. [PDF, 80KB]

Human-targeted Translation Edit Rate (HTER)

(2009) Chris Callison-Burch: Fast, cheap, and creative: evaluating translation quality using Amazon’s Mechanical Turk. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.286-295. [PDF, 289KB]

(2009) Matthew Snover, Nitin Madnani, Bonnie J.Dorr, & Richard Schwartz: Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.259-268. [PDF, 147KB]

(2007) Eduard Hovy: Investigating why BLEU penalizes non-statistical systems. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 10pp. [PDF of PPT presentation, 265KB]

(2006) Salim Roukos: Rosetta: an analyst’s co-pilot. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; 60pp.  [PDF, 2789KB]

(2006) Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, & John Makhoul: A study of translation edit rate with targeted human annotation.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.223-231 [PDF, 185KB]

Minimum error rate training [MERT]

(2009) Nicola Bertoldi, Barry Haddow, and Jean-Baptiste Fouet: Improved minimum error rate training in Moses. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.7-16. [PDF, 194KB] [presentation at MT Marathon 2009, 120KB]

(2009) George Foster & Roland Kuhn: Stabilizing minimum error rate training. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.242-249. [PDF, 126KB]

(2009) Katsuhiko Hayashi, Taro Watanabe, Hajime Tsukada, & Hideki Isozaki: Structural support vector machines for log-linear approach in statistical machine translation. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; pp. 144-151. [PDF, 352KB]; presentation [PDF of PPT, 220KB]

(2009) Yifan He & Andy Way: Improving the objective function in minimum error rate training. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.238-245. [PDF, 82KB]

(2009) Shankar Kumar, Wolfgang Macherey, Chris Dyer & Franz Och: Efficient minimum error rate training and minimum Bayes-risk decoding for translation hypergraphs and lattices. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.163-171. [PDF, 342KB]

(2009) Taraka Rama, Anil Kumar Singh, & Sudbeer Kolachina: Modeling letter-to-phoneme conversion as a phrase based statistical machine translation problem with minimum error rate training.  NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Proceedings of the Student Research Workshop and Doctoral Consortium, Boulder, Colorado, June 1, 2009;  pp.90-95. [PDF, 404KB]

(2009) Masao Utiyama, Hirofumi Yamamoto, & Eiichiro Sumita: Two methods for stabilizing MERT: NICT at IWSLT 2009.  IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; pp. 79-82. [PDF, 231KB]; presentation [PDF of PPT, 73KB]

(2009) Omar F.Zaidan & Chris Callison-Burch: Feasibility of human-in-the-loop minimum error rate training. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.52-61. [PDF, 321KB]

(2009) Omar F. Zaidan: Z-MERT: a fully configurable open source tool for minimum error rate training of machine translation systems. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.79-88. [PDF, 263KB]

(2009) Bing Zhao & Shengyuan Chen: A simplex Armijo downhill algorithm for optimizing statistical machine translation decoding parameters.  NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009; pp.21-24. [PDF, 601KB]

 (2009) Minimum error rate training lab. Third Machine Translation Marathon, Prague, Czech Republic, 26-30 January 2009; 2pp. [PDF, 32KB]

(2008) David Chiang, Yuval Marton, & Philip Resnik: Online large-margin training of syntactic and structural translation features.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.224-233. [PDF, 162KB]

(2008) Kevin Duh & Katrin Kirchhoff: Beyond log-linear models: boosted minimum error rate training for n-best re-ranking. ACL 2008 HLT Short Papers, June 2008, Columbus, Ohio; pp.37-40. [PDF, 102KB]

(2008) Wolfgang Macherey, Franz Josef Och, Ignacio Thayer, & Jakob Uszkoreit: Lattice-based minimum error rate training for statistical machine translation.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.725-734. [PDF, 304KB]

(2008) Robert C.Moore & Chris Quirk: Random restarts in minimum error rate training for statistical machine translation.  Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.585-592. [PDF, 552KB]

Quality assurance

(2008) Julia Makoushina & Henrik J.Kockaert: Zen and the art of quality assurance - quality assurance automation in translation: needs, reality and expectations. Translating and the Computer 30, 27-28 November 2008, London; 9pp. [PDF, 101KB]

(2007) Julia Makoushina: Translation quality assurance tools: current state and future approaches.  Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 39pp. [PDF, 468KB]

Quality control

(2008) Lauren Friedman, Haejoong Lee, & Stephanie Strassel: A quality control framework for gold standard reference translations: the process and toolkit developed for GALE. Translating and the Computer 30, 27-28 November 2008, London; 6pp. [PDF, 77KB]

Quality improvement techniques (see also Interactive methods)

(2008) Reginald Hobbs, Jamal Laoudi, & Clare R.Voss: MTriage: web-enabled software for the creation, machine translation, and annotation of smart documents.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 433KB]

(2007) Ariadna Font Llitjós & William A. Ridmann: The inner works of an automatic rule refiner for machine translation. METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 10pp. [PDF, 412KB]

(2007) Hans-Udo Stadler & Ursula Peter-Spörndli: The quest for machine translation quality at CLS Communication. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.435-442 [PDF, 115KB]

(2006) David Kauchak: Contributions to research on machine translation. PhD thesis, University of California, San Diego, 2006. xiv,92pp. [PDF, 491KB]

(2006) Gregor Thurmair: Using corpus information to improve MT quality. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.45-48. [PDF, 371KB]

(2005) Zhu Jiang & Wang Haifeng: The effect of adding rules into the rule-based MT system. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.298-304. [PDF, 378KB]

(2005) Lars Ahrenberg: Codified close translation as a standard for MT. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 13-22. [PDF, 69KB]

(2005) Bart Mellebeek, Anna Khasin, Josef Van Genabith, & Andy Way: TransBooster: boosting the performance of wide-coverage machine translation systems. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 189-197. [PDF, 65KB]

(2005) R. Mahesh K. Sinha: Integrating CAT and MT in AnglaBhart-II architecture. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 235-244. [PDF, 205KB]

Quality measures see Evaluation measures and metrics

Reading and comprehension

(2009) Jieun Chae & Ani Nenkova: Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.139-147. [PDF, 120KB]

(2009) Stephen Doherty & Sharon O'Brien: Can MT output be evaluated through eye tracking? MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.214-221. [PDF, 103KB]

(2009) Jesse Saba Kirchner, Justin Nuger, & Yi Zhang: An extensible crosslinguistic readability framework. ACL-IJCNLP-2009: Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.11-18. [PDF, 215KB]

(2009) Matthew Snover, Nitin Madnani, Bonnie J.Dorr, & Richard Schwartz: Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.259-268. [PDF, 147KB]

(2008) Gloria Corpas Pastor, Ruslan Mitkov, Naveed Afzal, & Viktor Pekar: Translation universals: do they exist? A corpus-based NLP study of convergence and simplification. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.75-81. [PDF, 566KB]

(2007) Douglas Jones, Martha Herzog, Hussny Ibrahim, Arvind Jairam, Wade Shen, Edward Gibson, & Michael Emonts: ILR-based MT comprehension test with multi-level questions. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; Companion volume, pp.77-80 [PDF, 67KB]

(2005) Douglas Jones, Edward Gibson, Wade Shen, Neil Granoien, Martha Herzog, Douglas Reynolds, & Clifford Weinstein: Measuring human readability of machine generated text: three case studies in speech recognition and machine translation. Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 18-23, 2005, Philadelphia, PA, USA; vol.5, pp. 1009-1012 [PDF, 233KB]

(2005) Katsunori Kotani, Takehiko Yoshimi, Takeshi Kutsumi, Ichiko Sata & Hitoshi Isahara: Toward a unified evaluation method for multiple reading support systems: a reading speed-based procedure. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.244-249. [PDF, 113KB]

(2005) Katsunori Kotani, Takehiko Yoshimi, Takeshi Kutsumi, Ichiko Sata & Hiroshi Isahara: A useful-based evaluation of reading support systems: comprehension, reading speed and effective speed . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp 419-426. [PDF, 264KB]

Test corpora

(2009) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Exploiting patent information for the evaluation of machine translation. MT Summit XII: Third Workshop on Patent Translation, August 30, 2009, Ottawa, Ontario, Canada; pp. 1-8. [PDF, 108KB], presentation [PDF of PPT, 98KB]

(2009) Tomoki Nagase, Katsunori Kotani, Masaaki Nagata, Nobutoshi Hatanaka, Yoshiyuki Sakamoto, Eiichiro Sumita, & Kiyotaka Uchimoto: Evaluation of Japanese-Chinese MT system using AAMT’s test-set. CWMT 2009: the 5th China Workshop on Machine Translation, Nanjing, China, October 16-17, 2009; 8pp. [PDF, 811KB]

(2008) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Producing a test collection for patent machine translation in the seventh NTCIR workshop.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 172KB]

(2007) Kiyotaka Uchimoto, Katsunori Kotani, Yujie Zhang, & Hitoshi Isahara: Automatic evaluation of machine translation based on rate of accomplishment of sub-goals. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; pp.33-40 [PDF, 124KB]

Think-aloud protocols

(2005) Sharon O’Brien: Methodologies for measuring the correlations between post-editing effort and machine translatability [abstract]. Machine Translation 19 (1), 2005; pp.37-58.

Translatability

(2009) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Evaluation-guided pre-editing of source text: improving MT-tractability of light verb constructions. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.36-43 [PDF, 336KB]

(2009) Carol Van Ess-Dykema, Susan P.Converse, Dennis Perzanowski, & John S.White: Exploring translation memory for extensibility across genres: implications for usage and metrics. Translating and the Computer 31, 19-20 November 2009, London; 16pp. [PDF, 444KB]

(2007) Behrang Mohit & Rebecca Hwa: Localization of difficult-to-translate phrases.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 248-255 [PDF, 117KB]

(2006) Mike Dillinger: Tools and techniques for translatable content. Tutorial at AMTA 2006 conference, August 8, 2006, Cambridge, Massachusetts, USA; 25pp. [PDF of PPT presentation, 1398KB]

(2006) David M.Rojas & Takako Aikawa: Predicting MT quality as a function of the source language.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2534-2537 [PDF, 613KB]

(2006) Kiyotaka Uchimoto, Naoko Hayashida, Toru Ishida, & Hitoshi Isahara: Automatic detection and semi-automatic revision of non-machine-translatable parts of a sentence.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.703-708 [PDF, 330KB]

(2005) Sharon O’Brien: Methodologies for measuring the correlations between post-editing effort and machine translatability [abstract]. Machine Translation 19 (1), 2005; pp.37-58.

(2005) Laura Ramirez Polo & Johann Haller: Controlled language and the implementation of machine translation for technical documentation. Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 9pp. [PDF, 109KB]; presentation [PDF, 117KB]

(2005) Kiyotaka Uchimoto, Naoko Hayashida, Toru Ishida, & Hitoshi Isahara: Automatic rating of machine translatability. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.235-242. [PDF, 271KB]

Translation Edit Rate [TER] (see also Human-targeted Edit Rate)

(2009) Sadaf Abdul-Rauf & Holger Schwenk: Exploiting comparable corpora with TER and TERp. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.46-54. [PDF, 186KB]

(2009) Matthew Snover, Nitin Madnani, Bonnie J.Dorr, & Richard Schwartz: Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.259-268. [PDF, 147KB]

(2009) Matthew G.Snover, Nitin Madnani, Bonnie Dorr, & Richard Schwartz: TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Error Rate [abstract]. Machine Translation 23 (2/3), September 2009; pp.117-127.

(2009) Anders Søgaard & Dekai Wu: Emprical lower bounds on translation unit error rate for the full class of inversion transduction grammars. IWPT-09: Proceedings of the 11th International Conference on Parsing Technologies, 7-9 October 2009, Paris, France; pp. 33-36. [PDF, 87KB]

(2008) Abhaya Agarwal & Alon Lavie: Meteor, M-BLEU and M-TER: Evaluation metrics for high-correlation with human rankings of machine translation output.  ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.115-118. [PDF, 148KB]

(2008) Almut Silja Hildebrand & Stephan Vogel: Combination of machine translation systems via hypothesis selection from combined n-best lists. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.254-261 [PDF, 681KB]

(2008) Lene Offersgaard, Claus Povlsen, Lisbeth Almsten, & Bente Maegaard: Domain specific MT in use. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.150-159. [PDF, 554KB]

(2008) Antti-Veikko I. Rosti, Bing Zhang, Spyros Matsoukas, & Richard Schwartz: Incremental hypothesis alignment for building confusion networks with application to machine translation system combination. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.183-186. [PDF, 61KB]

Translationese

(2009) Cyril Goutte, David Kurokawa, & Pierre Isabelle: Improving SMT by learning translation direction. SMART Workshop at EACL 2009, Barcelona, Spain, 13 May 2009. 24 slides. [PDF of PPT, 116KB]

(2009) David Kurokawa, Cyril Goutte & Pierre Isabelle: Automatic detection of translated text and its impact on machine translation. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 81-88. [PDF, 118KB]

Usability of systems

(2009) Jennifer DeCamp: What is missing in user-centric MT? MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 489-485. [PDF, 454KB]

(2008) Will Burgett & Julie Chang: The triple advantage factor of machine translation: cost, time-to-market and FAUT. [Keynote presentation at] AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.1-10. [PDF of PPT presentation, 853KB]

(2008) Lene Offersgaard, Claus Povlsen, Lisbeth Almsten, & Bente Maegaard: Domain specific MT in use. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.150-159. [PDF, 554KB]

(2006) Mark Seligman & Mike Dillinger: Usability issues in an interactive speech-to-speech translation system for healthcare. HLT-NAACL 2006: Proceedings of the  Workshop on Medical Speech Translation, 9 June 2006, New York, NY, USA; pp.1-8 [PDF, 288KB]

(2006) Harold Somers: Language engineering and the pathway to healthcare: a user-oriented view.  HLT-NAACL 2006: Proceedings of the  Workshop on Medical Speech Translation, 9 June 2006, New York, NY, USA; pp.32-39 [PDF, 187KB]

(2005) Leslie Barrett & Robert Levin: Usability considerations for a cellular-based text translator. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.471-475. [PDF, 235KB]