Machine Translation Archive

Index of data, corpora and resources

Publications 2005-2009

For other periods go to: Publications since 2010; publications 2000-2004; publications 1990-1999; publications 1970-1989; publications before 1970

[To return to home page click here]

Bilingual corpora [see also Example-based methods, Multilingual corpora]

(2009) R.Basili, D.De Cao, D.Croce, B.Coppola, & A.Moschitti: Cross-language frame semantic transfer in bilingual corpora [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 76KB]

(2009) Ondřej Bojar & Zdeněk Žabokrtský: CzEng 0.9: large parallel treebank with rich annotation. Prague Bulletin of Mathematical Linguistics, no.92, December 2009; pp.63-83 [PDF, 217KB]

(2009) E.Boldrini, S.Ferrández, R.Izquierdo, D.Tomás, & J.L.Vicedo: A parallel corpus labeled using open and restricted domain ontologies [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 65KB]

(2009) Budiono, Hammam Riza, & Charil Hakim: Resource report: building parallel text corpora for multi-domain translation system. ACL-IJCNLP-2009: 7th Workshop on Asian Language Resources (ALR-7), Proceedings of the workshop, 6-7 August 2009, Suntec, Singapore; pp. 92-95. [PDF, 243KB]

(2009) Han-Bin Chen, Jian-Cheng Wu & Jason S.Chang: Learning bilingual linguistic reordering model for statistical machine translation.  NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.254-262. [PDF, 333KB]

(2009) Guy De Pauw, Peter Waiganjo Wagacha, & Gilles-Maurice de Schryver: The SAWA corpus: a parallel corpus English – Swahili. Proceedings of the EACL Workshop on Language Technologies for African Languages (AfLaT 2009), Athens, Greece, 31 March 2009; pp.9-16. [PDF, 129KB]

(2009) Loic Dugast, Jean Senellart & Philipp Koehn: Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.222-229. [PDF, 183KB]

(2009) Izaskun Fernandez, Iñaki Alegria, & Nerea Ezeiza: Using Wikipedia for named-entity translation. [SEPLN 2009] SALTMIL 2009, Donostia-San Sebastián, Spain. “Information retrieval and information extraction for less resourced languages”, Donostia-San Sebastián, September 7; pp.27-35. [PDF, 550KB]

(2009) Alexander Fraser, Renjing Wang, & Hinrich Schütze: Rich bitext projection features for parse reranking. EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.282-290. [PDF, 132KB]

 (2009) Alon Halevy, Peter Norvig, & Fernando Pereira: The unreasonable effectiveness of data. IEEE Intelligent Systems, March-April 2009; pp.8-12. [PDF, 376KB]

(2009) Liang Huang, Wenbin Jiang & Qun Liu: Bilingually-constrained (monolingual) shift-reduce parsing. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.1222-1231. [PDF, 169KB]

 (2009) Tatsuya Ishisaka, Kazuhide Yamamoto, Masao Utiyama, & Eiichiro Sumita: Development of a Japanese-English software manual parallel corpus. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.254-259. [PDF, 87KB]

(2009) Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, & Qingsheng Zhu: Mining bilingual data from the web with adaptively learnt patterns. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.870-878. [PDF, 306KB]

(2009) A.Kumaran, K.Saravanan, Naren Datha, B.Ashok, & Vikram Dendi: WikiBABEL: a Wiki-style platform for creation of parallel data. Proceedings of the ACL-IJCNLP 2009 Sofware Demonstrations, Suntec, Singapore, 3 August 2009; pp.29-32. [PDF, 769KB]

(2009) Bin Lu, Benjamin K.Tsou, Jingbo Zhu, Tao Jiang, & Oi Yee Kwong: The construction of a Chinese-English patent parallel corpus. MT Summit XII: Third Workshop on Patent Translation, August 30, 2009, Ottawa, Ontario, Canada; pp. 17-24. [PDF, 181KB]

(2009) Paul McNamee, James Mayfield, & Charles Nicholas: Translation corpus source and size in bilingual retrieval. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009; pp.25-28. [PDF, 177KB]

(2009) David Mareček: Analysis and alignment of parallel data in TectoMT. Third Machine Translation Marathon, Prague, Czech Republic, 26-30 January 2009; pp.57-58 [PDF, 122KB]

(2009) Preslav Nakov & Hwee Tou Ng: Improved statistical machine translation for resource-poor languages using related resource-rich languages. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.1358-1367. [PDF, 209KB]

(2009) Svetlin Nakov, Preslav Nakov, & Elena Paskaleva: Unsupervised extraction of false friends from parallel bi-texts using the web as a corpus.  [RANLP 2009] International conference: Recent Advances in Natural Language Processing. Proceedings ed. Galia Angelova, Kalina Bontcheva, Ruslan Mitkov, Nicolas Nicolov, Nikolai Nikolov, Borovets, Bulgaria, 14-16 September 2009; pp. 292-298. [PDF, 235KB]

(2009) Sylwia Ozdowska & Andy Way: Optimal bilingual data for French-English PB-SMT. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.96-103. [PDF, 224KB]

(2009) Raphaël Rubino: Exploring context variation and lexicon coverage in project-based approach for term translation. [RANLP 2009] Student Research Workshop [held at], International conference: Recent Advances in Natural Language Processing. Proceedings ed. Irina Temnikova, Ivelina Nikolova, Natalia Konstantinova, Borovets, Bulgaria, 14-15 September 2009; pp.66-70. [PDF, 470KB]

(2009) Ibrahim M.Saleh & Nizar Habash: Automatic extraction of lemma-based bilingual dictionaries for morphologically rich languages. CAASL-3 – Third Workshop on Computational Approaches to Arabic Script-based Languages [at] MT Summit XII, August 26, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 622KB]

(2009) Felipe Sánchez-Martínez & Mikel L.Forcada: Inferring shallow-transfer machine translation rules from small parallel corpora. Journal of Artificial Intelligence Research 34, pp. 605-635. [PDF, 706KB]

(2009) Ruhi Sarikaya, Sameer Maskey, Rong Zhang & Ea-Ee Jan: Iterative sentence-pair extraction from quasi-parallel corpora for machine translation. Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 6-10 September 2009, Brighton, UK; abstract [PDF]

(2009) R.Mahesh K.Sinha: Automated mining of names using parallel Hindi-English corpus. ACL-IJCNLP-2009: 7th Workshop on Asian Language Resources (ALR-7), Proceedings of the workshop, 6-7 August 2009, Suntec, Singapore; pp. 11-18. [PDF, 312KB]

(2009) R.Mahesh K.Sinha: Mining complex predicates in Hindi using a parallel Hindi-English corpus. [ACL-IJCNLP-2009] Proceedings of the 2009 Workshop on Multiword Expressions, ACL-IJCNLP 2009, Suntec, Singapore, 6 August 2009; pp.40-46. [PDF, 112KB]

(2009) Benjamin Snyder, Tahira Naseem, & Regina Barzilay: Unsupervised multilingual grammar induction. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.73-81. [PDF, 273KB]

(2009) Sanghoun Song & Francis Bond: Online search interface for the Sejong Korean-Japanese bilingual corpus and auto-interpretation of phrase alignment. ACL-IJCNLP 2009: Third Linguistic Annotation Workshop (LAW III), Proceedings of the workshop, 6-7 August 2009, Suntec, Singapore; pp.146-149. [PDF, 153KB]

(2009) John Tinsley & Andy Way: Automatically generated parallel treebanks and their exploitability in machine translation [abstract]. Machine Translation 23 (1), February 2009; pp.1-22.

(2009) Masao Utiyama & Hitoshi Isahara: Mining patents for parallel corpora. In: Cyril Goutte, Nicola Cancedda, Marc Dymetman, & George Foster (eds.) Learning machine translation. (Cambridge, Mass.: The MIT Press, 2009); pp.41-58.

(2009) Kun Yu & Junichi Tsujii: Bilingual dictionary extraction from Wikipedia. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 379-386. [PDF, 984KB]

(2009) Qibo Zhu, Diana Inkpen & Ash Asudeh: Inducing translations from officially published materials in Canadian government websites. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 176-183. [PDF, 167KB]

(2008) Takehashi Abekawa & Kyo Kageura: Constructing a corpus that indicates patterns of modification between draft and final translations by human translators. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 65KB]

(2008) Farag Ahmed & Andreas Nürnberger: Arabic/English word translation disambiguation using parallel corpora and matching  schemes.  EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.6-11. [PDF, 616KB]

(2008) Marianna Apidianaki: Translation-oriented word sense induction based on parallel corpora. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 232KB]

(2008) Ondřej Bojar, Miroslav Janiček, Zdeněk Žabokrtský, Pavel Češka, & Peter Beňa: CzEng 0.7: parallel corpus with community-supplied translations. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 73KB]

(2008) David Burkett & Dan Klein: Two languages are better than one (for syntactic parsing).  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.877-886. [PDF, 405KB]

(2008) Chris Callison-Burch: Syntactic constraints on paraphrases extracted from parallel corpora. EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.196-205. [PDF, 187KB]

(2008) Jorge Civera & Alfons Juan-Ciscar: Bilingual text classification using the IBM 1 translation model.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 77KB]

(2008) Alain Désilets, Benoit Farley, Marta Stojanovic, & Geneviève Patenaude: WeBiText: building large heterogeneous translation memories from parallel web content. Translating and the Computer 30, 27-28 November 2008, London; 11pp. [PDF, 470KB]

(2008) Mark Fishel & Heiki-Jaan Kaalep: Experiments on processing overlapping parallel corpora. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 186KB]

(2008) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Toward the evaluation of machine translation using patent information.  AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.97-106. [PDF, 649KB]

(2008) Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, & Takehito Utsuro: Producing a test collection for patent machine translation in the seventh NTCIR workshop.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 172KB]

(2008) Le An Ha, Gabriela Fernandez, Ruslan Mitkov, & Gloria Corpas: Mutual bilingual terminology extraction. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 87KB]

(2008) Olivier Hamon & Djamel Mostefa: The impact of reference quality on automatic MT evaluation.  Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.39-42. [PDF, 66KB]

(2008) Young-Sook Hwang, YoungKil Kim, & SangKyu Park: Paraphrasing depending on bilingual context toward generalization of translation knowledge. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.327-334. [PDF, 952KB]

(2008) Cong-Phap Huynh, Christian Boitet, & Hervé Blanchon: SECTra_w.1: an online collaborative system for evaluating, post-editing and presenting MT translation corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 824KB]

(2008) Alon Lavie, Alok Parlikar, & Vamshi Ambati: Syntax-driven learning of sub-sentential translation equivalents and translation rules from parsed parallel corpora. Second ACL Workshop on Syntax and Structure in Statistical Translation (ACL-08 SSST-2), Proceedings, 20 June 2008, Columbus, Ohio, USA; pp.87-95. [PDF, 1610KB]

(2008) Bo Li & Juan Liu: Mining Chinese-English parallel corpora from the web. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.847-852. [PDF, 457KB]

(2008) Lieve Macken, Els Lefever, & Veronique Hoste:  Linguistically-based sub-sentential alignment for terminology extraction from a bilingual automotive corpus. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.529-536. [PDF, 145KB]

(2008) Elliott Macklovitch, Guy Lapalme, & Fabrizio Gotti: TransSearch: what are translators looking for?  AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.412-419. [PDF, 757KB]

(2008) Kazuaki Maeda, Xiaoyi Ma, & Stephanie Strassel: Creating sentence-aligned parallel text corpora from a large archive of potential parallel text using BITS and Champollion.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 68KB]

(2008) Michael Mohler & Rada Mihalcea: BABYLON parallel text builder: gathering parallel texts for low-density languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 63KB]

(2008) Rogelio Nazar, Leo Wanner, & Jorge Vivaldi: Two-step flow in bilingual lexicon extraction from unrelated corpora. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.140-149. [PDF, 537KB]

(2008) Achim Ruopp & Fei Xia: Finding parallel texts on the web using cross-language information retrieval.  IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.18-25. [PDF, 239KB]

(2008) Lei Shi & Ming Zhou: Improved sentence alignment on parallel web pages using a stochastic tree alignment model.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.505-513. [PDF, 483KB]

 (2008) Špela Vintar: Corpora in translation: a Slovene perspective. Journal of Specialised Translation, issue 10, July 2008; pp.40-55. [PDF, 187KB]

(2008) Keiji Yasuda, Ruiqiang Zhang, Hirofumi Yamamoto, & Eiichiro Sumita: Method of selecting training data to build a compact and efficient translation model. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.655-660. [PDF, 447KB]

(2008) Shiqi Zhao, Haifeng Wang, Ting Liu, & Sheng Li: Pivot approach for extracting paraphrase patterns from bilingual corpora. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 780-788. [PDF, 188KB]

(2007) Julia Aymerich & Hermes Camelo: Automatic extraction of entries for a machine translation dictionary using bitexts. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.21-27 [PDF, 88KB]

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: A dynamic dictionary for discovering indirect translation equivalents.  Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 10pp. [PDF, 150KB]

(2007) Matthias Buch-Kromann: Breaking the barrier of context-freeness: towards a linguistically adequate probabilistic dependency model of parallel texts. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.31-40 [PDF, 432KB]; presentation [PDF, 1285KB]; presentation [PDF, 1285KB]

(2007) Matthias Buch-Kromann: Computing translation units and quantifying parallelism in parallel dependency treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.69-76 [PDF, 361KB]

(2007) Gloria Corpas Pastor: Lost in specialised translation: the corpus as an inexpensive and under-exploited aid for language service providers. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 18pp. [PDF, 159KB]

(2007) Gloria Corpas Pastor & Miriam Seghiri: Specialized corpora for translators: a quantitative method to determine representativeness. Translation Journal 11 (3), July 2007; 8pp. [PDF, 232KB]

 (2007) Alain Désilets, Caroline Barrière, & Jean Quirion: Making WikiMedia resources more useful for translators. Wikimania 2007: the international Wikimedia conference, Taipei, Taiwan, 30 August 2007; 27 slides [PDF, 704KB]

(2007) Ulrich Germann: Two tools for creating and visualizing sub-sentential alignments of parallel text. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.121-124 [PDF, 247KB]

(2007) Xiaoguang Hu, Haifeng Wang, & Hua Wu: Using RBMT systems to produce bilingual corpus for SMT. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 287-295. [PDF, 131KB]

(2007) Masaki Itagaki, Takako Aikawa, & Xiaodong He: Automatic validation of terminology translation consistency with statistical method.  MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.269-274 [PDF, 416KB]

(2007) Krzysztof Jassem & Tomasz Kowalski: Machine translation using scarce bilingual corpora. TASK Quarterly 11, no.1-2, 21-33. [PDF, 216KB]

(2007) J. Howard Johnson, Joel Martin, George Foster & Roland Kuhn: Improving translation quality by discarding most of the phrasetable. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 967-975. [PDF, 182KB]

(2007) Heiki-Jaan Kaalep & Kaarel Veskis: Comparing parallel corpora and evaluating their quality. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.275-279 [PDF, 164KB]

(2007) Caroline Lavecchia, Kamel Smaïli, & David Langlois: Building a bilingual dictionary from movie subtitles based on inter-lingual triggers. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 19pp. [PDF, 139KB]

(2007) Caroline Lavecchia, Kamel Smaïli, David Langlois, & Jean-Paul Haton: Using inter-lingual triggers for machine translation. Interspeech 2007: 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; pp.2829-2832; abstract [PDF, 60KB]

(2007) Yajuan Lü, Jin Huang & Qun Liu: Improving statistical machine translation performance by training data selection and optimization. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 343-350. [PDF, 235KB]

(2007) Lieve Macken, Julia Trushkina, & Lidia Rura: Dutch parallel corpus: MT corpus and translator’s aid. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.313-320 [PDF, 132KB]

(2007) Jin’ichi Murakami, Masato Tokuhisa, & Satoru Ikehara: Statistical machine translation using large J/E parallel corpus and long phrase tables.  IWSLT 2007: International Workshop on Spoken Language Translation, 15-16 October 2007, Trento, Italy. 6pp. [PDF, 69KB]; presentation [PDF, 304KB]

(2007) Hwe Tou Ng & Yee Seng Chan: SemEval-2007 task 11: English lexical sample task via English-Chinese parallel text. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.54-58 [PDF, 84KB]

(2007) Chris Quirk, Raghavendra Udupa U., & Arul Menezes: Generative models of noisy translations with applications to parallel fragment extraction. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.377-384 [PDF, 249KB]

(2007) Monika Rosińska: Collecting Polish-German parallel corpora in the Internet. Proceedings of the International Multiconference on Computer Science and Information Technology, Wisla, Poland, 15-17 October 2007; pp.285-292. [PDF, 550KB]

(2007) Masao Utiyama & Hitoshi Isahara: A Japanese-English patent parallel corpus. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.475-482 [PDF, 113KB]

(2007) Vincent Vandeghinste: Removing the distinction between a translation memory, a bilingual dictionary and a parallel corpus. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 21pp. [PDF, 90KB]

(2007) Antal van den Bosch, Nicolas Stroppa, & Andy Way: A memory-based classification approach to marker-based EBMT.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 10pp. [PDF, 251KB]

(2007) Martin Volk, Joakim Lundborg, & Maël Mettler: A search tool for parallel treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.85-92 [PDF, 270KB]

(2007) Qibo Zhu, Diana Inkpen & Ash Asudeh: Automatic extraction of translations from web-based bilingual materials [abstract]. Machine Translation 21 (3), September 2007; pp.139-163.

 (2006) Saba Amsalu: Data-driven Amharic-English bilingual lexicon acquisition . LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.281-286 [PDF, 366KB]

(2006) Marco Baroni, Adam Kilgarriff, Jan Pomikálek, & Pavel Rychlý: WebBootCaT: instant domain-specific corpora to support human translators. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.247-252 [PDF, 191KB]

(2006) Ondřej Bojar & Zdeněk Žabokrtský: CzEng: Czech-English parallel corpus: release version 0.5. Prague Bulletin of Mathematical Linguistics, no.86, 2006; pp.59-62. [PDF, 92KB]

(2006) Helena M.Caseli, Maria das Graças V.Nunes, & Mikel L.Forcada: Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation [abstract]. Machine Translation 20 (4),2006; pp.227-245.

(2006) A.Casillas, A. Díaz de Illarraza, J.Igartua, R. Martínez, & K. Sarasola: Compilation and structuring of a Spanish-Basque parallel corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.55-58. [PDF, 172KB]

(2006) Lea Cyrus: Building a resource for studying translation shifts.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.697-702 [PDF, 358KB]

(2006) Andreas Eisele: Parallel corpora and phrase-based statistical machine translation for new language pairs via multiple intermediaries.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.845-848 [PDF, 329KB]

(2006) Tomaž Erjavec: The English-Slovene ACQUIS corpus. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2138-2141 [PDF, 365KB]

(2006) Dan Flickinger: Identifying complex phenomena in a corpus via a treebank lens. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.125-129 [PDF, 89KB]

(2006) Ken’ichi Fukushima, Kenjiro Taura, & Takashi Chikayama: A fast and accurate method for detecting English-Japanese parallel texts. Coling-ACL 2006: Proceedings of the Workshop on Multilingual Language Resources and Interoperability, Sydney, July 2006; pp.60-67. [PDF, 258KB]

(2006) Silvia Hansen-Schirra, Stella Neumann, & Mihaela Vela : Multi-dimensional annotation and alignment in an English-German translation corpus. EACL-2006: Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-dimensional Markup in Natural Language Processing, April 4, 2006, Trento, Italy; pp.35-42. [PDF, 77KB]

(2006) Krzysztof Jassem & Kowalski Tomasz: An algorithm for extracting translation rules from scarce bilingual corpora. Proceedings of the International Multiconference on Computer Science and Information Technology, vol.1: XXII Autumn Meeting of Polish Information Processing Society, November 6-10, 2006, Wisla, Poland; pp.67-73. [PDF, 368KB]

(2006) Beáta Bandmann Megyesi, Anna Sågvall Hein, & Éva Csató Johanson: Building a Swedish-Turkish parallel corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2130-2133 [PDF, 692KB]

(2006) Márton Miháltz & Gábor Pohl: Exploiting parallel corpora for supervised word sense disambiguation in English-Hungarian machine translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1294-1297 [PDF, 339KB]

(2006) Dragos Stefan Munteanu & Daniel Marcu: Extracting parallel sub-sentential fragments from non-parallel corpora. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.81-88. [PDF, 1598KB]

(2006) Dragos Stefan Munteanu & Daniel Marcu: Improving machine translation performance by exploiting non-parallel corpora. Computational Linguistics 31 (4), pp. 477-504 [PDF, 1060KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging recurrent phrase structure in large-scale ontology translation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.141-150 [PDF, 686KB]

(2006) Sylwia Ozdowska: Projecting POS tags and syntactic dependencies from English and French to Polish in aligned corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.53-60 [PDF, 366 KB]

(2006) Michael Paul & Eiichiro Sumita: Exploiting variant corpora for machine translation.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 113-116 [PDF, 86KB]

(2006) Alicia Pérez, Inés Torres, Francisco Casacuberta, & Víctor Guijarrubia: A Spanish-Basque weather forecast corpus for probabilistic speech translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.99-102. [PDF, 227KB]

(2006) Tao Tao, Su-Youn Yoon, Andrew Fister, Richard Sproat, & ChengXiang Zhai: Unsupervised name entity transliteration using temporal and phonetic correlation.  EMNLP-2006: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006; pp. 250-257. [PDF, 139KB]

(2006) Gregor Thurmair: Using corpus information to improve MT quality. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.45-48. [PDF, 371KB]

(2006) Hitomi Tohyama & Shigeki Matsubara: Collection of simultaneous interpreting patterns by using bilingual spoken monologue corpus. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2564-2569 [PDF, 552KB]

(2006) Mihaela Vela & Silvia Hansen-Schirra: The use of multi-level annotation and alignment for the translator. Translating and the Computer 28: proceedings of the Twenty-eighth International Conference on Translating and the Computer, 16-17 November 2006, London. (London: Aslib, 2006); 19pp. [PDF, 187KB]

(2006) Haifeng Wang, Hua Wu, & Zhanyi Liu: Word alignment for languages with scarce resources using bilingual corpora of other language pairs. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.874-881. [PDF, 155KB]

(2006) Xinglong Wang & David Martinez: Word sense disambiguation using automatically translated sense examples. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.45-52 [PDF, 272 KB]

(2006) Benjamin Wellington, Sonja Waxmonsky, & I.Dan Melamed: Empirical lower bounds on the complexity of translational equivalence. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.977-984. [PDF, 121KB]

 (2006) Michael Wilkinson: Compiling corpora for use as translation resources. Translation Journal 10 (1), January 2006; 7pp. [PDF, 132KB]

(2006) Jia Xu, Richard Zens, & Hermann Ney: Partitioning parallel documents using binary segmentation. HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 78-85 [PDF, 433KB]

(2005) proceedings of ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005

(2005) Alison Alvarez, Lori Levin, Robert Frederking, Erik Peterson & Jeff Good: Semi-automated elicitation corpus generation . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.388-395. [PDF, 131KB]

(2005) Naoki Asanoma, Setsuo Yamada, Osamu Furuse, & Masahiro Oku: Building a conversation corpus by text derivation from "germ dialogs". 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 27-32. [PDF, 82KB]

(2005) Colin Bannard & Chris Callison-Burch: Paraphrasing with bilingual parallel corpora. ACL-2005: 43rd Annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, 25-30 June 2005; pp. 597-604. [PDF, 196KB]

(2005) Chris Brockett & William B.Dolan: Support vector machines for paraphrase identification and corpus construction. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 1-8. [PDF, 115KB]

(2005) Martin Čmejrek, Jan Cuřín, Jan Hajič, & Jiří Havelka: Prague Czech-English dependency treebank: resource for structure-based MT. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 73-78. [PDF, 66KB]

(2005) Etienne Denoual: The influence of example-data: homogeneity on EBMT quality MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.35-42. [PDF, 404KB]

(2005) John Fry: Assembling a parallel corpus from RSS news feeds MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.59-62. [PDF, 303KB]

(2005) Pablo Gamallo Otero: Extraction of translation equivalents from parallel corpora using sense-sensitive contexts. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 97-102. [PDF, 52KB]

(2005) Emmanuel Giguet: Multi-grained alignment of parallel texts with endogenous resources. International workshop: Modern approaches in translation technologies, Borovets, Bulgaria, 24 September 2005; p.12-17 [PDF, 194KB]

(2005) Ebba Gustavii: Target language preposition selection - an experiment with transformation based learning and aligned bilingual data. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 112-118. [PDF, 45KB]

(2005) Fei Huang, Ying Zhang, & Stephan Vogel: Mining key phrase translations from web corpora. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 483-490. [PDF, 310KB]

 (2005) Zhenglin Jin & Caroline Barrière: Exploring sentence variations with bilingual corpora. Corpus Linguistics 2005. Birmingham, UK, 14-17 July 2005; 14pp. [PDF, 226KB]

(2005) Toshiyuki Kanamaru, Masaki Murata, Kow Kuroda, & Hitoshi Isahara: Obtaining Japanese lexical units for semantic frames from Berkeley FrameNet using a bilingual corpus. IJCNLP-05: Sixth International Workshop on Linguistically Interpreted Corpora (LINC 2005). Proceedings of the workshop, 15 October 2005, Jeju Island, Korea; pp. 11-20. [PDF, 106KB]

(2005) Chunyu Kit, Xiaoyue Liu, KingKui Siu, & Jonathan J.Webster: Harvesting the bitexts of the laws of Hong Kong from the web. IJCNLP-05: Fifth Workshop on Asian Language Resources (ALR-05). Proceedings of the workshop, 14 October 2005, Jeju Island, Korea; pp. 71-78. [PDF, 665KB]

(2005) Grzegorz Kondrak: Cognates and word alignment in bitexts. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.305-312. [PDF, 179KB]

(2005) Jonas Kuhn: Parsing word-aligned parallel corpora in a grammar induction context. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 17-25. [PDF, 150KB]

(2005) Yves Lepage & Etienne Denoual: The ‘purest’ EBMT system ever built: no variables, no templates, no training, examples, just examples, only examples MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.81-90. [PDF, 400KB]

(2005) Karin Müller: Revealing phonological similarities between related languages from automatically generated parallel corpora.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 33-40. [PDF, 114KB]

(2005) Sylwia Ozdowska: Using bilingual dependencies to align words in English/French parallel corpora.  ACL-2005: Student Research Workshop, University of Michigan, Ann Arbor, June 2005; pp. 127-132. [PDF, 89KB]

(2005) Isamu Okada, Shinichiro Miyazawa, Kazunari Ishida, Nobuhiko Shimizu, & Toshizumi Ohta: Quality analysis of patent parallel corpus by the scale MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.29-34. [PDF, 82KB]

(2005) Sitthaa Phaholphinyo, Teerapong Modhiran, Nattapol Kritsuthikul, & Thepchai Supnithi: A practical of memory-based approach for improving accuracy of MT. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.41-46. [PDF, 297KB]

(2005) Masatsugu Tonoike, Mitsuhiro Kida, Toshihiro Takagi, Yauhiro Sasaki, Takehito Utsuro, & Satoshi Sato: Effect of domain-specific corpus in compositional translation estimation for technical terms. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.114-119. [PDF, 193KB]

(2005) Michael Wilkinson: Discovering translation equivalents in a tourism corpus by means of fuzzy searching. Translation Journal 9 (4), October 2005; 6pp. [PDF, 182KB]

(2005) Toon Witkam: A new road to automatic translation. [Utrecht: private publication, 26 November 2005]. 13pp. [PDF, 177KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.85-90. [PDF, 936KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.71-78. [PDF, 1139KB]

(2005) Yujie Zhang, Qun Liu, Qing Ma, & Hitoshi Isahara: A multi-aligner for Japanese-Chinese parallel corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.133-140. [PDF, 420KB]

(2005) Bing Zhao & Alex Waibel: Learning a log-linear model with bilingual phrase-pair features for statistical machine translation. IJCNLP-05: Fourth SIGHAN  Workshop on Chinese Language Processing. Proceedings of the workshop, 14-15 October 2005, Jeju Island, Korea; pp. 79-86. [PDF, 1301KB]

Comparable corpora

(2009) Sadaf Abdul-Rauf & Holger Schwenk: Exploiting comparable corpora with TER and TERp. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.46-54. [PDF, 186KB]

(2009) Sadaf Abdul-Rauf & Holger Schwenk: On the use of comparable corpora to improve SMT performance. EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.16-23. [PDF, 100KB]

(2009) Ken Church: Repetition and language models and comparable corpora [abstract]. ACL-IJCNLP-2009: Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; p.1. [PDF, 61KB]

(2009) Thi-Ngoc-Diep Do, Viet-Bac Le, Brigitte Bigi, Laurent Besacier, & Eric Castelli: Mining a comparable text corpus for a Vietnamese-French statistical machine translation system.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.165-172. [PDF, 152KB]

(2009) Lorraine Goeuriot, Emmanuel Morin & Béatrice Daille: Compilation of specialized comparable corpora in French and Japanese. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.55-63. [PDF, 186KB]

(2009) Xiwu Han, Hanzhang Li, & Tiejun Zhao: Train the machine with what it can learn – corpus selection for SMT. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.27-33. [PDF, 184KB]

(2009) Heng Ji: Mining name translations from comparable corpora by creating bilingual information networks. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.34-47. [PDF, 120KB]

 (2009) Miguel A.Jiménez-Crespo: Conventions in localization: a corpus study of original vs. translated web texts. Journal of Specialised Translation 12 (July 2009); pp.79-102. [PDF, 285KB]

(2009) Jesse Saba Kirchner, Justin Nuger, & Yi Zhang: An extensible crosslinguistic readability framework. ACL-IJCNLP-2009: Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.11-18. [PDF, 215KB]

(2009) Emmanuel Prochasson, Emmanuel Morin & Kyo Kageura: Anchor points for bilingual lexicon extraction from small comparable corpora. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.284-291. [PDF, 138KB]

(2009) Kun Yu & Junichi Tsujii: Extracting bilingual dictionary from comparable corpora with dependency heterogeneity. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009; pp.121-124. [PDF, 132KB]

(2008) Bogdan Babych, Serge Sharoff, & Anthony Hartley: Generalising lexical translation strategies for MT using comparable corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 412KB]

(2008) K.Saravanan & A.Kumaran: Some experiments in mining named entitiy transliteration pairs from comparable corpora. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.26-33. [PDF, 805KB]

(2008) Kathrin Spreyer, Jonas Kuhn, & Bettina Schrader: Identification of comparable argument-head relations in parallel corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 138KB]

(2007) Pablo Gamallo Otero: Learning bilingual lexicons from comparable English and Spanish corpora. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.191-197 [PDF, 509KB]

(2007) E.Morin, B.Daille, K.Takeuchi, & K.Kageura: Bilingual terminology mining – using brain, not brawn comparable corpora.  ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 664-671 [PDF, 143KB]

(2006) Iñaki Alegria, Nerea Ezeiza, & Izaskun Fernandez: Named entities translation based on comparable corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp.1-8 [PDF, 455KB]

(2006) Viktor Pekar, Ruslan Mitkov, Dimitar Blagoev, & Andrea Mulloni: Finding translations for low-frequency words in comparable corpora [abstract].  Machine Translation 20 (4),2006; pp.247-266.

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using comparable corpora to solve problems difficult for human translators. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.739-746. [PDF, 250KB]

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using collocations from comparable corpora to find translation equivalents.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.465-470 [PDF, 1104KB]

(2006) Serge Sharoff: Translation as problem-solving: uses of comparable corpora.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.23-28. [PDF, 914KB]

 

Concordances

 (2009) Stéphane Huet, Julien Bourdaillet, & Philippe Langlais: TS3: an improved version of the bilingual concordancer TransSearch. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.20-27 [PDF, 678KB]

 (2009) N.W.Rees & J.D.Riding: Automatic concordance creation for texts in any language. Translating and the Computer 31, 19-20 November 2009, London; 11pp. [PDF, 455KB]

(2008) Elliott Macklovitch, Guy Lapalme, & Fabrizio Gotti: TransSearch: what are translators looking for?  AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.412-419. [PDF, 757KB]

(2007) Gloria Corpas Pastor: Lost in specialised translation: the corpus as an inexpensive and under-exploited aid for language service providers. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 18pp. [PDF, 159KB]

Crowd sourcing

(2009) Chris Callison-Burch: Fast, cheap, and creative: evaluating translation quality using Amazon’s Mechanical Turk. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.286-295. [PDF, 289KB]

(2009) David Lubensky & Salim Roukos: IBM deployment of real time translation services. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 437-444. [PDF of PPT presentation, 622KB]

(2009) Saverio Perrino: User-generated translation: the future of translation in a Web 2.0 environment. Journal of Specialised Translation 12 (July 2009); pp.55-78. [PDF, 244KB]

(2009) Ruhi Sarikaya, Sameer Maskey, Rong Zhang & Ea-Ee Jan: Iterative sentence-pair extraction from quasi-parallel corpora for machine translation. Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 6-10 September 2009, Brighton, UK; abstract [PDF]

(2009) Masao Utiyama, Takeshi Abekawa, Eiichiro Sumita, & Kyo Kageura: Hosting volunteer translators. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.348-355. [PDF, 827KB]

Data elicitation

(2008) Jonathan H. Clark, Robert Frederking, & Lori Levin: Toward active learning in data selection: automatic discovery of language features during elicitation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 141KB]

(2008) Christian Monson, Ariadna Font Llitjós, Vamshi Ambati, Lori Levin, Alon Lavie, Alison Alvarez, Roberto Aranovich, Jaime Carbonell, Robert Frederking, Erik Peterson, & Katharina Probst: Linguistic structure and bilingual informants help induce machine translation of lesser-resourced languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 141KB]

(2007) Alison Alvarez, Lori Levin, Robert Frederking, & Jill Lehman: An assessment of language elicitation without the supervision of a linguist. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.1-10 [PDF, 415KB]; presentation [PDF, 412KB]

(2005) Alison Alvarez, Lori Levin, Robert Frederking, Erik Peterson & Jeff Good: Semi-automated elicitation corpus generation . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.388-395. [PDF, 131KB]

Domain identification

(2008) Jorge Civera & Alfons Juan-Ciscar: Bilingual text classification using the IBM 1 translation model.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 77KB]

(2007) Hirofumi Yamamoto & Eiichiro Sumita: Bilingual cluster based models for statistical machine translation. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 514-523. [PDF, 166KB]

(2005) Alfio Gliozzo & Carlo Strapparava: Cross language text categorization by acquiring mulitingual domain models from comparable corpora. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 9-16. [PDF, 259KB]

Domain restriction, adaptation and specification

(2009) Nguyen Bach, Roger Hsiao, Matthias Eck, Paisarn Charoenpornsawat, Stephan Vogel, Tanja Schultz, Ian Lane, Alex Waibel, & Alan W.Black: Incremental adaptation of speech-to-speech translation. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009;  pp.149-152. [PDF, 111KB]

(2009) Nicola Bertoldi, Arianna Bisazza, Mauro Cettolo, Germán Sanchis-Trilles, & Marcello Federico: FBK @ IWSLT 2009. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; pp. 37-44. [PDF, 346KB]; poster [PDF, 411KB]

(2009) Nicola Bertoldi & Marcello Federico: Domain adaptation for statistical machine translation with monolingual resources. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.182-189. [PDF, 146KB]

(2009) E.Boldrini, S.Ferrández, R.Izquierdo, D.Tomás, & J.L.Vicedo: A parallel corpus labeled using open and restricted domain ontologies [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 65KB]

(2009) Loic Dugast, Jean Senellart & Philipp Koehn: Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.222-229. [PDF, 183KB]

(2009) Lorraine Goeuriot, Emmanuel Morin & Béatrice Daille: Compilation of specialized comparable corpora in French and Japanese. [ACL-IJCNLP-2009] Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, Suntec, Singapore, 6 August 2009; pp.55-63. [PDF, 186KB]

(2009) Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Xian Wu & Zhong Su: Domain adaptation with latent semantic association for named entity recognition. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.281-289. [PDF, 203KB]

(2009) Yanjun Ma & Andy Way: Bilingually motivated domain-adapted word segmentation for statistical machine translation.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.549-557. [PDF, 325KB]

(2009) Preslav Nakov & Hwee Tou Ng:  NUS at WMT09: domain adaptation experiments for English-Spanish machine translation of news commentary text.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.75-79. [PDF, 76KB]

(2009) Michael Paul, Andrew Finch & Eiichiro Sumita: NICT@WMT09: model adaptation and transliteration for Spanish-English SMT.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.105-109. [PDF, 95KB]

(2009) Zhixiang Ren, Yajuan Lü, Jie Cao, Qun Liu, & Yun Huang: Improving statistical machine translation using domain bilingual multiword expressions. [ACL-IJCNLP-2009] Proceedings of the 2009 Workshop on Multiword Expressions, ACL-IJCNLP 2009, Suntec, Singapore, 6 August 2009; pp.47-54. [PDF, 170KB]

(2009) Holger Schwenk & Jean Senellart: Translation model adaptation for an Arabic/French news translation system by lightly-supervised training. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.38-315. [PDF, 134KB]

(2009) Chris Wendt & Will Lewis: Pushing the quality of a customized SMT system using shared training data. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 450-456. [PDF of PPT presentation, 250KB]

(2008) Iñaki Alegria, Arantza Casillas, Arantza Diaz de Ilarraza, Jon Igartua, Gorka Labaka, Mikel Lersundi, Aingeru Mayor, & Kepa Sarasola: Spanish-to-Basque multiengine machine translation for a restricted domain. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.37-45. [PDF, 575KB]

(2008) A. Diaz de Ilarraza, G. Labaka, & K. Sarasola: Statistical post-editing: a valuable method in domain adaptation of RBMT systems for less-resourced languages. MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; pp. 35-40. [PDF, 364KB]

(2008) Emil Ettelaie, Panayiotis G.Georgiou, & Shrikanth S.Narayanan: Mitigation of data sparsity in classifier-based translation. Coling 2008: Proceedings of the Workshop on Speech Processing for Safety Critical Translation and Pervasive Applications, 23 August 2008, Manchester, UK; pp.1-4. [PDF, 138KB]

(2008) Andrew Finch & Eiichiro Sumita: Dynamic model interpolation for statistical machine translation. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.208-215. [PDF, 428KB]

(2008) Masaki Itagaki & Takako Aikawa: Post-MT term swapper: supplementing a statistical machine translation system with a user dictionary.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 352KB]

(2008) Takeshi Ito, Tomoyosi Akiba, & Katunobu Itou: Effect of the topic dependent translation models for patent translation – experiment at NTCIR-7. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 425-429. [PDF, 365KB]

(2008) Gareth J.F.Jones, Fabio Fantino, Eamonn Newman, & Ying Zhang: Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.34-41. [PDF, 401KB]

(2008) C.-L. Kao, S.Saleem, R.Prasad, F.Choi, P.Natarajan, D.Stallard, K.Krstovski, & M.Kamali: Rapid development of an English/Farsi speech-to-speech translation system. IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA; pp.166-173 [PDF, 143KB]; presentation [PDF, 95KB]

(2008) Mamoru Komachi, Masaaki Nagata, & Yuji Matsumoto: NAIST-NTT system description for patent translation task at NTCIR-7.  Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 435-440. [PDF, 669KB]

(2008) Preslav Nakov: Improving English-Spanish statistical machine translation: experiments in domain adaptation, sentence paraphrasing, tokenization, and recasing. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.147-150. [PDF, 110KB]

(2008) Udhyakumar Nallasamy, Alan W. Black, Tanja Schultz, & Robert Frederking: NineOneOne: recognizing and classifying speech for handling minority language emergency calls.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 95KB]

(2008) Lene Offersgaard, Claus Povlsen, Lisbeth Almsten, & Bente Maegaard: Domain specific MT in use. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.150-159. [PDF, 554KB]

(2008) Petya Osenova, Kiril Simov, & Eelco Mossel: Language resources for semantic document annotation and crosslingual retrieval.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 465KB]

(2008) Tadaaki Oshio, Tomoharu Mitsuhashi, & Tsuyoshi Kakita: Use of the technical field-oriented user dictionaries. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 462-465. [PDF, 382KB]

(2008) Mark Seligman & Mike Dillinger: Rapid portability among domains in an interactive spoken language translation system. Coling 2008: Proceedings of the Workshop on Speech Processing for Safety Critical Translation and Pervasive Applications, 23 August 2008, Manchester, UK; pp.40-47. [PDF, 129KB]

(2008) Marianne Starlander, Pierrette Bouillon, Glenn Flores, Manny Rayner, & Niks Tsourakis: Comparing two different bidirectional versions of the limited-domain medical spoken language translator MedSLT.  EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.176-181. [PDF, 521KB]

(2008) Hua Wu, Haifeng Wang, & Chengqing Zong: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.993-1000. [PDF, 182KB]

(2008) Jian-Cheng Wu, Peter Wei-Huai Hsu, Chiung-Hui Tseng, & Jason S. Chang: Mining the web for domain-specific translations. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.212-221 PDF, 750B]

(2008) Mikio Yamamoto, Jyunya Norimatsu, Mitsuru Koshikawa, Takahiro Fukutomi, Taku Nishio, Kugatsu Sadamitsu, & Takehito Utsuro: Diversion of hierarchical phrases as reordering templates. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 466-470. [PDF, 917KB]

(2008) Keiji Yasuda, Andrew Finch, Hideo Okuma, Masao Utiyama, Hirofumi Yamamoto, & Eiichiro Sumita: System description of NiCT-ATR SMT for NTCIR-7.  Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 415-419. [PDF, 430KB]

(2008) Andreas Zollmann, Ashish Venugopal, & Stephan Vogel: The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.  IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA; pp. 18-25. [PDF, 208KB]; presentation [PDF, 109KB]

(2007) Jorge Civera & Alfons Juan: Domain adaptation in statistical machine translation with mixture modelling.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 177-180 [PDF, 94KB]

(2007) George Foster & Roland Kuhn: Mixture-model adaptation for SMT.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 128-135 [PDF, 380KB]

(2007) Pierre Isabelle, Cyril Goutte, & Michel Simard: Domain adaptation of MT systems through automatic post-editing.  MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.255-261 [PDF, 151KB]

(2007) Philipp Koehn & Josh Schroeder: Experiments in domain adaptation for statistical machine translation.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 224-227 [PDF, 131KB]

(2007) Arul Menezes & Chris Quirk: Using dependency order templates to improve generality in translation.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 1-8 [PDF, 153KB]

(2007) Eelco Mossel: Cross-lingual ontology-based document retrieval.  RANLP-2007: Workshop on Natural Language Processing and Knowledge Representation for eLearning Environments, September 26th, 2007, Borovets, Bulgaria.  8pp. [PDF, 340KB]

(2007) Preslav Nakov & Marti Hearst: UCB system description for the WMT 2007 shared task. ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 212-215 [PDF, 129KB]

(2007) Sharon O’Brien & Johann Roturier: How portable are controlled language rules? A comparison of two empirical MT studies. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.345-352 [PDF, 77KB]

(2007) Akitoshi Okumura: Human communication technology – development of speech translation for hand-held devices. Invited talk at MT Summit XI, 10-14 September 2007, Copenhagen, Denmark; 16pp. [PDF of PPT presentation, 518KB]

(2007) Kristin Precoda, Jing Zheng, Dimitra Vergyri, Horacio Franco, Colleen Richey, Andreas Kathol, & Sachin Kajarekar: IraqComm: a next generation translation system. Interspeech 2007: 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; pp.2841-2844; abstract [PDF, 23KB]

(2007) Anna Sågvall Hein: Rule-based and statistical machine translation with a focus on Swedish [abstract]. Invited talk at TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; 1p. [PDF, 15KB]

(2007) Nicola Ueffing, Gholamreza Haffari, & Anoop Sarker: Semi-supervised model adaptation for statistical machine translation [abstract]. Machine Translation 21 (2), June 2007; pp.77-94.

(2007) Hirofumi Yamamoto & Eiichiro Sumita: Bilingual cluster based models for statistical machine translation. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 514-523. [PDF, 166KB]

(2007) Jia Xu, Yonggang Deng, Yuqing Gao, & Hermann Ney: Domain dependent statistical machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.515-520 [PDF, 289KB]

(2006) proceedings of HLT-NAACL 2006 workshop: Medical Speech Translation, 9 June 2006, New York, NY, USA.  72pp. [PDF, 1403KB]; table of contents

(2006) Marco Baroni, Adam Kilgarriff, Jan Pomikálek, & Pavel Rychlý: WebBootCaT: instant domain-specific corpora to support human translators. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.247-252 [PDF, 191KB]

(2006) Jesús Giménez & Lluís Màrquez: Low-cost enrichment of Spanish WordNet with automatically translated glosses: combing general and specialized models. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.287-294. [PDF, 108KB]

(2006) Young-Suk Lee: IBM Arabic-to-English translation for IWSLT 2006. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; pp.45-52  [PDF, 241KB]

(2006) Paul McNamee & James Mayfield: Translation of multiword expressions using parallel suffix arrays.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.100-109 [PDF, 733KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.945-952. [PDF, 159KB]

(2006) Holger Schwenk, Marta R. Costa-jussà & José A. R. Fonollosa: Continuous space language models for the IWSLT 2006 task. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; pp. 166-173 [PDF, 165KB]

(2006) Stephanie Seneff, Chao Wang & John Lee: Combining linguistic and statistical methods for bi-directional English Chinese translation in the flight domain. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.213-222 [PDF, 152KB]

(2006) David Stallard, Fred Choi, Kriste Krstovski, Prem Natarajan, Rohit Prasad, & Shirin Saleem: A hybrid phrase-based/statistical speech translation system. Interspeech 2006: ICSLP Ninth International Conference on  Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006, paper 1732; abstract [PDF, 74KB]

(2006) Chao Wang & Stephanie Seneff: High-quality speech translation in the flight domain. Interspeech 2006: ICSLP Ninth International Conference on  Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006, paper 1135; abstract [PDF, 72KB]

(2005) Naoki Asanoma, Setsuo Yamada, Osamu Furuse, & Masahiro Oku: Building a conversation corpus by text derivation from "germ dialogs". 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 27-32. [PDF, 82KB]

(2005) Aw AiTi, Zhang Min, Yeo PohKhim, Fan ZhenZhen & Su Jian: Input normalization for an English-to-Chinese SMS translation system. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.445-450. [PDF, 324KB]

(2005) Pierrette Bouillon, Manny Rayner, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Yukie Nakao, Kyoko Kanzaki, & Hitoshi Isahara: A generic multi-lingual open source platform for limited-domain medical speech translation. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 50-58. [PDF, 182KB]

(2005) Christian Champendal & Thierry Pitarque: Lexical sets and text-processing MT Summit X, Phuket, Thailand, September 12, 2005, Proceedings of Workshop on Semantic Web Technologies for Machine Translation; pp.10-12. [PDF, 250KB]

(2005) Etienne Denoual: The influence of example-data: homogeneity on EBMT quality MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.35-42. [PDF, 404KB]

(2005) Hiroyuki Kaji: Domain dependence of lexical translation: a case study of patent abstracts MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.43-49. [PDF, 215KB]

(2005) Svetlana Sheremetyeva: "Less, easier and quicker" in language acquisition for patent MT MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.35-42. [PDF, 269KB]

(2005) Masatsugu Tonoike, Mitsuhiro Kida, Toshihiro Takagi, Yauhiro Sasaki, Takehito Utsuro, & Satoshi Sato: Effect of domain-specific corpus in compositional translation estimation for technical terms. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.114-119. [PDF, 193KB]

(2005) Wu Hua, Wang Haifeng, & Liu Zhanyi: Alignment model adaptation for domain-specific word alignment.  ACL-2005: 43rd Annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, 25-30 June 2005; pp. 467-474. [PDF, 291KB]

Knowedge representation see Ontologies

Language resources (see also Bilingual corpora, Lexical resources, Multilingual corpora)

(2009) Anna Borovikov, Eugene Borovikov, Bradley Colquitt, & Kristen Summers: The EDEAL project for automated processing of African languages.  MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 546-549. [PDF, 73KB]

(2009) Budiono, Hammam Riza, & Charil Hakim: Resource report: building parallel text corpora for multi-domain translation system. ACL-IJCNLP-2009: 7th Workshop on Asian Language Resources (ALR-7), Proceedings of the workshop, 6-7 August 2009, Suntec, Singapore; pp. 92-95. [PDF, 243KB]

(2009) Khalid Choukri: MEDAR – Mediterranean Arabic language and speech technology: an intermediate report on the MEDAR survey of actors, projects, products. MEDAR 2009: 2nd International Conference on Arabic Language Resources & Tools, 22-23 April 2009, Cairo, Egypt; pp.186-192. [PDF, 445KB]

(2009) Bente Maegaard, M.Attia, K.Choukri, S.Krauwer, C.Mokbel, & M.Yaseen: MEDAR: Arabic language technology, state-of-the-art and a cooperation roadmap. MEDAR 2009: 2nd International Conference on Arabic Language Resources & Tools, 22-23 April 2009, Cairo, Egypt; pp.168-174. [PDF, 374KB]

(2008) I.Alegria, X.Arregi, A.Diaz de Ilarraza, G.Labaka, M.Lersundi, A.Mayor, & K.Sarasola: Strategies for sustainable MT for Basque: incremental design, reusability, standardization and open source. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.59-64. [PDF, 74KB]

 (2008) Torbjørg Breivik: Establishing the Norwegian HLT resource collection. SLTC 2008: Second Swedish Language Technology Conference, November 20-21, 2008, Stockholm; pp.29-30. [PDF, 319KB]

(2008) Jonathan H. Clark, Robert Frederking, & Lori Levin: Toward active learning in data selection: automatic discovery of language features during elicitation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 141KB]

(2008) Jennifer DeCamp: Language Technology Resource Center. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; 7pp. [PDF, 678KB]

(2008) Bente Maegaard, M.Atiyya, K.Choukri, S.Krauwer, C.Mokbel, & M.Yaseen: MEDAR – collaboration between European and Mediterranean Arabic partners to support the development of language technology for Arabic. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; pp. 3609-3614. [PDF, 235KB]

(2008) Christian Monson, Ariadna Font Llitjós, Vamshi Ambati, Lori Levin, Alon Lavie, Alison Alvarez, Roberto Aranovich, Jaime Carbonell, Robert Frederking, Erik Peterson, & Katharina Probst: Linguistic structure and bilingual informants help induce machine translation of lesser-resourced languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 141KB]

(2008) Alexander E. Richman & Patrick Schone: Mining Wiki resources for multilingual named entity recognition. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 1-9. [PDF, 134KB]

(2008) Anju Saxena, Pranava Swaroop Madhyasta, & Joakim Nivre: Building the Uppsala Hindi corpus. SLTC 2008: Second Swedish Language Technology Conference, November 20-21, 2008, Stockholm; pp.11-12. [PDF, 339KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Suphanut Thayaboon, Chumpol Mokarat, & Hitoshi Isahara: Enhanced tools for online collaborative language resource development. IJCNLP 2008: Sixth Workshop on Asian Language Resources, Proceedings of the workshop, 11-12 January 2008, Hyderabad, India; pp.105-106. [PDF, 310KB]

(2007) Christopher Cieri, Stephanie Strassel, Meghan Lammie Glenn, & Lauren Friedman: Linguistic resources in support of various evaluation metrics. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen, Denmark, [Proceedings]; 34pp. [PDF of PPT presentation, 1007KB]

(2007) Alain Désilets: Translation Wikified: how will massive online collaboration impact the world of translation? Keyword speech at Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 15pp. [PDF, 339KB]

(2006) proceedings of 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, LREC-2006: Fifth International Confe rence on Language Resources and Evaluation, Genoa, Italy, 23 May 2006. [PDF, 5391KB]

(2006) Ahmed Abdelali, James Cowie, Steve Helmreich, Wanying Jin, Maria Pilar Milagros, Bill Ogden, Hamid Mansouri Rad & Ron Zacharski: Guarani: a case study in resource development for quick ramp-up MT. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.1- 9 [PDF, 344KB]

(2006) A.Bonafonte, H.Höge, I.Kiss, A.Moreno, U.Ziegenhain, H.van den Heuvel, H.-U.Hain, X.S.Wang, M.N.Garcia: TC-STAR: specifications of language resources and evaluation for speech synthesis. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.311-314 [PDF, 272KB]

(2006) Gil Francopoulo, Nuria Bel, Monte George, Nicoletta Calzolari, Monica Monachini, Mandy Pet, & Claudia Soria: Lexical markup framework (LMF) for NLP multilingual resources. Coling-ACL 2006: Proceedings of the Workshop on Multilingual Language Resources and Interoperability, Sydney, July 2006; pp.1-8. [PDF, 69KB]

(2006) Diana Inkpen, Muath Alzghool, Gareth J.F.Jones & Douglas W.Oard: Investigating cross-language speech retrieval for a spontaneous conversational speech collection.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 61-64 [PDF, 43KB]

(2006) Xiaoyi Ma & Christopher Cieri: Corpus support for machine translation at LDC.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.859-864 [PDF, 507KB]

(2006) Stephanie Strassel, Christopher Cieri, Andrew Cole, Denise DiPersio, Mark Lieberman, Xiaoyi Ma, Mohamed Maamouri, & Kazuaki Maeda: Integrated linguistic resources for language exploitation technologies.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.185-190 [PDF, 465KB]

(2006) Henk van den Heuvel, Khalid Choukri, Christian Gollan, Asuncion Moreno, & Djamal Mostefa: TC-STAR: new langauge resources for ASR and SLT purposes. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2570-2573 [PDF, 271KB]

Lexical resources and lexical acquisition (see also Terminology and MT )

(2009) Ming-Hong Bai, Jia-Ming You, Keh-Jiann Chen, & Jason S.Chang: Acquiring translation equivalences of multiword expressions by normalized correlation frequencies. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.478-486. [PDF, 371KB]

(2009) David Bamman & Gregory Crane: Automatically building bilingual dictionaries for Greek and Latin.  EAMT-2009 Workshop Machine Translation for Historical Languages, 13 May 2009, Barcelona; slides [PDF of PPT, 211KB]

(2009) Dmitry Davidov & Ari Rappoport: Enhancements of lexical concepts using cross-lingual web mining. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.852-861. [PDF, 162KB]

(2009) Beate Dorow, Florian Laws, Lukas Michelbacher, Christian Scheible, & Jason Utt: A graph-theoretic algorithm for automatic extension of translation lexicons. Proceedings of the EACL 2009 Workshop on GEMS: Geometrical Models of Natural Language Semantics, Athens, Greece, 31 March 2009; pp.91-95. [PDF, 81KB]

(2009) Loic Dugast, Jean Senellart & Philipp Koehn: Selective addition of corpus-extracted phrasal lexical rules to a rule-based machine translation system. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.222-229. [PDF, 183KB]

(2009) Loic Dugast, Jean Senellart, & Philipp Koehn: Statistical post editing and dictionary extraction: Systran/Edinburgh submissions for ACL-WMT2009.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.110-114. [PDF, 245KB]

(2009) Miguel García, Jesús Giménez & Lluís Màrquez: Enriching statistical translation models using domain-independent multilingual lexical knowledge base [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 19KB]

(2009) Nikesh Garera, Chris Callison-Burch & David Yarowsky: Improving translation lexicon induction from monolingual corpora via dependency contexts and part-of-speech equivalences. CoNLL-2009. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, June 4-5, 2009, Boulder, Colorado; pp.129-137. [PDF, 450KB]

(2009) Rejwanul Haque, Sudip Kumar Naskar, Yanjun Ma & Andy Way: Using supertags as source language context in SMT. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.234-241. [PDF, 247KB]

(2009) Saša Hasan & Hermann Ney: Comparison of extended lexicon models in search and rescoring for SMT. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009; pp.17-20. [PDF, 167KB]

 (2009) Els Lefever, Lieve Macken & Veronique Hoste: Language-independent bilingual terminology extraction from a multilingual parallel corpus.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.496-504. [PDF, 137KB]

(2009) Mausam, Stephen Soderland, Oren Etzioni, Daniel S.Weld, Michael Skinner & Jeff Bilmes: Compiling a massive, multilingual dictionary via probabilistic inference. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.262-270. [PDF, 704KB}

(2009) Davide Picca, Alfio Massimiliano Gliozzo, & Simone Campora: Bridging languages by supersense entity tagging. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.136-142. [PDF, 159KB]

(2009) Emmanuel Prochasson, Emmanuel Morin & Kyo Kageura: Anchor points for bilingual lexicon extraction from small comparable corpora. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.284-291. [PDF, 138KB]

(2009) Ibrahim M.Saleh & Nizar Habash: Automatic extraction of lemma-based bilingual dictionaries for morphologically rich languages. CAASL-3 – Third Workshop on Computational Approaches to Arabic Script-based Languages [at] MT Summit XII, August 26, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 622KB]

(2009) Yasser Salem & Brian Nolan: Designing an XML lexicon architecture for Arabic machine translation based on role and reference grammar.  MEDAR 2009: 2nd International Conference on Arabic Language Resources & Tools, 22-23 April 2009, Cairo, Egypt; pp.221-229. [PDF, 691KB]

(2009) Ruhi Sarikaya, Sameer Maskey, Rong Zhang & Ea-Ee Jan: Iterative sentence-pair extraction from quasi-parallel corpora for machine translation. Interspeech 2009: 10th Annual Conference of the International Speech Communication Association, 6-10 September 2009, Brighton, UK; abstract [PDF]

(2009) Svetlana Sheremetyeva: On extracting multiword NP terminology for MT. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.205-212. [PDF, 215KB]

(2009) Masao Utiyama, Daisuke Kawahara, Keiji Yasuda & Eiichiro Sumita: Mining parallel texts from mixed-language web pages. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.152-159. [PDF, 136KB]

(2009) Varga István & Yokoyama Shoichi: Bilingual dictionary generation for low-resourced language pairs. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.862-870. [PDF, 199KB]

(2009) Eric Wehrli, Luka Nerima & Yves Scherrer: Deep linguistic multilingual translation and bilingual dictionaries.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.90-94. [PDF, 71KB]

(2009) Kun Yu & Junichi Tsujii: Bilingual dictionary extraction from Wikipedia. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 379-386. [PDF, 984KB]

(2009) Kun Yu & Junichi Tsujii: Extracting bilingual dictionary from comparable corpora with dependency heterogeneity. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Short Papers, Boulder, Colorado, May 31 - June 5, 2009; pp.121-124. [PDF, 132KB]

(2009) Qibo Zhu, Diana Inkpen & Ash Asudeh: Inducing translations from officially published materials in Canadian government websites. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 176-183. [PDF, 167KB]

(2008) Eneko Agirre & Aitor Soroa: Using the multilingual central repository for graph-based word sense disambiguation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 114KB]

(2008) Vamshi Ambati & Alon Lavie: Improving syntax driven translation models by re-structuring divergent and non-isomorphic parse tree structures. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.235-244 [PDF, 734KB]

(2008) Todor Arnaudov & Ruslan Mitkov: Smarty – extendable framework for bilingual and multilingual comprehension assistants. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 256KB]

(2008) A.S.Andreyeva: Lexical-functional correspondences and their use in the system of machine translation ETAP-3. Coling 2008: Proceedings of the Workshop on Cognitive Aspects of the Lexicon, 24 August 2008, Manchester, UK; pp.64-72. [PDF,  164KB]

(2008) Toni Badia, Maite Melero, & Oriol Valentin: Rapid deployment of a new METIS language pair: Catalan-English. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 122KB]

(2008) Piotr Bański & Radsław Moszczyński: Enhancing an English-Polish electronic dictionary for multiword expression research. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 56KB]

(2008) Francis Bond, Seiji Okura, Yuji Yamamoto, Toshiki Murata, Kiyotaka Uchimoto, Michael Kato, Miwako Shimazu, & Tsugiyoshi Suzuki: Sharing user dictionaries across multiple systems with UTX-S (AAMT Sharing/Standardization Working Group).  AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp. 304-313. [PDF of PPT presentation, 180KB]

(2008) Pierrette Bouillon, Sonia Halimi, Yukie Nakao, Kyoko Kanzaki, Hitoshi Isahara, Nikos Tsourakis, Marianne Starlander, Beth Ann Hockey, & Manny Rayner: Developing non-European translation pairs in a medium-vocabulary medical speech translation system. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 122KB]

(2008) Maxine Carpuat & Dekai Wu: Evaluation of context-dependent phrasal translation lexicons for statistical machine translation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 122KB]

(2008) Bruno Cartoni: Lexical resources for automatic translation of constructed neologisms: the case study of relational adjectives. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 57KB]

(2008) Helena M. Caseli, Maria das Graças V.Nunes, & Mikel L. Forcada: From free shallow monolingual resources to machine translation systems: easing the task. MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; pp. 41-48. [PDF, 454KB]

(2008) Debasri Chakrabarti, Hemang Mandalia, Ritwik Priya, Vaijayanthi Sarma, & Pushpak Bhattacharyya: Hindi compound verbs and their automatic extraction. Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.27-30. [PDF, 62KB]

(2008) Jonathan H. Clark, Robert Frederking, & Lori Levin: Inductive detection of language features via clustering minimal pairs: towards feature-rich grammars in machine translation.  Second ACL Workshop on Syntax and Structure in Statistical Translation (ACL-08 SSST-2), Proceedings, 20 June 2008, Columbus, Ohio, USA; pp.78-86. [PDF, 246KB]

(2008) Daiga Deksne, Raivis Skadiņš, & Inguna Skadiņa: Dictionary of multiword expressions for translation into highly inflected languages.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 262KB]

(2008) Matthias Eck, Stephan Vogel, & Alex Waibel: Communicating unknown words in machine translation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 303KB]

(2008) Rauf Fatullayev, Ali Abbasov, & Abulfat Fatullayev: Peculiarities of the development of the dictionary for the MT system from Azerbaijani. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.35-40. [PDF, 587KB]

(2008) Nikesh Garera & David Yarowsky: Minimally supervised multilingual taxonomy and translation lexicon induction. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.465-472. [PDF, 356KB]

(2008) Nikesh Garera & David Yarowsky: Translating compounds by learning component gloss translation models via multiple languages. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.403-410. [PDF, 497KB]

(2008) Frederic Gey, David Kirk Evans, & Noriko Kando: A Japanese-English technical lexicon for translation and language research. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 198KB]

(2008) Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick, & Dan Klein: Learning bilingual lexicons from monolingual corpora. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 771-779. [PDF, 257KB]

(2008) Reginald Hobbs, Jamal Laoudi, & Clare R.Voss: MTriage: web-enabled software for the creation, machine translation, and annotation of smart documents.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 433KB]

(2008) Masaki Itagaki & Takako Aikawa: Post-MT term swapper: supplementing a statistical machine translation system with a user dictionary.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 352KB]

(2008) Gareth J.F.Jones, Fabio Fantino, Eamonn Newman, & Ying Zhang: Domain-specific query translation for multilingual information access using machine translation augmented with dictionaries mined from Wikipedia. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.34-41. [PDF, 401KB]

(2008) Mamoru Komachi, Masaaki Nagata, & Yuji Matsumoto: NAIST-NTT system description for patent translation task at NTCIR-7.  Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 435-440. [PDF, 669KB]

(2008) Gerhard Kremer, Andrea Abel, & Marco Baroni: Cognitively salient relations for multilingual lexicography. Coling 2008: Proceedings of the Workshop on Cognitive Aspects of the Lexicon, 24 August 2008, Manchester, UK; pp.94-101. [PDF, 84KB]

(2008) Qing Ma, Nakao Koichi, Masaki Murata, & Hitoshi Isahara: Selection of Japanese-English equivalents by integrating high-quality corpora and huge amounts of web data.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 79KB]

(2008) Denis Maurel: Prolexbase: a multilingual relational lexical database of proper names. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 145KB]

(2008) Aurélien Max & Michael Zock: Looking up phrase rephrasings via a pivot language. Coling 2008: Proceedings of the Workshop on Cognitive Aspects of the Lexicon, 24 August 2008, Manchester, UK; pp.77-84. [PDF, 257KB]

(2008) Rajat Kumar Mohanty & Pushpak Bhattacharyya: Lexical resources for semantic extraction.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 148KB]

(2008) Yohei Morishita, Takehito Utsuro, & Mikio Yamamoto: Integrating a phrase-based SMT model and a bilingual lexicon for human in semi-automatic acquisition of technical term translation lexicon. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.153-162. [PDF, 988KB]

(2008) Rajat Kumar Mohanty, Pushpak Bhattacharyya, Sraddha Kalele, Prabhakar Pandey, Adita Sharma, & Mitesh Kopra: Synset based multilingual dictionary: insights, applications and challenges. GWC-2008: the Fourth Global WordNet conference, Szeged, Hungary, January 22-25, 2008; pp. 321-332. [PDF, 1245KB]

(2008) Rogelio Nazar, Leo Wanner, & Jorge Vivaldi: Two-step flow in bilingual lexicon extraction from unrelated corpora. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.140-149. [PDF, 537KB]

(2008) Luka Nerima & Eric Wehrli: Generating bilingual dictionaries by transitivity. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 205KB]

(2008) Tadaaki Oshio, Tomoharu Mitsuhashi, & Tsuyoshi Kakita: Use of the technical field-oriented user dictionaries. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 462-465. [PDF, 382KB]

(2008) Marianne Santaholma & Nikos Chatzichrisafis: A knowledge-modeling approach for multilingual Regulus lexica. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 156KB]

(2008) Holger Schwenk, Jean-Baptiste Fouet, & Jean Senellart: First steps towards a general purpose French/English statistical machine translation system. ACL-08: HLT. Third Workshop on Statistical Machine Translation, Proceedings, June 19, 2008, The Ohio State University, Columbus, Ohio, USA (ACL WMT-08); pp.119-122. [PDF, 77KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Chumpol Mokarat, Hammam Riza, Hitoshi Isahara, & Purev Jaimal: Synset assignment for bi-lingual dictionary with limited resource. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.673-678. [PDF, 549KB]

(2008) Takashi Tsunakawa & Jun’ichi Tsujii: Bilingual synonym identification with spelling variations. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.457-464. [PDF, 691KB]

(2008) Takashi Tsunakawa, Naoaki Okazaki, & Jun’ichi Tsujii: Building a bilingual lexicon using phrase-based statistical machine translation via a pivot language. Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.127-130. [PDF, 115KB]

(2008) Takashi Tsunakawa, Naoaki Okazaki, & Jun’ichi Tsujii: Building bilingual lexicons using lexical translation probabilities via pivot languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 66KB]

(2008) Hua Wu, Haifeng Wang, & Chengqing Zong: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.993-1000. [PDF, 182KB]

(2008) Jian-Cheng Wu, Peter Wei-Huai Hsu, Chiung-Hui Tseng, & Jason S. Chang: Mining the web for domain-specific translations. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.212-221 PDF, 750B]

(2008) Xianchao Wu, Naoaki Okazaki, Takashi Tsunakawa, & Jun’ichi Tsujii: Improving English-to-Chinese translation for technical terms using morphological information. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.202-211 [PDF, 799KB]

(2007) Takeshi Abekawa & Kyo Kageura: A translation aid system with a stratified lookup interface. ACL 2007: proceedings of demo and poster sessions, Prague, Czech Republic, June 2007; pp. 5-8 [PDF, 334KB]

(2007) Julia Aymerich & Hermes Camelo: Automatic extraction of entries for a machine translation dictionary using bitexts. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.21-27 [PDF, 88KB]

(2007) Guihong Cao, Jianfeng Gao, & Jian-Yun Nie: A system to mine large-scale bilingual dictionaries from monolingual web pages. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.57-64 [PDF, 539KB]

(2007) Marine Carpuat & Dekai Wu: Context-dependent phrasal translation lexicons for statistical machine translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.73-80 [PDF, 425KB]

(2007) Sanae Fujita & Francis Bond: A method of creating new valency entries [abstract]. Machine Translation 21 (1), March 2007; pp.1-28.

(2007) Pablo Gamallo Otero: Learning bilingual lexicons from comparable English and Spanish corpora. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.191-197 [PDF, 509KB]

(2007) Federico Gaspari & Harold Somers: Making a sow’s ear out of a silk purse: (mis)using online MT services as bilingual dictionaries. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 15pp. [PDF, 138KB]

(2007) Masaki Itagaki, Takako Aikawa, & Xiaodong He: Automatic validation of terminology translation consistency with statistical method.  MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.269-274 [PDF, 416KB]

(2007) Jae Dong Kim & Stephan Vogel: Iterative refinement of lexicon and phrasal alignment. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.281-288 [PDF, 121KB]

(2007) Caroline Lavecchia, Kamel Smaïli, & David Langlois: Building a bilingual dictionary from movie subtitles based on inter-lingual triggers. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 19pp. [PDF, 139KB]

(2007) Caroline Lavecchia, Kamel Smaïli, David Langlois, & Jean-Paul Haton: Using inter-lingual triggers for machine translation. Interspeech 2007: 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; pp.2829-2832; abstract [PDF, 60KB]

(2007) Chengye Lu, Yue Xu, & Shlomo Geva: Improving translation accuracy in web-based translation extraction. Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007, Tokyo, Japan; pp.31-35. [PDF, 94KB]

(2007) Ruslan Mitkov, Viktor Pekar, Dimitar Blagoev, & Andrea Mulloni: Methods for extracting and classifying pairs of cognates and false friends [abstract]. Machine Translation 21 (1), March 2007; pp.29-53.

(2007) Hideo Okuma, Hirofumi Yamamoto, & Eiichiro Sumita: Introducing translation dictionary into phrase-based SMT. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.361-368 [PDF, 446KB]

(2007) Chris Quirk, Raghavendra Udupa U., & Arul Menezes: Generative models of noisy translations with applications to parallel fragment extraction. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.377-384 [PDF, 249KB]

(2007) Marcus Sammer & Stephen Soderland: Building a sense-distinguished multilingual lexicon from monolingual corpora and bilingual lexicons. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.399-406 [PDF, 263KB]

(2007) Yves Scherrer: Adaptive string distance measures for bilingual dialect lexicon induction. ACL 2007: proceedings of the Student Research Worksho, Prague, Czech Republic, June 2007; pp. 55-60 [PDF, 122KB]

(2007) Svetlana Sheremetyeva: On portability of resources for a quick ramp up of multilingual MT of patent claims. MT Summit XI Workshop on patent translation, 11 September 2007, Copenhagen, Denmark; pp.28-33. [PDF, 66KB]

(2007) Stuart M.Shieber: Probabilistic synchronous tree-adjoining grammars for machine translation: the argument from bilingual dictionaries. SSST, NAACL-HLT-2007 AMTA Workshop on Syntax and Structure in Statistical Translation, 26 April 2007, Rochester, NY; pp.88-95 [PDF, 152KB]

(2007) Koichi Takeuchi, Takashi Kanehila, Kazuki Hilao, Takeshi Abekawa, & Kyo Kageura: Flexible automatic look-up of English idiom entries in dictionaries. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.451-458 [PDF, 489KB]

(2007) Yik-Cheung Tam, Ian Lane, & Tanja Schultz: Bilingual LSA-based adaptation for statistical machine translation [abstract].  Machine Translation 21 (4) December 2007; pp.187-207.

(2007) Masatoshi Tsuchiya, Ayu Purwarianti, Toshiyuki Wakita, & Seiichi Nakagawa: Expanding Indonesian-Japanese small translation dictionary using a pivot language. ACL 2007: proceedings of demo and poster sessions, Prague, Czech Republic, June 2007; pp. 197-200 [PDF, 567KB]

(2007) Varga István & Yokoyama Shoichi: Japanese-Hungarian dictionary generation using ontology resources. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.483-490 [PDF, 167KB]

(2007) Yik-Cheung Tam & Tanja Schultz: Bilingual LSA-based translation lexicon adaptation for spoken language translation. Interspeech 2007: 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; pp.2461-2464; abstract [PDF, 23KB]

(2007) Yujie Zhang, Qing Ma, & Hitoshi Isahara: Building Japanese-Chinese translation dictionary based on EDR Japanese-English bilingual dictionary. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.551-557 [PDF, 323KB]

(2006) Kisuh Ahn & Matthew Frampton: Automatic generation of translation dictionaries using intermediary languages. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.41-44 [PDF, 278KB]

(2006) Saba Amsalu: Data-driven Amharic-English bilingual lexicon acquisition . LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.281-286 [PDF, 366KB]

(2006) Marco Baroni, Adam Kilgarriff, Jan Pomikálek, & Pavel Rychlý: WebBootCaT: instant domain-specific corpora to support human translators. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.247-252 [PDF, 191KB]

(2006) Michael Carl & Ecaterina Rascu: A dictionary lookup strategy for translating of discontinuous phrases. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.49-57 [PDF, 197KB]

(2006) Helena M.Caseli, Maria das Graças V.Nunes, & Mikel L.Forcada: Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation [abstract]. Machine Translation 20 (4),2006; pp.227-245.

(2006) Conrad Chen & Hsin-Hsi Chen: A high-accurate Chinese-English NE backward translation system combining both lexical information and web statistics.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.81-88. [PDF, 313KB]

(2006) Thierry Declerck, Asunción Gómez Pérez, Ovidiu Vela, Zeno Gantner, & David Manzano-Macho: Multilingual lexical semantic resources for ontology translation. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1492-1495 [PDF, 248KB]

(2006) Sabri Elkateb, William Black, Piek Vossen, David Farwell, Adam Pease, & Christiane Fellbaum: Arabic WordNet and the challenges of Arabic. The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.15-24. [PDF, 295KB]

(2006) Gaolin Fang, Hao Yu, & Fumihito Nishino: Chinese-English term translation mining based on semantic prediction.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.199-206. [PDF, 240KB]

(2006) Ken’ichi Fukushima, Kenjiro Taura, & Takashi Chikayama: A fast and accurate method for detecting English-Japanese parallel texts. Coling-ACL 2006: Proceedings of the Workshop on Multilingual Language Resources and Interoperability, Sydney, July 2006; pp.60-67. [PDF, 258KB]

(2006) Pascale Fung & Benfeng Chen: Robust word sense translation by EM learning of frame semantics. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.239-246. [PDF, 223KB]

(2006) Emmanuel Giguet & Pierre-Sylvain Luquet: Multilingual lexical database generation fom parallel texts in 20 European languages with endogenous resources. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.271-278. [PDF, 427KB]

(2006) Jesús Giménez & Lluís Màrquez: Low-cost enrichment of Spanish WordNet with automatically translated glosses: combing general and specialized models. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.287-294. [PDF, 108KB]

(2006) Rebecca Hwa, Carol Nichols & Khalil Sima’an: Corpus variations for translation lexicon induction. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.74-81 [PDF, 178KB]

(2006) Badam-Osor Khaltar, Atsushi Fujii, & Testuya Ishikawa: Extracting loan words from Mongolian corpora and producing a Japanese-Mongolian bilingual dictionary. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.657-664. [PDF, 297KB]

(2006) Jin-Shea Kuo, Haizhou Li, & Ying-Kuei Yang: Learning transliteration lexicons from the Web. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.1129-1136. [PDF, 428KB]

(2006) André Le Meur & Marie-Jeanne Derouin: Integrated bilingual specialist dictionaries – the LexTerm initiative. Translating and the Computer 28: proceedings of the Twenty-eighth International Conference on Translating and the Computer, 16-17 November 2006, London. (London: Aslib, 2006); 8pp. [PDF, 77KB]; presentation by Marie-Jeanne Derouin: 32 slides [PDF, 579KB]

(2006) Paul McNamee & James Mayfield: Translation of multiword expressions using parallel suffix arrays.  AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.100-109 [PDF, 733KB]

(2006) Slim Mesfar: Standard Arabic formalization and linguistic platform for its analysis. The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.84-94. [PDF, ]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.945-952. [PDF, 159KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging recurrent phrase structure in large-scale ontology translation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.141-150 [PDF, 686KB]

(2006) Viktor Pekar, Ruslan Mitkov, Dimitar Blagoev, & Andrea Mulloni: Finding translations for low-frequency words in comparable corpora [abstract].  Machine Translation 20 (4),2006; pp.247-266.

(2006) Scott S.L.Piao, Guangfan Sun, Paul Rayson, & Qi Yuan: Automatic extraction of Chinese multiword expressions with a statistical tool. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp.17-24 [PDF, 396KB]

(2006) Brock Pytlik & David Yarowsky: Machine translation for languages lacking bitext via multilingual gloss transduction. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.156-165 [PDF, 267KB]

(2006) Reinhard Rapp & Carlos Martin Vide: Example-based machine translation using a dictionary of word pairs.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1268-1273 [PDF, 352KB]

(2006) Violeta Seretan & Eric Wehrli: Accurate collocation extraction using a multilingual parser. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.953-960. [PDF, 168KB]

(2006) Dennis Spohr & Ulrich Heid: Modeling monolingual and bilingual collocation dictionaries in description logics. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp.65-72 [PDF, 357KB]

(2006) David Talbot & Miles Osborne: Modelling lexical redundancy for machine translation.  Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.969-976. [PDF, 207KB]

(2006) Tamás Váradi: Multiword units in an MT lexicon. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp. 73-78 [PDF, 635KB]

(2006) Eiko Yamamoto, Kyoko Kanzaki, & Hitoshi Isahara: Detection of inconsistencies in concept classifications in a large dictionary: toward an improvement of the EDR electronic dictionary.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2325-2330 [PDF, 569KB]

(2005) Igor Boguslavsky: Some lexical issues of UNL. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.101-108 [abstract, PDF, 54KB]

(2005) Ondřej Bojar, Petr Homola, & Vladislav Kuboň: Problems of reusing an existing MT system. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.179-184. [PDF, 73KB]

(2005) Christian Champendal & Thierry Pitarque: Lexical sets and text-processing MT Summit X, Phuket, Thailand, September 12, 2005, Proceedings of Workshop on Semantic Web Technologies for Machine Translation; pp.10-12. [PDF, 250KB]

(2005) Hiroshi Echizen-ya, Kenji Araki, & Yoshio Momouchi: Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora. ACL-SIGLEX-2005: Workshop on Deep Lexical Acquisition, University of Michigan, Ann Arbor, 30 June 2005; pp. 87-96.  [PDF, 726KB]

(2005) Pablo Gamallo Otero: Extraction of translation equivalents from parallel corpora using sense-sensitive contexts. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 97-102. [PDF, 52KB]

(2005) Gregory Grefenstette, Nasredine Semmar, & Faïza Elkateb-Gara: Modifying a natural language processing system for European languages to treat Arabic in information processing and information retrieval applications.  ACL-2005: Workshop on Computational Approaches to Semitic Languages, University of Michigan, Ann Arbor, 29 June 2005; pp. 31-38. [PDF,

(2005) Luis Iraola: Using WordNet for linking UWs to the UNL UW system. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.370-379 [abstract, PDF, 59KB]

(2005) Hiroyuki Kaji: Domain dependence of lexical translation: a case study of patent abstracts MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.43-49. [PDF, 215KB]

(2005) Takeshi Kutsumi, Takehiko Yoshimi, Katsunori Kotani, Ichiko Sata, & Hitoshi Isahara: Selection of entries for a bilingual dictionary from aligned translation equivalents using support vector machines. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.11-16. [PDF, 253KB]

(2005) Marjorie McShane, Sergei Nirenburg, & Stephen Beale: An NLP lexicon as a largely language-independent resource [abstract].  Machine Translation 19 (2), 2005; pp.139-173.

(2005) Palmira Marrafa: The representation of complex telic predicates in wordnets: the case of lexical-conceptual structure deficitary verbs. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.109-116 [abstract, PDF, 14KB]

(2005) Shigeko Nariyama, Eric Nichols, Francis Bond, Takaaki Tanaka, & Hiromi Nakaiwa: Extracting representative arguments from dictionaries for resolving zero pronouns. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.3-10. [PDF, 665KB]

(2005) Carol Nichols & Rebecca Hwa: Word alignment and cross-lingual resource acquisition. ACL-2005: Interactive Poster and Demonstration Sessions, University of Michigan, Ann Arbor, June 2005; pp. 69-72. [PDF, 331KB]

(2005) Constantin Orasan, Ted Marshall, Robert Clark, Le An Ha, & Ruslan Mitkov: Building a WSD module within an MT system to enable interactive resolution in the user's source language. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 205-211. [PDF, 67KB]

(2005) Maja Popovic & Hermann Ney: Exploiting phrasal lexica and additional morpho-syntactic language resources for statistical machine translation with scarce training data. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 212-218. [PDF, 65KB]

(2005) F.Sáenz & A.Vaquero: Knowledge representation issues and implementation of lexical data bases. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.430-442 [abstract, PDF, 13KB]

(2005) Patanakul Sathapornrungkij & Charnyote Pluempitiwiriyawej: Construction of Thai WordNet lexical database from machine readable dictionaries. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.87-92 [PDF, 322KB]

(2005) Svetlana Sheremetyeva: "Less, easier and quicker" in language acquisition for patent MT MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Workshop on Patent Translation; pp.35-42. [PDF, 269KB]

(2005) Hans-Udo Stadler: Lexicon-coding workflow at CLS Communication. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 255-261. [PDF, 56KB]

(2005) Masatsugu Tonoike, Mitsuhiro Kida, Toshihiro Takagi, Yauhiro Sasaki, Takehito Utsuro, & Satoshi Sato: Effect of domain-specific corpus in compositional translation estimation for technical terms. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.114-119. [PDF, 193KB]

(2005) Nitin Verma & Pushpak Bhattacharyya: Automatic generation of multilingual lexicon by using WordNet. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.380-391 [abstract, PDF, 16KB]

(2005) Xinglong Wang & John Carroll: Word sense disambiguation using sense examples automatically acquired from a second language. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 684-691. [PDF, 311KB]

(2005) Hao Zhang & Daniel Gildea: Stochastic lexicalized inversion transduction grammar for alignment. ACL-2005: 43rd Annual meeting of the Association for Computational Linguistics, University of Michigan, Ann Arbor, 25-30 June 2005; pp. 4787-482. [PDF, 102KB]

(2005) Bing Zhao & Stephan Vogel: A generalized alignment-free phrase extraction.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 141-144. [PDF, 174KB]

Monolingual corpora

(2009) Delphine Bernhard & Iryna Gurevych: Combining lexical semantic resources with question & answer archives for translation-based answer finding. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.728-736. [PDF, 145KB]

(2009) Nicola Bertoldi & Marcello Federico: Domain adaptation for statistical machine translation with monolingual resources. Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.182-189. [PDF, 146KB]

(2009) Jing-Shin Chang & Sheng-Sian Lin: Improving translation fluency with search-based decoding and monolingual statistical machine translation model for automatic post-editing. ROCLING 2009: Proceedings of the 21st Conference on Computational Linguistics and Speech Processing, Taichung, Taiwan, 2009; pp.195-207. [PDF, 590KB]

(2009) Nikesh Garera, Chris Callison-Burch & David Yarowsky: Improving translation lexicon induction from monolingual corpora via dependency contexts and part-of-speech equivalences. CoNLL-2009. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, June 4-5, 2009, Boulder, Colorado; pp.129-137. [PDF, 450KB]

(2009) Zhongjun He, Yao Meng, Yajuan Lü, Hao Yu, & Qun Liu: Reducing SMT rule table with monolingual key phrase. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.121-124. [PDF, 120KB]

(2009) Shachar Mirkin, Lucia Specia, Nicola Cancedda, Ido Dagan, Marc Dymetman, & Idan Szpektor: Source-language entailment modeling for translating unknown terms. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.791-799. [PDF, 152KB]

(2009) Yuval Marton, Chris Callison-Burch, & Philip Resnik: Improved statistical machine translation using monolingually-derived paraphrases. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.381-390. [PDF, 160KB]

(2009) Sujith Ravi & Kevin Knight: Learning phoneme mappings for transliteration without parallel data. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.37-45. [PDF, 198KB]

(2009) Massoud Sharifi-Atashgah & Mahmood Bijankhan: Corpus-based analysis for multi-token units in Persian. CAASL-3 – Third Workshop on Computational Approaches to Arabic Script-based Languages [at] MT Summit XII, August 26, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 817KB]

(2008) Toni Badia, Maite Melero, & Oriol Valentin: Rapid deployment of a new METIS language pair: Catalan-English. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 122KB]

(2008) Carmen Banea, Rada Mihalcea, Janyce Wiebe, & Samer Hassan: Multilingual subjectivity analysis using machine translation. EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.127-135. [PDF, 15403KB]

(2008) Michael Carl, Maite Melero, Toni Badia, Vincent Vandeghinste, Peter Dirix, Ineke Schuurman, Stella Markantonatou, Sokratis Sofianopoulos, Marina Vassiliou, & Olga Yannoutsou: METIS-II: low resource machine translation [abstract]. Machine Translation 22 (1/2), March-June 2008; pp.67-99.

(2008) Matthias Eck, Stephan Vogel, & Alex Waibel: Communicating unknown words in machine translation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 303KB]

(2008) Nikesh Garera & David Yarowsky: Minimally supervised multilingual taxonomy and translation lexicon induction. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.465-472. [PDF, 356KB]

(2008) Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick, & Dan Klein: Learning bilingual lexicons from monolingual corpora. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 771-779. [PDF, 257KB]

(2008) Adrien Lardilleux & Yves Lepage: Multilingual alignments by monolingual string differences.  Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.55-58. [PDF, 164KB]

(2008) Zhifei Li & David Yarowsky: Unsupervised translation induction for Chinese abbreviations using monolingual corpora. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 425-433. [PDF, 181KB]

(2008) Holger Schwenk: Investigations on large-scale lightly-supervised training for statistical machine translation. IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA; pp.182-189 [PDF, 141KB]; presentation [PDF, 182KB]

(2008) Matthew Snover, Bonnie Dorr, & Richard Schwartz: Language and translation model adaptation using comparable corpora. EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.857-866. [PDF, 132KB]

(2008) Takashi Tsunakawa & Jun’ichi Tsujii: Bilingual synonym identification with spelling variations. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.457-464. [PDF, 691KB]

(2008) Hua Wu, Haifeng Wang, & Chengqing Zong: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.993-1000. [PDF, 182KB]

(2008) Fan Yang, Jun Zhao, Bo Zou, Kang Liu, & Feifan Liu: Chinese-English backward transliteration assisted with mining monolingual web pages. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 541-549. [PDF, 593KB]

(2007) Bernd Bohnet: The induction and evaluation of word order rules using corpora based on the two concepts of topological models. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.38-45 [PDF, 1835KB]

(2007) Guihong Cao, Jianfeng Gao, & Jian-Yun Nie: A system to mine large-scale bilingual dictionaries from monolingual web pages. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.57-64 [PDF, 539KB]

(2007) Michael Carl, Sandrine Garnier, & Paul Schmidt: Demonstration of the German to English METIS-II MT system. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.41-42 [PDF, 251KB]; poster [PDF, 276KB]

(2007) Peter Dirix, Vincent Vandeghinste, & Ineke Schuurman: Demonstration of the Dutch-to-English METIS-II MT system. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.53-54 [PDF, 316KB]; presentation [PDF, 6104KB]

(2007) Stella Markantonatou, Sokratis Sofianopoulos, Vassiliki Spilioti, Marina Vassiliou, & Olga Yannoutsou: An MT system embedding pattern knowledge.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 8pp. [PDF, 195KB]

(2007) Maite Melero, Antoni Oliver, Toni Badia & Teresa Suñol: Dealing with bilingual divergences in MT using target language n-gram models.  METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 8pp. [PDF, 63KB]

(2007) Maite Melero & Toni Badia: Demonstration of the Spanish to English METIS-II MT system. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.132-133 [PDF, 220KB]

(2007) Marcus Sammer & Stephen Soderland: Building a sense-distinguished multilingual lexicon from monolingual corpora and bilingual lexicons. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.399-406 [PDF, 263KB]

(2007) Sofianopoulos Sokratis, Spilioti Vassiliki, Vassiliou Marina, Yannoutsou Olga, & Markantonatou Stella: Demonstration of the Greek to English METIS-II system. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.199-205 [PDF, 295KB]

(2007) Nicola Ueffing, Gholamreza Haffari, & Anoop Sarkar: Transductive learning for statistical machine translation. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 25-32 [PDF, 309KB]

(2007) Vincent Vandeghinste, Peter Dirix, & Ineke Schuurman: The effect of a few rules on a data-driven MT system. METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 8pp. [PDF, 154KB]

(2007) Vincent Vandeghinste: Removing the distinction between a translation memory, a bilingual dictionary and a parallel corpus. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 21pp. [PDF, 90KB]

(2007) Keiji Yasuda, Hirofumi Yamamoto, & Eiichiro Sumita: Method of selecting training sets to build compact and efficient language model. MT Summit XI Workshop: Using corpora for natural language generation: language generation and machine translation (UCNLG+MT), 11 September 2007, Copenhagen, Denmark; pp.31-37 [PDF, 1632KB]

(2006) Juri Apresjan, Igor Boguslavsky, Boris Iomdin, Leonid Iomdin, Andrei Sannikov, & Victor Sizov: A syntactically and semantically tagged corpus of Russian: state of the art and prospects.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1378-1381 [PDF, 596KB]

(2006) Jaime Carbonell, Steve Klein, David Miller, Michael Steinbaum, Tomer Grassiany, & Jochen Frei: Context-based machine translation. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.19-28 [PDF, 338KB]

(2006) Pascale Fung & Benfeng Chen: Robust word sense translation by EM learning of frame semantics. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.239-246. [PDF, 223KB]

(2006) Kyo Kageura & Genichiro Kikui: A self-referring quantitative evaluation of the ATR Basic Travel Expression Corpus (BTEC).  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1945-950 [PDF, 368KB]

(2006) Svetla Koeva, Svetlozara Lesseva, & Maria Todorova: Bulgarian sense tagged corpus.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.79-86. [PDF, 494KB]

(2006) Stella Markantonatou, Sokratis Sofianopoulos, Vassiliki Spilioti, George Tambouratzis, Marina Vassiliou, & Olga Yannoutsou: Using patterns for machine translation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.239-245 [PDF, 237KB]

(2006) Brock Pytlik & David Yarowsky: Machine translation for languages lacking bitext via multilingual gloss transduction. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.156-165 [PDF, 267KB]

(2006) Nicola Ueffing: Using monolingual source-language data to improve MT performance. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2006], November 27-28, 2006, Kyoto, Japan; pp. 174-181 [PDF, 103KB]

(2005) Toni Badia, Gemma Boleda, Maite Melero, & Antoni Oliver: An n-gram approach to exploiting a monolingual corpus for machine translation MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.1-7. [PDF, 1877KB]

(2005) Chris Brockett & William B.Dolan: Support vector machines for paraphrase identification and corpus construction. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 1-8. [PDF, 115KB]

(2005) Carme Colominas: BancTrad: a web interface for integrated access to annotated corpora. International workshop: Modern approaches in translation technologies, Borovets, Bulgaria, 24 September 2005; p.7-8 [PDF, 114KB]

(2005) Peter Dirix, Ineke Schuurman, & Vincent Vandeghinste: METIS-II: example-based machine translation using monolingual corpora - system description MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.43-50. [PDF, 435KB]

(2005) William B.Dolan & Chris Brockett: Automatically constructing a corpus of sentential paraphrases. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 9-16. [PDF, 115KB]

(2005) Atsushi Fujita & Kentaro Inui: A class-oriented approach to building a paraphrase corpus. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 25-32. [PDF, 300KB]

(2005) Yves Lepage & Etienne Denoual: Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation. IJCNLP-05: Third International Workshop on Paraphrasing (IWP 2005). Proceedings of the workshop, 12 October 2005, Jeju Island, Korea; pp. 57-64. [PDF, 131KB]

(2005) Stella Markantonatou, Sokratis Sofianopoulos, Vassiliki Spilioti, Yiorgos Tambouratzis, Marina Vassiliou, Olga Yannoutsou, & Nikos Ioannou: Monolingual corpus-based MT using chunks MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.91-98. [PDF, 694KB]

(2005) Hirokazu Suzuki & Akira Kumano: Learning translations from monolingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.33-40. [PDF, 439KB]

(2005) Vincent Vandeghinste, Peter Dirix, & Ineke Schuurman: Example-based translation without parallel corpora: first experiments on a prototype MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.135-142. [PDF, 370KB]

(2005) Shyamsundar Jayaraman & Alon Lavie: Multi-engine machine translation guided by explicit word matching. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 143-152. [PDF, 63KB]

(2005) Maja Popovic & Hermann Ney: Exploiting phrasal lexica and additional morpho-syntactic language resources for statistical machine translation with scarce training data. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 212-218. [PDF, 65KB]

Multilingual corpora

(2009) Reza Bosagh Zadeh: Building strong multilingual aligned corpora. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.176-181. [PDF, 497KB]

(2009) Gosse Bouma, Sergio Duarte, & Zahurul Islam: Cross-lingual alignment and completion of Wikipedia templates. NAACL-HLT-2009: The Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies (CLIAWS3), Proceedings of the Workshop, June 4, 2009, Boulder, Colorado; pp.21-29. [PDF, 160KB]

 (2009) Yu Chen, Martin Kay, & Andreas Eisele: Intersecting multilingual data for faster and better statistical translations.  NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.128-136. [PDF, 159KB]

(2009) Dmitry Davidov & Ari Rappoport: Enhancements of lexical concepts using cross-lingual web mining. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.852-861. [PDF, 162KB]

(2009) Miquel Esplà-Gomis: Bitextor, a free/open-source software to harvest translation memories from multilingual websites. MT Summit XII - Workshop: Beyond Translation Memories: New Tools for Translators MT, August 29, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 730KB]

(2009) Elena Filatova: Directions for exploiting asymmetries in multilingual Wikipedia. NAACL-HLT-2009: The Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies (CLIAWS3), Proceedings of the Workshop, June 4, 2009, Boulder, Colorado; pp.30-37. [PDF, 114KB]

(2009) Alexandre Klementiev & Dan Roth: Names entity transliteration and discovery in multilingual corpora.  In: Cyril Goutte, Nicola Cancedda, Marc Dymetman, & George Foster (eds.) Learning machine translation. (Cambridge, Mass.: The MIT Press, 2009); pp.79-92.

(2009) Adrien Lardilleux, Jonathan Chevelu, Yves Lepage, Ghislain Putois, & Julien Gosme: Lexicons or phrase tables? An investigation in sampling-based multilingual alignment. Proceedings of the 3rd International Workshop on Example-Based Machine Translation, 12-13 November 2009, Dublin City University, Dublin, Ireland, ed. Mikel L. Forcada [and] Andy Way; pp.45-52. [PDF, 345KB]; presentation [PDF of PPT, 256KB]

(2009) Els Lefever, Lieve Macken & Veronique Hoste: Language-independent bilingual terminology extraction from a multilingual parallel corpus.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.496-504. [PDF, 137KB]

(2009) Els Lefever & Veronique Hoste: SemEval-2010 Task 3: cross-lingual word sense disambiguation. NAACL-HLT-2009 (SEW-2009): Semantic Evaluations: Recent Achievements and Future Directions, Proceedings of the workshop, June 4, 2009, Boulder, Colorado; pp.76-81. [PDF, 146KB]

(2009) Djamel Mostefa, Mariama Laïb, Stéphane Chaudiron, Khalid Choukri, & Gaël de Chalendar: A multilingual named entity corpus for Arabic, English and French.  MEDAR 2009: 2nd International Conference on Arabic Language Resources & Tools, 22-23 April 2009, Cairo, Egypt; pp.213-216. [PDF, 571KB]

(2009) Alexandre Rafalovitch & Robert Dale: United Nations general assembly resolutions: a six-language parallel corpus. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.292-299. [PDF, 221KB]

(2009) Benjamin Snyder, Tahira Naseem, Jacob Eisenstein, & Regina Barzilay: Adding more languages improves unsupervised multilingual part-of-speech tagging: a Bayesian non-parametric approach. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.83-91. [PDF, 599KB]

(2009) Raghavendra Udupa, K.Saravanan, A.Kumaran, & Jagadeesh Jagarlamudi: MINT: a method for effective and scalable mining of named entity transliterations from large comparable corpora.  EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.799-807. [PDF, 1120KB]

(2009) Masao Utiyama, Daisuke Kawahara, Keiji Yasuda & Eiichiro Sumita: Mining parallel texts from mixed-language web pages. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.152-159. [PDF, 136KB]

(2008) Eneko Agirre & Aitor Soroa: Using the multilingual central repository for graph-based word sense disambiguation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 114KB]

(2008) Brett W.Bader & Peter A.Chew: Enhancing multilingual latent semantic analysis with term alignment information.  Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.49-56. [PDF, 295KB]

(2008) Anurag Bhardwaj, Damien Jose, & Venu Govindaraju: Script independent word spotting in multilingual documents. IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.48-54. [PDF, 122KB]

(2008) Yu Chen, Andreas Eisele, & Martin Kay: Improving statistical machine translation efficiency by triangulation.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 122KB]

(2008) João Graça, Joana Paulo Pardal, Luisa Coheur, & Diamantino Caseiro: Building a golden collection of parallel multi-language word alignments. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 301KB]

(2008) Adrien Lardilleux & Yves Lepage: Multilingual alignments by monolingual string differences.  Coling 2008:  22nd International Conference on Computational Linguistics, Posters and demonstrations, 18-22 August 2008, Manchester UK; pp.55-58. [PDF, 164KB]

(2008) Adrien Lardilleux & Yves Lepage: A truly multilingual, high coverage, accurate, yet simple, subsentential alignment method. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.125-132. [PDF, 633KB]

(2008) Denis Maurel: Prolexbase: a multilingual relational lexical database of proper names. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 145KB]

(2008) Luka Nerima & Eric Wehrli: Generating bilingual dictionaries by transitivity. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 205KB]

(2008) Petya Osenova, Kiril Simov, & Eelco Mossel: Language resources for semantic document annotation and crosslingual retrieval.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 465KB]

(2008) Doaa Samy & Ana González-Ledesma: Pragmatic annotation of discourse markers in a multilingual parallel corpus (Arabic-Spanish-English).  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 301KB]

(2008) Marianne Santaholma & Nikos Chatzichrisafis: A knowledge-modeling approach for multilingual Regulus lexica. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 156KB]

(2008) Lane Schwartz: Multi-source translation methods. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.279-288 [PDF, 650KB]

(2008) Julia Trushkina, Lieve Macken, & Hans Paulussen: Sentence alignment in DPC: maximizing precision, minimizing human effort.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 52KB]

(2008) Hans van Halteren: Source language markers in EUROPARL translations.  Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.937-944. [PDF, 144KB]

(2008) Wolodja Wentland, Johannes Knopp, Carina Silberer, & Matthias Hartung: Building a multilingual lexical resource for named entity disambiguation, translation and transliteration.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 451KB]

(2007) Peter A.Chew & Ahmed Abdelali: Benefits of the ‘massively parallel Rosetta stone’: cross-language information retrieval with over 30 languages. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 872-879 [PDF, 145KB]

(2007) Trevor Cohn & Mirella Lapata: Machine translation by triangulation: making effective use of multi-parallel corpora. ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 728-735 [PDF, 197KB]

(2007) Fernando Diaz & Donald Metzler: Pseudo-aligned multilingual corpora. IJCAI-07: Twentieth International Joint conference on Artificial Intelligence, Hyderabad, India, 6-12 January 2007; pp.2727-2732. [PDF, 126KB]

(2007) Shankar Kumar, Franz Och, & Wolfgang Macherey: Improving word alignment with bridge languages. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 42-50. [PDF, 183KB]

(2007) Marcus Sammer & Stephen Soderland: Building a sense-distinguished multilingual lexicon from monolingual corpora and bilingual lexicons. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.399-406 [PDF, 263KB]

(2006) Simona Balbi & Michelangelo Misuraca: Rotated canonical correlation analysis for multilingual corpora. JADT 2006: 8es Journées internationals d’Analyse statistique des Données Textuelles, 19-21 avril 2006, Besançon, France; pp.99-106. [PDF, 216KB]

(2006) Thierry Declerck, Asunción Gómez Pérez, Ovidiu Vela, Zeno Gantner, & David Manzano-Macho: Multilingual lexical semantic resources for ontology translation. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1492-1495 [PDF, 248KB]

(2006) Sisay Fissaha Adafre & Maarten de Rijke: Finding similar sentences across multiple languages in Wikipedia. EACL-2006: Proceedings of the Workshop on New Text: Wikis and blogs and other dynamic text sources, April 4, 2006, Trento, Italy; pp.62-69. [PDF, 131KB]

(2006) Gil Francopoulo, Nuria Bel, Monte George, Nicoletta Calzolari, Monica Monachini, Mandy Pet, & Claudia Soria: Lexical markup framework (LMF) for NLP multilingual resources. Coling-ACL 2006: Proceedings of the Workshop on Multilingual Language Resources and Interoperability, Sydney, July 2006; pp.1-8. [PDF, 69KB]

(2006) Emmanuel Giguet & Pierre-Sylvain Luquet: Multilingual lexical database generation fom parallel texts in 20 European languages with endogenous resources. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.271-278. [PDF, 427KB]

(2006) Alexandre Klementiev & Dan Roth: Weakly supervised named entity transliteration and discovery from multilingual comparable corpora. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.817-824. [PDF, 173KB]

(2006) Owen Rambow, Bonnie Dorr, David Farwell, Rebecca Green, Nizar Habash, Stephen Helmreich, Eduard Hovy, Lori Levin, Keith J.Miller, Teruko Mitamura, Florence Reeder, & Advaith Siddharthan: Parallel syntactic annotation of multiple languages.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.559-564 [PDF, 286KB]

(2006) Doaa Samy, Antonio Moreno Sandoval, Jose M. Guirao, & Enrique Alfonseca: Building a parallel multilingual corpus (Arabic-Spanish-English).  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2176-2181 [PDF, 473KB]

(2006) Dan Ştefănescu & Dan Tufiş: Aligning multilingual thesauri. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.475-478 [PDF, 461KB]

(2006) Ralf Steinberger, Bruno Pouliquen, Anna Widiger, Camelia Ignat, Tomaž Erjavec, Dan Tufiş, & Daniel Varga: The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2142-2147 [PDF, 411KB]

(2006) Lonneke van der Plas & Jörg Tiedemann: Finding synonyms using automatic word alignment and measures of distributional similarity.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.866-873. [PDF, 223KB]

(2005) Alfio Gliozzo & Carlo Strapparava: Cross language text categorization by acquiring mulitingual domain models from comparable corpora. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 9-16. [PDF, 259KB]

(2005) Hideki Kashioka: Training data modification for SMT: considering groups of synonymous sentences. ACL-2005: Workshop on Empirical Modeling of Semantic Equivalence and Entailment, University of Michigan, Ann Arbor, 30 June 2005; pp. 19-24. [PDF, 113KB]

(2005) Philipp Koehn: Europarl: a parallel corpus for statistical machine translation. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.79-86. [PDF, 123KB]

(2005) Carol Nichols & Rebecca Hwa: Word alignment and cross-lingual resource acquisition. ACL-2005: Interactive Poster and Demonstration Sessions, University of Michigan, Ann Arbor, June 2005; pp. 69-72. [PDF, 331KB]

(2005) Pavol Zavarsky, Yoshiki Mikami & Shota Wada: Language and encoding scheme identification of extremely large sets of multilingual text . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.354-355. [PDF, 84KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.85-90. [PDF, 936KB]

(2005) Yujie Zhang, Kiyotaka Uchimoto, Qing Ma, & Hitoshi Isahara: Building an annotated Japanese-Chinese parallel corpus – a part of NICT multilingual corpora. MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.71-78. [PDF, 1139KB]

Ontologies

(2009) E.Boldrini, S.Ferrández, R.Izquierdo, D.Tomás, & J.L.Vicedo: A parallel corpus labeled using open and restricted domain ontologies [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 65KB]

(2009) Philipp Cimiano, Antje Schultz, Sergej Sizov, Philipp Sorg, & Steffen Staab: Explicit versus latent concept models for cross-language information retrieval. IJCAI-09: Twenty-first International Joint conference on Artificial Intelligence, Pasadena, Californai, USA, 11-17 July 2009; pp.1513-1518. [PDF, 493KB]

(2009) Dmitry Davidov & Ari Rappoport: Translation and extension of concepts across languages. EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.175-183. [PDF, 141KB]

(2009) Miguel García, Jesús Giménez & Lluís Màrquez: Enriching statistical translation models using domain-independent multilingual lexical knowledge base [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 19KB]

(2009) Kentaro Torisawa: Monolingual knowledge acquisition and a multilingual information environment – invited talk. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; 43pp. [PDF of PPT, 7552KB]

(2009) Jantine Trapman & Paola Monachesi: Ontology engineering and knowledge extraction for crosslingual retrieval. [RANLP 2009] International conference: Recent Advances in Natural Language Processing. Proceedings ed. Galia Angelova, Kalina Bontcheva, Ruslan Mitkov, Nicolas Nicolov, Nikolai Nikolov, Borovets, Bulgaria, 14-16 September 2009; pp.455-459. [PDF, 133KB]

(2008) Eneko Agirre & Aitor Soroa: Using the multilingual central repository for graph-based word sense disambiguation. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 114KB]

(2008) Rajat Kumar Mohanty & Pushpak Bhattacharyya: Lexical resources for semantic extraction.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 5pp. [PDF, 148KB]

(2008) Rajat Kumar Mohanty, Pushpak Bhattacharyya, Sraddha Kalele, Prabhakar Pandey, Adita Sharma, & Mitesh Kopra: Synset based multilingual dictionary: insights, applications and challenges. GWC-2008: the Fourth Global WordNet conference, Szeged, Hungary, January 22-25, 2008; pp. 321-332. [PDF, 1245KB]

(2008) Petya Osenova, Kiril Simov, & Eelco Mossel: Language resources for semantic document annotation and crosslingual retrieval.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 465KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Kergrit Robkop, & Hitoshi Isahara: KUI: self-organizing multi-lingual WordNet construction tool. GWC-2008: the Fourth Global WordNet conference, Szeged, Hungary, January 22-25, 2008; pp. 419-427. [PDF, 814KB]

(2008) Yorick Wilks: On whose shoulders? Computational Linguistics 34(4), pp. 471-486. [PDF, 113KB]

(2007) Varga István & Yokoyama Shoichi: Japanese-Hungarian dictionary generation using ontology resources. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.483-490 [PDF, 167KB]

(2007) Rajat Kumar Mohanty, M. Krishna Prasad, Lakshmi Narayanaswamy, & Pushpak Bhattacharyya: Semantically relatable sequences in the context of interlingua based machine translation. ICON-2007: 5th International Conference on Natural Language Processing, IIIT Hyderabad, India, 4-6 January 2007; 8pp. [PDF, 925KB]

(2007) Eelco Mossel: Cross-lingual ontology-based document retrieval.  RANLP-2007: Workshop on Natural Language Processing and Knowledge Representation for eLearning Environments, September 26th, 2007, Borovets, Bulgaria.  8pp. [PDF, 340KB]

(2006) Thierry Declerck, Asunción Gómez Pérez, Ovidiu Vela, Zeno Gantner, & David Manzano-Macho: Multilingual lexical semantic resources for ontology translation. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1492-1495 [PDF, 248KB]

(2006) Pascale Fung & Benfeng Chen: Robust word sense translation by EM learning of frame semantics. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.239-246. [PDF, 223KB]

(2006) Daniel T. Heinze, Alexander Turchin & V. Jagannathan: Automated interpretation of clinical encounters with cultural cues and electronic health record generation.  HLT-NAACL 2006: Proceedings of the  Workshop on Medical Speech Translation, 9 June 2006, New York, NY, USA; pp.24-31 [PDF, 251KB]

(2006) Eduard Hovy: A gentle introduction to ontologies.  Tutorial at AMTA 2006 conference, August 8, 2006, Cambridge, Massachusetts, USA; 60pp. [PDF of PPT presentation, 2050KB]

(2006) Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw & Ralph Weischedel: OntoNotes: the 90% solution.  HLT-NAACL 2006: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, NY, USA, June 2006; pp. 57-60 [PDF, 44KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging reusability: cost-effective lexical acquisition for large-scale ontology translation. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.945-952. [PDF, 159KB]

(2006) G. Craig Murray, Bonnie J. Dorr, Jimmy Lin, Jan Hajič, & Pavel Pecina: Leveraging recurrent phrase structure in large-scale ontology translation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.141-150 [PDF, 686KB]

(2006) E. Saquete, P.Martínez-Barco, R.Muñoz, M.Negri, M.Speranza, & R.Sprugnoli: Multilingual extension of a temporal expression normalizer using annotated corpora. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Cross-Language Knowledge Induction Workshop, Trento, Italy, April 3, 2006; pp.1-8 [PDF, 261 KB]

(2006) Dennis Spohr & Ulrich Heid: Modeling monolingual and bilingual collocation dictionaries in description logics. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Workshop on Multi-word expressions in a Multilingual Context, Trento, Italy, April 3, 2006; pp.65-72 [PDF, 357KB]

(2005) proceedings of Workshop on Semantic Web Technologies for Machine Translation, MT Summit X, 12 September 2005, Phuket, Thailand [HTML]

(2005) Tânia C.D.Bueno, Hugo C.Hoeschl, Andre Bortolon, Eduardo S.Mattos, Cristina Santos, & Ricardo M.Barcia: Knowledge engineering suite: a tool to create ontologies for automatic knowledge representation in intelligent systems. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.337-346 [abstract, PDF, 13KB]

(2005) Jesús Cardeñosa, Carolina Gallardo & Luis Iraola: An XML-UNL model for knowledge-based annotation. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.300-308 [abstract, PDF, 99KB]

(2005) Claire-Lise Mottaz Jiang, Gabriela Tissiani, Gilles Falquet, & Rodolfo Pinto da Luz: Facilitating communication between languages and cultures: a computerized interface and knowledge base. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.359-369 [abstract, PDF, 51KB]

(2005) Marjorie McShane, Sergei Nirenburg, & Stephen Beale: An NLP lexicon as a largely language-independent resource [abstract].  Machine Translation 19 (2), 2005; pp.139-173.

(2005) F.Sáenz & A.Vaquero: Knowledge representation issues and implementation of lexical data bases. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.430-442 [abstract, PDF, 13KB]

(2005) Virach Sornlertlamvanich, Canasai Kruengkrai, Shisanu Tongchin, Prapass Srichaivattana, & Hitoshi Isahara: Term-based ontology alignment. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.138-144 [abstract, PDF, 70KB]

(2005) Ronaldo Teixeira Martins & Maria das Graças Volpe Nunes: On the aboutness of UNL. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.51-63 [abstract, PDF, 57KB]

Open source

(2009) Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Francis M.Tyers (eds.): Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, 2-3 November 2009, Universitat d’Alacant, Alacant, Spain

(2009) Miquel Esplà-Gomis: Bitextor, a free/open-source software to harvest translation memories from multilingual websites. MT Summit XII - Workshop: Beyond Translation Memories: New Tools for Translators MT, August 29, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 730KB]

(2009) David Farwell & Lluís Padró: FreeLing: from a multilingual open-source analyzer suite to an EBMT platform. Proceedings of the 3rd International Workshop on Example-Based Machine Translation, 12-13 November 2009, Dublin City University, Dublin, Ireland, ed. Mikel L. Forcada [and] Andy Way; pp.37-43. [PDF, 372KB]; presentation [PDF of PPT, 331KB]

(2009) Mikel L.Forcada: Why free/open-source EBMT? Presentation at Proceedings of the 3rd International Workshop on Example-Based Machine Translation, 12-13 November 2009, Dublin City University, Dublin, Ireland, ed. Mikel L. Forcada [and] Andy Way; 15 slides. [PDF, 310KB]

(2009) João Graça, Kuzman Ganchev & Ben Taskar: PostCAT – posterior constrained alignment toolkit. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.27-36. [PDF, 134KB]

(2009) Yvette Graham & Josef van Genabith: An open source rule induction tool for transfer-based SMT. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.37-46. [PDF, 321KB]

(2009) Hendrik J.Groenewald & Wildrich Fourie: Introducing the Autshumato integrated translation environment. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.190-196. [PDF , 410KB]

(2009) Hieu Hoang, Philipp Koehn, & Adam Lopez: A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; pp. 152-159. [PDF, 231KB]; presentation [PDF of PPT, 125KB]

(2009) Sadao Kurohashi: Free/open-source EBMT. Presentation at Proceedings of the 3rd International Workshop on Example-Based Machine Translation, 12-13 November 2009, Dublin City University, Dublin, Ireland, ed. Mikel L. Forcada [and] Andy Way; 1 slide. [PDF, 38KB]

(2009) Zhifei Li, Chris Callison-Burch, Sanjeev Khudanpur, & Wren Thornton: Decoding Joshua: open source, parsing-based machine translation. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.47-56. [PDF, 146KB]

(2009) Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren N.G.Thornton, Jonathan Weese, & Omar F.Zaidan: Demonstration of Joshua: an open source toolkit for parsing-based machine translation. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Software Demonstrations, Suntec, Singapore, 3 August 2009; pp.25-28. [PDF, 429KB]

(2009) Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren N.G.Thornton, Jonathan Weese, & Omar F.Zaidan: Joshua: an open source toolkit for parsing-based machine translation.  Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30 March – 31 March 2009; pp.135-139. [PDF, 109KB]

(2009) Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren Thornton, Jonathan Weese, & Omar Zaidan: Joshua: open source toolkit for parsing-based machine translation. Third Machine Translation Marathon, Prague, Czech Republic, 26-30 January 2009; 23pp.  [PDF, 933KB]

(2009) Aaron B. Phillips & Ralf D. Brown: Cunei machine translation platform: system description. Proceedings of the 3rd International Workshop on Example-Based Machine Translation, 12-13 November 2009, Dublin City University, Dublin, Ireland, ed. Mikel L. Forcada [and] Andy Way; pp.29-36. [PDF, 109KB]; presentation [PDF of PPT, 1356KB]

(2009) Francis M. Tyers & Kevin Donnelly: apertium-cy: a collaboratively-developed RBMT system for Welsh to English. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.57-66. [PDF, 134KB]  [presentation at MT Marathon 2009, 1376KB]

(2009) Omar F. Zaidan: Z-MERT: a fully configurable open source tool for minimum error rate training of machine translation systems. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.79-88. [PDF, 263KB]

(2008) I.Alegria, X.Arregi, A.Diaz de Ilarraza, G.Labaka, M.Lersundi, A.Mayor, & K.Sarasola: Strategies for sustainable MT for Basque: incremental design, reusability, standardization and open source. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.59-64. [PDF, 74KB]

(2008) Pierrette Bouillon, Glenn Flores, Maria Georgescul, Sonia Halimi, Beth Ann Hockey, Hitoshi Isahara, Kyoko Kanzaki, Yukie Nakao, Manny Rayner, Marianne Santaholma, Marianne Starlander, & Nikos Tsourakis: Many-to-many multilingual medical speech translation on a PDA. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.314-323. [PDF, 666KB]

(2008) Shin Chang-Meadows: MT errors in CH-to-EN MT systems: user feedback. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.324-333.  [PDF of PPT presentation, 778KB]

(2008) Ze-Jung Chuang & Yuen-Hsien Tseng: NTCIR-7 experiments in patent translation based on open-source statistical machine translation tools. Proceedings of NTCIR-7 Workshop Meeting, December 16-19, 2008, Tokyo, Japan; pp. 423-424. [PDF, 365KB]

(2008) Hieu Hoang & Philipp Koehn: Design of the Moses decoder for statistical machine translation. ACL-08 HLT: Software Engineering, Testing, and Quality Assurance for Natural Language Processing, June 20, 2008, The Ohio State University, Columbus, Ohio, USA; pp.58-65. [PDF, 137KB]

(2008) Philipp Koehn: Moses: moving open source MT towards linguistically richer models [abstract only]. Invited talk at: MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; p. 71. [PDF, 266KB]

(2008) Alexandre Patry & Philippe Langlais: MISTRAL: a statistical machine translation decoder for speech recognition lattices.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 265KB]

(2008) Manny Rayner, Pierrette Bouillon, Beth Ann Hockey, & Yukie Nakao: Almost flat functional semantics for speech translation.  Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.713-720. [PDF, 96KB]

(2008) Manny Rayner, Pierrette Bouillon, Glenn Flores, Farzad Ehsani, Marianne Starlander, Beth Ann Hockey, Jane Brotanek, & Lukas Biewald: A small-vocabulary shared task for medical speech translation. Coling 2008: Proceedings of the Workshop on Speech Processing for Safety Critical Translation and Pervasive Applications, 23 August 2008, Manchester, UK; pp.60-63. [PDF, 62KB]

(2008) Holger Schwenk & Yannick Esteve: Data selection and smoothing in an open-source system for the 2008 NIST machine translation evaluation. Interspeech 2008: 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008; pp.2727-2730; abstract [PDF, 64KB]

(2008) Andreas Zollmann, Ashish Venugopal, & Stephan Vogel: The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.  IWSLT 2008: Proceedings of the International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA; pp. 18-25. [PDF, 208KB]; presentation [PDF, 109KB]

(2008) About OpenMT project. MATMT 2008: Mixing Approaches to Machine Translation, Donostia-San Sebastian [Spain], February 14th 2008: Proceedings; p. 7.  [PDF, 74KB]

(2007) Yu Chen, Andreas Eisele, Christian Federmann, Eva Hasler, Michael Jellinghaus, & Silke Theison: Multi-engine machine translation with an open-source decoder for statistical machine translation.  ACL 2007: proceedings of the Second Workshop on Statistical Machine Translation, June 23, 2007, Prague, Czech Republic; pp. 193-196 [PDF, 96KB]

(2007) Alain Désilets: Translation Wikified: how will massive online collaboration impact the world of translation? Keyword speech at Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 15pp. [PDF, 339KB]

(2007) Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, Evan Herbst: Moses: open source toolkit for statistical machine translation. ACL 2007: proceedings of demo and poster sessions, Prague, Czech Republic, June 2007; pp. 177-180 [PDF, 119KB]

(2007) Eric Nichols, Francis Bond, Darren Scott Appling, & Yuji Matsumoto: Combining resources for open source machine translation. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.134-143 [PDF, 404KB]; poster [PDF, 67KB]

(2007) Antoni Oliver & Mercè Vàzquez: A free terminology extraction suite. Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 (London: Aslib, 2007); 28pp. [PDF, 182KB]

(2007) Felipe Sánchez-Martínez & Mikel L.Forcada: Automatic induction of shallow-transfer rules for open-source machine translation. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.181-190 [PDF, 336KB]; poster [PDF, 29KB]

(2007) Felipe Sánchez-Martínez, Juan Antonio Pérez-Ortiz, & Mikel L. Forcada: Integrating corpus-based and rule-based approaches in an open-source machine translation system. METIS-II Workshop: New Approaches to Machine Translation, Centre for Computational Linguistics, Katholieke Universiteit Leuven, Belgium, 11 January 2007; 10pp. [PDF, 180KB]

(2006) Carme Armentano i Oller & Mikel L. Forcada: Open-source machine translation between small languages: Catalan and Aranese Occitan.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.51-54. [PDF, 62KB]

(2006) Mikel L.Forcada: Open source machine translation: an opportunity for minor languages. LREC-2006: Fifth International Conference on Language Resources and Evaluation. 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, Genoa, Italy, 23 May 2006; pp.1-6. [PDF, 118KB]

(2006) Marian Olteanu, Chris Davis, Ionut Volosen, & Dan Moldovan: Phramer – an open source statistical pharse-based translator.  HLT-NAACL 2006: Proceedings of the Workshop on Statistical Machine Translation, New York, NY, USA, June 2006; pp. 146-149 [PDF, 125KB]

(2006) Alexandre Patry, Fabrizio Gotti, & Philippe Langlais: MOOD: a modular object-oriented decoder for statistical machine translation.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.709-714 [PDF, 281KB]

(2006) Javier Pérez & Antonio Bonafonte: GAIA: a common framework for the development of speech translation technologies.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.2560-2563 [PDF, 531KB]

(2006) Gema Ramírez-Sánchez, Felipe Sánchez-Martínez, Sergio Ortiz-Rojas, Juan Antonio Pérez-Ortiz & Mikel L.Forcada: Opentrad Apertium open-source machine translation system: an opportunity for business and research. Translating and the Computer 28: proceedings of the Twenty-eighth International Conference on Translating and the Computer, 16-17 November 2006, London. (London: Aslib, 2006); 15pp. [PDF, 145KB]

(2006) Oliver Streiter, Kevin P.Scanell, & Martin Stuflesser: Implementing NLP projects for noncentral languages: instructions for funding bodies, strategies for developers [abstract]. Machine Translation 20 (4),2006; pp.267-289.

(2005) proceedings of Workshop on Open Source Machine Translation, MT Summit X, 16 September 2005, Phuket, Thailand [HTML]

(2005) Antonio M.Corbi-Bellot, Mikel L. Forcada, Sergio Ortíz-Rojas, Juan Antonio Pérez-Ortiz, Gema Ramírez-Sánchez, Felipe Sánchez-Martínez, Iñaki Alegria, Aingeru Mayor, & Kepa Sarasola: An open-source shallow-transfer machine translation engine for the Romance languages of Spain. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 79-86. [PDF, 130KB]

Scarce resources (see also Language resources, Rapid development of MT)

(2009) proceedings of SALTMIL 2009, “Information retrieval and information extraction for less resourced languages”, Donostia-San Sebastián, September 7 2009. [PDF, 16458KB]

(2009) Eneko Agirre, Aitziber Atutxa, Gorka Labaka, Mikel Lersundi, Aingeru Mayor, & Kepa Sarasola: Use of rich linguistic information to translate prepositions and grammatical cases to Basque. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.58-65. [PDF, 340KB]

(2009) Vamshi Ambati & Jaime Carbonell: Proactive learning for building machine translation systems for minority languages. NAACL-HLT-2009: Active Learning for Natural Language Processing (ALNLP-09), Proceedings of the workshop, June 5, 2009, Boulder, Colorado; pp.58-61. [PDF, 101KB]

(2009) Kathy Baker, Steven Bethard, Michael Bloodgood, Ralf Brown, Chris Callison-Burch, Glen Coppersmith, Bonnie Dorr, Wes Filardo, Kendall Giles, Anni Irvine, Mike Kayser, Lori Levin, Justin Martineau, Jim Mayfield, Scott Miller, Aaron Phillips, Andrew Philpot, Christine Piatko, Lane Schwartz, & David Zajic: Semantically informed machine translation (SIMT). Final report of the 2009 Summer Camp for Applied Language Exploration [John Hopkins University], November 2009; 152pp. [PDF, 5705KB]

(2009) Rodolfo Delmonte, Antonella Bristot, Sara Tonelli, & Emanuele Pianta: English/Veneto resource poor machine translation with STILVEN.  ISMTCL: International Symposium on Data and Sense Mining, Machine Translation and Controlled Languages, and their application to emergencies and safety critical domains, July 1-3, 2009, Centre Tesnière, University of Franche-Comté, Besançon, France (Presses universitaires de Franche-Comté, 2009); pp.82-89 [abstract]

(2009) Dmitriy Genzel, Klaus Macherey & Jakob Uszkoreit: Creating a high-quality machine translation system for a low-resource language: Yiddish. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 41-48. [PDF, 157KB]

(2009) Hendrik J.Groenewald & Wildrich Fourie: Introducing the Autshumato integrated translation environment. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.190-196. [PDF , 410KB]

(2009) Gholamreza Haffari, Maxim Roy, & Anoop Sarkar: Active learning for statistical phrase-based machine translation. NAACL HLT 2009. Human Language Technologies: the 2009 annual conference of the North American Chapter of the ACL, Boulder, Colorado, May 31 - June 5, 2009; pp.415-423. [PDF, 190KB]

(2009) Preslav Nakov & Hwee Tou Ng: Improved statistical machine translation for resource-poor languages using related resource-rich languages. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.1358-1367. [PDF, 209KB]

(2009) Kepa Sarasola: Matxin: developing sustainable machine translation for a less-resourced language [abstract]. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, 2-3 November 2009, Universitat d’Alacant, Alacant, Spain; ed. Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Francis M.Tyers; pp. 1.

(2009) Stephen Soderland, Christopher Lim, Mausam, Bo Qin, Oren Etzioni, & Jonathan Pool: Lemmatic machine translation. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.128-135. [PDF, 214KB]

(2009) Jörg Tiedemann: Translating questions for cross-lingual QA. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.112-119. [PDF, 370KB]

(2009) Varga István & Yokoyama Shoichi: Bilingual dictionary generation for low-resourced language pairs. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.862-870. [PDF, 199KB]

(2008) Michael Carl, Maite Melero, Toni Badia, Vincent Vandeghinste, Peter Dirix, Ineke Schuurman, Stella Markantonatou, Sokratis Sofianopoulos, Marina Vassiliou, & Olga Yannoutsou: METIS-II: low resource machine translation [abstract]. Machine Translation 22 (1/2), March-June 2008; pp.67-99.

(2008) Steve DeNeefe, Ulf Hermjakob & Kevin Knight: Overcoming vocabulary sparsity in MT using lattices. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.89-96. [PDF, 492KB]

(2008) Emil Ettelaie, Panayiotis G.Georgiou, & Shrikanth S.Narayanan: Mitigation of data sparsity in classifier-based translation. Coling 2008: Proceedings of the Workshop on Speech Processing for Safety Critical Translation and Pervasive Applications, 23 August 2008, Manchester, UK; pp.1-4. [PDF, 138KB]

(2008) Reginald L.Hobbs & Clare R.Voss: Designing and executing machine translation workflows through the Kepler framework. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.380-389.  [PDF of PPT presentation, 1204KB]

(2008) Arnar Jensson, Koji Iwano, & Sadaoki Furui: Development of a speech recognition system for Icelandic using machine translated text. First International Workshop on Spoken Languages Technologies for Under-resourced languages (SLTU-2008), Hanoi University of Technology, Hanoi, Vietnam, May 5-7, 2008; pp. 18-21. [PDF, 116KB]

(2008) Andreas Kathol & Jing Zheng: Strategies for building a Farsi-English SMT system from limited resources. Interspeech 2008: 9th Annual Conference of the International Speech Communication Association, Brisbane, Australia, September 22-26, 2008; pp.2731-2734; abstract [PDF, 62KB]

(2008) Karine Megerdoomian & Dan Parvaz: Low-density language bootstrapping: the case of Tajiki Persian.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 331KB]

(2008) Rada Mihalcea & Vivi Nastase: How to add a new language on the NLP map: building resources and tools for languages with scarce resources. [Abstract of tutorial at] IJCNLP 2008: Sixth SIGHAN Workshop on Chinese Language Processing, Proceedings of the workshop, 11-12 January 2008, Hyderabad, India; pp.938. [PDF, 249KB]

(2008) Michael Mohler & Rada Mihalcea: BABYLON parallel text builder: gathering parallel texts for low-density languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 4pp. [PDF, 63KB]

(2008) Christian Monson, Ariadna Font Llitjós, Vamshi Ambati, Lori Levin, Alon Lavie, Alison Alvarez, Roberto Aranovich, Jaime Carbonell, Robert Frederking, Erik Peterson, & Katharina Probst: Linguistic structure and bilingual informants help induce machine translation of lesser-resourced languages. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 141KB]

(2008) Felipe Sánchez-Martínez, Juan Antonio Pérez-Ortiz, & Mikel L.Forcada: Using target-language information to train part-of-speech taggers for machine translation [abstract]. Machine Translation 22 (1/2), March-June 2008; pp.29-66.

(2008) Tanja Schultz: Rapid language adaptation tools and technologies for multilingual speech processing systems. [abstract of] Invited talk at: First International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU-2008), Hanoi, Vietnam, 5-7 May 2008; 1p. [PDF, 51KB]

(2008) Virach Sornlertlamvanich: Cross language resource sharing. IJCNLP 2008: Workshop on NLP for Less Privileged Languages. Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.3-4. [PDF, 37KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Chumpol Mokarat, Hammam Riza, Hitoshi Isahara, & Purev Jaimal: Synset assignment for bi-lingual dictionary with limited resource. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.673-678. [PDF, 549KB]

(2008) Kathrin Spreyer, Jonas Kuhn, & Bettina Schrader: Identification of comparable argument-head relations in parallel corpora.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 138KB]

(2008) Sebastian Stüker & Alex Waibel: Towards human translations guided language discovery for ASR systems. First International Workshop on Spoken Languages Technologies for Under-resourced languages (SLTU-2008), Hanoi University of Technology, Hanoi, Vietnam, May 5-7, 2008; pp. 76-79. [PDF, 218KB]

(2008) Vincent Vandeghinste, Peter Dirix, Ineke Schuurman, Stella Markantonatou, Sokratis Sofianopoulos, Marina Vassiliou, Olga Yannoutsou, Toni Badia, Maite Melero, Gemma Boleda, Michael Carl, & Paul Schmidt: Evaluation of a machine translation system for low resource languages: METIS-II.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 8pp. [PDF, 78KB]

(2008) Daniel Zeman & Philip Resnik: Cross-language parser adaptation between related languages. IJCNLP 2008: Workshop on NLP for Less Privileged Languages. Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.35-42. [PDF, 150KB]

(2008) Imed Zitouni & Radu Florian: Mention detection crossing the language barrier. EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.600-609. [PDF, 149KB]

(2007) Alison Alvarez, Lori Levin, Robert Frederking, & Jill Lehman: An assessment of language elicitation without the supervision of a linguist. TMI-2007: Proceedings of the 11th International Conference on Theoretical and Methodological Issues in Machine Translation, Skövde [Sweden], 7-9 September 2007; pp.1-10 [PDF, 415KB]; presentation [PDF, 412KB]

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Translating from under-resourced languages: comparing direct transfer against pivot translation. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.29-35 [PDF, 197KB]

(2007) Krzysztof Jassem & Tomasz Kowalski: Machine translation using scarce bilingual corpora. TASK Quarterly 11, no.1-2, 21-33. [PDF, 216KB]

(2007) Chris Quirk, Raghavendra Udupa U., & Arul Menezes: Generative models of noisy translations with applications to parallel fragment extraction. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.377-384 [PDF, 249KB]

(2007) Hua Wu & Haifeng Wang: Pivot language approach for phrase-based statistical machine translation [abstract].  Machine Translation 21 (3), September 2007; pp.165-181.

(2006) proceedings of 5th SALTMIL Workshop on Minority Languages: “Strategies for developing machine translation for minority languages”, LREC-2006: Fifth International Confe rence on Language Resources and Evaluation, Genoa, Italy, 23 May 2006. [PDF, 5391KB]

(2006) Andreas Eisele: Parallel corpora and phrase-based statistical machine translation for new language pairs via multiple intermediaries.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.845-848 [PDF, 329KB]

(2006) Emmanuel Giguet & Pierre-Sylvain Luquet: Multilingual lexical database generation fom parallel texts in 20 European languages with endogenous resources. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.271-278. [PDF, 427KB]

(2006) Krzysztof Jassem & Kowalski Tomasz: An algorithm for extracting translation rules from scarce bilingual corpora. Proceedings of the International Multiconference on Computer Science and Information Technology, vol.1: XXII Autumn Meeting of Polish Information Processing Society, November 6-10, 2006, Wisla, Poland; pp.67-73. [PDF, 368KB]

(2006) Thai Phuong Nguyen & Akira Shimazu: Improving phrase-based statistical machine translation with morphosyntactic transformation [abstract]. Machine Translation 20 (3),2006; pp.147-166.

(2006) Ayu Purwarianti, Masatoshi Tsuchiya, & Seiichi Nakagawa: Indonesian-Japanese CLIR using only limited resource. Coling-ACL 2006: Proceedings of the workshop on How can computational linguistics improve information retrieval? Sydney, July 2006; pp.1-8. [PDF, 80KB]

(2006) Jason Riesa, Behrang Mohit, Kevin Knight, & Daniel Marcu: Building an English-Iraqi Arabic machine translation system for spoken utterances with limited resources . Interspeech 2006: ICSLP Ninth International Conference on  Spoken Language Processing, Pittsburgh, PA, USA, September 17-21, 2006, paper 2012; abstract [PDF, 80KB]

(2006) Jason Riesa & David Yarowsky: Minimally supervised morphological segmentation with applications to machine translation. AMTA 2006: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; pp.185-192 [PDF, 613KB]

(2006) Oliver Streiter, Kevin P.Scanell, & Martin Stuflesser: Implementing NLP projects for noncentral languages: instructions for funding bodies, strategies for developers [abstract]. Machine Translation 20 (4),2006; pp.267-289.

(2006) Vincent Vandeghinste, Ineka Schuurman, Michael Carl, Stella Markantonatou, & Toni Badia: METIS-II: machine translation for low-resource languages.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1284-1289 [PDF, 409KB]

(2006) Haifeng Wang, Hua Wu, & Zhanyi Liu: Word alignment for languages with scarce resources using bilingual corpora of other language pairs. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.874-881. [PDF, 155KB]

(2005) Youjin Chung & Jong-Hyeok Lee: Practical word-sense disambiguation using co-occurring concept codes [abstract]. Machine Translation 19 (1),2005; pp.59-82.

(2005) Hiroshi Echizen-ya, Kenji Araki, & Yoshio Momouchi: Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora. ACL-SIGLEX-2005: Workshop on Deep Lexical Acquisition, University of Michigan, Ann Arbor, 30 June 2005; pp. 87-96.  [PDF, 726KB]

(2005) Andreas Kathol, Kristin Precoda, Dimitra Vergyri, Wen Wang, & Susanne Riehemann: Speech translation for low-resource languages: the case of Pashto. Interspeech 2005 - Eurospeech: 9th European  Conference on  Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005; pp.2273-2276; abstract [PDF, 44KB]

(2005) Jonas Kuhn: Parsing word-aligned parallel corpora in a grammar induction context. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 17-25. [PDF, 150KB]

(2005) Adam Lopez & Philip Resnik: Improved HMM alignment models for languages with scarce resources.  ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 83-86. [PDF, 103KB]

(2005) Joel Martin, Rada Mihalcea, & Ted Pedersen: Word alignment for languages with scarce resources. ACL-2005: Workshop on Building and Using Parallel Texts – Data-driven machine translation and beyond, University of Michigan, Ann Arbor, 29-30 June 2005; pp. 65-74. [PDF, 170KB]

(2005) Maja Popovic & Hermann Ney: Exploiting phrasal lexica and additional morpho-syntactic language resources for statistical machine translation with scarce training data. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 212-218. [PDF, 65KB]

Software resources

(2009) Cristina Vertan: Machine (aided) translation for historical texts – an overview of current solutions. EAMT-2009 Workshop Machine Translation for Historical Languages, 13 May 2009, Barcelona; slides [PDF of PPT, 453KB]

(2008) Nicola Bertoldi: A tutorial on the IRSTLM library. Second Machine Translation Marathon, Wandlitz, Berlin, Germany, 12-17 May 2008; 34 slides. [PDF, 401KB]

Spoken language resources

(2006) Thomas Koller: [review of] Emanuela Cresti and Massimo Moneglia (eds.), C-ORAL-ROM, Integrated reference corpora for spoken Romance languages. Machine Translation 20 (4),2006; pp.297-300. [see publication]

Translation archive

(2009) Masao Utiyama, Takeshi Abekawa, Eiichiro Sumita, & Kyo Kageura: Minna no Hon’yaku: a website for hosting, archiving, and promoting translations. Translating and the Computer 31, 19-20 November 2009, London; 12pp. [PDF, 1921KB]

Treebanks (see also Semantic analysis and representation, Thesaurus method)

(2009) Ondřej Bojar & Zdeněk Žabokrtský: CzEng 0.9: large parallel treebank with rich annotation. Prague Bulletin of Mathematical Linguistics, no.92, December 2009; pp.63-83 [PDF, 217KB]

(2009) John Tinsley & Andy Way: Automatically generated parallel treebanks and their exploitability in machine translation [abstract]. Machine Translation 23 (1), February 2009; pp.1-22.

(2009) John Tinsley, Mary Hearne, & Andy Way: Exploiting parallel treebanks to improve phrase-based statistical machine translation [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 21KB]

(2009) Vincent Vandeghinste: Tree-based target language modeling. EAMT-2009: Proceedings of the 13th Annual Conference of the European Association for Machine Translation, ed. Lluís Màrquez and Harold Somers, 14-15 May 2009, Universitat Politècnica de Catalunya, Barcelona, Spain; pp.152-159. [PDF, 391KB]

(2009) Hai Zhao, Yan Song, Chunyu Kit, & Guodong Zhou: Cross language dependency parsing using a bilingual lexicon. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.55-63. [PDF, 547KB]

(2009) Ventsislav Zhechev: Unsupervised generation of parallel treebanks through sub-tree alignment. Prague Bulletin of Mathematical Linguistics 91, 2009; pp.89-98. [PDF, 478KB]  [presentation at MT Marathon 2009, 3281KB]

(2008) Colin Cherry & Chris Quirk: Discriminative, syntactic language modeling through latent SVMs. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.65-74. [PDF, 609KB]

(2008) David Mareček, Zdeněk Žabokrtský, & Václav Novák: Automatic alignment of Czech and English deep syntactic dependency trees. EAMT 2008: 12th annual conference of the European Association for Machine Translation, September 22 & 23, 2008, Hamburg, Germany. Proceedings, ed. John Hutchins and Walther v.Hahn; pp.104-113. [PDF, 586KB]

(2008) Beáta Megyesi, Bengt Dahlqvist, Eva Pettersson, & Joakim Nivre: Swedish-Turkish parallel treebank. LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 80KB]

(2008) Joakim Nivre, Igor M.Boguslavsky, & Leonid L.Iomdin: Parsing the SynTagRus treebank of Russian.  Coling 2008:  22nd International Conference on Computational Linguistics, Proceedings of the conference, 18-22 August 2008, Manchester UK; pp.641-648. [PDF, 266KB]

(2008) Daniel Zeman & Philip Resnik: Cross-language parser adaptation between related languages. IJCNLP 2008: Workshop on NLP for Less Privileged Languages. Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.35-42. [PDF, 150KB]

(2007) Matthias Buch-Kromann: Computing translation units and quantifying parallelism in parallel dependency treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.69-76 [PDF, 361KB]

(2007) Martin Volk, Joakim Lundborg, & Maël Mettler: A search tool for parallel treebanks. ACL 2007: proceedings of the Linguistic Annotation Workshop, Prague, Czech Republic, 28-29 June 2007; pp.85-92 [PDF, 270KB]

(2006) Yafa Al-Raheb, A.Akrout, J. van Genabith, & J. Dichy: DCU 250 Arabic dependency bank: an LFG gold standrad resource for the Arabic Penn treebank.  The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.105-116. [PDF, 355KB]

(2006) Sabine Buchholz & Erwin Marsi: CoNLL-X shared task on multilingual dependency parsing. CoNLL-X: Proceedings of the 10th Conference on Computational Natural Language Learning, New York City, June 2006; pp.149-164. [PDF, 158KB]

(2006) Dan Flickinger: Identifying complex phenomena in a corpus via a treebank lens. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.125-129 [PDF, 89KB]

(2006) Mohamed Maamouri, Ann Bies, & Seth Kulick: Diacritization: a challenge to Arabic treebank annotation and parsing.  The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.35-47 [PDF, 223KB]

(2006) Stephan Oepen & Jan Tore Lønning: Discriminant-based MRS banking.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa, Italy, 22-28 May 2006; pp.1250-1255 [PDF, 430KB]

(2006) Otakar Smrž: Tips and tricks of the Prague Arabic dependency treebank.  The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.25-34. [PDF, ]

(2006) Martin Volk, Sofia Gustafson-Capková, Joakim Lundborg, Torsten Marek, Yvonne Samuelsson, & Frida Tidström: XML-based phrase alignment in parallel treebanks. EACL-2006: Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-dimensional Markup in Natural Language Processing, April 4, 2006, Trento, Italy; pp.93-96. [PDF, 103KB]

(2005) Martin Čmejrek, Jan Cuřín, Jan Hajič, & Jiří Havelka: Prague Czech-English dependency treebank: resource for structure-based MT. 10th EAMT conference "Practical applications of machine translation", 30-31 May 2005, Budapest; pp. 73-78. [PDF, 66KB]

(2005) Martin Jansche: Treebank transfer. IWPT 2005: Proceedings of the Ninth International Workshop on Parsing Technologies, Vancouver, October 2005; pp.74-82. [PDF, 120KB]

Wikis

(2009) Alain Désilets, Louis-Philippe Huberdeau, Marc Laporte, & Jean Quirion: Building a collaborative multilingual terminology system. Translating and the Computer 31, 19-20 November 2009, London; 11pp. [PDF, 349KB]

(2008) David Calvert: Wiki behind the firewall – microscale online collaboration in a translation agency. Translating and the Computer 30, 27-28 November 2008, London; 14pp. [PDF, 200KB]

(2008) Louis-Philippe Huberdeau, Sébastien Paquet, & Alain Désilets: The cross-lingual wiki engine: enabling collaboration across language barriers. Proc. WikiSym 2008, Sept.8-10, Porto, Portugal; 14pp. [PDF, 364KB]

Wordnets (see also WordNet in index of systems)

(2009) Jinho D.Choi, Martha Palmer, & Nianwen Xue: Using parallel Propbanks to enhance word-alignments. ACL-IJCNLP 2009: Third Linguistic Annotation Workshop (LAW III), Proceedings of the workshop, 6-7 August 2009, Suntec, Singapore; pp.121-124. [PDF, 139KB]

(2009) Dmitry Davidov & Ari Rappoport: Translation and extension of concepts across languages. EACL-2009: Proceedings of the 12th Conference of the European Chapter of the ACL, Athens, Greece, 30 March – 3 April 2009; pp.175-183. [PDF, 141KB]

(2009) Miguel García, Jesús Giménez & Lluís Màrquez: Enriching statistical translation models using domain-independent multilingual lexical knowledge base [abstract]. CICLING 2009: 10th International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City, Mexico, March 1-7, 2009; 1p. [PDF, 19KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Suphanut Thayaboon, Chumpol Mokarat, & Hitoshi Isahara: Enhanced tools for online collaborative language resource development. IJCNLP 2008: Sixth Workshop on Asian Language Resources, Proceedings of the workshop, 11-12 January 2008, Hyderabad, India; pp.105-106. [PDF, 310KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Chumpol Mokarat, Hammam Riza, Hitoshi Isahara, & Purev Jaimal: Synset assignment for bi-lingual dictionary with limited resource. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.673-678. [PDF, 549KB]

(2008) Virach Sornlertlamvanich, Thatsanee Charoenporn, Kergrit Robkop, & Hitoshi Isahara: KUI: self-organizing multi-lingual WordNet construction tool. GWC-2008: the Fourth Global WordNet conference, Szeged, Hungary, January 22-25, 2008; pp. 419-427. [PDF, 814KB]

(2007) Eneko Agirre, Bernardo Magnini, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau, & Piek Vossen: SemEval-2007 task 01: evaluating WSD on cross-language information retrieval. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.1-6 [PDF, 79KB]

(2006) Sabri Elkateb, William Black, Piek Vossen, David Farwell, Adam Pease, & Christiane Fellbaum: Arabic WordNet and the challenges of Arabic. The Challenge of Arabic for NLP/MT. International conference at the British Computer Society, London, 23 October 2006; pp.15-24. [PDF, 295KB]

(2006) Pascale Fung & Benfeng Chen: Robust word sense translation by EM learning of frame semantics. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.239-246. [PDF, 223KB]

(2006) Jesús Giménez & Lluís Màrquez: Low-cost enrichment of Spanish WordNet with automatically translated glosses: combing general and specialized models. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.287-294. [PDF, 108KB]

(2005) Palmira Marrafa: The representation of complex telic predicates in wordnets: the case of lexical-conceptual structure deficitary verbs. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.109-116 [abstract, PDF, 14KB]

(2005) Siddharth Patwardhan, Satanjeev Banerjee & Ted Pedersen: SenseRelate::TargetWord – a generalized framework for word sense disambiguation. ACL-2005: Interactive Poster and Demonstration Sessions, University of Michigan, Ann Arbor, June 2005; pp. 73-76. [PDF, 54KB]

World Wide Web [see also Internet, Semantic Web]

(2009) Caroline Barrière: The web as a source of informative background knowledge. MT Summit XII - Workshop: Beyond Translation Memories: New Tools for Translators MT, August 29, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 787KB]

(2009) Hervé Blanchon, Christian Boitet, & Cong-Phap Huynh: A web service enabling gradable post-edition of pre-translations produced by existing translation tools: practical use to provide high-quality translation of an online encyclopedia. MT Summit XII - Workshop: Beyond Translation Memories: New Tools for Translators MT, August 29, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 2435KB]

(2009) Janara Christensen, Mausam & Oren Etzioni: A rose is a roos is a ruusu: querying translations for web image search. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.193-196. [PDF, 101KB]

(2009) Dmitry Davidov & Ari Rappoport: Enhancements of lexical concepts using cross-lingual web mining. EMNLP-2009: proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-7 August 2009; pp.852-861. [PDF, 162KB]

(2009) Miquel Esplà-Gomis: Bitextor, a free/open-source software to harvest translation memories from multilingual websites. MT Summit XII - Workshop: Beyond Translation Memories: New Tools for Translators MT, August 29, 2009, Ottawa, Ontario, Canada; 8pp. [PDF, 730KB]

(2009) Wei Gao, John Blitzer, Ming Zhou, & Kam-Fai Wong: Exploiting bilingual information to improve web search. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.1075-1083. [PDF,

(2009) Alon Halevy, Peter Norvig, & Fernando Pereira: The unreasonable effectiveness of data. IEEE Intelligent Systems, March-April 2009; pp.8-12. [PDF, 376KB]

(2009) Chiori Hori, Sakriani Sakti, Michael Paul, Noriyuki Kimura, Yutaka Ashikari, Ryosuke Isotani, Eiichiro Sumita, & Satoshi Nakamura: Network-based speech-to-speech translation. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; p. 168. [PDF, 286KB]; presentation [PDF of PPT, 485KB]

(2009) Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, & Qingsheng Zhu: Mining bilingual data from the web with adaptively learnt patterns. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.870-878. [PDF, 306KB]

(2009) Miguel A.Jiménez-Crespo: Conventions in localization: a corpus study of original vs. translated web texts. Journal of Specialised Translation 12 (July 2009); pp.79-102. [PDF, 285KB]

(2009) Philipp Koehn: A web-based interactive computer aided translation tool. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Software Demonstrations, Suntec, Singapore, 3 August 2009; pp.17-20. [PDF, 127KB]

(2009) Svetlin Nakov, Preslav Nakov, & Elena Paskaleva: Unsupervised extraction of false friends from parallel bi-texts using the web as a corpus.  [RANLP 2009] International conference: Recent Advances in Natural Language Processing. Proceedings ed. Galia Angelova, Kalina Bontcheva, Ruslan Mitkov, Nicolas Nicolov, Nikolai Nikolov, Borovets, Bulgaria, 14-16 September 2009; pp. 292-298. [PDF, 235KB]

(2009) Ondřej Odcházal & Ondřej Bojar: Computer-aided translation backed by machine translation. Translating and the Computer 31, 19-20 November 2009, London; 8pp. [PDF, 143KB]

(2009) Saverio Perrino: User-generated translation: the future of translation in a Web 2.0 environment. Journal of Specialised Translation 12 (July 2009); pp.55-78. [PDF, 244KB]

(2009) Feiliang Ren, Muhua Zhu, Huizhen Wang, & Jingbo Zhu: Chinese-English organization name translation based on correlative expansion. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.143-151. [PDF, 123KB]

(2009) Victor M.Sánchez-Cartagena & Juan Antonio Pérez-Ortiz: An open-source highly scalable web service architecture for the Apertium machine translation engine. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, 2-3 November 2009, Universitat d’Alacant, Alacant, Spain; ed. Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Francis M.Tyers; pp. 51-58. [PDF, 240KB]

(2009) Kentaro Torisawa: Monolingual knowledge acquisition and a multilingual information environment – invited talk. IWSLT 2009: Proceedings of the International Workshop on Spoken Language Translation, National Museum of Emerging Science and Innovation, Tokyo, Japan, December 1-2, 2009; 43pp. [PDF of PPT, 7552KB]

(2009) Marco Trombetti: Creating the world's largest translation memory. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 9-16. [PDF of PPT presentation, 707KB]

(2009) Masao Utiyama, Daisuke Kawahara, Keiji Yasuda & Eiichiro Sumita: Mining parallel texts from mixed-language web pages. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp.152-159. [PDF, 136KB]

(2009) Masao Utiyama, Takeshi Abekawa, Eiichiro Sumita, & Kyo Kageura: Minna no Hon’yaku: a website for hosting, archiving, and promoting translations. Translating and the Computer 31, 19-20 November 2009, London; 12pp. [PDF, 1921KB]

(2009) Fan Yang, Jun Zhao, & Kang Liu: A Chinese-English organization name translation system using heuristic web mining and asymmetric alignment. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Suntec, Singapore, 2-7 August 2009; pp.387-395. [PDF, 178KB]

(2009) Yuejie Zhang, Yang Wang, & Xiangyang Xue: English-Chinese bi-directional OOV translation based on web mining and supervised learning. [ACL-IJCNLP-2009] Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Suntec, Singapore, 4 August 2009; pp.129-132. [PDF, 325KB]

(2009) Yilu Zhou: Maximum n-gram HMM-based name transliteration: experiment in NEWS 2009 on English-Chinese corpus. [ACL-IJCNLP-2009] Proceedings of the 2009 Named Entities Workshop ACL-IJCNLP 2009, Suntec, Singapore, 7 August 2009; pp.128-131. [PDF, 244KB]

(2009) Qibo Zhu, Diana Inkpen & Ash Asudeh: Inducing translations from officially published materials in Canadian government websites. MT Summit XII: proceedings of the twelfth Machine Translation Summit, August 26-30, 2009, Ottawa, Ontario, Canada; pp. 176-183. [PDF, 167KB]

(2008) Alain Désilets, Benoit Farley, Marta Stojanovic, & Geneviève Patenaude: WeBiText: building large heterogeneous translation memories from parallel web content. Translating and the Computer 30, 27-28 November 2008, London; 11pp. [PDF, 470KB]

(2008) Reginald Hobbs, Jamal Laoudi, & Clare R.Voss: MTriage: web-enabled software for the creation, machine translation, and annotation of smart documents.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 7pp. [PDF, 433KB]

(2008) Roderick Holland & Brendan Keyes: ClipperRSS: a light-weight prototype for the cross-language exploitation of syndicated feeds. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.390-393. [PDF, 661KB]

(2008) HaiXiang Huang & Atsushi Fujii: Effects of related term extraction in transliteration into Chinese. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.643-648. [PDF, 374KB]

(2008) Jin-Shea Kuo, Haizhou Li, & Chih-Lung Lin: Mining transliterations from web query results: an incremental approach. IJCNLP 2008: Sixth SIGHAN Workshop on Chinese Language Processing, Proceedings of the workshop, 11-12 January 2008, Hyderabad, India; pp.16-23. [PDF, 400KB]

(2008) Jin-Shea Kuo & Haizhou Li: Multi-view co-training of transliteration model. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.373-380. [PDF, 465KB]

(2008) Bo Li & Juan Liu: Mining Chinese-English parallel corpora from the web. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.847-852. [PDF, 457KB]

(2008) Dekang Lin, Shaojun Zhao, Benjamin Van Durme, & Marius Paşca: Mining parenthetical translations from the web by word alignment. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 994-1002. [PDF, 419KB]

(2008) Qing Ma, Nakao Koichi, Masaki Murata, & Hitoshi Isahara: Selection of Japanese-English equivalents by integrating high-quality corpora and huge amounts of web data.  LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech, Morocco, 26-30 May 2008; 6pp. [PDF, 79KB]

(2008) Jong-Hoon Oh & Hitoshi Isahara: Hypothesis selection in machine transliteration: a web mining approach. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.233-240. [PDF, 461KB]

(2008) Achim Ruopp & Fei Xia: Finding parallel texts on the web using cross-language information retrieval.  IJCNLP 2008: 2nd International Workshop on Cross-Lingual Information Access (CLIA) Proceedings of the workshop, 11 January 2008, Hyderabad, India; pp.18-25. [PDF, 239KB]

(2008) Lei Shi & Ming Zhou: Improved sentence alignment on parallel web pages using a stochastic tree alignment model.  EMNLP 2008: Proceedings of  the 2008 Conference on Empirical Methods in Natural Language Processing, 25-27 October 2008, Honolulu, Hawaii, USA; pp.505-513. [PDF, 483KB]

(2008) Rohini U, Vamshi Ambati, & Vasudev Varma: Statistical machine translation models for personalized search. IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.521-528. [PDF, 432KB]

(2008) Yu-Chun Wang, Richard Tzong-Han Tsai, & Wen-Lian Hsu: Learning patterns from the web to translate named entities for cross language information retrieval.  IJCNLP 2008: Third International Joint Conference on Natural Language Processing, January 7-12, 2008, Hyderabad, India; pp.281-288. [PDF, 464KB]

(2008) Yorick Wilks: On whose shoulders? Computational Linguistics 34(4), pp. 471-486. [PDF, 113KB]

(2008) Fai Wong & Kwok Kit Leung: The design of web based machine translation server based on grid infrastructure . The Fourth International Conference on Networked Computing and Advanced Information Management (NCM2008), Gyeongju, Korea, 2008; pp.713-718. [PDF, 226KB]

(2008) Jian-Cheng Wu, Peter Wei-Huai Hsu, Chiung-Hui Tseng, & Jason S. Chang: Mining the web for domain-specific translations. AMTA-2008. MT at work: Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas, Waikiki, Hawai’i, 21-25 October 2008; pp.212-221 PDF, 750B]

(2008) Fan Yang, Jun Zhao, Bo Zou, Kang Liu, & Feifan Liu: Chinese-English backward transliteration assisted with mining monolingual web pages. ACL-08: HLT. 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the conference, June 15-20, 2008, The Ohio State University, Columbus, Ohio, USA; pp. 541-549. [PDF, 593KB]

(2007) Guihong Cao, Jianfeng Gao, & Jian-Yun Nie: A system to mine large-scale bilingual dictionaries from monolingual web pages. MT Summit XI, 10-14 September 2007, Copenhagen, Denmark. Proceedings; pp.57-64 [PDF, 539KB]

(2007) Ignacio Garcia: Power shifts in web-based translation memory [abstract]. Machine Translation 21 (1), March 2007; pp.55-68.

(2007) Federico Gaspari: The role of online MT in webpage translation. PhD thesis, University of Manchester, [June] 2007; 317pp. [PDF, 8342KB]

(2007) Federico Gaspari & Harold Somers: Using free online MT in multilingual websites. Tutorial at MT Summit XI, 10 September 2007, Copenhagen, Denmark; abstract, 1p. [PDF, 75KB]

(2007) Long Jiang, Ming Zhou, Lee-Feng Chien & Cheng Niu: Named entity translation with web mining and transliteration. IJCAI-07: Twentieth International Joint conference on Artificial Intelligence, Hyderabad, India, 6-12 January 2007; pp.1629-1634. [PDF, 491KB]

(2007) Peng Yuan Liu, TieJun Zhao, & Mu Yun Yang: HIT-WSD: using search engine for multilingual Chinese-English lexical sample task. ACL 2007: proceedings of the 4th International  Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007; pp.160-172 [PDF,113KB]

(2007) Chengye Lu, Yue Xu, & Shlomo Geva: Improving translation accuracy in web-based translation extraction. Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007, Tokyo, Japan; pp.31-35. [PDF, 94KB]

(2007) Masaki Murata, Jong-Hoon Oh, Qing Ma & Hitoshi Isahara: Applying multiple charcteristics and techniques in the NICT information retrieval system in NTCIR-6.  Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007, Tokyo, Japan; pp.97-104. [PDF, 164KB]

(2007) Patrizia Pierini: Quality in web translation: an investigation intoUK and Italian tourism web sites. Journal of Specialised Translation 8 (July 2007); pp.85-103. [PDF, 329KB]

(2007) Tanja Schultz, Alan W.Black, Sameer Badaskar, Matthew Hornyak, & John Kominek: SPICE: web-based tools for rapid language adaptation in speech processing systems. Interspeech 2007: 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007; pp.2125-2128; abstract [PDF, 23KB]

(2007) Harold Somers: Machine translation and the World Wide Web. In: Khurshid Ahmad, Christopher Brewster, & Mark Stevenson (eds.) Words and intelligence II: essays in honor of Yorick Wilks (Dordrecht: Springer); pp.209-233.

(2007)  Jian-Cheng Wu & Jason S.Chang: Learning to find English to Chinese transliterations on the web. EMNLP-CoNLL-2007: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic; pp. 858-857. [PDF, 187KB]

(2007) Chien-Cheng Wu & Jason S.Chang: Learning to find transliteration on the Web. NAACL-HLT-2007 Human Language Technology: the conference of the North American Chapter of the Association for Computational Linguistics, 22-27 April 2007, Rochester, NY; Demonstration Program, pp.21-22 [PDF, 1621KB]

(2007) Dong Zhou, Mark Truran, Tim Brailsford & Helen Ashman: NTCIR-6 experiments using pattern matched translation extraction. Proceedings of NTCIR-6 Workshop Meeting, May 15-18, 2007, Tokyo, Japan; pp.145-151. [PDF, 130KB]

(2007) Qibo Zhu, Diana Inkpen & Ash Asudeh: Automatic extraction of translations from web-based bilingual materials [abstract]. Machine Translation 21 (3), September 2007; pp.139-163.

(2006) Marco Baroni, Adam Kilgarriff, Jan Pomikálek, & Pavel Rychlý: WebBootCaT: instant domain-specific corpora to support human translators. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.247-252 [PDF, 191KB]

(2006) Youcef Bey, Christian Boitet, & Kyo Kageura: The TRANSBey prototype: an online collaborative Wiki-based CAT environment for volunteer translators.  LREC-2006: Fifth International Conference on Language Resources and Evaluation. Third International Workshop on Language Resources for Translation Work, Research & Training (LR4Trans-III), Genoa, Italy, 28 May 2006; pp.49-54. [PDF, 670KB]

(2006) Chris Brockett, William B.Dolan, & Michael Gamon: Correcting ESL errors using phrasal SMT techniques.  Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.249-256. [PDF, 109KB]

(2006) Roldano Cattoni, Nicola Bertoldi, Mauro Cettolo, Boxing Chen, & Marcello Federico: A web-based demonstrator of a multi-lingual phrase-based translation system. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Posters and demonstrations, Trento, Italy, April 5-6, 2006; pp.91-94 [PDF, 184KB]

(2006) Conrad Chen & Hsin-Hsi Chen: A high-accurate Chinese-English NE backward translation system combining both lexical information and web statistics.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.81-88. [PDF, 313KB]

(2006) Matthias Eck, Stephan Vogel, & Alex Waibel: A flexible online server for machine translation evaluation. EAMT-2006: 11th Annual Conference of the European Association for Machine Translation, June 19-20, 2006, Oslo, Norway. Proceedings; p.89-94 [PDF, 111KB]

(2006) Gaolin Fang, Hao Yu, & Fumihito Nishino: Chinese-English term translation mining based on semantic prediction.  Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.199-206. [PDF, 240KB]

(2006) Jin-Shea Kuo, Haizhou Li, & Ying-Kuei Yang: Learning transliteration lexicons from the Web. Coling-ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, 17-21 July 2006; pp.1129-1136. [PDF, 428KB]

(2006) Michael C. McCord: MT for social impact (contribution to panel on “Machine translation for social impact”).  AMTA 2006: 7th Conference of the Association for Machine Translation in the Americas, “Visions for the Future of Machine Translation”, August 8-12, 2006, Cambridge, Massachusetts, USA; [PDF of PPT presentation, 1022KB]

(2006) Jonathan Pool: Can controlled languages scale to the Web? CLAW 2006: 5th International Workshop on Controlled Language Applications, Cambridge, MA, USA, August 12, 2006; 12pp. [PDF, 136KB]

(2006) Rolf Schwitter & Marc Tilbrook: Writing RSS feeds in a machine-processable controlled natural language. CLAW 2006: 5th International Workshop on Controlled Language Applications, Cambridge, MA, USA, August 12, 2006; 12pp. [PDF, 132KB]

(2005) Lumar Bértoli Jr., Rodolfo Pinto da Luz, & Rogério Cid Bastos: A WEB platform using UNL: CELTA’s showcase. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.276-285 [abstract, PDF, 15KB]

(2005) Carme Colominas: BancTrad: a web interface for integrated access to annotated corpora. International workshop: Modern approaches in translation technologies, Borovets, Bulgaria, 24 September 2005; p.7-8 [PDF, 114KB]

(2005) John Fry: Assembling a parallel corpus from RSS news feeds MT Summit X, Phuket, Thailand, September 16, 2005, Proceedings of Second Workshop on Example-Based Machine Translation; pp.59-62. [PDF, 303KB]

(2005) Federico Gaspari: Embedding free online machine translation into monolingual websites for multilingual dissemnination: a case study of implementation. Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 25pp. [PDF, 1255KB]

(2005) Najeh Hajlaoui & Christian Boitet: A “pivot” XML-based architecture for multilingual, multiversion documents: parallel monolingual documents aligned through a central correspondence descriptor and possible use of UNL. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.309-325 [abstract, PDF, 85KB]

(2005) Fei Huang, Ying Zhang, & Stephan Vogel: Mining key phrase translations from web corpora. HLT-EMNLP-2005: Proceedings of Human Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, October 2005; pp. 483-490. [PDF, 310KB]

(2005) Chunyu Kit, Xiaoyue Liu, KingKui Siu, & Jonathan J.Webster: Harvesting the bitexts of the laws of Hong Kong from the web. IJCNLP-05: Fifth Workshop on Asian Language Resources (ALR-05). Proceedings of the workshop, 14 October 2005, Jeju Island, Korea; pp. 71-78. [PDF, 665KB]

(2005) Jinming Min, Le Sun, & Junlin Zhang: ISCAS in English-Chinese CLIR at NTCIR-5.  Proceedings of NTCIR-5 Workshop Meeting, December 6-9, 2005, Tokyo, Japan; 7pp. [PDF, 277KB]

(2005) Reinhard Schäler: Reverse localisation. Translating and the Computer 27: proceedings of the Twenty-seventh International Conference on Translating and the Computer, 24-25 November 2005, London. (London: Aslib, 2005); 15pp. [PDF, 125KB]; presentation [PDF, 2986KB]

(2005) Masatsugu Tonoike, Mitsuhiro Kida, Toshihiro Takagi, Yauhiro Sasaki, Takehito Utsuro, & Satoshi Sato: Effect of domain-specific corpus in compositional translation estimation for technical terms. IJCNLP-05: Second International Joint Conference on Natural Language Processing, 11-13 October 2005, Jeju Island, Republic of Korea; pp.114-119. [PDF, 193KB]

(2005) Wang-Ju Tsai: A platform for experimenting with UNL. In: Jesús Cardeñosa, Alexander Gelbukh, Edmundo Tovar (eds.): Universal Networking Language: advances in theory and applications (Mexico City: National Polytechnic Institute); pp.261-267 [abstract, PDF, 15KB]

(2005) Sophie Vandeputte: Multilingual websites: the European Schoolnet’s approach. In: Pascaline Merten (ed.) La traduction à l’heure de la localisation: outils, méthodes et formation. Equivalences no.32/1, 2005; pp. 79-91. [PDF, 140KB]

 (2005) Jian-Cheng Wu, Tracy Lin, & Jason S.Chang: Learning source-target surface patterns for web-based terminology translation. ACL-2005: Interactive Poster and Demonstration Sessions, University of Michigan, Ann Arbor, June 2005; pp. 37-40. [PDF, 153KB]

(2005) Ying Zhang & Phil Vines: Using the web for translation disambiguation: RMIT University at NTCIR-5 Chinese-English CLIR.  Proceedings of NTCIR-5 Workshop Meeting, December 6-9, 2005, Tokyo, Japan; 6pp. [PDF, 394KB]