Machine Translation Archive

Index of data, corpora and resources

Publications 2000-2004

   For other periods go to: Publications since 2010; publications 2005-2009; publications 1990-1999; publications 1970-1989; publications before 1970

 

To return to home page click here

 

Bilingual corpora [see also Comparable corpora, Example-based methods, Multilingual corpora]

(2004) proceedings of  Workshop: The amazing utility of parallel and comparable corpora.LREC-2002-Hovy-2.pdf LREC-2004: Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal, 25 May 2004. [PDF, 2226KB]

(2004) Michael Barlow: Parallel concordancing and translation. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 11pp. [PDF, 84KB]

(2004) Robert S.Belvin, Win May, Shrikanth Narayanan, Panayiotis Georgiou, & Shadi Ganjavi: Creation of a doctor-patient dialogue corpus using standardized patients. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.187-190. [PDF, 480KB]

(2004) Indrajit Bhattacharya, Lise Getoor, & Yoshua Bengio: Unsupervised sense disambiguation using bilingual probabilistic models.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 287-294. [PDF, 166KB]

(2004) Michael Carl, Ecaterina Rascu, & Johann Haller: Using weighted abduction to align term variant translations in bilingual texts. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1973-1976. [PDF, 294KB]

(2004) Chen Benfeng & Pascale Fung: Automatic construction of an English-Chinese bilingual FrameNet.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA – Short Papers; pp. 29-32. [PDF, 185KB]

(2004) Luisa Bentivogli, Pamela Forner, & Emanuele Pianta: Evaluating cross-language annotation transfer in the MultiSemCor corpus. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 66KB]

(2004) Thomas C.Chuang, Jian-Cheng Wu, Tracy Lin, Wen-chie Shei & Jason S.Chang: Bilingual sentence alignment based on punctuation statistics and lexicon. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.224-232. [abstract]

(2004) Jan Cuřin, Martin Čmejrek, Jiří Havelka & Vladislav Kuboň: Building a parallel bilingual syntactically annotated corpus. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.168-176. [abstract]

(2004) Yuan Ding & Martha Palmer: Automatic learning of parallel dependency treelet pairs. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.233-243. [abstract]

(2004) Sanae Fujita & Francis Bond: A method of creating new bilingual valency entries using alternations.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.47-54. [PDF, 176KB]

(2004) Pascale Fung & Percy Cheung: Mining very-non-parallel corpora: parallel sentence and lexicon extraction via bootstrapping and EM. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 276KB]

(2004) Tamás Grőbler, Gábor Hodász, & Balázs Kis: MetaMorpho TM: a rule-based translation corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.339-342. [PDF, 390KB]

(2004) Mihoko Kitamura & Yuji Matsumoto: Practical translation pattern acquisition from combined language resources. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.244-253. [abstract]

(2004) Michael Kluck: Evaluation of cross-language information retrieval using the domain-specific GIRT data as parallel German-English corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1343-1346. [PDF, 1533KB]

(2004) Jonas Kuhn: Experiments in parallel-text based grammar induction. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.470-477. [PDF, 99KB]

(2004) Tadashi Kumano, Hideki Kashioka, Hideki Tanaka & Takahiro Fukusima: Acquiring bilingual named entity translations from content-aligned corpora. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.177-186. [abstract]

(2004) Alon Lavie, Katharina Probst, Erik Peterson, Stephan Vogel, Lori Levin, Ariadna Font-Llitjos, & Jaime Carbonell: A trainable transfer-based MT approach for languages with limited resources 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp. 116-123. [PDF, 265KB]

(2004) Hang Li & Cong Li: Word translation disambiguation using bilingual bootstrapping. Computational Linguistics 30 (1), pp. 1-22. [PDF, 2311KB]

(2004) Li Weigang, Liu Ting, Wang Zhen, & Li Sheng: Aligning bilingual corpora using sentences location information. [ACL-2004] Proceedings of the Third SIGHAN Workshop on Chinese Language Learning, 25 July 2004, Barcelona; 7pp. [PDF, 100KB]

(2004) Tracy Lin, Jian-Cheng Wu, & Jason S. Chang: Extraction of name and transliteration in monolingual and parallel corpora. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 177-186. [go to publisher details]

(2004) Francisco Nevado, Francisco Casacuberta, & Josu Landa: Translation memories enrichment by statistical bilingual segmentation. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.335-338. [PDF, 354KB]

(2004) Sylwia Ozdowska: Identifying correspondences between words: an approach based on a bilingual syntactic analysis of French/English parallel corpora. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.55-62. [PDF, 213KB]

(2004) Chris Pike & I.Dan Melamed: An automatic filter for non-parallel texts. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 128KB]

(2004) Monica Rogati & Yiming Yang: Customizing parallel corpora at the document level. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 80KB]

(2004) Keita Tsuji & Kyo Kageura: Extracting low-frequency translation pairs from Japanese-English bilingual corpora. Coling 2004, CompuTerm 2004: 3rd International Workshop on Computational Technology, Proceedings of the Workshop, 29th August 2004, Geneva, Switzerland; 8pp. [PDF, 178KB]

(2004) Dan Tufiş, Radu Ion, & Nancy Ide: Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 192KB]

(2004) Agnčs Tutin, Meriam Haddara, Ruslan Mitkov, & Constantin Orasan: Annotation of anaphoric expressions in an aligned bilingual corpus.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.267-270. [PDF, 524KB]

(2004) Martin Volk & Yvonne Samuelsson: Bootstrapping parallel treebanks. Coling 2004: Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora, August 29th 2004, Geneva, Switzerland; 7pp. [PDF, 192KB]

(2004) Jui-Feng Yeh, Chung-Hsien Wu, Ming-Jun Chen, & Liang-Chih Yu: Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 196KB]

(2003) Eiji Aramaki, Sadao Kurohashi, Hideki Kashioka, & Hideki Tanaka: Word selection for EBMT based on monolingual similarity and translation confidence HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 1565KB]

(2003) Chris Callison-Burch & Miles Osborne: Bootstrapping parallel corpora HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 83KB]

(2003) Katri A. Clodfelder: An LSA implementation against parallel texts in French and English HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 119KB]

(2003) Pernilla Danielsson: Units of meaning in translation – how to make real use of corpus evidence. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 15pp. [PDF, 71KB]

(2003) Dinh Dien & Hoang Kiem: POS-tagger for English Vietnamese bilingual corpus HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 202KB]

(2003) Yuan Ding, Daniel Gildea, & Martha Palmer: An algorithm for word-level alignment of parallel dependency trees MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 242KB]

(2003) Takao Doi, Eiichiro Sumita, & Hirofumi Yamamoto: Adaptation using out-of-domain corpus within EBMT HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 3pp. [PDF, 33KB]

 (2003) Federico Gaspari: Relevance of parallel corpora to the latest developments of machine translation and computer-assisted translation. International Journal of Translation 15 (1), Jan-June 2003; pp.27-41. [PDF, 77KB]

(2003) Fei Huang, Stephan Vogel, & Alex Waibel: Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization ACL-2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp.. [PDF, 262KB]

(2003) Kenji Imamura, Eiichiro Sumita, & Yuji Matsumoto: Feedback cleaning of machine translation rules using automatic evaluation ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 62KB]

(2003) Kenji Imamura, Eiichiro Sumita & Yuji Matsumoto: Automatic construction of machine translation knowledge using translation literalness. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.155-162 [PDF, 397KB]

(2003) Genichiro Kikui, Eiichiro Sumita, Toshiyuki Takezawa, & Seiichi Yamamoto: Creating corpora for speech-to-speech translation.  Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.381-384; abstract [PDF, 34KB]

(2003) Tadashi Kumano, Hideki Kashioka, Hideki Tanaka, & Takahiro Fukusima: Construction and analysis of Japanese-English broadcast news corpus with named entity tags ACL-2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp.. [PDF, 56KB]

(2003) Qing Ma, Yujie Zhang, Masaki Murata, & Hitoshi Isahara: Semantic maps for word alignment in bilingual parallel corpora  ACL-2003: Second SIGHAN Workshop on Chinese Language Processing, July 11-12, 2003, Sapporo, Japan; 6pp.. [PDF, 199KB]

(2003) Sara Laviosa: Corpora and the translator. In: Harold Somers (ed.) Computers and translation: a translator’s guide (Amsterdam/Philadelphia: John Benjamins Publishing Company, 2003); pp.105-117.

(2003) Elliott Macklovitch: On the pleasure of being bi-textual; or My life in parallel text. HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF from PPT, 301KB]

(2003) Lluís Mŕrquez, Adriŕ de Gispert, Xavier Carreras, & Lluís Padró: Low-cost named entity classification for Catalan: exploiting multilingual resources and unlabeled data.  ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp. [PDF, 80KB]

(2003) Joel Martin, Howard Johnson, Benoit Farley, and Anna Maclachlan: Aligning and using an English-Inuktitut parallel corpus HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 30KB]

(2003) Robert C.Moore: Learning translations of named-entity phrases from parallel corpora. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.259-266 [PDF, 377KB]

(2003) Francisco Nevado, Francisco Casacuberta, & Enrique Vidal: Parallel corpora segmentation using anchor words. 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 33-40 [PDF, 382KB]

(2003) Hwee Tou Ng, Bin Wang, & Yee Seng Chan: Exploiting parallel texts for word sense disambiguation: an empirical study ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 364KB]

(2003) Stephen Nightingale & Hideki Tanaka: Comparing the sentence alignment yield from two news corpora using a dictionary-based alignment system HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 142KB]

(2003) Daniel Ortíz, Ismael García-Varea, Francisco Casacuberta, Antonio Lagarda, & Jorge González: On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA) MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.299-306. [PDF, 88KB]

(2003) Katharina Probst: Using ‘smart’ bilingual projection to feature-tag a monolingual dictionary HLT-NAACL 2003: proceedings of Seventh Conference on Natural Language Learning,  May 27 – June 1,  2003, Edmonton, Canada; 8pp. [PDF, 108KB]

(2003) Philip Resnik & Noah A. Smith: The web as a parallel corpus. Computational Linguistics 29 (3), pp.349-380. [PDF, 8130KB]

(2003) Lee Schwartz, Takako Aikawa, & Chris Quirk: Disambiguation of English PP attachment using multilingual aligned data MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.330-337. [PDF, 98KB]

(2003) Toshiyuki Takezawa & Genichiro Kikui: Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.2757-2760; abstract [PDF, 34KB]

(2003) Stephan Vogel: Using noisy bilingual data for statistical machine translation. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.175-178 [PDF, 194KB]

(2003) Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan, Diana Steffen, & Paul Buitelaar: Unsupervised monolingual and bilingual word-sense disambiguation of medical documents using UMLS. ACL 2003 Workshop on Natural Language Processing in Biomedicine, July 2003, Sapporo, Japan; 8pp. [PDF, 124KB]

(2003) Hua Wu & Ming Zhou: Optimizing synonym extraction using monolingual and bilingual resources. ACL 2003 International Workshop on Paraphrasing, July 11, 2003, Sapporo, Japan; 8pp. [PDF, 142KB]

(2003) Hua Wu & Ming Zhou: Synonymous collocation extraction using translation information ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 139KB]

(2003) Jian-Cheng Wu, Kevin C.Yeh, Thomas C.Chuang, Wen-Chi Shei, & Jason S.Chang: TotalRecall: a bilingual concordance for computer assisted translation and language learning ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 76KB]

(2003) Kaoru Yamamoto, Taku Kudo, Yuta Tsuboi, & Yuji Matsumoto: Learning sequence-to-sequence correspondences from parallel corpora via sequential pattern mining HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 101KB]

(2003) Bing Zhao & Stephan Vogel: Word alignment based on bilingual bracketing HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 56KB]

(2003) Bing Zhao, Klaus Zechner, Stephen Vogel, & Alex Waibel: Efficient optimization for bilingual sentence alignment based on linear regression HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 93KB]

(2002) proceedings of the Workshop: Language resources for translation work and research, LREC-2002: Third International Conference on Language Resources and Evaluation, Las Palmas Canary Islands, 27 May 2002. [PDF, 653KB]

(2002) Mosleh H.Al-Adhaileh, Tang Enya Kong, & Zaharin Yusoff: A synchronization structure of SSTC and its applications in machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 8pp. [PDF, 272KB]

(2002) Yaser Al-Onaizan & Kevin Knight: Translating named entities using monolingual and bilingual resources; ACL-2002: 40th Annual meeting of the Association for Computational Linguistics, July 2002, Philadelphia, USA; pp.400-408 [PDF, 186KB]

(2002) Toni Badia, Gemma Boleda, Carme Colominas, Agnčs González, Mireia Garmendia, Martí Quixal: BancTrad: a web interface for integrated access to parallel annotated corpora. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.15-19. [PDF, 106KB]

(2002) Michael Barlow: ParaConc: concordance software for multilingual parallel corpora. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.20-24. [PDF, 97KB]

(2002) Ralf Brown: Corpus-driven splitting of compound words. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [PDF, 95KB]

(2002) Yves Champollion: Automated translation: the next frontier.  Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 7pp. [PDF, 47KB]; presentation:  10pp. [PDF, 44KB]

(2002) Baobao Chang, Pernilla Danielsson, & Wolfgang Teubert: Extraction of translation unit from Chinese-English parallel corpora; Coling-2002: First SIGHAN Workshop on Chinese Language Processing, 1 September 2002, Taipei,Taiwan; 5pp. [PDF, 157KB]

(2002) Chang Baobao, Zhang Huarui, Yu Shiwen, & Kang Shiyong: Bilingual corpus construction and its management for Chinese-English machine translation. In: Chan Sin-wai (ed.) Translation and Information Technology (Hong Kong: Chinese University Press, 2002); pp.31-41.

(2002) Dien Dinh: Building a training corpus for word sense disambiguation in English-to-Vietnamese machine translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 281KB]

(2002) Mathieu Guidčre: Towards a corpus-based machine translation for standard Arabic. Translation Journal 6 (1), January 2002; 12pp. [PDF, 173KB]

(2002) Nancy Ide, Tomaz Erjavec, & Dan Tufis: Sense discrimination with parallel corpora. ACL-2002 SIGLEX/SENSEVAL workshop on Word Sense Disambiguation "Recent successes and future directions", 11 July 2002, Philadelphia, USA; pp. 54-60 [PDF, 387KB]

(2002) Takahiro Ikeda, Shinichi Ando, Kenji Satoh, Akitoshi Okumura, & Takao Watanabe: Automatic interpretation system integrating free-style sentence translation and parallel text based translation; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 85-92 [PDF, 240KB]

(2002) Kenji Imamura & Eiichiro Sumita: Bilingual corpus cleaning focusing on translation literality. ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1713-1716; abstract [PDF, 47KB]

(2002) Shankar Kumar & William Byrne: Minimum Bayes-risk word alignments of bilingual texts. EMNLP-2002: Proceedings of the 2002 conference on Empirical Methods in Natural Language Processing, July 2002, Philadelphia, USA; pp.140-147 [PDF, 263KB]

(2002) Shigeki Matsubara, Akira Tagaki, Nobuo Kawaguchi, & Yasuyoshi Inagaki: Bilingual spoken monologue corpus for simultaneous machine interpretation research. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.153-159. [PDF, 108KB]

(2002) Robert C. Moore: Fast and accurate sentence alignment of bilingual corpora. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 135-144. [go to publisher details]

(2002) Masaki Murata, Masao Utiyama, Kiyotaka Uchimoto, Qing Ma and Hitoshi Isahara: Correction of errors in a modality corpus used for machine translation using machine-learning. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp.125-134. [PDF, 106KB]

(2002) Jessie Pinkham & Martine Smets: Modular MT with a learned bilingual dictionary: rapid deployment of a new language pair. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 59KB]

(2002) Andrei Popescu-Belis, Margaret King, & Houcine Benantar: Towards a corpus of corrected human translations. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Machine translation evaluation: human evaluators meet automated metrics, Las Palmas Canary Islands, 27 May 2002; pp.17-21. [PDF, 50KB]

(2002) Magnus Sahlgren, Preben Hansen & Jussi Karlgren: English-Japanese cross-lingual query expansion using random indexing of aligned bilingual text data. NTCIR Workshop 3: Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, Automatic Text Summarization and Question Answering, October 8-10, 2002, Tokyo, Japan; 5pp. [PDF, 82KB]

(2002) Deepak Sharma, K.Vikram, Manav Ratan Mital, Amitabha Mukerjee & Achla M.Raina: An integrated discourse semantic model for bilingual corpora. ICUKL-2002: International Conference on Universal Knowledge and Language, 25th-29th November 2002, Goa, India, organised by UNDL Foundation and Indian Institute of  Technology Bombay; 16pp. [PDF,  645KB]

(2002) Noah A.Smith: From words to corpora: recognizing translation. EMNLP-2002: Proceedings of the 2002 conference on Empirical Methods in Natural Language Processing, July 2002, Philadelphia, USA; pp.95-102 [PDF, 303KB]

(2002) Le Sun, Song Xue, Weimin Qu, Xiaofeng Wang, & Yufang Sun: Constructing a large-scale Chinese-English parallel corpus. Coling-2002: Third Workshop on Asian Language resources and International Standarization, 31 August 2002, Taipei,Taiwan; 8pp. [PDF, 359KB]

(2002) Toshiyuki Takezawa, Eiichiro Sumita, Fumiaki Sugaya, Hirofumi Yamamoto, & Seiichi Yamamoto: Toward a broad-coverage bilingual corpus for speech translation of travel conversations in the real world. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.147-152. [PDF, 216KB]

(2002) Keita Tsuji, Beatrice Daille, & Kyo Kageura: Extracting French-Japanese word pairs from bilingual corpora based on transliteration rules. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.499-502. [PDF, 120KB]

(2002) Taro Watanabe, Mitsuo Shimohata, & Eiichiro Sumita: Statistical machine translation on paraphrased corpora. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1954-1957. [PDF, 133KB]

(2002) Setsuo Yamada, Kenji Imamura and Kazuhide Yamamoto: Corpus-assisted expansion of manual MT knowledge. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp.198-208. [PDF, 111KB]

(2001) Michael Carl: Inducing probabilistic invertable translation grammars from aligned texts.  ACL-EACL 2001 Workshop on Computational Natural Language Learning (CoNLL), July 2001, Toulouse, France; 7pp. [PDF, 160KB]

(2001) Tatsuya Izuha: Machine translation using bilingual term entries extracted from parallel texts. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 169-173 [PDF, 200KB]

(2001) F.Jelinek, W.Byrne, S.Khudanpur, B.Hladká, H.Ney, F.J.Och, J.Cuřin, J.Psutka: Robust knowledge discovery from parallel speech and text sources. HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 3pp. [PDF, 44KB]

(2001) Philippe Langlais, George Foster & Guy Lapalme: Integrating bilingual lexicons in a probabilistic translation assistant. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.197-202. [PDF, 95KB]

(2001) Benoit Lavoie, Michael White, & Tanya Korelsky: Inducing lexico-structural transfer rules from parsed bi-texts. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.17-24. [PDF, 60KB]

(2001) Gideon S.Mann & David Yarowsky: Multipath translation lexicon induction via bridge languages.  [NAACL-2001] Language Technologies 2001: the Second meeting of the North American Chapter of the Association for Computational Linguistics, Carnegie Mellon University, Pittsburgh, PA, 2-7 June 2001; 8pp. [PDF, 181KB]

(2001) Arul Menezes & Stephen D. Richardson: A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Workshop on Example-Based Machine Translation. [PDF, 72KB]

(2001) Arul Menezes & Stephen D.Richardson: A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.39-46. [PDF, 84KB]

(2001) Robert C.Moore: Towards a simple and accurate statistical approach to learning translation relationhips among words. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.79-86. [PDF, 55KB]

(2001) Sonja Nießen & Hermann Ney: Toward hierarchical models for statistical machine translation of inflected languages. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.47-54. [PDF, 62KB]

(2001) Jessie Pinkham, Monica Corston-Oliver, Martine Smets & Martine Pettenaro: Rapid assembly of a large-scale French-English MT system. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.277-281. [PDF, 180KB]

(2001) Philip Resnik: [review of] Parallel text processing: alignment and use of translation corpora [ed. by] Jean Véronis. Computational Linguistics 27 (4), pp.592-595. [PDF, 333KB]

(2001) Jean Senellart, Mirko Plitt, Christophe Bailly & Françoise Cardoso: Resource alignment for machine translation or implicit transfer. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.317-323. [PDF, 112KB]

(2001) Kaoru Yamamoto, Yuji Matsumoto, & Mihoko Kitamura: A comparative study on translation units for bilingual lexicon extraction. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.87-94. [PDF, 91KB]

(2001) Keiji Yasuda, Fumiaki Sugaya, Toshiyuki Takezawa, Seiichi Yamamoto & Masuzo Yanagida: An automatic evaluation method of translation quality using translation answer candidates queried from a parallel corpus. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.373-378. [PDF, 193KB]

(2001) Ying Zhang, Ralf Brown, Robert Frederking & Alon Lavie: Pre-processing of bilingual corpora for Mandarin-English EBMT.  MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.385-390. [PDF, 2374KB]

(2000) Yaser Al-Onaizan, Ulrich Germann, Ulf Hermjakob, Kevin Knight, Philip Koehn, Daniel Marcu, & Kenji Yamada: Translating with scarce resources 17th National conference of the American Association for Artificial Intelligence (AAAI 2000) July 30- August 3, 2000, Austin,Texas. [PDF, 166KB]

 (2000) Ingeborg Blank: Terminology extraction from parallel technical texts [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 237-252.

(2000) Lars Borin: You'll take the high road and I'll take the low road: using a third language to improve bilingual word alignment Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 97-103 [PDF,.534KB]

(2000) Michael Carl: Extracting invertible translations from pre-aligned texts.  LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 6pp. [PDF, 321KB]

(2000) Arantza Casillas, Joseba Abaitua, & Raquel Martínez: Recycling annotated parallel corpora for bilingual document composition. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.117-126. [go to publisher details]

(2000) Arantza Casillas, Joseba Abaitua & Raquel Martínez: DTD-driven bilingual document generation. INLG’2000 : Proceedings of the First International Conference on Natural Language Generation, Mitzpe Ramon, Israel, 12-16 June 2000; pp.32-38. [PDF, 586KB]

(2000) David Chambers: Automatic bilingual terminology extraction: a practical approach. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 19pp.  [PDF, 383KB]

(2000) Jiang Chen & Jian-Yun Nie: Automatic construction of parallel English-Chinese corpus for cross-language information retrieval.  ANLP-NAACL-2000: proceedings of the Sixth conference on Applied Natural Language Processing and 1st Meeting of the North American Chapter of the Association for Computational Linguistics, April 29 – May 4, 2000, Seattle, Washington; pp.7-12. [PDF, 753KB]

(2000) Pernilla Danielsson & Katarina Mühlenbock: Small but efficient: the misconception of high-frequency words in Scandinavian translation. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.158-168. [go to publisher details]

(2000) Tomaš Erjavec: Slovene-English datasets for MT Fifth EAMT Workshop "Harvesting existing resources", May 11 - 12, 2000, Ljubljana, Slovenia. [PDF, 104KB]

 (2000) Pascale Fung: A statistical view on bilingual lexicon extraction: from parallel corpora to non-parallel corpora [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 219-236.

(2000) Zhao-Ming Gao: Automatic acquisition of a high-precision translation lexicon from parallel Chinese-English corpora. In: IAI Working Paper no.36, 2000; 12pp. [PDF, 193KB]

(2000) Jin-Xia Huang & Key-Sun Choi: Chinese-Korean word alignment based on linguistic comparison. ACL-2000: 38th Annual meeting of the Association for Computational Linguistics, Hong Kong, October 2000. [PDF, 208KB]

 (2000) Hitoshi Isahara & Masahiko Haruno: Japanese-English aligned bilingual corpora [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 313-334.

(2000) Zheng Jie & Mao Yuhang: A word sense disambiguation model using bilingual corpus. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 517-521. [PDF, 85KB]

(2000) Olivier Kraif: Evaluation of statistical tools for automatic extraction of lexical correspondences between parallel texts. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1579KB]

(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]

(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Word alignment of English-Chinese bilingual corpus based on chunks. EMNLP-2000: Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 7-8 October 2000, Hong Kong; pp. 110-116 [PDF, 509KB]

(2000) Elliott Macklovitch: Two types of translation memory. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 15pp.  [PDF, 198KB]

(2000) Bernardo Magnini & Carlo Strapparava: Experiments in word domain disambiguation for parallel texts. ACL 2000 Workshop on Word Senses and Multi-linguality, Hong Kong, October 2000; pp. 27-33 [PDF, 605KB]

(2000) Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufmann, & Stanley Peters: A bootstrapping method for extracting bilingualtext pairs Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1066-1070 [PDF,.464KB]

(2000) Masumi Narita: A corpus-based English language assistant to Japanese software engineers. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1844KB]

 (2000) John Nerbonne: Parallel texts in computer-assisted language learning [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 299-311.

(2000) Uwe Reinke: Towards a closer integration of termbases, translation memories and parallel corpora: a translation-oriented view. In: IAI Working Paper no.36, 2000; 14pp. [PDF, 216KB]

(2000) António Ribeiro, Gabriel Lopes, & Joăo Mexia: Using confidence bands for parallel texts alignment. ACL-2000: 38th Annual meeting of the Association for Computational Linguistics, Hong Kong, October 2000. [PDF, 260KB]

 (2000) Sukhdave Singh, Tony McEnery, & Paul Baker: Building a parallel corpus of English/Panjabi [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 335-346.

(2000) Jörg Tiedemann: Extracting phrasal terms using bitext. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 7pp. [PDF, 160KB]

(2000) Tamás Váradi: Lexical and translation equivalence in parallel corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 539-543. [PDF, 756KB]

(2000) Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000.

(2000) Špela Vintar: Extracting terminological collocations from a parallel corpus. Fifth EAMT Workshop "Harvesting existing resources", May 11 - 12, 2000, Ljubljana, Slovenia; pp.21-30. [PDF, 132KB]

(2000) Jean Véronis: From the Rosetta stone to the information society: a survey of parallel text processing [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp.1-24. 

(2000) S.Vogel & H.Ney: Construction of a hierarchical translation memory Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1131-1135 [PDF,.396KB]

(2000) Hideo Watanabe, Sadao Kurohashi, & Eiji Aramaki: Finding structural correspondences from bilingual parsed corpus for corpus-based translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 906-912 [PDF,.701KB]

(2000) KaorouYamamoto & Yuji Matsumoto: Acquisition of phrase-level bilingual correspondence using dependency structure Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 933-939 [PDF,.592KB]

Bi-text see Bilingual corpora

Comparable corpora

(2004) proceedings of  Workshop: The amazing utility of parallel and comparable corpora.LREC-2002-Hovy-2.pdf LREC-2004: Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal, 25 May 2004. [PDF, 2226KB]

(2004) Pascale Fung & Percy Cheung: Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus.  Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 245KB]

(2004) E. Gaussier, J.-M.Renders, I.Matveeva, C.Goutte, & H.Déjean: A geometric view on bilingual lexicon extraction from comparable corpora.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.526-533. [PDF, 122KB]

(2004) Hiroyuki Kaji: Adapted seed lexicon and combined bidirectional similarity measures for translation equivalent extraction from comparable corpora; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.115-124. [PDF, 378KB]

(2004) Dragos Stefan Munteanu, Alexander Fraser, & Daniel Marcu: Improved machine translation performance via parallel sentence extraction from comparable corpora.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; pp. 265-272. [PDF, 1125KB]

(2004) Li Shao & Hwee Tou Ng: Mining new word translations from comparable corpora. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 305KB]

(2003) Fatiha Sadat, Masatoshi Yoshikawa, & Shunsuke Uemura: Bilingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 61KB]

(2003) Fatiha Sadat, Masatoshi Yoshikawa, & Shunsuke Uemura: Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid ststistics-based and linguistics-based approach IRAL 2003: Sixth International Workshop on Information Retrieval with Asian Languages,  July 7,  2003, Sapporo, Japan; 8pp. [PDF, 122KB]

(2003) Takehito Utsuro, Takashi Horiuchi, Kohei Hino, Takeshi Hamamoto & Takeaki Nakayama: Effect of cross-language IR in bilingual lexical acquisition from comparable corpora. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.355-362 [PDF, 734KB]

(2002) Yun-Chuang Chiao & Pierre Zweigenbaum: Looking for candidate translational equivalents in specialized, comparable corpora. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 82KB]

 (2000) Pascale Fung: A statistical view on bilingual lexicon extraction: from parallel corpora to non-parallel corpora [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 219-236.

(2000) Hiroshi Nakagawa: Disambiguation of lexical translations based on bilingual comparable corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 6pp. [PDF, 100KB]

(2000) Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000.

(2000) Jean Véronis: From the Rosetta stone to the information society: a survey of parallel text processing [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp.1-24. 

 (2000) Dekai Wu: Bracketing and aligning words and constituents in parallel text using stochastic inversion transduction grammars [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 139-167.

Concordances

(2004) Michael Barlow: Parallel concordancing and translation. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 11pp. [PDF, 84KB]

 (2004) Lynne Bowker & Michael Barlow: Bilingual concordancers and translation memories: a comparative evaluation. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp.70-79. [PDF, 88KB]

(2003) Pernilla Danielsson: Units of meaning in translation – how to make real use of corpus evidence. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 15pp. [PDF, 71KB]

(2003) Jian-Cheng Wu, Kevin C.Yeh, Thomas C.Chuang, Wen-Chi Shei, & Jason S.Chang: TotalRecall: a bilingual concordance for computer assisted translation and language learning ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 76KB]

(2002) Michael Barlow: ParaConc: concordance software for multilingual parallel corpora. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.20-24. [PDF, 97KB]

(2000) Elliott Macklovitch: Two types of translation memory. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 15pp.  [PDF, 198KB]

Data elicitation

(2002) Katharina Probst and Lori Levin: Challenges in automated elicitation of a controlled bilingual corpus. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp.157-167. [PDF, 173KB]

(2002) Fumiaki Sugaya, Toshiyuki Takezawa, Genichiro Kikui, & Seiichi Yamamoto: Proposal of a very-large-corpus acquisition method by cell-formed registration. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.326-328. [PDF, 32KB]

(2001) Katharina Probst, Ralf Brown, Jaime Carbonell, Alon Lavie, Lori Levin & Erik Peterson: Design and implementation of controlled elicitation for machine translation of low-density languages MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Towards a Road Map for MT [PDF, 90KB]

Domain identification

(2000) Bernardo Magnini & Carlo Strapparava: Experiments in word domain disambiguation for parallel texts. ACL 2000 Workshop on Word Senses and Multi-linguality, Hong Kong, October 2000; pp. 27-33 [PDF, 605KB]

Domain restriction, adaptation and specification

(2004) Victoria Arranz, Elisabet Comelles, David Farwell, Climent Nadeu, Jaume Padrell, Albert Febrer, Dorcas Alexander, & Kay Peterson: A speech-to-speech translation system for Catalan, Spanish, and English. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 7-16. [go to publisher details]

(2004) Necip Fazil Ayan, Bonnie J. Dorr, & Nizar Habash: Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 17-26. [go to publisher details]

(2004) Luisa Bentivogli, Pamela Forner, Bernardo Magnini, & Emanuele Pianta: Revising the Wordnet Domains hierarchy: semantics, coverage and balancing. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.101-108. [PDF, 80KB]

(2004) Hervé Blanchon: HLT modules scalability within the NESPOLE! Project. Interspeech 2004 – ICSLP 8th International  Conference on  Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004; pp.1641-1644; abstract [PDF, 72KB]

(2004) Violetta Cavalli-Sforza, Jaime G. Carbonell, & Peter J. Jansen: Developing language resources for a transnational digital government system. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.945-948. [PDF, 271KB]

(2004) Matthias Eck, Stephan Vogel, & Alex Waibel: Improving statistical machine translation in the medical domain using the Unified Medical Language System. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 81KB]

(2004) Matthias Eck, Stephan Vogel, & Alex Waibel: Language model adaptation for statistical machine translation based on information retrieval. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.327-330. [PDF, 291KB]

(2004) Michael Kluck: Evaluation of cross-language information retrieval using the domain-specific GIRT data as parallel German-English corpus. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1343-1346. [PDF, 1533KB]

(2004) Kristin Precoda, Horacio Franco, Ascander Dost, Michael Frandsen, John Fry, Andreas Kathol, Colleen Ritchie, Susanne Riehemann, Dimitra Vergyri, & Jing Zheng: Limited-domain speech-to-speech translation between English and Pashto. HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; demonstration paper, 4pp. [PDF, 57KB]

(2004) Manny Rayner, Pierrette Bouillon, Beth Ann Hockey, Nikos Chatzichrisafis & Marianne Starlander: Comparing rule-based and statistical approaches to speech understanding in a limited domain speech translation system; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.21-29. [PDF, 86KB]

(2004) Stephen D. Richardson: Machine translation of online product support articles using data-driven MT system. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 246-251. [go to publisher details]

(2004) Tanja Schultz, Dorcas Alexander, Alan W.Black, Kay Peterson, Sinaporn Suebvisai, & Alex Waibel: A Thai speech translation system for medical dialogs. HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA; demonstration paper, 2pp. [PDF, 122KB]

(2004) Mitsuo Shimohata, Eiichiro Sumita & Yuji Matsumoto: Method for retrieving a similar sentence and its application to machine translation; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.105-114. [PDF, 112KB]

(2004) Toshiyuki Takezawa & Genichiro Kikui: A comparative study on human communication behaviors and linguistic characteristics for speech-to-speech translation.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1589-1592. [PDF, 585KB]

(2004) Hua Wu & Haifeng Wang: Improving domain-specific word alignment with a general bilingual corpus. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 262-271. [go to publisher details]

(2004) Wu Hua & Wang Haifeng: Improving domain-specific word alignment for computer assisted translation. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 221KB]

(2004) Jui-Feng Yeh, Chung-Hsien Wu, Ming-Jun Chen, & Liang-Chih Yu: Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 196KB]

(2003) Necip Fazil Ayan, Bonnie J.Dorr, & Okan Kolak: Evaluation techniques applied to domain tuning of MT lexicons. "Towards systematizing MT evaluation": a workshop on machine translation evaluation at the MT Summit IX, New Orleans, USA, 27 September 2003; pp.3-11. [PDF, 328KB]

(2003) Takao Doi, Eiichiro Sumita, & Hirofumi Yamamoto: Adaptation using out-of-domain corpus within EBMT HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; 3pp. [PDF, 33KB]

(2003) Lori Levin, Chad Langley, Alon Lavie, Donna Gates, Dorcas Wallace, & Kay Peterson: Domain specific speech acts for spoken language translation. Proceedings of 4th SIGdial Workshop on Discourse and Dialogue (SIGDIAL-2003), Sapporo, Japan, July 2003; 10pp. [PDF, 103KB]

(2003) Manny Rayner, Pierrette Bouillon, Vol Van Dalsem III, Hitoshi Isahara, Kyoko Kanzaki, & Beth Ann Hockey: A limited-domain English to Japanese medical speech translator built using REGULUS 2 ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 41KB]

(2002) Chan Sin-wai: The making of TransRecipe: a translational approach to the machine translation of Chinese cookbooks. In: Chan Sin-wai (ed.) Translation and Information Technology (Hong Kong: Chinese University Press, 2002); pp.3-22.

(2002) Robert E. Frederking, Alan W. Black, Ralf D. Brown, Alexander Rudnicky, John Moody, & Eric Steinbrecher: Speech translation on a tight budget without enough data; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 77-84 [PDF, 156KB]

(2002) Adria de Gispert & José B.Marińo: Using x-grams for speech-to-speech translation. ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1885-1888; abstract [PDF, 65KB]

(2002) Benoit Lavoie, Michael White, & Tanya Korelsky: Learning domain-specific transfer rules: an experiment with Korean to English translation; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 43KB]

(2002) Solange Rossato, Hervé Blanchon, & Laurent Besacier: Speech-to-speech translation system evaluation: results for French for the NESPOLE! project first showcase.  ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1905-1908; abstract [PDF, 68KB]

(2002) Nestor Rychtyckyj: An assessment of machine translation for vehicle assembly process planning at Ford Motor Company. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 207-215. [go to publisher details]

(2002) David Stallard, Premkumar Natarajan, Mohammed Noamany, Richard Schwartz, & John Makhoul: Design for a speech-to-speech translator for field use. ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1705-1708; abstract [PDF, 46KB]

(2001) Srinivas Bangalore & Giuseppe Riccardi: A finite-state approach to machine translation. [NAACL-2001] Language Technologies 2001: the Second meeting of the North American Chapter of the Association for Computational Linguistics, Carnegie Mellon University, Pittsburgh, PA, 2-7 June 2001; 8pp. [PDF, 210KB]

(2001) Alon Lavie, Chad Langley, Alex Waibel, Fabio Pianesi, Ganni Lazzari, Paolo Coletti, Loredana Taddei, & Franco Balducci: Architecture and design considerations in NESPOLE!: a speech translation system for e-commerce applications.  HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 4pp. [PDF, 51KB]

(2001) Alon Lavie, Lorin Levin, Tanja Schultz, Chad Langley, Benjamin Han, Alicia Tribble, Donna Gates, Dorcas Wallace, & Kay Peterson: Domain portability in speech-to-speech translation.  HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 5pp. [PDF, 58KB]

(2001) Jessie Pinkham & Monica Corston-Oliver: Adding domain specificity to an MT system. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.103-110. [PDF, 73KB]

(2000) Srinivas Bangalore & Giuseppe Riccardi: Stochastic finite-state models for spoken language machine translation; ANLP/NAACL 2000 workshop: Embedded machine translation systems, May 4, 2000, Seattle, Washington, [USA]; pp.52-59. [PDF, 638KB]

(2000) Hans Ulrich Block, Stefanie Schachtl, & Manfred Gehrke: Adapting a large scale MT system for spoken language. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 394-410. [abstract]

(2000) Alexander Franz, Keiko Horiguchi, Lei Duan, Doris Ecker, Eugene Koontz, & Kazami Uchida: An integrated architecture for example-based machine translation Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1031-1035 [PDF,.369KB]

(2000) Estela Saquete & Patricio Martínez-Barco: Grammar specification for the recognition of temporal expressions. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 7pp. [PDF, 1397KB]

(2000) Sayori Shimohata: An empricial method for identifying and translating technical terminology Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 782-788 [PDF,.513KB]

(2000) John Weisgerber, Jin Yang, & Pete Fisher: Pacific Rim portable translator. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.196-201. [go to publisher details]

Knowedge representation see Ontologies

Language resources (see also Bilingual corpora, Lexical resources, Multilingual corpora)

(2004) Violetta Cavalli-Sforza, Jaime G. Carbonell, & Peter J. Jansen: Developing language resources for a transnational digital government system. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.945-948. [PDF, 271KB]

(2004) M.Gavrilidou, P.Labropoulou, E.Desipri, V.Giouli, V.Antonopoulos, & S.Piperidis: Building parallel corpora for eContent professionals.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.97-100. [PDF, 34KB]

(2004) Bente Maegaard: NEMLAR – an Arabic language resources project. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.109-112. [PDF, 279KB]

(2004) Bente Maegaard: The NEMLAR project on Arabic language resources 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp.124-128. [PDF, 123KB]

(2004) Reinhard Schäler: Language resources and localisation. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp. 27-35. [PDF, 174KB]

(2004) Gregor Thurmair: Multilingual content processing. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.xii-xvi. [PDF, 391KB]

(2004) Darinka Verdonik, Matej Rojc, & Zdravko Kačič: Creating Slovenian language resources for development of speech-to-speech translation components.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1399-1402. [PDF, 753KB]

(2004) Cristina Vertan: Language resources for the Semantic Web – perspectives for machine translation. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp. 37-41. [PDF, 232KB]

(2004) Elia Yuste: Corporate language resources in multilingual content creation, maintenance and leverage. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp. 9-15. [PDF, 71KB]

(2003) Woosung Kim & Sanjeev Khudanpur: Cross-lingual lexical triggers in statistical language modeling EMNLP-2003: proceedings of the 2003  conference on Empirical Methods in Natural Language Processing, a meeting of SIGDAT, a special interest group of the ACL, held in conjunction with ACL-03,  11-12 July  2003, Sapporo, Japan; 8pp. [PDF, 96KB]

(2002) Ruvan Weerasinghe: Bootstrapping the lexicon building process for machine translation between 'new' languages. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 177-186. [go to publisher details]

 (2000) Jeff Allen: The ELRA language resources survey: languages needed. International Journal for Language and Documentation 5, June 2000; pp.41-42. [PDF, 545KB]

 (2000) Susanne Burger, Karl Weilhammer, Florian Schiel, &  Hans G. Tillmann: Verbmobil data collection and annotation. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 537-549. [abstract]

 (2000) Jaro Lajovic: EAMT 2000 Workshop looks at resources for MT development.  In: MT News International no.25, Summer 2000. [PDF]

(2000) Svetlana Sheremetyeva & Sergei Nirenburg: Towards a universal tool for NLP resource acquisition. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 761-768. [PDF, 80KB]

Lexical acquisition see Lexical resources

Lexical resources and lexical acquisition (see also Terminology and MT in index of applications)

(2004) Chen Benfeng & Pascale Fung: Automatic construction of an English-Chinese bilingual FrameNet.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA – Short Papers; pp. 29-32. [PDF, 185KB]

(2004) Luisa Bentivogli, Pamela Forner, Bernardo Magnini, & Emanuele Pianta: Revising the Wordnet Domains hierarchy: semantics, coverage and balancing. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.101-108. [PDF, 80KB]

(2004) Igor Boguslavsky, Leonid Iomdin, & Victor Sizov: Multilinguality in ETAP-3: reuse of lexical resources. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.7-14. [PDF, 582KB]

(2004) James Breen: JMdict: a Japanese-multilingual dictionary. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.71-78. [PDF, 191KB]

(2004) Pu-Jen Cheng, Yi-Cheng Pan, Wen-Hsiang Lu, & Lee-Feng Chien: Creating multilingual translation lexicons with regional variations using web corpora.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 534-541. [PDF, 283KB]

(2004) Martin Čmejrek, Jan Cuřin, Jiři Havelka, Jan Hajič, & Vladislav Kuboň: Prague Czech-English dependency treebank: syntactically annotated resources for machine translation . LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1597-1600. [PDF, 292KB]

(2004) Sabri Elkateb & Bill Black: English-Arabic dictionary for translators.  Proceedings of the 7th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics [CLUK-2004], Birmingham, UK, 6-7 January 2004; 6pp. [PDF, 200KB]

(2004) Georges Fafiotte: Building and sharing multilingual speech resources, using ERIM generic platforms. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.39-46. [PDF, 327KB]

(2004) Hanne Fersře, Elviira Hartikainen, Henk van den Heuvel, Giulio Maltese, Asuncion Moreno, Shaunie Shammass, & Ute Ziegenhain: Creation and validation of large lexica for speech-to-speech translation purposes. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1431-1434. [PDF, 701KB]

(2004) Atsushi Fujii, Tetsuya Ishikawa, & Jong-Hyeok Lee : Term extraction from Korean corpora via Japanese. Coling 2004, CompuTerm 2004: 3rd International Workshop on Computational Technology, Proceedings of the Workshop, 29th August 2004, Geneva, Switzerland; 8pp. [PDF, 107KB]

(2004) Sanae Fujita & Francis Bond: An automatic method of creating valency entries using plain bilingual dictionaries; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.55-64. [PDF, 147KB]

(2004) Sanae Fujita & Francis Bond: A method of creating new bilingual valency entries using alternations.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.47-54. [PDF, 176KB]

(2004) Pascale Fung & Percy Cheung: Mining very-non-parallel corpora: parallel sentence and lexicon extraction via bootstrapping and EM. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 276KB]

(2004) Pascale Fung & Benfeng Chen: BiFrameNet: bilingual frame semantics resource construction by cross-lingual induction. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 432KB]

(2004) E. Gaussier, J.-M.Renders, I.Matveeva, C.Goutte, & H.Déjean: A geometric view on bilingual lexicon extraction from comparable corpora.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.526-533. [PDF, 122KB]

 (2004) Lorena Guerra Martínez: PROMT Professional and the importance of building and updating machine translation dictionaries. International Journal of Translation 16 (1), Jan-June 2004; pp.121-139. [PDF, 254KB]

(2004) Le An Ha: Co-training applied in automatic term extraction. Proceedings of the 7th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics [CLUK-2004], Birmingham, UK, 6-7 January 2004; 7pp. [PDF, 115KB]

(2004) Krzysztof Jassem: Applying Oxford-PWN English-Polish dictionary to machine translation 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp.98-105. [PDF, 209KB]

(2004) Hiroyuki Kaji: Adapted seed lexicon and combined bidirectional similarity measures for translation equivalent extraction from comparable corpora; TMI-2004: proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation, October 4-6, 2004, Baltimore, Maryland, USA; pp.115-124. [PDF, 378KB]

(2004) Alon Lavie, Katharina Probst, Erik Peterson, Stephan Vogel, Lori Levin, Ariadna Font-Llitjos, & Jaime Carbonell: A trainable transfer-based MT approach for languages with limited resources 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp. 116-123. [PDF, 265KB]

(2004) Kyonghee Paik, Satoshi Shirai, & Hiromi Nakaiwa: Automatic construction of a transfer dictionary considering directionality. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.31-38. [PDF, 183KB]

(2004) Emanuele Pianta & Luisa Bentivogli: Knowledge intensive word alignment with KNOWA. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 70KB]

(2004) Catarina Ribeiro, Ricardo Santos, Rui Pedro Chaves, & Palmira Marrafa: Semi-automatic UNL dictionary generation using WordNet.PT.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.279-282. [PDF, 344KB]

(2004) Anna Samiotou, Lambros Kranias, George Papadopoulos, Marita Asunmaa, & Gudrun Magnusdottir: Exploitation of parallel texts for populating MT & TM databases. LREC-2004. Workshop, 25th May 2004: The amazing utility of parallel and comparable corpora; pp. 1-4. [PDF, 323KB]

(2004) Charles Schafer & David Yarowsky: Exploiting aggregate properties of bilingual dictionaries for distinguishing senses of English words and inducing English sense clusters. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 199KB]

(2004) Gilles Sérasset: A generic collaborative platform for multilingual lexical database development.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.79-86. [PDF, 503KB]

(2004) Li Shao & Hwee Tou Ng: Mining new word translations from comparable corpora. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 305KB]

(2004) Aree Teeraparbseree: Qualitative evaluation of automatically calculated acception based MLDB.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.23-30. [PDF, 261KB]

(2004) Keita Tsuji & Kyo Kageura: Extracting low-frequency translation pairs from Japanese-English bilingual corpora. Coling 2004, CompuTerm 2004: 3rd International Workshop on Computational Technology, Proceedings of the Workshop, 29th August 2004, Geneva, Switzerland; 8pp. [PDF, 178KB]

(2004) Stephan Vogel & Christian Monson: Augmenting manual dictionaries for statistical machine translation. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1593-1596. [PDF, 264KB]

(2003) Necip Fazil Ayan, Bonnie J.Dorr, & Okan Kolak: Evaluation techniques applied to domain tuning of MT lexicons. "Towards systematizing MT evaluation": a workshop on machine translation evaluation at the MT Summit IX, New Orleans, USA, 27 September 2003; pp.3-11. [PDF, 442KB]

(2003) Ondřej Bojar: Towards automatic extraction of verb frames. Prague Bulletin of Mathematical Linguistics, no.79/80, 2003; 21pp. [PDF, 390KB]

(2003) Francis Bond & Sanae Fujita: Evaluation of a method of creating new valency entries MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 108KB]

(2003) David Conejero, Jesus Gimenez, Victoria Arranz, Antonio Bonafonte, Neus Pascual, Nuria Castell, & Asunción Moreno: Lexica and corpora for speech-to-speech translation: a trilingual approach. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.1593-1596; abstract [PDF, 34KB]

(2003) Joseph Dichy & Ali Farghaly: Roots & patterns vs. stems plus grammar-lexis specifications: on what basis should a multilingual database centred on Arabic be built? MT Summit IX -- workshop: Machine translation for semitic languages, New Orleans, USA, 23 September 2003 [PDF, 258KB]

(2003) Limdin Du & Boxing Chen: Automatic extraction of bilingual chunk lexicon for spoken language translation. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.2333-2336; abstract [PDF, 34KB]

(2003) Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi, & Koji Tochinai: Effectiveness of automatic extraction of bilingual collocations using recursive chain-link-type learning MT Summit IX, New Orleans, USA, 23-27 September 2003 [PDF, 687KB]

(2003) Ali Farghaly & Jean Senellart: Intuitive coding of the Arabic lexicon. MT Summit IX -- workshop: Machine translation for semitic languages, New Orleans, USA, 23 September 2003 [PDF,168KB]

(2003) Kenji Imamura, Eiichiro Sumita & Yuji Matsumoto: Automatic construction of machine translation knowledge using translation literalness. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.155-162 [PDF, 397KB]

(2003) Hyungsuk Ji, Sabine Ploux, & Eric Wehrli: Lexical knowledge representation with contextonyms MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.194-201. [PDF, 187KB]

(2003) Hiroyuki Kaji: Word sense acquisition from bilingual comparable corpora HLT-NAACL 2003: conference combining Human Language Technology conference series and the North American Chapter of the Association for Computational Linguistics conference series,  May 27 – June 1,  2003, Edmonton, Canada; pp.32-39 [PDF, 339KB]

(2003) Burcu Karagol-Ayan, David Doermann, & Bonnie J. Dorr: Acquisition of bilingual MT lexicons from OCRed dictionaries MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.208-215. [PDF, 285KB]

(2003) Rob Koeling, Adam Kilgarriff, David Tugwell, & Roger Evans: An evaluation of a lexicographer's workbench: building lexicons for machine translation 7th EAMT Workshop, "Improving machine translation through other language technology tools", 13 April 2003, Budapest, Hungary; pp. 9-16 [PDF, 250KB]

(2003) Brigitte Orliac & Mike Dillinger: Collocation extraction for machine translation MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.292-298. [PDF, 203KB]

(2003) Katharina Probst: Using ‘smart’ bilingual projection to feature-tag a monolingual dictionary HLT-NAACL 2003: proceedings of Seventh Conference on Natural Language Learning,  May 27 – June 1,  2003, Edmonton, Canada; 8pp. [PDF, 108KB]

(2003) Jean Senellart, Jin Yang, & Anabel Rebollo: SYSTRAN intuitive coding technology MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.346-353. [PDF, 445KB]

(2003) Gregor Thurmair: Making term extraction tools usable Controlled language translation, EAMT-CLAW-03, Dublin City University, 15-17 May 2003; pp.170-179. [PDF, 213KB]

(2003) Takehito Utsuro, Takashi Horiuchi, Kohei Hino, Takeshi Hamamoto & Takeaki Nakayama: Effect of cross-language IR in bilingual lexical acquisition from comparable corpora. EACL 2003: 10th Conference of the European Chapter of the Association for Computational Linguistics, April 12-17, 2003, Budapest, Hungary. Proceedings; pp.355-362 [PDF, 734KB]

(2003) Hua Wu & Ming Zhou: Synonymous collocation extraction using translation information ACL-2003: 41st Annual meeting of the Association for Computational Linguistics, July 7-12, 2003, Sapporo, Japan. [PDF, 139KB]

(2003) Rémi Zajac, Elke Lange, & Jin Yang: Customizing complex lexical entries for high-quality MT MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.433-438. [PDF, 277KB]

(2002) Timothy Baldwin and Francis Bond: Alternation-based lexicon reconstruction. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [PDF, 187KB]

(2002) Hans C. Boas: Bilingual FrameNet dictionaries for machine translation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1364-1371. [PDF, 123KB]

(2002) Christian Boitet, Mathieu Mangeot, & Gilles Sérasset: The Papillon project : cooperatively building a multilingual lexical data-base to derive open source dictionaries & lexicons. Coling-2002: Second Workshop on NLP and XML (NLPXML-2002), August 2002, Taipei, Taiwan; 3pp. [PDF, 163KB]

(2002) Igor Boguslavsky: Some lexical issues of UNL. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: First international workshop on UNL, other interlinguas and their applications, Las Palmas Canary Islands, 27 May 2002; pp.19-22. [PDF, 146KB]

(2002) Rafael C.Carrasco & Mikel L.Forcada: Incremental construction and maintenance of minimal finite-state automata. Computational Linguistics 28 (2), pp. 207-216 [PDF, 103KB]

(2002) Baobao Chang, Pernilla Danielsson, & Wolfgang Teubert: Extraction of translation unit from Chinese-English parallel corpora; Coling-2002: First SIGHAN Workshop on Chinese Language Processing, 1 September 2002, Taipei,Taiwan; 5pp. [PDF, 157KB]

(2002) Lawrence Cheung, Tom Lai, Robert Luk, Oi Yee Kwong, King Kui Sin, & Benjamin K.Tsou: Some considerations on guidelines for bilingual alignment and terminology extraction; Coling-2002: First SIGHAN Workshop on Chinese Language Processing, 1 September 2002, Taipei,Taiwan; 5pp. [PDF, 177KB]

(2002) Yun-Chuang Chiao & Pierre Zweigenbaum: Looking for candidate translational equivalents in specialized, comparable corpora. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 82KB]

(2002) Claudia Gdaniec & Esmé Manandise: Using word formation rules to extend MT lexicons. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 64-73. [go to publisher details]

(2002) Sanae Fujita and Francis Bond: A method of adding new entries to a valency dictionary by exploiting existing lexical resources. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [PDF, 179KB]

(2002) Sanae Fujita & Francis Bond: Extending the coverage of a valency dictionary; Coling-2002 workshop "Machine translation in Asia", 1 September 2002, Taipei,Taiwan; 7pp. [PDF, 180KB]

(2002) Barbara Gawronska, Björn Erlendsson and Hanna Duczak: Extracting semantic classes and morphosyntactic features for English-Polish machine translation. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [PDF, 252KB]

(2002) Lee Gillam, Khurshid Ahmad, David Dulby, & Christopher Cox: Knowledge exchange and terminology interchange: the role of standards. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 22pp. [PDF, 277KB]; presentation by Lee Gillam [with minor defects], 40 slides [PDF of PPT, 389KB]

(2002) Mutsumia Imai, Etsuko Haryu and Hiroyuki Okada: Building up the lexicon: how Japanese children learn meanings to novel nouns and verbs. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002. [slides, PDF, 1189KB]

(2002) Kenji Imamura & Eiichiro Sumita: Bilingual corpus cleaning focusing on translation literality. ICSLP 2002, Interspeech 2002:7th International Conference on  Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA; pp.1713-1716; abstract [PDF, 47KB]

(2002) Hiroshi Kanayama: An iterative algorithm for translation acquisition of adpositions. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002; pp.85-95. [PDF, 268KB]

(2002) Genichiro Kikui & Hirofumi Yamamoto: Finding translation pairs from English-Japanese untokenised aligned corpora; ACL-2002 workshop "Speech-to-speech translation",11 July 2002, Philadelphia, USA; pp. 23-30 [PDF, 2510KB]

(2002) Philipp Koehn & Kevin Knight: Learning a translation lexicon from monolingual corpora. Unsupervised Lexical Acquisition: Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), July 2002, Philadelphia, Pennsylvania,USA; pp.9-16 [PDF, 187KB]

(2002) Natalie Kübler: Creating a term base to customise an MT system: reusability of resources and tools from the translator’s point of view. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: Language resources for translation work and research, Las Palmas Canary Islands, 27 May 2002; pp.44-48. [PDF, 31KB]

(2002) Oi Yee Kwong, Benjamin K.Tsou, Tom B.Y.Lai, Robert W.P.Luk, Lawrence Y.L.Cheung, & Francis C.Y.Chik: Alignment and extraction of bilingual legal terminology from context profiles; Coling-2002: Second international workshop on computational terminology (COMPUTERM 2002), 31 August 2002, Taipei,Taiwan; 7pp. [PDF, 224KB]

(2002) Hyo-Kyung Lee: Classification approach to word selection in machine translation. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 114-123. [go to publisher details]

(2002) Teruko Mitamura, Eric Nyberg, Kathy Baker, Peter Cramer, Jeongwoo Ko, David Svoboda, & Michael Duggan: The KANTOO MT system: controlled language checker and lexical maintenance tool. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 244-247. [go to publisher details]

(2002) Gábor Prószéky, Mátyás Naszódi & Balázs Kis: Recognition assistance: treating errors in text acquired from various recognition processes. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 63KB]

(2002) Charles Schafer & David Yarowsky: Inducing translation lexicons via diverse similarity measures and bridge languages. Coling-2002: Sixth Conference on Natural Language Learning 2002 (CoNLL-2002), 31 August 2002, Taipei,Taiwan; 8pp. [PDF, 361KB]

(2002) Fumiaki Sugaya, Toshiyuki Takezawa, Genichiro Kikui, & Seiichi Yamamoto: Proposal of a very-large-corpus acquisition method by cell-formed registration. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.326-328. [PDF, 32KB]

(2002) Kazutaka Takao, Kenji Imamura, & Hideki Kashioka: Comparing and extracting paraphrasing words with 2-way bilingual dictionaries. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1016-1022. [PDF, 72KB]

(2002) Jörg Tiedemann: MatsLex – a multilingual lexical database for machine translation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1909-1912. [PDF, 172KB]

(2002) Keita Tsuji, Beatrice Daille, & Kyo Kageura: Extracting French-Japanese word pairs from bilingual corpora based on transliteration rules. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.499-502. [PDF, 120KB]

(2002) Takehito Utsuro, Takashi Horiuchi, Yasunobu Chiba, & Takeshi Hamamoto: Semi-automatic compilation of bilingual lexicon entries from cross-lingually relevant news articles on WWW news sites. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 165-176. [go to publisher details]

(2002) Ruvan Weerasinghe: Bootstrapping the lexicon building process for machine translation between 'new' languages. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 177-186. [go to publisher details]

 (2001) Igor Bolshakov & Alexander Gelbukh: A large database of collocations and semantic references: interlingual applications.  International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.167-187. [PDF, 292KB]

(2001) Francis Bond, Ruhaida Binti Sulong, Takefumi Yamazaki & Kentaro Ogura: Design and construction of a machine-tractable Japanese-Malay dictionary.  MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 53-58. [PDF, 103KB]

(2001) Lynne Cahill: Semi-automatic construction of multilingual lexicons.  In: Machine Translation Review, issue 12: December 2001; pp.58-66.

(2001) Nicoletta Calzolari, Alessandro Lenci, Antonio Zampolli, Nuria Bel, Marta Villegas & Gregor Thurmair: The ISLE in the ocean. Transatlantic standards for multilingual lexicons (with an eye to machine translation). MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 67-71. [PDF, 193KB]

(2001) Mike Dillinger: Dictionary development workflow for MT: design and management. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 83-87. [PDF, 133KB]

(2001) Ismael García-Varea, Franz J. Och, Hermann Ney, & Francisco Casacuberta: Refined lexicon models for statistical machine translation using a maximum entropy approach  ACL-EACL-2001: 39th Annual meeting [of the Association for Computational Linguistics] and 10th Conference of the European Chapter [of ACL], July 9th - 11th 2001, Toulouse, France; pp.204-211. [PDF, 72KB]

(2001) Barbara Gawronska: PolVerbNet: an experimental database for Polish verbs. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp. 121-126. [PDF, 255KB]

(2001) Hodong Lee & Jong C.Park: Automatic augmentation of translation dictionary with database terminologies in multilingual query interpretation. ACL 2001 Workshop on Human Language Technology and Knowledge Management, July 2001, Toulouse, France; 10pp. [PDF, 108KB]

(2001) Christian Lieske, Susan McCormick & Gregor Thurmair: The Open Lexicon Interchange Format (OLIF) comes of age. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.211-216. [PDF, 184KB]

(2001) Gideon S.Mann & David Yarowsky: Multipath translation lexicon induction via bridge languages.  [NAACL-2001] Language Technologies 2001: the Second meeting of the North American Chapter of the Association for Computational Linguistics, Carnegie Mellon University, Pittsburgh, PA, 2-7 June 2001; 8pp. [PDF, 181KB]

(2001) Robert C.Moore: Towards a simple and accurate statistical approach to learning translation relationhips among words. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.79-86. [PDF, 55KB]

(2001) Masaaki Nagata, Teruka Saito, & Kenji Suzuki: Using the web as a bilingual dictionary. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.95-102. [PDF, 216KB]

 (2001) Rajmund Piotrowski, Yuri Romanov, Yuri Tovmach, Natalia Zaitseva, & Michael Blekhman: Dictionary organization in Linguistic Automaton for oriental languages.  International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.105-118. [PDF, 108KB]

(2001) Katharina Probst, Ralf Brown, Jaime Carbonell, Alon Lavie, Lori Levin & Erik Peterson: Design and implementation of controlled elicitation for machine translation of low-density languages MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Towards a Road Map for MT [PDF, 90KB]

(2001) Philip Resnik, Douglas Oard, & Gina Levow: Improved cross-language retrieval using backoff translation. HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 3pp. [PDF, 66KB]

(2001) Jean Senellart, Mirko Plitt, Christophe Bailly & Françoise Cardoso: Resource alignment for machine translation or implicit transfer. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.317-323. [PDF, 112KB]

(2001) Sayori Shimohata, Mihoko Kitamura, Tatsuya Sukehiro, & Toshiki Murata: Collaborative translation environment on the Web. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.331-334. [PDF, 364KB]

(2001) Enya Kong Tang & Mosleh H. Al-Adhaileh: Converting a bilingual dictionary into a bilingual knowledge bank based on the synchronous SSTC.  MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.351-356. [PDF, 76KB]

(2001) Kaoru Yamamoto, Yuji Matsumoto, & Mihoko Kitamura: A comparative study on translation units for bilingual lexicon extraction. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.87-94. [PDF, 91KB]

(2000) Luisa Bentivogli, Emanuele Pianta, & Fabio Pianesi: Coping with lexical gaps when building aligned multilingual wordnets. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 993-997. [PDF, 32KB]

 (2000) Ralf D.Brown, Jaime G.Carbonell, & Yiming Yang: Automatic dictionary extraction for cross-language information retrieval [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 275-298.

(2000) Lynne Cahill: Semi-automatic construction of multilingual lexicons. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 10pp. [PDF, 2016KB]

(2000) Bob Clark: MoBiMouse, the world’s first “no-click” dictionary program. International Journal for Language and Documentation 3, January 2000; pp.26-27. [PDF, 626KB]

(2000) Arantxa Diaz de Ilarraza, Aingeru Mayor, & Kepa Sarasola: Building a lexicon for an English-Basque MT system from heterogenous wide-coverage dictionaries. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 9pp. [PDF, 1751KB]

(2000) Bonnie J. Dorr, Gina-Anne Levow, & Dekang Lin: Building a Chinese-English mapping between verb concepts for multilingual applications. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.1-12. [go to publisher details]

(2000) Bonnie J. Dorr, Gina-Anne Levow, Dekang Lin, & Scott Thomas: Chinese-English semantic resource construction. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 757-760. [PDF, 31KB]

 (2000) Pascale Fung: A statistical view on bilingual lexicon extraction: from parallel corpora to non-parallel corpora [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 219-236.

(2000) Zhao-Ming Gao: Automatic acquisition of a high-precision translation lexicon from parallel Chinese-English corpora. In: IAI Working Paper no.36, 2000; 12pp. [PDF, 193KB]

 (2000) Dafydd Gibbon & Harold Lüngen: Speech lexica and consistent multilingual vocabularies. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 296-307. [abstract]

(2000) Toru Hisamitsu, Yoshiki Niwa, Shingo Nishioka, Hirofumi Sakurai, Osamu Imaichi, Makoto Iwayama, & Akihiko Takano: Term extraction using a new measure of term representativeness.  LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 8pp. [PDF, 224KB]

(2000) Olivier Kraif: Evaluation of statistical tools for automatic extraction of lexical correspondences between parallel texts. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 8pp. [PDF, 1579KB]

(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]

(2000) Fang Li & Wilhelm Weisweber: Bilingual lexicon extraction from Internet. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 7pp. [PDF, 59KB]

(2000) Timothy Meekhof & David Clements: L&H lexicography toolkit for machine translation. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.213-218. [go to publisher details]

(2000) Tanapong Potipiti, Virach Sornlertlamvanich, & Thatsanee Charoenporn: Towards building a corpus-based dictionary for non-word-boundary languages. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 5pp. [PDF, 56 KB]

(2000) Carole Tiberius & Lynne Cahill: Incorporating metaphonemes in a multilingual lexicon. Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp.1126-1130. [PDF, 405KB]

(2000) Jörg Tiedemann: Extracting phrasal terms using bitext. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 7pp. [PDF, 160KB]

(2000) Tamás Váradi: Lexical and translation equivalence in parallel corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 539-543. [PDF, 756KB]

(2000) Špela Vintar: Extracting terminological collocations from a parallel corpus. Fifth EAMT Workshop "Harvesting existing resources", May 11 - 12, 2000, Ljubljana, Slovenia; pp.21-30. [PDF, 132KB]

Monolingual corpora

(2004) Bill Dolan, Chris Quirk, & Chris Brockett: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 107KB]

(2004) Tracy Lin, Jian-Cheng Wu, & Jason S. Chang: Extraction of name and transliteration in monolingual and parallel corpora. Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Washington, DC, September 28 – October 2, 2004; ed. Robert E.Frederking and Kathryn B.Taylor (Berlin: Springer Verlag, 2004); pp. 177-186. [go to publisher details]

(2004) Yajuan Lü & Ming Zhou: Collocation translation acquisition using monolingual corpora. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp.167-174. [PDF, 95KB]

(2004) Chris Quirk, Chris Brockett, & William Dolan: Monolingual machine translation for paraphrase generation. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 127KB]

(2004) Mitsuo Shimohata, Eiichiro Sumita, & Yuji Matsumoto: Building a paraphrase corpus for speech translation.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1407-1410. [PDF, 672KB]

(2004) Dan Tufis, Radu Ion, & Nancy Ide: Word sense disambiguation as a WordNets’ validation method in Balkanet: LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1071-1074. [PDF, 422KB]

(2004) Wei Wang & Ming Zhou: Improving word alignment models using structured monolingual corpora. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 209KB]

(2004) Bing Zhao, Matthias Eck, & Stephan Vogel: Language model adaptation for statistical machine translation with structured query models. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 224KB]

(2003) Eiji Aramaki, Sadao Kurohashi, Hideki Kashioka, & Hideki Tanaka: Word selection for EBMT based on monolingual similarity and translation confidence HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 1565KB]

(2003) Regina Barzilay & Noemie Elhadad: Sentence alignment for monolingual comparable corpora EMNLP-2003: proceedings of the 2003  conference on Empirical Methods in Natural Language Processing, a meeting of SIGDAT, a special interest group of the ACL, held in conjunction with ACL-03,  11-12 July  2003, Sapporo, Japan; 8pp. [PDF, 94KB]

(2003) Pernilla Danielsson: Units of meaning in translation – how to make real use of corpus evidence. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 15pp. [PDF, 71KB]

(2003) Yannis Dologlou, Stella Markantonatou, George Tambouratzis, Olga Yannoutsou, Athanassia Fourla, and Nikos Iannou: Using monolingual corpora for statistical machine translation: the METIS system Controlled language translation, EAMT-CLAW-03, Dublin City University, 15-17 May 2003 [PDF, 296KB]

(2003) Lluís Mŕrquez, Adriŕ de Gispert, Xavier Carreras, & Lluís Padró: Low-cost named entity classification for Catalan: exploiting multilingual resources and unlabeled data.  ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp. [PDF, 80KB]

(2003) Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan, Diana Steffen, & Paul Buitelaar: Unsupervised monolingual and bilingual word-sense disambiguation of medical documents using UMLS. ACL 2003 Workshop on Natural Language Processing in Biomedicine, July 2003, Sapporo, Japan; 8pp. [PDF, 124KB]

(2003) Hua Wu & Ming Zhou: Optimizing synonym extraction using monolingual and bilingual resources. ACL 2003 International Workshop on Paraphrasing, July 11, 2003, Sapporo, Japan; 8pp. [PDF, 142KB]

(2002) Yaser Al-Onaizan & Kevin Knight: Translating named entities using monolingual and bilingual resources; ACL-2002: 40th Annual meeting of the Association for Computational Linguistics, July 2002, Philadelphia, USA; pp.400-408 [PDF, 186KB]

(2002) Philipp Koehn & Kevin Knight: Learning a translation lexicon from monolingual corpora. Unsupervised Lexical Acquisition: Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), July 2002, Philadelphia, Pennsylvania,USA; pp.9-16 [PDF, 187KB]

(2002) Radu Soricut, Kevin Knight, & Daniel Marcu: Using a large monolingual corpus to improve translation accuracy. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 155-164. [go to publisher details]

Multilingual corpora

(2004) proceedings of  Workshop: The amazing utility of parallel and comparable corpora.LREC-2002-Hovy-2.pdf LREC-2004: Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal, 25 May 2004. [PDF, 2226KB]

(2004) Victoria Arranz, Núria Castell, Josep Maria Crego, Jesús Giménez, Adriŕ de Gispert, & Patrick Lambert: Bilingual connections for trilingual corpora: an XML approach.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1459-1462. [PDF, 674KB]

(2004) Lea Cyrus & Hendrik Feddes: A model for fine-grained alignment of multilingual texts. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.15-22. [PDF, 141KB]

(2004) Georges Fafiotte: Building and sharing multilingual speech resources, using ERIM generic platforms. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.39-46. [PDF, 327KB]

(2004) M.Gavrilidou, P.Labropoulou, E.Desipri, V.Giouli, V.Antonopoulos, & S.Piperidis: Building parallel corpora for eContent professionals.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.97-100. [PDF, 34KB]

(2004) Najeh Hajlaoui & Christian Boitet: PolyphraZ: a tool for the quantitative and subjective evaluation of parallel corpora. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2004], September 30 – October 1, 2004, Kyoto, Japan; pp. 123-129 [PDF, 1616KB]

(2004) Najeh Hajlaoui & Christian Boitet: PolyphraZ: a tool for the management of parallel corpora. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.109-116. [PDF, 528KB]

(2004) Stephen Helmreich, David Farwell, Florence Reeder, Keith Miller, Bonnie Dorr, Nizar Habash, Eduard Hovy, Lori Levin, Owen Rambow, & Advaith Siddharthan: Interlingual annotation of multilingual text corpora. HLT/NAACL 2004: Frontiers in Corpus Annotation, Proceedings of the Workshop, Boston, Massachusetts, May 6, 2004; 8pp. [PDF, 126KB]

(2004) Genichiro Kikui, Toshiyuki Takezawa, & Seiichi Yamamoto: Multilingual corpora for speech-to-speech translation research. Interspeech 2004 – ICSLP 8th International  Conference on  Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004; pp.357-360; abstract [PDF, 56KB]

(2004) Yves Lepage: Lower and higher estimates of “true analogies” between sentences contained in a large multilingual corpus. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 999KB]

(2004) Nadia Mana, Roldano Cattoni, Emanuele Pianta, Franca Rossi, Fabio Pianesi, & Susanne Burger: The Italian NESPOLE! Corpus: a multilingual database with interlingua annotation in tourism and medical domains. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1467-1470. [PDF, 753KB]

(2004) I. Dan Melamed, Giorgio Satta, & Benjamin Wellington: Generalized multitext grammars.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 661-668. [PDF, 156KB]

(2004) Stelios Piperidis, Iason Demiros,  Prokopis Prokopidis, Peter Vanroose, Anja Hoethker, Walter Daelemans, Elsa Sklavounou, Manos Konstantinou, & Yannis Karavidis: Multimodal multilingual resources in the subtitling process. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.205-208. [PDF, 319KB]

(2004) Lee Schwartz & Takako Aikawa: Multilingual corpus-based approach to the resolution of English –ing.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.959-962. [PDF, 301KB]

(2004) Aree Teeraparbseree: Qualitative evaluation of automatically calculated acception based MLDB.  Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.23-30. [PDF, 261KB]

(2004) Gregor Thurmair: Multilingual content processing. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.xii-xvi. [PDF, 391KB]

(2004) Dan Tufiş: Term translations in multilingual corpora: discovery and consistency check. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1981-1984. [PDF, 749KB]

(2003) Hsin-Hsi Chen, Changhua Yang, & Ying Lin: Learning formulation and transformation rules for multilingual named entities. ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp. [PDF, 285KB]

(2003) David Conejero, Jesus Gimenez, Victoria Arranz, Antonio Bonafonte, Neus Pascual, Nuria Castell, & Asunción Moreno: Lexica and corpora for speech-to-speech translation: a trilingual approach. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.1593-1596; abstract [PDF, 34KB]

(2003) Joseph Dichy & Ali Farghaly: Roots & patterns vs. stems plus grammar-lexis specifications: on what basis should a multilingual database centred on Arabic be built? MT Summit IX -- workshop: Machine translation for semitic languages, New Orleans, USA, 23 September 2003 [PDF, 258KB]

(2003) Lee Schwartz, Takako Aikawa, & Chris Quirk: Disambiguation of English PP attachment using multilingual aligned data MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.330-337. [PDF, 98KB]

(2003) Stephanie Strassel, Alexis Mitchell, & Shudong Huang: Multilingual resources for entity extraction. ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition, July 12, 2003, Sapporo, Japan; 8pp. [PDF, 138KB]

(2002) Christian Boitet, Mathieu Mangeot, & Gilles Sérasset: The Papillon project : cooperatively building a multilingual lexical data-base to derive open source dictionaries & lexicons. Coling-2002: Second Workshop on NLP and XML (NLPXML-2002), August 2002, Taipei, Taiwan; 3pp. [PDF, 163KB]

(2002) Erica Costantini, Susanne Burger, & Fabio Pianesi: NESPOLE!’s multilingual and multimodal corpus. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.165-170. [PDF, 84KB]

(2002) Daniel Gervais: The full-text multilingual corpus: breaking the translation memory bottleneck. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 12pp. [PDF,160KB]; presentation, 22 slides [PDF of PPT, 2753KB]

(2002) Jörg Tiedemann: MatsLex – a multilingual lexical database for machine translation. LREC-2002: Third International Conference on Language Resources and Evaluation. Proceedings, Las Palmas de Gran Canaria, Spain, 27 May – 2 June 2002; pp.1909-1912. [PDF, 172KB]

(2001) Susanne Burger, Laurent Besacier, Paolo Coletti, Florian Metze, & Céline Morel: The Nespole! VoIP dialogue database. Eurospeech 2001 Scandinavia: 7th European Conference on Speech Communication and Technology, 2nd Interspeech Event, Aalborg, Denmark, September 3-7, 2001; pp.2043-2046 [PDF,84KB]; abstract [PDF, 81KB]

(2001) Vivian Tsang & Suzanne Stevenson: Automatic verb classification using multilingual resources. ACL-EACL 2001 Workshop on Computational Natural Language Learning (CoNLL), July 2001, Toulouse, France; 8pp. [PDF, 139KB]

 (2000) Ingeborg Blank: Terminology extraction from parallel technical texts [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 237-252.

(2000) Lynne Cahill: Semi-automatic construction of multilingual lexicons. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 10pp. [PDF, 2016KB]

(2000) Mona Diab: An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation. ACL 2000 Workshop on Word Senses and Multi-linguality, Hong Kong, October 2000; pp.1-9 [PDF, 785KB]

 (2000) Anthony McEnery, Paul Baker, Rob Gaizauskas, & Hamish Cunningham: EMILLE: building a corpus of South Asian languages.  MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 9pp. [PDF, 2172KB]

(2000) Stelios Piperidis, Harris Papageorgiou, & Sotiris Boutsis: From sentences to words and clauses [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 117-138.

(2000) Laurent Romary & Patrice Bonhomme: Parallel alignment of structured documents [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 201-217.

(2000) Michel Simard: Multilingual text alignment: aligning three or more versions of a text [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 49-67.

Ontologies

(2004) In-Su Kang, Jae-Hak J.Bae & Jong-Hyeok Lee: Natural language database access using semi-automatically constructed translation knowledge. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.280-289. [abstract]

(2004) Jui-Feng Yeh, Chung-Hsien Wu, Ming-Jun Chen, & Liang-Chih Yu: Automated alignment and extraction of bilingual ontology for cross-language domain-specific applications. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 196KB]

 (2003) Galia Angelova: Structuring terminology: between lexicons and domain knowledge representation. In: Wolfgang Menzel & Cristina Vertan (eds.) Natural Language Processing between Linguistic Inquiry and System Engineering. Iaşi: Editura Universităţii “Alexandru Oan Cuza”, 2003; pp.1-7. [PDF, 110KB]

(2003) Hyungsuk Ji, Sabine Ploux, & Eric Wehrli: Lexical knowledge representation with contextonyms MT Summit IX, New Orleans, USA, 23-27 September 2003; pp.194-201. [PDF, 187KB]

(2003) Kaoru Yamamoto & Yuji Matsumoto: Extracting translation knowledge from parallel corpora. In: Michael Carl & Andy Way (eds.) Recent advances in example-based machine translation (Dordrecht: Kluwer Academic Publishers, 2003), pp. 365-395.

(2002) Christian Boitet: A rationale for using UNL as an interlingua and more in various domains. LREC-2002: Third International Conference on Language Resources and Evaluation. Workshop: First international workshop on UNL, other interlinguas and their applications, Las Palmas Canary Islands, 27 May 2002; pp.23-26. [PDF, 68KB]

(2002) Dong-il Kim, Zheng Cui, Jinji Li, & Jong-Hyeok Lee: A knowledge based approach to identification of serial verb construction in Chinese-to-Korean machine translation system; Coling-2002: First SIGHAN Workshop on Chinese Language Processing, 1 September 2002, Taipei,Taiwan; 5pp. [PDF, 536KB]

(2002) Eric Nyberg, Teruko Mitamura, Kathryn Baker, David Svoboda, Brian Peterson, & Jennifer Williams: Deriving semantic knowledge from descriptive texts using an MT system. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp.145-154. [go to publisher details]

(2001) Sin-Jae Kang & Jong-Hyeok Lee: Ontology-based word sense disambiguation using semi-automatically constructed ontology. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.181-186. [PDF, 287KB]

(2001) Young-Suk Lee, Wu Sok Yi, Stephanie Seneff, & Clifford J. Weinstein: Interlingua-based broad-coverage Korean-English translation at CCLINC. HLT-2001: Proceedings of the First International Conference on Human Language Technology Research, San Diego, CA, March 18-21, 2001; 6pp. [PDF, 80KB]

(2001) Mark Stevenson & Yorick Wilks: The interaction of knowledge sources in word sense disambiguation. Computational Linguistics 27 (3), pp. 321-349 [PDF, 2076KB]

(2001) Enya Kong Tang & Mosleh H. Al-Adhaileh: Converting a bilingual dictionary into a bilingual knowledge bank based on the synchronous SSTC.  MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.351-356. [PDF, 76KB]

(2000) David Farwell & Stephen Helmreich: An interlingual-based approach to reference resolution. NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP, Seattle, May 2000; pp. 1-11 [PDF, 809KB]

(2000) M. Abdus Salam: Machine translation and multilingual communication on the internet. MT2000: machine translation and multilingual applications in the new millennium: international conference at the University of Exeter, 20-22 November 2000, organised by the British Computer Society. [London: BCS]; 5pp. [PDF, 1003KB]

Open source

(2002) Christian Boitet: A roadmap for MT: four «keys» to handle more languages, for all kinds of tasks, while making it possible to improve quality (on demand).  ICUKL-2002: International Conference on Universal Knowledge and Language, 25th-29th November 2002, Goa, India, organised by UNDL Foundation and Indian Institute of  Technology Bombay; 12pp. [PDF, 482KB]

(2002) Christian Boitet, Mathieu Mangeot, & Gilles Sérasset: The Papillon project : cooperatively building a multilingual lexical data-base to derive open source dictionaries & lexicons. Coling-2002: Second Workshop on NLP and XML (NLPXML-2002), August 2002, Taipei, Taiwan; 3pp. [PDF, 163KB]

(2001) Bert Esselink: Free translation memory: “not a bad vision”. Lionbridge releases translation technology. [Interview with Henri Broekmate.] Language International 13 (6), December 2001; pp.8-9. [PDF, 393KB]

(2001) Bernard Scott: The Logos model: principles and motivations underlying the OpenLogos MT system. [Logos Institute, revised 2009]; 36pp. [PDF, 301KB]

Scarce resources (see also Language resources, Rapid development of MT)

(2004) Patrick A.V.Hall: Localising nations, saving languages: moving from Unicode to language engineering. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 12pp. [PDF, 172KB]

(2004) Alon Lavie, Katharina Probst, Erik Peterson, Stephan Vogel, Lori Levin, Ariadna Font-Llitjos, & Jaime Carbonell: A trainable transfer-based MT approach for languages with limited resources 9th EAMT Workshop, "Broadening horizons of machine translation and its applications", 26-27 April 2004, Malta; pp. 116-123. [PDF, 265KB]

(2004) Evgeny Matusov, Maja Popovic, Richard Zens & Hermann Ney: Statistical machine translation of spontaneous speech with scarce resources. International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation [IWSLT 2004], September 30 – October 1, 2004, Kyoto, Japan; pp. 139-146 [PDF, 197KB]

(2004) Sonja Nießen & Hermann Ney: Statistical machine translation with scarce resources using morpho-syntactic information. Computational Linguistics 30 (2), pp.181-204. [PDF, 147KB]

(2003) Alon Lavie, Stephan Vogel, Lori Levin, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell, & Richard Cohen: Experiments with a Hindi-to-English transfer-based MT system under a miserly data scenario. ACM Translations on Asian Language Information Processing (TALIP) 2 (2), June 2003; pp.143-163. [PDF, 349KB]

(2003) Viet Bac Le, Brigitte Bigi, Laurent Besacier, & Eric Castelli: Using the web for fast language model construction in minority languages. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.3117-3120; abstract [PDF, 34KB]

(2003) Dan Tufis, Ana-Maria Barbu, & Radu Ion: TREQ-AL: a word alignment system with limited language resources HLT-NAACL 2003 Workshop, "Building and using parallel texts: data driven machine translation and beyond", 31 May 2003, Edmonton, Canada. [PDF, 164KB]. [A new version describing the TREQ-AL system after bug fixes is also available [PDF, 53KB]

(2002) Jaime Carbonell, Katharina Probst, Erik Peterson, Christian Monson, Alon Lavie, Ralf Brown, & Lori Levin: Automatic rule learning for resource-limited MT. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 1-10. [go to publisher details]

(2000) Jeff Allen: The risks of spelling variation and reform. International Journal for Language and Documentation 4, April 2000; pp.41-42. [PDF, 589KB]

(2000) Jeff Allen: What about statistical MT? International Journal for Language and Documentation 7, October/November 2000; pp.41-42. [PDF, 826KB]

(2000) Yaser Al-Onaizan, Ulrich Germann, Ulf Hermjakob, Kevin Knight, Philip Koehn, Daniel Marcu, & Kenji Yamada: Translating with scarce resources 17th National conference of the American Association for Artificial Intelligence (AAAI 2000) July 30- August 3, 2000, Austin,Texas. [PDF, 166KB]

 (2000) Sukhdave Singh, Tony McEnery, & Paul Baker: Building a parallel corpus of English/Panjabi [abstract]. In: Jean Véronis (ed.) Parallel text processing: alignment and use of translation corpora. Dordrecht/Boston/London: Kluwer Academic Publishers, 2000; pp. 335-346.

Software resources

(2000) Bert Esselink: Translators take to the Web. Language International 12 (6), December 2000; pp.34-35. [PDF, 522KB]

Spoken language resources

(2004) Georges Fafiotte: Building and sharing multilingual speech resources, using ERIM generic platforms. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.39-46. [PDF, 327KB]

Termbanks

(2004) Janet Carter-Sigglow: Who’s afraid of CAT? Redefining the boundaries of translation. Translating and the Computer 26: proceedings of the Twenty-sixth International Conference on Translating and the Computer, 18-19 November 2004, London. (London: Aslib, 2004); 17pp. [PDF, 159KB]; presentation [PDF, 3800KB]

(2003) Sylvia Ball: Joined-up terminology – the IATE system enters production. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 17pp. [PDF, 583KB]

(2002) Lee Gillam, Khurshid Ahmad, David Dulby, & Christopher Cox: Knowledge exchange and terminology interchange: the role of standards. Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 22pp. [PDF, 277KB]; presentation by Lee Gillam [with minor defects], 40 slides [PDF of PPT, 389KB]

(2001) D.Rummel & S.Ball: The IATE project – towards a single terminological database for the European Union. Translating and the Computer 23: papers from the Aslib conference held on 29 & 30 November 2001 (London: Aslib, 2001); 28pp. [PDF, 551KB]

(2000) Ian Johnson & Alastair MacPhail: IATE – Inter-Agency Terminology Exchange: development of a single central terminology database for the institutions and agencies of the European Union. LREC-2000: Second International Conference on Language Resources and Evaluation. Workshop on Terminology Resources and Computation, Athens, Greece, 29 May 2000; 7pp. [PDF, 55KB]

(2000) Ian Johnson & Maria-José Palos Caravina: Validation and quality control issues in a new web-based, interactive terminology database for the institutions and agencies of the European Union.  Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 9pp.  [PDF, 54KB]

Treebanks (see also Semantic analysis and representation, Thesaurus method)

(2004) Martin Čmejrek, Jan Cuřin, Jiři Havelka, Jan Hajič, & Vladislav Kuboň: Prague Czech-English dependency treebank: syntactically annotated resources for machine translation . LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1597-1600. [PDF, 292KB]

(2004) Martin Čmejrek, Jan Cuřín, & Jiří Havelka: Prague Czech-English dependency treebank: any hopes for a common annotation scheme? HLT/NAACL 2004: Frontiers in Corpus Annotation, Proceedings of the Workshop, Boston, Massachusetts, May 6, 2004; 8pp. [PDF, 230KB]

(2004) Jan Cuřin, Martin Čmejrek, Jiří Havelka & Vladislav Kuboň: Building a parallel bilingual syntactically annotated corpus. First International Joint Conference on Natural Language Processing, Hainan Island, China, March 22-24, 2004; pp.168-176. [abstract]

 (2004) Kiyotaka Uchimoto, Yujie Zhang, Kiyoshi Sudo, Masaki Murata, Satoshi Sekine, & Hitoshi Isahara: Multilingual aligned parallel treebank corpus reflecting contextual information and its applications. Coling 2004: Proceedings of the Workshop on Multilingual Linguistic Resources (MLR2004), August 28th 2004, University of Geneva, Switzerland; pp.63-70. [PDF, 162KB]

(2004) Martin Volk & Yvonne Samuelsson: Bootstrapping parallel treebanks. Coling 2004: Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora, August 29th 2004, Geneva, Switzerland; 7pp. [PDF, 192KB]

 (2000) Erhard W. Hinrichs, Julia Bartels, Yasuhiro Kawata, Valia Kordoni & Heike Telljohann: The Tübingen treebanks for spoken German, English, and Japanese. In: Wolfgang Wahlster (ed.) Verbmobil: foundations of speech-to-speech translation. (Berlin: Springer, 2000); pp. 550-574. [abstract]

Wordnets (see also WordNet in index of systems)

(2004) Chen Benfeng & Pascale Fung: Automatic construction of an English-Chinese bilingual FrameNet.  HLT-NAACL 2004: Human Language Technology conference and North American Chapter of the Association for Computational Linguistics annual meeting, May 2-7, 2004, The Park Plaza Hotel, Boston, USA – Short Papers; pp. 29-32. [PDF, 185KB]

(2004) Key-Sun Choi, Hee-Sook Bae, Wonseok Kang, Juho Lee, Eunhe Kim, Hekyeong Kim, Donghee Kim, Youngbin Song, & Hyosik Shin: Korean-Chinese-Japanese multilingual wordnet with shared semantic hierarchy.  LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon, Portugal, 26-28 May 2004; pp.1131-1134. [PDF, 518KB]

(2004) Dan Tufiş, Radu Ion, & Nancy Ide: Fine-grained word sense disambiguation based on parallel corpora, word alignment, word clustering and aligned wordnets. Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva, Switzerland, Proceedings; 7pp. [PDF, 192KB]

(2003) Nuno M.F.Dionisio: Nominal taxonomies and word sense disambiguation. Ph D thesis, University of East Anglia, Norwich, November 15, 2003. 312pp. [PDF, 3062KB]

(2002) Chu-Ren Huang, I-Ju E.Tseng, & Dylan B.S.Tsai: Translating lexical semantic relations: the first step towards multilingual wordnets. Coling-2002: Workshop SEMANET: Building and using Semantic Networks, August 2002, Taipei, Taiwan; 7pp. [PDF, 129KB]

(2001) Guo-Wei Bian & Chi-Ching Lin: Trans-EZ at NTCIR-2: synset co-occurrence method for English-Chinese cross-lingual information retrieval. NTCIR Workshop 2: Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text retrieval and Text Summarization, March 7-9, 2001, Tokyo, Japan; 6pp. [PDF, 41KB]

 (2001) Igor Bolshakov & Alexander Gelbukh: A large database of collocations and semantic references: interlingual applications.  International Journal of Translation 13 (1-2), Jan-Dec 2001; pp.167-187. [PDF, 292KB]

(2000) Luisa Bentivogli, Emanuele Pianta, & Fabio Pianesi: Coping with lexical gaps when building aligned multilingual wordnets. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 993-997. [PDF, 32KB]

World Wide Web [see also Internet, Semantic Web]

(2004) Pu-Jen Cheng, Yi-Cheng Pan, Wen-Hsiang Lu, & Lee-Feng Chien: Creating multilingual translation lexicons with regional variations using web corpora.  ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona, Spain; pp. 534-541. [PDF, 283KB]

(2004) Federico Gaspari: Controlled language, web usability and machine translation services on the Internet. International Journal of Translation 16 (1), Jan-June 2004; pp.41-54. [PDF, 70KB]

(2004) Fuminori Kimura, Akira Maeda & Shunsuke Uemura: CLIR using web directory at NTCIR4. Proceedings of NTCIR-4, Tokyo, 2-4 June 2004; 5pp. [PDF, 368KB]

(2004) Jin-Shea Kuo & Ying-Kuei Yang: Constructing transliteration lexicons from web corpora. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the interactive poster and demonstration sessions, 21-26 July 2004, Barcelona, Spain; 4pp. [PDF, 127KB]

(2004) Chris Quirk, Chris Brockett, & William Dolan: Monolingual machine translation for paraphrase generation. EMNLP-2004: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 25-26 July 2004, Barcelona, Spain; 8pp. [PDF, 127KB]

(2004) Angelika Zerfass: Teaching translation tools over the web. Coling 2004: Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training, 28th August, University of Geneva, Switzerland; pp. 61-67. [PDF, 67KB]

(2003) Ian Harris: Translation web services – a reality. Translating and the Computer 25: proceedings of the Twenty-fifth International Conference on Translating and the Computer, 20-21 November 2003, London. (London: Aslib, 2003); 11pp. [PDF, 154KB]

(2003) Fuminori Kimura, Akira Maeda, Masatoshi Yoshikawa, & Shunsuke Uemura: Cross-language information retrieval based on category matching between language versions of a web directory. IRAL 2003: Sixth International Workshop on Information Retrieval with Asian Languages,  July 7,  2003, Sapporo, Japan; 8pp. [PDF, 78KB]

(2003) Viet Bac Le, Brigitte Bigi, Laurent Besacier, & Eric Castelli: Using the web for fast language model construction in minority languages. Eurospeech 2003 - Interspeech 2003 8th European  Conference on  Speech Communication and Technology, Geneva, Switzerland, September 1-4, 2003; pp.3117-3120; abstract [PDF, 34KB]

(2003) Wessel Kraaij, Jian-Yun Nie, & Michel Simard: Embedding web-based statistical translation models in cross-language information retrieval. Computational Linguistics 29 (3), pp.381-419. [PDF, 209KB]

(2003) Philip Resnik & Noah A. Smith: The web as a parallel corpus. Computational Linguistics 29 (3), pp.349-380. [PDF, 8130KB]

(2003) Andy Way & Nano Gough: wEBMT: developing and validating an example-based machine translation system using the World Wide Web. Computational Linguistics 29 (3), pp.421-457. [PDF, 169KB]

(2002) Yunbo Cao & Hang Li: Base noun phrase translation using Web data and the EM algorithm. Coling 2002, Taipei, Taiwan, 26-30 August 2002 [PDF, 206KB]

(2002) Bill Dunlap: Website Translation version 2.0. Language International 14 (3), June 2002; pp.10-12. [PDF, 581KB]

(2002) Mikel L. Forcada: Using multilingual content on the web to build fast finite-state direct translation systems. TMI-2002 conference, Keihanna, Japan, March 13-17, 2002: Workshop. [PDF, 155KB]

(2002) Nano Gough, Andy Way, & Mary Hearne: Example-based machine translation via the Web. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 74-83. [go to publisher details]

(2002) Mike Roche: Web services for translation.  Translating and the Computer 24: proceedings from the Aslib conference held on 21-22 November 2002 (London: Aslib, 2002); 8pp. [PDF, 60KB]

(2002) Gabriela Tissiani, Hugo Cesar Hoeschl, & Ricardo Miranda Barcia: Semiotic approach for the design of adaptive graphical user interfaces using Universal Networking Language. ICUKL-2002: International Conference on Universal Knowledge and Language, 25th-29th November 2002, Goa, India, organised by UNDL Foundation and Indian Institute of  Technology Bombay; 6pp. [PDF, 468KB]

(2002) Laraine Tunick: Web site translation is fastest growing segment of worldwide language translation industry.  Allied Business Intelligence, October 29, 2002. 1p. [PDF, 55KB]

(2002) Takehito Utsuro, Takashi Horiuchi, Yasunobu Chiba, & Takeshi Hamamoto: Semi-automatic compilation of bilingual lexicon entries from cross-lingually relevant news articles on WWW news sites. Machine translation: from research to real users: 5th conference of the Association for Machine Translation in the Americas, AMTA 2002, Tiburon, CA, October 2002; ed. Stephen D. Richardson (Berlin: Springer Verlag, 2002); pp. 165-176. [go to publisher details]

(2001) Lorna Balkan: Exploiting the WWW for MT teaching. MT Summit VIII, Santiago de Compostela, Spain, 18-22 September 2001. Workshop on Teaching Machine Translation [PDF, 61KB]

(2001) Bert Esselink: Web design: going native. Language International 13 (1), February 2001; pp.16-18. [PDF, 666KB]

(2001) Masaaki Nagata, Teruka Saito, & Kenji Suzuki: Using the web as a bilingual dictionary. ACL-EACL 2001 workshop "Data-driven machine translation", July 7, 2001, Toulouse, France; pp.95-102. [PDF, 216KB]

(2001) Julian Perkin: Multilingual websites widen the way to a new online world. [From Financial Times]. Language International 13 (2), April 2001; p.7. [PDF, 161KB]

(2001) Sayori Shimohata, Mihoko Kitamura, Tatsuya Sukehiro, & Toshiki Murata: Collaborative translation environment on the Web. MT Summit VIII: Machine Translation in the Information Age, Proceedings, Santiago de Compostela, Spain, 18-22 September 2001; pp.331-334. [PDF, 364KB]

(2000) Charlie Baxter: You ain’t seen nothing yet. Language International 12 (1), February 2000; pp.10-11. [PDF, 422KB]

(2000) Eve Lindenmuth Bodeux: Tongue tied on the Web. Language International 12 (2), April 2000; pp.20-22. [PDF, 787KB]

(2000) Michael Fleming & Robin Cohen: Mixed-initiative translation of Web pages. Envisioning machine translation in the information future: 4th conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca,Mexico, October 2000; ed. John S. White (Berlin: Springer Verlag, 2000); pp.25-29. [go to publisher details]

(2000) Sarah Grip: Case study: machine translation comes to Corning Cable Systems. Language International 12 (5), October 2000; pp.18-20. [PDF, 555KB]

(2000) Bert Esselink: Translators take to the Web. Language International 12 (6), December 2000; pp.34-35. [PDF, 522KB]

(2000) Christopher Hogan & Robert Frederking: WebDIPLOMAT: a web-based interactive machine translation system Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 1041-1045 [PDF,.473KB]

(2000) Ian Johnson & Maria-José Palos Caravina: Validation and quality control issues in a new web-based, interactive terminology database for the institutions and agencies of the European Union.  Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 9pp.  [PDF, 54KB]

(2000) Andrew Joscelyne: Is it showtime? Language International 12 (3), June 2000; pp.18-19, 40-41. [PDF, 1011KB]

(2000) Elliott Macklovitch: Two types of translation memory. Translating and the Computer 22: proceedings of the Twenty-second International Conference… 16-17 November 2000 (London: Aslib, 2000); 15pp.  [PDF, 198KB]

(2000) Elliott Macklovitch, Michel Simard, & Philippe Langlais: TransSearch: a free translation memory on the World Wide Web. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 1201-1208. [PDF, 214KB]

(2000) Vladimir Oboronko: Wired for peace and multi-language communication: virtual diplomacy in northeast Asia and web-based translation. AMTA 2000 pre-conference workshop “Machine translation in practice: from old guard to new guard”, Cuernavaca,Mexico, October10,  2000 . 5p. [PDF, 183KB]

(2000) Thomas Schneider: Stop the presses. Language International 12 (4), August 2000; pp.22-24. [PDF, 692KB]

(2000) Howard Schwartz: Creating a global website. International Journal for Language and Documentation 6, August/September 2000; pp.14-16. [PDF, 2892KB]

(2000) Gilles Sérasset & Christian Boitet: On UNL as the future "html of the linguistic content" & reuse of existing NLP components in UNL-related applications with the example of a UNL-French deconverter Coling 2000 in Europe: the 18th International Conference on Computational Linguistics. Proceedings of the conference, Universität des Saarlandes, Saarbrücken, Germany, 31 July -4 August 2000; pp. 768-774 [PDF,.562KB]

(2000) Bob Sheng: Automated translation for the deployment of dynamic and mission critical content AMTA 2000 pre-conference workshop “Machine translation in practice: from old guard to new guard”, Cuernavaca,Mexico, October10,  2000 . 1p. [abstract only] [PDF, 58KB]

(2000) Ronaldo Teixeira Martins, Lucia Helena Machado Rino, Maria das Graças Volpe Nunes,  Gisele Montilha, & Osvaldo Novais de Oliveira: An interlingua aiming at communication on the Web: How language-independent can it be? NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP, Seattle, May 2000; pp. 24-33 [PDF, 628KB]

(2000) Feiyu Xu, Klaus Netter, & Holger Stenzhorn: A system for uniform and multilingual access to structured database and web information in a tourism domain. ACL-2000: 38th Annual meeting of the Association for Computational Linguistics, Hong Kong. Demonstration notes, 3-6 October 2000; pp.9-10 [PDF, 207KB]