Building and Using Parallel Texts:
Data Driven Machine Translation and Beyond

HLT-NAACL 2003, Edmonton, Canada

Workshop, May 31, 2003

Proceedings

 

On the Pleasure of Being Bi-textual; or My Life in Parallel Text -- Elliott Macklovitch [PDF, 301KB]

 

Word Alignment Shared Task

An Evaluation Exercise for Word Alignment -- Rada Mihalcea and Ted Pedersen [PDF, 131KB]

ProAlign: Shared Task System Description -- Dekang Lin and Colin Cherry [PDF, 85KB]
Word Alignment Based on Bilingual Bracketing -- Bing Zhao and Stephan Vogel [PDF, 56KB]

Statistical Translation Alignment with Compositionality Constraints -- Michel Simard and Philippe Langlais [PDF, 65KB]
Reducing Parameter Space for Word Alignment -- Herve Dejean, Eric Gaussier, Cyril Goutte and Kenji Yamada [PDF, 42KB]
Word Alignment Baselines -- John C. Henderson [PDF, 54KB]

TREQ-AL: A Word Alignment System with Limited Language Resources -- Dan Tufis, Ana-Maria Barbu, Radu Ion [PDF, 164KB]. [A new version describing the TREQ-AL system after bug fixes is also available [PDF, 53KB]
The Duluth Word Alignment System -- Bridget Thomson McInnes and Ted Pedersen [PDF,52KB]
Phrase-based Evaluation of Word-to-Word Alignments -- Michael Carl and Sisay Fissaha [PDF, 253KB]
 

Regular Papers

Bootstrapping Parallel Corpora -- Chris Callison-Burch and Miles Osborne [PDF, 83KB]
Retrieving Meaning-equivalent Sentences for Example-based Rough Translation -- Mitsuo Shimohata and Eiichiro Sumita and Yuji Matsumoto [PDF, 43KB]
Word Selection for EBMT based on Monolingual Similarity and Translation Confidence -- Eiji Aramaki and Sadao Kurohashi and Hideki Kashioka and Hideki Tanaka [PDF, 1565KB]

Translation Spotting for Translation Memories -- Michel Simard [PDF, 120KB]
Learning Sequence-to-Sequence Correspondences from Parallel Corpora via Sequential Pattern Mining -- Kaoru Yamamoto and Taku Kudo and Yuta Tsuboi and Yuji Matsumoto [PDF, 101KB]
Efficient Optimization for Bilingual Sentence Alignment Based on Linear Regression -- Bing Zhao and Klaus Zechner and Stephen Vogel and Alex Waibel [PDF, 93KB]

POS-Tagger for English Vietnamese Bilingual Corpus -- Dinh Dien and Hoang Kiem [PDF, 202KB]
Acquisition of English-Chinese Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model -- Chun-Jen Lee and Jason S. Chang [PDF, 344KB]
Input Sentence Splitting and Translating -- Takao Doi and Eiichiro Sumita [PDF, 201KB]

 

Short Papers

An LSA Implementation Against Parallel Texts in French and English -- Katri A. Clodfelder [PDF, 119KB]
Aligning and Using an English-Inuktitut Parallel Corpus -- Joel Martin and Howard Johnson and Benoit Farley and Anna Maclachlan [PDF, 30KB]
Comparing the Sentence Alignment Yield from Two News Corpora Using a Dictionary-Based Alignment System -- Stephen Nightingale and Hideki Tanaka [PDF, 142KB]

 

Resources for Word Alignment

            For details of the shared task organized during the workshop go to http://cs.unt.edu/~rada/wpt/