ACL 2013
Sixth Workshop on Building and Using
Comparable Corpora
Proceedings of the Workshop
August 8, 2013
Table of contents
Cross-lingual WSD for translation extraction from comparable corpora.
Marianna Apidianaki, Nikola Ljubešić, and Darja Fišer…1-10
Bilingual lexicon extraction via pivot language and word alignment tool.
Hong-Seok Kwon, Hyeong-Won Seo, and Jae-Hoon Kim… 11-15. [PDF, 666KB]
Using WordNet and semantic similarity for bilingual terminology mining from comparable corpora.
Dhouha Bouamor, Nasredine Semmar, and Pierre Zweigenbaum… 16-23
A comparison of smoothing techniques for bilingual lexicon extraction from comparable corpora.
Amir Hazem and Emmanuel Morin… 24-33
Chinese-Japanese parallel sentence extraction from quasi-comparable corpora.
Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi… 34-42
A modular open-source focused crawler for mining monolingual and bilingual corpora from the web.
Vassilis Papavassiliou, Prokopis Prokopidis, and Gregor Thurmair … 43-51
Building basic vocabulary across 40 languages.
Judit Ács, Katalin Pajkossy, and András Kornai … 52-58
Scientific registers and disciplinary diversification: a comparable corpus approach.
Elke Teich, Stefania Degaetano-Ortlieb, Hannah Kermes, and Ekaterina Lapshinova-Koltunski … 59-68
Improving MT system using extracted parallel fragments of text from comparable corpora.
Rajdeep Gupta, Santanu Pal, and Sivaji Bandyopadhyay… 69-76
VARTRA: a comparable corpus for analysis of translation variation.
Ekaterina Lapshinova-Koltunski… 77-86
Building ontologies from collaborative knowledge bases to search and interpret multilingual corpora.
Yegin Genc, Elizabeth A.Lennon, Winter Mason, and Jeffrey V.Nickerson… 87-94
Using a random forest classifier to recognise translations of biomedical terms across languages.
Georgios Kontonatsios, Ioannis Korkontzelos, Jun’ichi Tsujii, and Sophia Ananiadou… 95-104
Comparing multilingual comparable articles based on opnions.
Motaz Saad, David Langlois, and Kamel Smaili… 105-111
Mining for domain-specific text from Wikipedia.
Gathering and generating paraphrases from Twitter with application to normalization.
Wei Xu, Alan Ritter, and Ralph Grishman… 121-128
Learning comparable corpora from latent semantic analysis simplified document space.
Ekaterina Stambolieva… 129-137
Finding more bilingual webpages with high credibility via link analysis.
Chengzhi Zhang, Xuchen Yao and Chunyu Kit … 138-143