OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Sloven？？ina 2.0 : Empiri？ne, Aplikativne in Interdisciplinarne Raziskave 2013

Best friends or just faking it? Corpus-based extraction of Slovene-Croatian translation equivalents and false friends

Darja Fi？er,Nikola Ljube？i？

Keywords: automatic bilingual lexicon extraction , distributional semantics , closely related languages , cognates , false friends

Full-Text Cite this paper Add to My Lib

Abstract:

In this paper we present a corpus-based approach to automatic extraction of translation equivalents and false friends for Slovene and Croatian, a pair of closely related languages. While taking advantage of the orthographic similarities between the two languages, the approach relies on a straightforward but powerful assumption of distributional semantics, which stipulates that words with a similar meaning tend to be used in similar contexts in both languages. On the one hand, this phenomenon enables us to quickly generate a Slovene-Croatian bilingual lexicon from minimal knowledge sources, the weakly comparable web corpora. On the other, it can also be used to identify the cognates that only seem similar on the surface but are in fact used to express different concepts in the two languages. The presented approach is language-independent and therefore attractive for natural language processing tasks that often lack the lexical resources and cannot afford to build them by hand, but is also useful in lexicography and language pedagogy where it can be used to highlight the lexical characteristics specific for a given language pair or domain.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133