Name | Modified | Size | Downloads / Week |
---|---|---|---|
First document | 2014-10-29 | ||
Second document | 2014-10-29 | ||
README.txt | 2014-10-29 | 742 Bytes | |
Totals: 3 Items | 742 Bytes | 0 |
The first document includes the annotated text files. Six different genres, and each contains 10 text files. Approaximate 350-500 words each text file. Those words with two brackets in front are considered as "Almost cognate" by our bilingula annotator, while words with single bracket are considered as "Obvious cognate". Both are cognates, but people may have slightly different opinions to the "Almost cognates". The second document is a list of online resource where we collect our true cognates. When use the annotated text files, please refer to the following paper: Haoxing Wang and Laurianne Sitbon Multilingual lexical resources to detect cognates in non-aligned texts. Australasian Language Technology Association, 2014.