ICONV/OCONV keywords does not work on the words in the private dictionary. Such words are usually saved by applications independently.
I think Hunspell::add() and Hunspell::add_with_affix() needs the ICONV conversion.
Hi, also thanks for this report.
A note about Korean spell checking possibilities in OpenOffice.org: CWS hunspell4thesaurus with Hunspell 1.2.8 is ready for QA. I hope, OOo 3.0.1 will be released with Hunspell 1.2.8, and you can use ICONV/OCONV for your dictionary. You can try the working test builds here:
Thanks for the news. I've already tried the dynamically-linked OOo Debian package and libhunspell from 1.2.8. Here is a successful screenshot:
It is still far from real use. (It's a heavy job to write Korean affix rules.) But it's being improved.
There is a quick method to develop the first version of the Korean spelling dictionary:
1. Download Korean Wikipedia from download.wikipedia.org (>80 thousand articles, http://download.wikimedia.org/kowiki/20081126/kowiki-20081126-pages-articles.xml.bz2\)
2. Extract page texts and convert to jamo
3. Use affixcompress (hunspell/src/tools) on the (LC_ALL=C) sorted word list, and convert the result (aff and dic file) to Hangul.
In fact, this is a compression of the words of the Korean Wiki with all uncommon words. Future version of affixcompress will support filtering of uncommon words and statistical classification (~describe real morphology) of the words of (agglutinative) languages.
It looks promising. I'll try it.
But the Korean Wikipedia lacks a large set of words and affixes, because many Korean words have different agglutinations by speaker/audience relationships and all the Wikipedia articles have a consistent style. I think such dictionary will be good for report or news articles but not for the other types of text.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.