Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

comparing spellcheckers

giomach
2013-05-18
2013-06-08
  • giomach
    giomach
    2013-05-18

    Which, if any, spellcheckers can be configured to act like this?  (The examples are from English but the real need comes from other languages.)

    First, if necessary, allow the dictionary to contain words with apostrophe "'" and hyphen "-" in any position.

    Now, when checking text:

    1. Accept a word containing a hyphen if EITHER the dictionary contains the whole word including the hyphen ("hotch-potch") OR if the dictionary contains both parts separately ("half-moon").

    2. With a dictionary containing "'twas" but not "twas", accept "'twas".

    3. With a dictionary containing "well" but not "'well", not accept "'well".

    In my experiments, Aspell (0.50-3 in Windows) fails all these tests.  But perhaps I'm omitting to do something, or perhaps other implementations behave differently.
    The spellchecker in MSWord passes all the tests, but unfortunately it is not available for use in other applications.
    I haven't been able to test Hunspell.

    I would be interested to hear anyone else's evidence or experience.

     
  • Esben Aaberg
    Esben Aaberg
    2013-06-08

    I think this is what you are looking for in Hunspell:

    test_TEST.aff:
    SET UTF-8
    WORDCHARS -'

    test_TEST.dic:
    2
    'twas
    well

    echo twas \'twas well \'well |hunspell -d test_TEST -G
    'twas
    well

    echo twas \'twas well \'well |hunspell -d test_TEST -l
    twas
    'well