#163 SIMPLIFIEDTRIPLE and COMPOUNDMIN issue

open-fixed
nobody
None
3
2010-10-20
2010-10-20
No

I felt like reopening 3090957 - I thought it was "fixed" by rearranging some flags in the .aff, but now it doesn't work again =???

When SIMPLIFIEDTRIPLE is used together with COMPOUNDMIN, the COMPOUNDMIN is actually one less than defined in the aff file.

Example:
.aff
COMPOUNDMIN 4
SIMPLIFIEDTRIPLE

.dic
kontroll/
kork/
ork/
trafikk/

$ echo trafikkork trafikkkork trafikkontroll | hunspell -d nb_NO
says:
& trafikkork 8 0: trafikk ork, trafikk-ork […snip…]
& trafikkkork 6 11: trafikk kork, trafikk-kork […snip…]
*

BUT if I change COMPOUNDMIN to 3 then trafikkontroll is also checked OK (*), and I get the correct suggestion "trafikkork" for the triple-k-word

It seems hunspell currently "fixes" the word and identifies it as SIMPLIFIEDTRIPLE, and then tries to check

1) "trafikk" 2) "ork" -- oops "ork" is smaller than COMPOUNDMIN. -> Give up and suggest "space split" and "dash split".

I would suggest splitting the words /before/ checking if the word is smaller than COMPOUNDMIN, if the word has been identified as SIMPLIFIEDTRIPLE:
1) trafikk/ork check first part
2) trafik/kork check second part [if it's at least COMPOUNDMIN characters wide]

Discussion

  • Arno Teigseth

    Arno Teigseth - 2010-10-20
    • priority: 5 --> 3
    • status: open --> open-fixed
     
  • Arno Teigseth

    Arno Teigseth - 2010-10-20

    A fix that works for me is:

    changing line 1494 in affixmgr.cxx, function compound_check

    from:
    for (i = cmin; i < cmax; i++) {

    to:
    for (i = cmin; i < cmax + 1; i++) {

    Please note, I don't know side effects of doing that +1 though. Beware. But it fixes this bug :D
    output now:
    $ echo trafikkkork | hunspell -d nb_NO
    Hunspell 1.2.12
    & trafikkkork 7 0: trafikk-kork, trafikk kork, trafikkork, …

     
  • Nobody/Anonymous

    Tracker.. Reposted it :)