#45 Aspell flags "doesn’t"

closed-fixed
nobody
None
4
2011-01-18
2011-01-14
No

One possible replacement being "doesn't". However, I do not want to write "doesn't". I ended up writing "doesn‘t" several times, flagged in Firefox but perfectly good for Aspell…

Discussion

  • Kevin Atkinson

    Kevin Atkinson - 2011-01-14

    I think this is a Firefox or Hunspell specific issue. doesn't is in the dictionary.

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-14
    • status: open --> closed-fixed
     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-14

    This bug report is really confusing. On my browser all three forms of the single quote look almost identical. Here is the problem as I see it:
    Both Aspell and Firefox/Hunspell only accept:
    doesn't (' = U+0027) ASCII
    doesn’t (’ = U+2019) Unicode, correct
    doesn‘t (‘ = U+2018) Unicode, wrong

    You don't want to use doesn't (ASCII) you want to use doesn’t (Unicode) correct?

    There is not a simple or clean solution to this problem, and the solution for Aspell and Firefox/Hunspell will likely be different.

    (1) Shall we let both in the dictionary? Well then, whenever some one misspells a word with a ' in it both forms will appear which might appear identical to the user (depending on the font). The user will think this is a bug and randomly select one of then, The end result is a mix of ' some ASCII and some Unicode.

    Or, (2) Shall we convert the Unicode ' to ASCII? Well the spell checker will accept it, but than if you misspelled "doesn't" it will only offer the ASCII one. Now we can convert the ASCII back to Unicode, but for most users ASCII is the preferred form.

    (2) is a better solution, but only if the back conversion (ASCII back to Unicode) is an option selected by the user or somehow automatically detected (harder).

    Aspell also has the problem in that it doesn't think the ’ (Unicode) is a valid word character. I will eventually fix this, but it's not a high priority because it will only be a partial solution.

    It will let the Hunspell author speak for it.

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-14
    • priority: 5 --> 4
    • status: closed-fixed --> open
     
  • Christopher Yeleighton

    Aspell flags "doesn’t" within KWrite, and also called explicitly given LANG=en_US.utf8; however, in the latter case it does not come up with the suggestion "doesn't" — OTOH, it rejects "doesn‘t" as well (which is good).

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-15

    I guess it depends on who does the tokenization, Aspell's internal tokenizer will treat "doesn’t" (Unicode) as two words.

    Also, please explicitly state which ' you are using, otherwise it is very confusing to read. I have to nearly double my font size in order to tell the difference.

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-15

    And it appears that newer versions of Hunspell (at least as used with Firefox) partly implements solution (2) (Chrome rejects doesn’t (Unicode), however). That it it will accept doesn’t (Unicode), but if you misspell it, it will only suggest the ASCII version.

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-15

    Assuming you are using Aspell 0.60, try replacing the existing iso-8850-1.cmap with the attached one. Use "aspell --config data-dir" to find out the location. (Of course back up the original). With this Aspell should accept the Unicode '.

    If you want suggestions to always contain the Unicode version than somewhere add the config option "norm-form hack" (see the Aspell manual) or on the command line use "--norm-form=hack". If you spell check in other languages this option will likely break things because, as the name implies, this is a hack and will only work with languages using the iso-8859-1 charset internally. Things might also go wrong if you try to check a non-Unicode document with the option enabled (it will always try to map U+0027 to U+2019, which if doesn't exist in the target charset will than get mapped to '?').

    I will try to get the first change in the next version of Aspell 0.60 (which I hope to release sometime before the end of the month). The "hack" norm form, probably won't make it in unless I can figure out how to make it less of a hack (unlikely).

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-18

    I just committed the first part of the last change to Aspell 0.60. I am considering this bug closed.

    This bug really belongs with the Aspell project anyway as it not a dictionary issue.

    Being able to suggest with the Unicode ', is a seperate issue and should be filed as a Feature Request for Aspell.

     
  • Kevin Atkinson

    Kevin Atkinson - 2011-01-18
    • status: open --> closed-fixed
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks