Menu

#185 Sentence begin error detection not working in some cases

2.1
closed-fixed
None
3
2013-09-12
2013-07-02
No

Summary

The previous version of LanguageTool we used was 1.8. Now, we upgraded to release 2.2. Since then, some of our Unit tests fail because the sentence begin error detection doesn't work properly any more. Depending of the entire text to be checked, in some cases LanguageTool doesn't detect if a sentence starts with a lowercase letter.

How can you reproduce the issue?

  • Visit LanguageTool homepage: http://www.languagetool.org/
    • On page's top you can enter text to be checked
  • Switch language to German
  • Enter following text:

    etwas beginnen. und der auch nicht

  • Let LT check the text
  • Result: Only "etwas" will be marked as sentence begin error, but not "und"

  • Enter following text:

    etwas beginnen. und der auch nicht. weil dann gehts fast

  • Result: Now, "etwas" and "und" is marked as wrong, but "weil" is missing

  • Switch language to English

  • Enter following text:

    this is a sentence. and this as well.

  • Result: "this" is marked as wrong, while "and" is not

  • Enter following text:

    this is a sentence. and this as well. what is going on

  • Result: "this" and "and" are marked, but "what" is missing

  • Enter following text:

    this is a sentence. and this as well. what is going on?

  • Result: Now, each sentence begin error is marked

Discussion

  • Andreas PAX Lück

    This is a fatal issue in our environment because right now, the grammar check appearence is inconsistent.

     
  • Andreas PAX Lück

    One further strange behavior:

    • Enter following German text:

      etwas beginnen. und der auch nicht. weil das geht nicht

    • Result: All three sentence begin errors are detected
    • Now remove the very last word ("nicht"):

      etwas beginnen. und der auch nicht. weil das geht

    • Result: Sentence begin error "weil" isn't detected.
     
  • Jaume Ortolà i Font

    I have taken a look at the code. When you end a sentence properly with a punctuation mark (.?!…), everything is OK. This behavior was introduced (in some languages) in order to avoid false alarms in tables (which can contain sentences starting in lower case and without a punctuation mark at the end). Any solution has pros and cons, as there is no way to know if a sentence has to end with a punctuation mark (titles, tables, etc., usually do not).

     
  • Andreas PAX Lück

    Hi Jaume.

    Thanks for you analysis!

    If we are within a table cell and somebody writes following:

    Etwas beginnen. und der ist welcher dort war ohne zu gucken oder zu blinken

    Then he forgot to set the punctuation mark at the end and there's also a sentence begin error ("und"). In my environment, no errors would be indicated by grammar checker. Our customers wouldn't agree to this behavior.

    Would it be imaginable to make this behavior configurable?

     
  • Jaume Ortolà i Font

    Hi Andreas,

    I think you are right in your claim. The errors in your examples should be definitely detected. But I think that to make this behavior configurable is not a good solution. We'll try to find which is the "least bad" solution.

     
  • Andreas PAX Lück

    Hallo Jaume.

    Thank you very much! Your Fix solved my problem! Good work! :-)

    Cheers

    PAX

     

    Last edit: Andreas PAX Lück 2013-07-03
  • Jaume Ortolà i Font

    Well. This is not definitive. The current solution causes a lot of false alarms in Wikipedia checks. I will try a midway solution, and I hope we can reach an agreement among the mantainers.

     
  • Emmeran Seehuber

    I get some false positives because the sentences are not split correctly. For this German text:

    Wir heiraten am 19.09.12 und würden uns freuen, wenn Ihr diesen ganz besonderen Tag mit uns feiern würdet. Zur kirchlichen Trauung in der St. Wolfgangkirche in Regensburg laden wir um 11:00 Uhr ein.
    

    I get a warning that "Zur" should be lowercase, because it is not at the beginning of a sentence. I don`t get this error when using the spellchecker on the www.languagetool.org. But when i use the fallback form (http://www.languagetool.org/simple-check/) i can reproduce this problem.

    Is there any way i can workaround this problem? Or is there any reason why the spellchecker on the homepage works, but not on the fallback form? (I`m using LanguageTool 2.2 from maven central)

    Thanks

     
  • Daniel Naber

    Daniel Naber - 2013-07-09

    Emmeran, I cannot reproduce either of your issues. Which browser are you using? Which language did you select? If spell checking doesn't work that's often caused because a language without a variant ("German") has been selected instead of e.g. "German (Germany)". I suggest you open a new issue (or even better, post to the forum or mailing list) so this one doesn't get too crowded.

     
  • Daniel Naber

    Daniel Naber - 2013-09-12

    Jaume, can this issue be closed?

     
  • Jaume Ortolà i Font

    Daniel,
    I think it can be closed. The current solution is not perfect, but it is reasonably good.

     
  • Daniel Naber

    Daniel Naber - 2013-09-12
    • status: open --> closed-fixed
     

Log in to post a comment.

Auth0 Logo