Menu

#869 Tokenizer language discovery at runtime not working

4.1
closed-fixed
None
5
2017-06-23
2017-05-24
Tomi Pieski
No

Hi!
We are developing a fst based tokeniser for omegat to be able to use our fst tools.
We noticed a bug in the way BaseTokenizer detects the language. It might be elsewhere also.

getLanuages() has 'languages[0].equals(Tokenizer.DISCOVER_AT_RUNTIME)' which never returns true, because string equals() compares object reference.

All best,
Tomi

Discussion

  • Aaron Madlon-Kay

    I believe you have it backwards: equals() compares content, while == compares the reference. Indeed BaseTokenizer.getLanguage() does use ==; I believe the assumption was that the string would be interned and thus a reference comparison would be better. However when inspecting at runtime, it appears that the value from the annotation is a different object.

     

    Last edit: Aaron Madlon-Kay 2017-05-25
  • Aaron Madlon-Kay

    • assigned_to: Aaron Madlon-Kay
     
  • Aaron Madlon-Kay

    • status: open --> open-fixed
     
  • Aaron Madlon-Kay

    This should now be fixed in trunk.

     
  • Aaron Madlon-Kay

    • summary: BaseTokenizer.getLanguage() change string.equals() to '==' --> Tokenizer language discovery at runtime not working
     
  • Didier Briel

    Didier Briel - 2017-06-23
    • status: open-fixed --> closed-fixed
     
  • Didier Briel

    Didier Briel - 2017-06-23

    Fixed in the released version 4.1.2 of OmegaT.

    Didier

     

Log in to post a comment.