Menu

#1306 OmegaT treats as repetitions segments that are not identical

6.0.1
closed-invalid
nobody
None
5
2025-10-09
2025-09-12
No

How come OmegaT thinks the following two segments are identical?

  • <x0> has received the highest number of emails (<x1>).</x1></x0>
  • <x0> has the highest number of recipients (<x1>).</x1></x0>

Something weird happens with stemming and leads to such problems. It also makes OmegaT insert automatically TM matches that are not 100%, although I have set it to only insert 100% matches.

Discussion

  • Epameinondas Soufleros

    By the way, I am using 6.0.1, but I could not find it in the list of "Milestones".

     
  • Jean-Christophe Helary

    We need a sample project to reproduce the issue.

     
  • Epameinondas Soufleros

    Can't share the project, sorry. It should be easy to reproduce, though.

    Right now, my source segment is:
    Books
    And the translation memory has:
    Bookings

    The translation of this TM hit is inserted in the editor as a 100% match.

    The lack of match percentages in the editor pane, along with the lack of a confirmation mechanism for segments, make this behaviour highly dangerous, and decreases my confidence in OmegaT for mission-critical work.

     
    • Jean-Christophe Helary

      We need a sample project. We don’t need you to share the original.
      Just create a source file with two different segments that are considered identical in OmegaT. Put that is a project with a corresponding 1 segment TMX and check that you reproduce the issue before sending it to us.

      We’ve had OmegaT used in mission critical work for more than 20 years by thousands of professional translators who depend on reliable tools to pay their bills and we’re pretty reactive on bugs that we can reproduce.

       
    • Jean-Christophe Helary

      When OmegaT inserts a match above the set threshold, it also inserts a marker that indicates that the match was automatically inserted.

      To confirm that the insertion is valid, the user either manually removes the marker, or uses the "insert match" function to manually insert it.

      I’m not sure I understand what you mean by "the lack of match percentages in the editor pane". Which percentage should be displayed ?

       
  • Epameinondas Soufleros

    Here it is. Just two segments. Once you translate "Bookings", it auto-completes it as a 100% match for "Books".

     
    • Jean-Christophe Helary

      Please, make sure that the project opens properly on your side.

       
  • Epameinondas Soufleros

    Let's just leave it here. The app does stemming for TM matches, which is clearly incorrect. If you don't think this is a bug, that's fine with me.

     

    Last edit: Epameinondas Soufleros 2025-10-09
  • Jean-Christophe Helary

    • status: open --> closed-invalid
    • Group: 6.0.2 --> 6.0.1
     
  • Jean-Christophe Helary

    If you have specific requests like:

    • make stemming optionnal
    • display some sort of match % somewhere in the Editor pane
      feel free to create a RFE.
      The project that you sent behaves properly and OmegaT does not consider the two segments as equivalent : If you put the first translated segment in a TMX inside /tm/auto OmegaT will not automatically translate the second segment.
     

Log in to post a comment.

MongoDB Logo MongoDB