Menu

#1406 Allow user to choose which MT suggestion to insert

4.1
closed-fixed
None
5
2018-11-14
2018-10-05
Marc Riera
No

Using OmegaT, I have noticed that currently it is impossible to select which Machine Translation result to insert when using more than one MT engine at the same time. The "Replace with Machine Translation" (Ctrl+M) option seems to always choose the first result.

Given how MT accuracy depends on many factors and how it can be useful to see different MT results while translating, it would be great to have the option to select which result to use and have more control over MT.

Thanks!

Related

Feature Requests: #816

Discussion

1 2 > >> (Page 1 of 2)
  • Thomas CORDONNIER

    Hi Marc

    Did you hear about DGT-OmegaT : http://185.13.37.79/ ?
    Yes, this is a fork, but we have an agreement with OmegaT team that I can help to port my features to OmegaT 4, if they are interested.

    About your request, we already have something like this in DGT-OmegaT: http://185.13.37.79/?q=/node/28

    • you can set a preferred MT engine, which will be used during automatic insertion as well as during insertion using the menu "Replace with machine translation"
    • if one time you want to insert the result from another engine, you can use right-mouse click on the "machine translation" pane, it will open a menu which enables insertion from the engine which provides the translation you just selected with the mouse.

    Tell us if one of these approaches would cover your needs for OmegaT 4. Then we will see if the core team is interested.

    Regards
    Thomas Cordonnier

     

    Last edit: Thomas CORDONNIER 2018-10-05
  • Marc Riera

    Marc Riera - 2018-10-09

    Hello Thomas,

    Thanks for your quick reply. I had not heard about DGT-OmegaT, but I have taken my time to try it out and explore its features and it is great to see all the additions to make translation even easier.

    I have checked out the extra MT features you mentioned and I find the approach well-balanced. On the one hand, setting a preferred MT engine makes it possible to check the results by multiple engines and to have a way to quickly access a specific one (the user's preferred, probably because it is perceived as offering reliable results). On the other hand, the right-click option makes it possible to easily keep the translation flow when the preferred engine's result is not good enough (which sometimes happens). Two clicks are very fast compared to what has to be done in the latest main version (select MT text with mouse, copy all, click on segment, paste).

    The approach by DGT-OmegaT definitely covers my needs (and I bet other translators' needs too) and it would be very useful to have it included in the main version. I have no knowledge of Java to give you a hand, but if there is any other way I can help I will be glad to do it.

    Thanks,

    Marc Riera

     
    • Thomas CORDONNIER

      Hello Marc

      Thanks for your reply. This is not a problem that you don't know about Java. DGT-OmegaT does not require new developers for the moment. What we more need is any kind of feedback (not necessarily always positive) about our specific features. Of course, don't do them in this thread, use either the contact form or write a comment in DGT's web site instead. I did not write here to advertise, but only because you were asking for a feature which is present in our project, but not in standard OmegaT for the moment.

      About this specific feature, let's take note here that you agree with our aproach. Now the only question is what the core team of OmegaT thinks about that. I am not a member of the core team, so even if I can write the Java code, I do not have write access to the SourceForge so the fact that my modifications would be integrated or not does not depend on me. All we can do is wait for their reaction, or eventually contact them if nothing happens. Once they confirm to me about what they want, I will take the patch from DGT-OmegaT and merge it to OmegaT 4. But then, publication of the result will be out of my responsability.

      Regards
      Thomas

       
  • Didier Briel

    Didier Briel - 2018-10-11

    This RFE is more or less a duplicate of [#816]. I have nothing against the preferred machine translation and a right-click for exceptions, but I do like also the idea of cycling through suggestions with Ctrl+M.

    Didier

     

    Related

    Feature Requests: #816


    Last edit: Didier Briel 2018-10-11
    • Thomas CORDONNIER

      Nice that you found this old RFE. About the cycling option, I had simply not thought about this possibility. First impressions:
      1 - it is a little bit harder to implement: usually, menus are stateless, to implement this we must memorize anywhere how may times the menu has been called, and not forget to reset this counter each time you go to another segment;
      2 - about the right-click menu, actually DGT-OmegaT implements two menus: replace and insert. Cycling is only possible for replacement, that's almost one reason why to consider these two options as not exclusive;
      3 - DGT-OmegaT also has the possibility to do auto-insertion of MT, as it was already the case for translation memories ( http://185.13.37.79/?q=/node/5 ). The rule is that translation memories have priority, MT is only used if no match more than selected percentage was found.
      The feature "preferred MT engine" was first written for this auto-insertion system, only later it was extended to CTRL+M. That's why I ask you first if you are also interested in this feature (for which we already discussed years ago, see last comments on https://sourceforge.net/p/omegat/feature-requests/776/) or if you prefer to put the "preferred MT engine" dropdown list in another pane?
      4 - actually when you use CTRL+M you have no way to identify immediately which engine has been used : you must read again the contents of the MT pane and try to identify which of the strings has been inserted; in DGT-OmegaT, that's one of the first modifications I did, because my clients make lot of use of various auto-insertion systems but only if they can identify immediately which one has just been used. If you are interested in this idea, test it in DGT-OmegaT and tell me if you want it like that ( http://185.13.37.79/?q=/node/4 ), or maybe if you find another location where to put this useful information.

      Tell me what you think about points 3 and 4, in my side I can already prepare the "popup menus" patch (this one is independant from other options) and study the others later.

      Regards
      Thomas

       

      Last edit: Thomas CORDONNIER 2018-10-11
      • Marc Riera

        Marc Riera - 2018-10-17

        Points 3 and 4 sound very exciting too and the implementation in DGT-OmegaT seems good enough for what I would like to see.

        There is only one aspect that I am not really sure about: as newer versions of OmegaT have only one configuration window with many sections, and one which is specifically "Machine Translation", wouldn't it make more sense to add the "preferred MT engine" selector there? Even if it gets separated from the other option ("auto-populate with MT", which I would expect to find in the "Editor" section), it feels more intuitive to find it there. It could even simply be a radio button next to the "Enabled" checkbox instead of a full dropdown list.

        Regards,

        Marc

         
        • Thomas CORDONNIER

          as newer versions of OmegaT have only one configuration window with many sections, and one which is specifically "Machine Translation", wouldn't it make more sense to add the "preferred MT engine" selector there?

          As I said, I first wrote auto-insertion, that is why the notion of preferred engine was first put in a section associated to it. But if the core team is only interested about CTRL+M, then your position is the correct one.
          That is one reason why I wait for Didier's answer before writing any piece of code.

          Regards
          Thomas

           
    • Thomas CORDONNIER

      I finally found a simple solution for cycling insertions. See in given attachment.
      Now the preferred engine seems useless (it is definitively linked to auto-insertion only, which you do not have). But tell me if popup menu would remain useful.

       
      • Kos Ivantsov

        Kos Ivantsov - 2018-11-12

        As a layman I should admit, great job, Thomas! This is what I wanted to have for years. Would it also be possible to somehow indicate which MT result is being used, maybe by making it bold, similar to the way the active fuzzy match is indicated. Also if there's a selection in the MT pane, it would be logical if selected text was inserted regardless what has or has not been used previously. It would be in line with the way selection is treated in the Fuzzy Marches pane. With the ability to insert any MT result I don't see any need for the context menu in MT pane.
        I'm using your patch already, but looking forward to further improvements.

        --
        Kos

         
        • Thomas CORDONNIER

          Hi Kos

          I know that you did not see this discussion before, so I cannot blame you. In the previous messages I make reference to the approach I had in DGT-OmegaT, which is totally different. But finally I find Didier's approach about cycling not bad at all, so I propose an implementation.
          About seeing where data come from, see what I did in DGT-OmegaT (where my clients much more use auto-insertion, that's why data can come from multiple locations, for that reason it is important to see this info) and tell us what you think about my approach.
          As usual I am ready to port my work to OmegaT, but simply as we have sometimes different approach, I want to be sure that OmegaT users agree with mine before to start the work.

          Regards
          Thomas

           
          • Kos Ivantsov

            Kos Ivantsov - 2018-11-12

            I did read all of the previous conversation, and my comment had to do with the way the old request (which Didier reminded of) was implemented. I really like it — just simple cycling, without extra menus or user choices. I will surely look at DGT version, but for now there's a little glitch in the current patch. When "Automatically Fetch Translations" is disabled, Ctrl + M doesn't fetch anything, it seems to be ignored. When the option is enabled, all works fine.

            --
            Kos

             

            Last edit: Kos Ivantsov 2018-11-13
            • Thomas CORDONNIER

              In one of the previous messages I said that once this is implemented, anybody would probably mention the fact that it is difficult to identify which of the engines has been inserted. And this is exactly what you did. That's the reason why I thought you did not read it.
              Then I mentioned DGT-OT only to emphase the fact that there is a solution to this (to identify which engine is used), which is implemented in it. I do not say that this is the ideal solution, but that to make a choice, I prefer that people test it in live rather than trying to explain it in a long comment.

              somehow indicate which MT result is being used, maybe by making it bold,

              Yes, making bold is another solution as well. Not so easy because actually the MT pane does not store location of engines in the text, but not impossible.
              As said before, in DGT-OT I used another solution but there is one reason: data can come from MT, but also from matches pane. That means that if you click CTRL+M but suddenly change mind and click CTRL+R, then I should unbold in the MT pane, and where should I put the info that now data come from translation memory? (as in Matches pane, bold is already used for totally different thing)
              As you can see, "bold" is not a so simple solution...

              When "Automatically Fetch Translations" is disabled, Ctrl + M doesn't fetch anything, it seems to be ignored.

              OK, I have found the bug and it is easy to correct. But before, I want to be sure about expected behaviour. To be honnest, I was a little bit surprised how it worked before the patch: when you click CTRL+M it lauches the search, but then you have to click a second time to CTRL+M to do the insertion. I did a correction which restores exactly this behaviour, is this correct or was it unexpected?

               
              • Kos Ivantsov

                Kos Ivantsov - 2018-11-13

                Then I mentioned DGT-OT only to emphase the fact that there is a solution to this (to identify which engine is used), which is implemented in it. I do not say that this is the ideal solution, but that to make a choice, I prefer that people test it in live rather than trying to explain it in a long comment.

                Yes, using segment marker is real nice, but I guess, it needs to be discussed separately as it pertains not only to this case. If something like this is implemented (and I hope it will be), its use should be consistent in different scenarios.

                Yes, making bold is another solution as well. Not so easy because actually the MT pane does not store location of engines in the text, but not impossible.
                As said before, in DGT-OT I used another solution but there is one reason: data can come from MT, but also from matches pane. That means that if you click CTRL+M but suddenly change mind and click CTRL+R, then I should unbold in the MT pane, and where should I put the info that now data come from translation memory? (as in Matches pane, bold is already used for totally different thing)
                As you can see, "bold" is not a so simple solution...

                I'm speaking just as a layman here, so I'm not implying how difficult or easy it is to implement, but I don't think you have to unbold anything in the MT pane. There can be several panes with bold in them. Now we have bold in Fuzzy Matches and that indicates the active match. At the same time we can have bold in Glossary pane, it indicates entries from the writable glossary. I don't use dictionary in OmegaT, but that pane probably can have bold simultaneously with other panes too, so MT pane can have bold as well, indicating the MT result which is going to be inserted should the user press Ctrl+M.

                OK, I have found the bug and it is easy to correct. But before, I want to be sure about expected behaviour. To be honnest, I was a little bit surprised how it worked before the patch: when you click CTRL+M it lauches the search, but then you have to click a second time to CTRL+M to do the insertion. I did a correction which restores exactly this behaviour, is this correct or was it unexpected?

                If autofetching is off, pressing Ctrl + M once "forces" grabbing MT results for the current segment, but it only populated the pane, it doesn't insert anything. Once the MT pane is populated, pressing Ctrl + M for the second time replaces target with the result that happened to be on the top (and with your patch it should circulate). If the pane is populated with the cached results, even for untranslated segments that the user just jumped into, pressing Ctrl + M doesn't fetch new MT results, but simply replaces the target. Sorry for describing the expected behaviour again, but it's probably better to be annoying than misunderstood (and be assured it's done in a very friendly spirit).

                 
                • Thomas CORDONNIER

                  If something like this is implemented (and I hope it will be), its use should be consistent in different scenarios.

                  What is implemented in DGT-OT is consistent with various scenarios (auto-insertion or using CTRL shortcuts, data coming from MT or TM or source, etc.). But I agree with you that this may be discussed separately: probably we should close this RFE once my patch is applied and open a new one whose title could be something like "find a solution to identify where inserted text comes from". Right?

                  so MT pane can have bold as well, indicating the MT result which is going to be inserted should the user press Ctrl+M.

                  But doing like that you provide nothing for the other scenarios (data coming from translation memory, for example) so if later we implement another solution, like the one in DGT-OT for example, this will not be anymore consistent with other scenarios.

                  Once the MT pane is populated, pressing Ctrl + M for the second time replaces target

                  OK, now that it it is confirmed, I send the corrected patch (replacing older one, not added to it!) in attachment.

                  with the result that happened to be on the top

                  wow, actually I did not really implement an order. The fact that it was always the first inserted in the previous version was simply a side effect of the fact that insertions happened in MT pane in the order the threads are finished: there was absolutely no implemented rule, and there is no one in the patch as well : actually the order is not predictible, it is decided by Java. Does it prevent from commiting it?

                   

                  Last edit: Thomas CORDONNIER 2018-11-13
  • Marc Riera

    Marc Riera - 2018-11-13

    Great work, the approach is simple but effective, and it is really easy to use it.

    I think that a "Preferred MT engine" option still makes sense if we want to decide the order of MT suggestions (and consequently alter the Ctrl+M cycle). I also agree with Kos regarding the use of bold to highlight the current engine in the cycle.

    Regards,
    Marc

     
  • Kos Ivantsov

    Kos Ivantsov - 2018-11-13

    If something like this is implemented (and I hope it will be), its use should be consistent in different scenarios.

    What is implemented in DGT-OT is consistent with various scenarios (auto-insertion or using CTRL shortcuts, data coming from MT or TM or source, etc.). But I agree with you that this may be discussed separately: probably we should close this RFE once my patch is applied and open a new one whose title could be something like "find a solution to identify where inserted text comes from". Right?

    If automatically inserting MT results is disabled in DGT version, I don't see that segment marker is changed when the user manually selects which match to insert. So using bold/different color/other visual clue in MT pane would work just as well, but not requiring mouse click (provided your patch for cycling in used) and providing the same information as to where the text comes from, and will be in line with current OmegaT behavior in other panes.

    But what I actually meant in the previous comment is that DGT version segment marker is used for other indications too, like segment status etc. I like the idea and have a few ideas/requests on how the use of that marker could be expanded even further, too. I can't speak from the development point of view, but to me as to the end user it seems that just throwing in pieces of functionality (however useful, and I do think what you have done with the segment marker in principle is quite useful) without having a good plan of how it integrates with other things is a good recipe to make the program less user-friendly despite making it more functional. But I'm just sharing my not very educated guesses; Aaron and Didier and other devs are in much better position to give their expert judgment.

    so MT pane can have bold as well, indicating the MT result which is going to be inserted should the user press Ctrl+M.

    But doing like that you provide nothing for the other scenarios (data coming from translation memory, for example) so if later we implement another solution, like the one in DGT-OT for example, this will not be anymore consistent with other scenarios.

    I'm not sure I understand how it conflicts with other scenarios. If data comes from local TM, but populates the MT pane (like your local engine the idea of which I just love), it should get "bolded" and be inserted in cycle just as all the other MT results. So whatever makes it into the MT pane, is included in the cycle. I admit, I'm slow most of the times, so if I misunderstood you, please bear with me.

    Once the MT pane is populated, pressing Ctrl + M for the second time replaces target

    OK, now that it it is confirmed, I send the corrected patch (replacing older one, not added to it!) in attachment.

    with the result that happened to be on the top

    wow, actually I did not really implement an order. The fact that it was always the first inserted in the previous version was simply a side effect of the fact that insertions happened in MT pane in the order the threads are finished: there was absolutely no implemented rule, and there is no one in the patch as well : actually the order is not predictible, it is decided by Java. Does it prevent from commiting it?

    I can't say what qualifies for a commit, but the current behavior with randomly ordered results didn't stop people from using MT, so it's not a big deal, at least for me. Maybe if there was a way to implement an arbitrary sorting order for MT results, it would make it even more useful, at least more predictable.

     
    • Thomas CORDONNIER

      If automatically inserting MT results is disabled in DGT version, I don't see that segment marker is changed when the user manually selects which match to insert.

      You are true, thanks for the info - I can consider it as a bug to be corrected in the next version.
      In practice the idea is that the info "data comes from MT" and "data comes from Matches" (or from the source, or from already translated text) should be in the same location, considering the fact that it is totally impossible that they are true at the same time.
      And that is not the case if we follow your option to show this in the MT pane: if we do so, when you have a doubt about where the data come from, you have to look at every pane individually until you see one of them with something in bold.

      I'm not sure I understand how it conflicts with other scenarios.

      Because if you want to mark the fact that a segment comes from translation memory, logically you should bold in the matches pane... but you can't, because here bold has a different meaning (active segment)

      like your local engine the idea of which I just love

      This local engine already has own RFE : https://sourceforge.net/p/omegat/feature-requests/776/ and I already submitted it to OmegaT, years ago, but was totally forgotten. Maybe it would be useful to "relaunch" the RFE?

      segment marker is used for other indications too, like segment status etc.

      No, this is not "other indication": if you see "already translated", data comes from the project memory (project_save.tmx); if you see "Machine translation", it comes from the MT pane. So in both cases the info you receive is where the contents of editing zone comes from, and nothing else. Maybe the term "already translated" should be replaced by "from project memory", to avoid confusion.

       I can't say what qualifies for a commit
      

      Then we should ask Didier or Aaron to have a look to it, now that it is ready. Didier says that he normally receives notifications of our discussions, but if he does not react, maybe we should write to him to tell that a new patch is ready.

      if there was a way to implement an arbitrary sorting order for MT results,

      if I do so, the selected order will make some people happy and some other not, and finally we will have to make it configurable. Not a good idea, better ask the question to Didier and implement the order most users prefer.

       
  • Didier Briel

    Didier Briel - 2018-11-13

    I do receive notifications. I have included the patch in SVN, (/trunk, [r10484]).

    (I agree with Kos that bold would be nice. Let's keep that for a future improvement.)

    Didier

     

    Related

    Commit: [r10484]

  • Didier Briel

    Didier Briel - 2018-11-13
    • status: open --> open-fixed
    • assigned_to: Thomas CORDONNIER
     
  • Kos Ivantsov

    Kos Ivantsov - 2018-11-13

    If automatically inserting MT results is disabled in DGT version, I don't see that segment marker is changed when the user manually selects which match to insert.

    You are true, thanks for the info - I can consider it as a bug to be corrected in the next version.
    In practice the idea is that the info "data comes from MT" and "data comes from Matches" (or from the source, or from already translated text) should be in the same location, considering the fact that it is totally impossible that they are true at the same time.
    And that is not the case if we follow your option to show this in the MT pane: if we do so, when you have a doubt about where the data come from, you have to look at every pane individually until you see one of them with something in bold.

    I'm not sure I understand how it conflicts with other scenarios.

    Because if you want to mark the fact that a segment comes from translation memory, logically you should bold in the matches pane... but you can't, because here bold has a different meaning (active segment)

    Ok, now that I understand what all those different marks in the segment marker in the DGT version mean, I understand your point better. Thanks for explaining and sorry for being so slow (I warned you =). But still, while OmegaT lacks that functionality, having a visual clue in the MT pane would be in line with the way Fuzzy Matches behave. When the user inserts a fuzzy match, it's not indicated in the segment itself in OmegaT (mainstream version, not DGT). But seeing which one is marked bold in the Fuzzy Matches pane is really helpful; not having that hint would be rather confusing. Plus, even though I have bold in the Editor, in the Fuzzy Matches pane, and in the Glossary pane, I can't think of a time when I pressed a key combination and was wondering where that text in my target field came from. Pressing Ctrl+R is quite different from pressing Ctrl+M, or even Ctlr+Shift+I, at least for me (again, here I'm not saying anything about information in the segment marker being or not being helpful, I'm only speaking about what currently is available to non-DGT user). But before I press Ctrl+R or Ctrl+I to insert a fuzzy match, I need to see what I'm inserting. The same logic can be extrapolated to MT results. I'm presented with a few text blocks in the MT pane, and if one of them is somehow different from others to indicate that pressing Ctrl+M will make it appear in my current target field, it would make using it so much more intuitive.

    It probably means reworking the complete MT pane, and perhaps this is beyond the scope of this RFE. If what you have shown as the patch is acceptable (Didier and Aaron follow the discussion, no need to write them separately), I'd be quite content with it even without all the extra goodies, as you current fix keeps things the way they were, but adds the requested cycling through suggestions.

    --
    Kos

     
    • Thomas CORDONNIER

      But seeing which one is marked bold in the Fuzzy Matches pane is really helpful;

      OK, so now consider the following scenario : you type CTRL+R to insert the current match, then CTRL+2 to select the next match (but without inserting it). From now, what you see highlighted in the matches pane does not reflect what is inserted in the editor.
      On the contrary, the contents of the segment marker reflects what is in the editor, even if in the meantime you changed highlighting in the matches pane.

      But ok, now that the RFE is closed, maybe we should consider to open a new one to discuss about this?

       
      • Kos Ivantsov

        Kos Ivantsov - 2018-11-13

        OK, so now consider the following scenario : you type CTRL+R to insert the current match, then CTRL+2 to select the next match (but without inserting it). From now, what you see highlighted in the matches pane does not reflect what is inserted in the editor.
        On the contrary, the contents of the segment marker reflects what is in the editor, even if in the meantime you changed highlighting in the matches pane.

        As I said, I was only speaking about what is currently available in the mainstream OmegaT, not about usefulness of what you've developed for segment markers, as it seemed to go beyond the scope of this RFE. After you've explained how the marker works in your version, I see your point much clearer and even tend to agree that it might be helpful in number of cases, though I personally don't recall ever wondering about the origin of the text in my target field after deliberately and knowingly inserting it there.

        But yes, using segment markers for additional indication is another RFE altogether, and I think there might be older ones about this. Let's stop the conversation here. Thanks a lot for developing a fix.

         
  • Konstantin

    Konstantin - 2018-11-14

    Hi all,
    I read about 80% of the above posts, so I hope I didn't miss one mentioning my following thought.

    What about using different colours?
    Every pane (TM, MT, glossary etc.) uses black bold to mark the active match, then you decide to use e.g. match number 3 from MT-pane, this becomes e.g. green, you have it in the editor for better reading but you cycle trough the other matches to see if you like another one better.
    This of course would mean that you have two markings in the pane at this given moment: green for the used match and black bold for the active one.

    To make it better, we could have e.g. bold red colour for cycling through the active pane, green for the used match and bold black for the active matches in the other panes.

    This way we have the following information at a glance:

    1. the matches that are selected in every pane (black bold)
    2. the active pane (red bold) (this makes sense only if you use a shortcut like CTRL+1, CTRL+2 etc. to cycle through matches, otherwise CTRL+R, CTRL+M etc. would suffice)
    3. the used match (green bold)

    I really do not know if this is technically feasable, but I am curious to see if my thought makes any sense to you.

    Best Regards
    Konstantin

     
  • Didier Briel

    Didier Briel - 2018-11-14

    As this RFE is now fixed, please create new ones or comment on open ones. For general discussions, you can also use the Yahoo user group.

    Didier

     
  • Didier Briel

    Didier Briel - 2018-11-14
    • status: open-fixed --> closed-fixed
     
1 2 > >> (Page 1 of 2)

Log in to post a comment.

MongoDB Logo MongoDB