#158 [1.6] Exact OmegaT version in TMX

1.6
closed-fixed
None
3
2006-07-23
2006-02-22
Anonymous
No

OmegaT should indicate precisely which version the TMX has been
created with.

Currently all the RCs are identified as 1.6 which could lead people into
thinking that all RCs are equivalent.

OmegaT should indicate the exact version (including RC number, build
date eventually) and produce an error message when a project is opened
with a non matching OmegaT version (like are you sure you want to open
this project with this version: TMX creation OmegaT version=..., current
OmegaT version=... there may be segmentation and TM problems of you
proceed).

I know it should be a RFE but with RC7 release and 1.6.0 close, it is very
important that users are informed of the possible issues related to
upgrading.

JC

Discussion

  • Logged In: YES
    user_id=488500

    JC,
    It's OK to indicate the exact version number, but why should
    OmegaT produce an error message importing any TMX, and why
    on Earth on its own (!!!) TMX?

     
  • Logged In: YES
    user_id=915082

    I have a project created with RC6, I use RC7 with it: all the tagged segments are
    wrong so there _will_ be problems.

    An alert (not error, sorry :) message should be there to remind me that I have
    created the project TM with such version and that using a different version can
    result in project TM matching problems.

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-23

    Logged In: YES
    user_id=545103

    I definitely agree that the user should be aware of the
    issue. It should be mentioned in the documentation, and I
    think the issue warrants an alert message too. However,
    it'll soon get obnoxious if you work with older TMs a lot,
    so I'd strongly advice adding the familiar "Do not bother me
    again" option to such a message dialog.

    Furthermore, this issue only appears when working with RC7
    or later with TMs created by RC6 or earlier, so before
    displaying any messages, that should be checked, so messages
    aren't displayed unnecessarily.

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-26

    Logged In: YES
    user_id=545103

    Instead of displaying warning messages, can't these tag
    errors be automatically corrected? Is it possible to
    automatically convert The TMX to the new format?

     
  • Logged In: YES
    user_id=488500

    Hi Henry,

    Well... That's some work to rewrite tag definitions
    on-the-fly, and in case of TMX files in /tm folder OmegaT
    cannot (by contract) modify these files.

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-27

    Logged In: YES
    user_id=545103

    I didn't intend to overwrite the TM files, but to apply the
    conversion in-memory only.

    So, is it possible? Can you renumber the tags in a segment
    correctly, and simply stick missing ones in front and behind?

     
  • Logged In: YES
    user_id=488500

    Not really, because there's no longer any information there
    (only, say, "<f3>"), but OmegaT can more or less reliably
    guess that.
    Not easy, though...

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-27

    Logged In: YES
    user_id=545103

    If I encounter the following segment:

    <seg>Word</f0><f1>more words</f2>, plus <f3>more words</seg>

    I could very simply convert it to the following:

    <seg><f0>Word</f0><f1>more words</f1>, plus <f2>more
    words</f2></seg>

    You don't really need to know about the original tags, do you?

     
  • Logged In: YES
    user_id=488500

    Henry, you are right. The only issue I see is detecting and
    fixing this problem if the paragraph was split into several
    segments. No dark magic, but not simple anyway...

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-27

    Logged In: YES
    user_id=545103

    If you apply the same conversion when splitting paragraphs,
    then the problem is solved.

     
  • Logged In: YES
    user_id=488500

    Henry, probably you misunderstood me (or I misunderstood
    your last comment).

    Let me repeat myself. For a paragraph like
    "Word</f0><f1>more words</f2><f3>. Plus </f3>more words."
    OmegaT would create two segments:
    <seg>Word</f0><f1>more words</f2><f3>.</seg>
    <seg>Plus </f3>more words</seg>

    1. Renumbering <f3> in the second segment based on how many
    segments there were in a previous one is more or less
    simple: (n+1)/2 should almost always work.

    2. The difficult thing here is NOT to add <f3> to the
    beginning of the second segment, and not add </f3> to the
    end of the first one.

    3. Also difficult are standalone segments (<f6/> in current
    notation) -- earlier OmegaT versions didn't add a slash, and
    it's totally unclear how to detect them given only TMX files...

     
  • Henry Pijffers
    Henry Pijffers
    2006-02-27

    Logged In: YES
    user_id=545103

    Ok, you convinced me, I see the showstoppers now :(

     
  • Logged In: YES
    user_id=915082

    Besides for the fact that the "old" numbering was <seg>blabla</f1>blabla</
    seg> not "</f0>".

    To reply to earlier parts of the discussion:
    1) I don't think automatic conversion of anything without explicit user
    authorization is something to do, unless we back up everything and create a
    new project TMX, but leave the old one intact.
    2) as for in which case the alert message should be displayed, we seem to
    already have something in log.txt, but as far as I know, before 1.6RCn we did
    not have segmentation rules, so basically _any_ TM before 1.6 may have
    issues. This can easily be identified by using the "segtype" info in the TMX.
    3) also I am only talking about the project TMX, not the reference TMX in /
    tm/ for those we have the log and for now that should be enough. RIght now
    we have the possibility to take an old TM, rename it project_save.tmx, put it
    in /omegat/ and load. Or we can shift of version (and that will be the case
    when a huge majority of users shift from 1.4.5 to 1.6.o stable) sometimes in
    the middle of a project. So this alert message will be displayed only in very
    limited cases where the project_save.tmx may end up having compatibility
    issues with the application settings. We are talking about a transition period,
    _and_ about cases where old TMs are willingly used as project tm for
    conviniency. Case one will be over in a few months hopefully and case two is
    rare enough that it only deserves an alert message so that people who start
    using OmegaT at 1.6.0 but who get old TM files know what is going on.

    As for automatically replacing the tags, I'd say that fits the "Tools" menu as
    an "convert old OmegaT TM" item or something.

     
  • Logged In: YES
    user_id=488500

    Hi JC,

    You mean incorporating something like
    TMX Resegmenter
    (http://sourceforge.net/project/showfiles.php?group_id=68187&package_id=118970&release_id=396350)
    directly into OmegaT core?

    Maxym

     
  • Logged In: YES
    user_id=915082

    Well, not specifically because first I was not aware of the tool when I wrote
    that :), two because it seems to me the tool is limited in scope since it only
    addresses the case when the number of sentences are the exact same in
    source and target, which does not take into account weird syntactic
    structures in source that need to be regrouped in target or the opposite, and
    third because I simply meant:

    offer a way to fix old TMs including:

    <seg>aaaaa</f1>eeeee<f2>ggggg</seg>

    so that they look like:

    <seg><f0>aaaaa</f0>eeeee<f1>ggggg</f1></seg>

    :)

    As far as the tool is concerned, I think it should include a mechanism to
    display a list of segments that don't fit so that the user can fix them
    externally before resegmenting.

     
    • assigned_to: nobody --> mihmax
    • summary: OmegaT version number in the TMX --> Exact OmegaT version in TMX
    • milestone: --> 1.6
    • priority: 5 --> 3
    • status: open --> open-accepted
     
    • summary: Exact OmegaT version in TMX --> [1.6] Exact OmegaT version in TMX
     
  • Logged In: YES
    user_id=488500

    implemented in 1.6.0 Release Candidate 12

     
    • status: open-accepted --> closed-fixed