OmegaT should indicate precisely which version the TMX has been
created with.
Currently all the RCs are identified as 1.6 which could lead people into
thinking that all RCs are equivalent.
OmegaT should indicate the exact version (including RC number, build
date eventually) and produce an error message when a project is opened
with a non matching OmegaT version (like are you sure you want to open
this project with this version: TMX creation OmegaT version=..., current
OmegaT version=... there may be segmentation and TM problems of you
proceed).
I know it should be a RFE but with RC7 release and 1.6.0 close, it is very
important that users are informed of the possible issues related to
upgrading.
JC
Logged In: YES
user_id=488500
JC,
It's OK to indicate the exact version number, but why should
OmegaT produce an error message importing any TMX, and why
on Earth on its own (!!!) TMX?
Logged In: YES
user_id=915082
I have a project created with RC6, I use RC7 with it: all the tagged segments are
wrong so there _will_ be problems.
An alert (not error, sorry :) message should be there to remind me that I have
created the project TM with such version and that using a different version can
result in project TM matching problems.
Logged In: YES
user_id=545103
I definitely agree that the user should be aware of the
issue. It should be mentioned in the documentation, and I
think the issue warrants an alert message too. However,
it'll soon get obnoxious if you work with older TMs a lot,
so I'd strongly advice adding the familiar "Do not bother me
again" option to such a message dialog.
Furthermore, this issue only appears when working with RC7
or later with TMs created by RC6 or earlier, so before
displaying any messages, that should be checked, so messages
aren't displayed unnecessarily.
Logged In: YES
user_id=545103
Instead of displaying warning messages, can't these tag
errors be automatically corrected? Is it possible to
automatically convert The TMX to the new format?
Logged In: YES
user_id=488500
Hi Henry,
Well... That's some work to rewrite tag definitions
on-the-fly, and in case of TMX files in /tm folder OmegaT
cannot (by contract) modify these files.
Logged In: YES
user_id=545103
I didn't intend to overwrite the TM files, but to apply the
conversion in-memory only.
So, is it possible? Can you renumber the tags in a segment
correctly, and simply stick missing ones in front and behind?
Logged In: YES
user_id=488500
Not really, because there's no longer any information there
(only, say, "<f3>"), but OmegaT can more or less reliably
guess that.
Not easy, though...
Logged In: YES
user_id=545103
If I encounter the following segment:
<seg>Word</f0><f1>more words</f2>, plus <f3>more words</seg>
I could very simply convert it to the following:
<seg><f0>Word</f0><f1>more words</f1>, plus <f2>more
words</f2></seg>
You don't really need to know about the original tags, do you?
Logged In: YES
user_id=488500
Henry, you are right. The only issue I see is detecting and
fixing this problem if the paragraph was split into several
segments. No dark magic, but not simple anyway...
Logged In: YES
user_id=545103
If you apply the same conversion when splitting paragraphs,
then the problem is solved.
Logged In: YES
user_id=488500
Henry, probably you misunderstood me (or I misunderstood
your last comment).
Let me repeat myself. For a paragraph like
"Word</f0><f1>more words</f2><f3>. Plus </f3>more words."
OmegaT would create two segments:
<seg>Word</f0><f1>more words</f2><f3>.</seg>
<seg>Plus </f3>more words</seg>
1. Renumbering <f3> in the second segment based on how many
segments there were in a previous one is more or less
simple: (n+1)/2 should almost always work.
2. The difficult thing here is NOT to add <f3> to the
beginning of the second segment, and not add </f3> to the
end of the first one.
3. Also difficult are standalone segments (<f6/> in current
notation) -- earlier OmegaT versions didn't add a slash, and
it's totally unclear how to detect them given only TMX files...
Logged In: YES
user_id=545103
Ok, you convinced me, I see the showstoppers now :(
Logged In: YES
user_id=915082
Besides for the fact that the "old" numbering was <seg>blabla</f1>blabla</
seg> not "</f0>".
To reply to earlier parts of the discussion:
1) I don't think automatic conversion of anything without explicit user
authorization is something to do, unless we back up everything and create a
new project TMX, but leave the old one intact.
2) as for in which case the alert message should be displayed, we seem to
already have something in log.txt, but as far as I know, before 1.6RCn we did
not have segmentation rules, so basically _any_ TM before 1.6 may have
issues. This can easily be identified by using the "segtype" info in the TMX.
3) also I am only talking about the project TMX, not the reference TMX in /
tm/ for those we have the log and for now that should be enough. RIght now
we have the possibility to take an old TM, rename it project_save.tmx, put it
in /omegat/ and load. Or we can shift of version (and that will be the case
when a huge majority of users shift from 1.4.5 to 1.6.o stable) sometimes in
the middle of a project. So this alert message will be displayed only in very
limited cases where the project_save.tmx may end up having compatibility
issues with the application settings. We are talking about a transition period,
_and_ about cases where old TMs are willingly used as project tm for
conviniency. Case one will be over in a few months hopefully and case two is
rare enough that it only deserves an alert message so that people who start
using OmegaT at 1.6.0 but who get old TM files know what is going on.
As for automatically replacing the tags, I'd say that fits the "Tools" menu as
an "convert old OmegaT TM" item or something.
Logged In: YES
user_id=488500
Hi JC,
You mean incorporating something like
TMX Resegmenter
(http://sourceforge.net/project/showfiles.php?group_id=68187&package_id=118970&release_id=396350)
directly into OmegaT core?
Maxym
Logged In: YES
user_id=915082
Well, not specifically because first I was not aware of the tool when I wrote
that :), two because it seems to me the tool is limited in scope since it only
addresses the case when the number of sentences are the exact same in
source and target, which does not take into account weird syntactic
structures in source that need to be regrouped in target or the opposite, and
third because I simply meant:
offer a way to fix old TMs including:
<seg>aaaaa</f1>eeeee<f2>ggggg</seg>
so that they look like:
<seg><f0>aaaaa</f0>eeeee<f1>ggggg</f1></seg>
:)
As far as the tool is concerned, I think it should include a mechanism to
display a list of segments that don't fit so that the user can fix them
externally before resegmenting.
Logged In: YES
user_id=488500
Accepted as required because of new feature to upgrade
OpenOffice tags for a new filter:
http://sourceforge.net/support/tracker.php?aid=1523768
http://sourceforge.net/support/tracker.php?aid=1523773
Logged In: YES
user_id=488500
implemented in 1.6.0 Release Candidate 12