After translating a .docx file, OmegaT failed to create the translated document. The file was first converted from .doc and processed with CodeZapper.
There seems to be no problem with OmegaT parsing and segmenting the file, only with creating the translated document.
Attached is a bogus project that recreates the issue. I am using OmegaT 3.1.2. If there is any more information you need, please let me know.
I don't reproduce the issue, and the target file opens fine with Word 2010. There may be an issue with your configuration files.
You can send them to me privately if you want.
Didier
To be clear: the target Word file is corrupted in your Bug report, it is not even a valid .zip archive. When I write I don't reproduce the issue, I mean that I have no issue doing Create Target Documents, and that the resulting Word file opens correctly in Word.
Didier
Last edit: Didier Briel 2014-08-05
I sent a reply to your private message in Sourceforge, but I'm not sure it was really sent. If not, contact me through normal email (i.e., not through a Sourceforge message).
Didier
It seems it wasn't really sent. I don't know your e-mail address.
I tried deleting all my settings and it seems the culprit is filters.xml
Whenever I uncheck the option "Remove leading and trailing tags", OmegaT creates a corrupted Word file, like the one in my bug report. When I "Restore defaults", I can create a valid Word file.
On Sourceforge, it's my user name (which you can get by hovering on my name here) followed by @users.sf.net. On Yahoo, my email address is displayed for all my messages in an email program.
Thank you, I was finally able to reproduce it (I must have done something wrong previously with your configuration files).
Didier
I found the issue.
It's the combination of not using Remove leading and trailing tags and your unusual segmentation rules, which are putting these tags in standalone segments.
Of course, the bug should be corrected, but what is the reason for doing this? Usually, when one unchecks Remove..., that's because they want to be able to change the tag location, for formatting purposes. With your segmentation rules, this is not possible.
Didier
Fixed in SVN (/trunk).
Didier
Great! I reproduced the issue in 3.1.2 both enabling and disabling my segmentation settings. This also happened in 3.1.4. On the other hand, I could not reproduce it in 3.1.1_1, that is, this version could create target with "Remove..." disabled even using my unusual segmentation settings.
Regarding those, they come from a post by Roman Mironov. I noticed his approach was useful for me too, but I made an effort to improve his regex, as it didn't account for all my cases, so I ended up with what you found under the rule "Leading and trailing tags off", which I think provides for any combination of tags possible.
Here is Roman's full post.
The idea behind Roman's solution is to provide a setting that somewhat creates a "context-specific" way to include/remove leading and trailing tags from segments when need be.
Closed in the released version 3.1.5 of OmegaT.
Didier