In the Open XML filter, Excel strings are loaded in a "random" order. In fact, it is the physical order in the source document.
I enclose below a possible algorithm to read the strings in logical order:
Load all the strings from sharedStrings.xml in an array sharedStrings
For each sheet (sheet1.xml...sheet<n>.xml)
For each tag of the form:
<c r="B1" t="s">
<v>0</v>
</c>
index = the content of the <v> tag
string = sharedStrings[index]
End for each tag
End for each sheet
Didier
@didierbr is the problem still observed in OmegaT 6.1 weekly with StaX filters?
I tried to test, but the StaX filter in 6.1 doesn't seem to open my .xlsx files at all.
Didier
This is one of the not working files.
Didier
Last edit: Didier Briel 2023-10-06
This sample loads correctly in my computer (using just built master/6.1), I see 5 segments
Anything in your log?
Just another reference point.
The file also opens just fine for me in both OmegaT 6.0 and the lastest 6.1 master (October 9: 0b31aec58481d1a837a9347b916a1035707ea9ef).
This is on an Arch Linux system, and I see the same five segments in OmegaT as I do if I open the file in LibreOffice.
Do you have a more complex file (with data in several columns, for example) that doesn't load properly on your system?
Didier, since this file works for us, can you also share your filters.xml ? Maybe something in the configuration...
I tried deleting filters.xml, using Restore defaults, still the same behaviour (both on 6.0 and 6.1).
It's under Windows 11, Java 11
I attach the newly generated filters.xml
Same behaviour also when deleting omegat.prefs
Didier
It does seem to be an issue with the filters file.
If I replace my original
filters.xmlfile with Didier's and try to load my test project with the latests OmegaT master (90625acc2), OmegaT tells me there are no source files in the project.If I close OmegaT and replace Didier's file with my initial
filters.xmlfile, everything works again. I'm attaching my own file for reference. (Note that it was originally created (by OmegaT) in 2021.)Interestingly, if I replace Didier's file with mine without closing OmegaT and reload the project, it gets overwritten with the contents of Didier's file and the Excel file in the source folder is no longer recognized. I'm not sure if overwriting the
filters.xmlfile at project reload and close is intended behaviour, nor whether it is desirable if it is, in fact, intended.What happens if you use "Restore defaults" in File Filters?
Or if you delete
filterx.xmloutside of OmegaT before launching it?Didier
Didler's filters.xml configuration
Phillippe's filter configuration
I've observed the behavior with OKAPI OpenXML filter.
Hiroshi's comment on the differences in the filters.xml configuration made me realize I had reused a configuration created with a pre-6.x version of OmegaT. (The Okapi filters shouldn't be in my test configurations for 6.x).
I tried again with both 6.0 and 6.1 (latest master), with the same results in both:
/sourcesubdirectory.It looks like the StaX filter doesn't recognize Excel files. (I haven't tried other MS Office files).
Good. I think you should create a bug report with that specific issue (Excel files being ignored by the StaX filter).
Didier