Menu

#288 Open XML: Excel strings are read in random order

SVN
open
nobody
5
2023-10-28
2007-02-02
No

In the Open XML filter, Excel strings are loaded in a "random" order. In fact, it is the physical order in the source document.

I enclose below a possible algorithm to read the strings in logical order:

Load all the strings from sharedStrings.xml in an array sharedStrings
For each sheet (sheet1.xml...sheet<n>.xml)
    For each tag of the form: 
    <c r="B1" t="s">
        <v>0</v>
    </c>
        index = the content of the <v> tag
        string = sharedStrings[index]
    End for each tag
End for each sheet

Didier

Related

Feature Requests: #1254

Discussion

  • Jean-Christophe Helary

    • Group: CVS --> SVN
     
  • Hiroshi Miura

    Hiroshi Miura - 2023-09-27

    @didierbr is the problem still observed in OmegaT 6.1 weekly with StaX filters?

     
    • Didier Briel

      Didier Briel - 2023-10-04

      I tried to test, but the StaX filter in 6.1 doesn't seem to open my .xlsx files at all.

      Didier

       
  • Didier Briel

    Didier Briel - 2023-10-06

    This is one of the not working files.

    Didier

     

    Last edit: Didier Briel 2023-10-06
    • Thomas CORDONNIER

      This sample loads correctly in my computer (using just built master/6.1), I see 5 segments
      Anything in your log?

       
    • Philippe

      Philippe - 2023-10-09

      Just another reference point.

      The file also opens just fine for me in both OmegaT 6.0 and the lastest 6.1 master (October 9: 0b31aec58481d1a837a9347b916a1035707ea9ef).

      This is on an Arch Linux system, and I see the same five segments in OmegaT as I do if I open the file in LibreOffice.

      Do you have a more complex file (with data in several columns, for example) that doesn't load properly on your system?

       
    • Thomas CORDONNIER

      Didier, since this file works for us, can you also share your filters.xml ? Maybe something in the configuration...

       
      • Didier Briel

        Didier Briel - 2023-10-10

        I tried deleting filters.xml, using Restore defaults, still the same behaviour (both on 6.0 and 6.1).

        It's under Windows 11, Java 11

        I attach the newly generated filters.xml

        Same behaviour also when deleting omegat.prefs

        Didier

         
        • Philippe

          Philippe - 2023-10-25

          It does seem to be an issue with the filters file.

          If I replace my original filters.xml file with Didier's and try to load my test project with the latests OmegaT master (90625acc2), OmegaT tells me there are no source files in the project.

          If I close OmegaT and replace Didier's file with my initial filters.xmlfile, everything works again. I'm attaching my own file for reference. (Note that it was originally created (by OmegaT) in 2021.)

          Interestingly, if I replace Didier's file with mine without closing OmegaT and reload the project, it gets overwritten with the contents of Didier's file and the Excel file in the source folder is no longer recognized. I'm not sure if overwriting the filters.xmlfile at project reload and close is intended behaviour, nor whether it is desirable if it is, in fact, intended.

           
          • Didier Briel

            Didier Briel - 2023-10-25

            What happens if you use "Restore defaults" in File Filters?

            Or if you delete filterx.xml outside of OmegaT before launching it?

            Didier

             
          • Hiroshi Miura

            Hiroshi Miura - 2023-10-26

            Didler's filters.xml configuration

            • Disable OpenXMLFilter (enabled in 6.0, and disabled in 6.1 in default)
            • Enable MsOfficeFileFilter (no support in 6.0 and enabled in 6.1 as default)
            • No configuration of Okapi filter

            Phillippe's filter configuration

            • Enable OpenXMLFilter
            • Enable Okapi OpenXMLFilter
            • Enable MsOfficeFileFilter
             
  • Hiroshi Miura

    Hiroshi Miura - 2023-10-15

    I've observed the behavior with OKAPI OpenXML filter.

     
  • Philippe

    Philippe - 2023-10-28

    Hiroshi's comment on the differences in the filters.xml configuration made me realize I had reused a configuration created with a pre-6.x version of OmegaT. (The Okapi filters shouldn't be in my test configurations for 6.x).

    I tried again with both 6.0 and 6.1 (latest master), with the same results in both:

    • If I only enable MsOfficeFileFilter, OmegaT does not see the Excel file in the /source subdirectory.
    • If I enable OpenXMLFilter in addition to MsOfficeFileFilter, the file loads with the legacy filter.

    It looks like the StaX filter doesn't recognize Excel files. (I haven't tried other MS Office files).

     
    • Didier Briel

      Didier Briel - 2023-10-28

      Good. I think you should create a bug report with that specific issue (Excel files being ignored by the StaX filter).

      Didier

       

Log in to post a comment.

MongoDB Logo MongoDB