Menu

#1260 Excel files not correctly read by StaX filter

6.0
open
None
5
2026-01-28
2024-04-22
No

After discussion in mailing list and #288, I could establish that Excel files without paragraphs (which is the case of most Excel files indeed) are not correctly read in StaX filter while read correctly in old filter.
Word and Powerpoint, which always have paragraphs, are not concerned.

Discussion

  • Jean-Christophe Helary

    @miurahr9 I am not seeing this fix in the change.txt file.

     
    • Hiroshi Miura

      Hiroshi Miura - 2024-09-13

      Yes it have been still open.

       
  • Jean-Christophe Helary

    @t_cordonnier can you point @miurahr9 at your PR so that he can merge ?

     
  • Jean-Christophe Helary

    @t_cordonnier, I guess the PR was about disabling the Excel handling in Stax, right ?

     
  • Thomas CORDONNIER

    No. This ticket is about a bug in the StaX filter which prevents reading some Excel files correctly. We had together agreed to inactivate the filter for Excel files until I find a solution but this is another ticket, whose number I don't remember. This other ticket is probably closed, while the current one is still open.
    For the moment I did not find a correct solution, I had implemented something but then the correction impacted Word files. So, I could not publish it as a pull request.
    It seems that the Excel format is a little bit different from other OpenXML. And more generally the OpenXML format is very complicated.

     
    👍
    1

    Last edit: Thomas CORDONNIER 2025-02-24
  • Hiroshi Miura

    Hiroshi Miura - 2025-08-19

    The filter is disabled by default in 6.0
    https://github.com/omegat-org/omegat/pull/1611

     
    👍
    1
  • Hiroshi Miura

    Hiroshi Miura - 2026-01-28

    @t_cordonnier — could you share an update on the StaX filter enhancement? This has been pending for a while, so a status update would be appreciated.

     
  • Thomas CORDONNIER

    @miurahr9 unless the message from 2025-02-24 is not clear, actually I could only make a diagnostic, not find a solution. For the moment my attempts to solve this had side effects on other Excel files, reason why I cannot publish anything

     
    • Hiroshi Miura

      Hiroshi Miura - 2026-01-28

      Appreciations for the update of status. The StaX filter is very complex and difficult to modify by other developers, and modifications can be easily produce side-effect...

      If you are ok to share your attempt in dev list, it is valuable to educate other developers.

       

Log in to post a comment.

MongoDB Logo MongoDB