Menu

#288 exporting archive file - duplicate entries

open
Normaliser (38)
6
2014-09-25
2009-11-17
No

Testing Branch - Xena 4.3.14

Ran several archive files through Xena and some failed normalisation. Some GZIP and ZIP files pass and some fail.

The current Stable Branch version of Xena (4.3.0) can normalise all of these files OK.

Some of the errors are:

For a GZIP file :

The supplied data appears to be in the Office 2007+ XML. POI only supports OLE2 Office documents
Trace:
org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:108)
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
au.gov.naa.digipres.xena.plugin.office.MicrosoftOfficeGuesser.officeTypeMatched(MicrosoftOfficeGuesser.java:129)
au.gov.naa.digipres.xena.plugin.office.MicrosoftOfficeGuesser.guess(MicrosoftOfficeGuesser.java:103)
au.gov.naa.digipres.xena.plugin.office.spreadsheet.XlsxGuesser.guess(XlsxGuesser.java:70)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.getBestGuess(GuesserManager.java:376)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.mostLikelyType(GuesserManager.java:260)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.mostLikelyType(GuesserManager.java:240)
au.gov.naa.digipres.xena.plugin.archive.ArchiveNormaliser.parse(ArchiveNormaliser.java:96)
au.gov.naa.digipres.xena.plugin.archive.gzip.GZipNormaliser.parse(GZipNormaliser.java:114)
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.parse(NormaliserManager.java:817)
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.normalise(NormaliserManager.java:1005)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:595)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:539)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseFile(NormalisationThread.java:324)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseStandard(NormalisationThread.java:246)
au.gov.naa.digipres.xena.litegui.NormalisationThread.run(NormalisationThread.java:187)

Another GZIP file:

org.xml.sax.SAXException: Cannot connect to OpenOffice.org - possibly something wrong with the input file
Trace:
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.parse(NormaliserManager.java:826)
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.normalise(NormaliserManager.java:1005)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:595)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:539)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseFile(NormalisationThread.java:324)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseStandard(NormalisationThread.java:246)
au.gov.naa.digipres.xena.litegui.NormalisationThread.run(NormalisationThread.java:187)

A Zip file:

The supplied data appears to be in the Office 2007+ XML. POI only supports OLE2 Office documents
Trace:
org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:108)
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:151)
au.gov.naa.digipres.xena.plugin.office.MicrosoftOfficeGuesser.officeTypeMatched(MicrosoftOfficeGuesser.java:129)
au.gov.naa.digipres.xena.plugin.office.MicrosoftOfficeGuesser.guess(MicrosoftOfficeGuesser.java:103)
au.gov.naa.digipres.xena.plugin.office.spreadsheet.XlsxGuesser.guess(XlsxGuesser.java:70)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.getBestGuess(GuesserManager.java:376)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.mostLikelyType(GuesserManager.java:260)
au.gov.naa.digipres.xena.kernel.guesser.GuesserManager.mostLikelyType(GuesserManager.java:240)
au.gov.naa.digipres.xena.plugin.archive.ArchiveNormaliser.parse(ArchiveNormaliser.java:96)
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.parse(NormaliserManager.java:817)
au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.normalise(NormaliserManager.java:1005)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:595)
au.gov.naa.digipres.xena.core.Xena.normalise(Xena.java:539)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseFile(NormalisationThread.java:324)
au.gov.naa.digipres.xena.litegui.NormalisationThread.normaliseStandard(NormalisationThread.java:246)
au.gov.naa.digipres.xena.litegui.NormalisationThread.run(NormalisationThread.java:187)

Discussion

  • Justin Waddell

    Justin Waddell - 2009-11-24
    • assigned_to: nobody --> acunliffe
    • labels: --> Normaliser
    • status: open --> open-fixed
     
  • Justin Waddell

    Justin Waddell - 2009-11-24

    There were a number of problems with archive files that needed to be fixed:

    * 0-length archive files are not valid and cause problems when attempting to operate on them. 0-length files are now more likely to be guessed as binary files.
    * The wrong magic number was being used for jar files, causing them to be guessed as .odt files.
    * Problems with the plaintext normaliser for files inside the archives.

    Fixes made in Xena v4.3.16, archive v1.2.4, plaintext v3.4.3 (testing branches)

     
  • Allan Cunliffe

    Allan Cunliffe - 2009-11-25

    Tested in Testing Branch - Xena v4.3.16, archive v1.2.4, plaintext v3.4.3.

    Ran selection of zip and gzip files through Xena. All were normalised correctly.

     
  • Allan Cunliffe

    Allan Cunliffe - 2009-11-25
    • status: open-fixed --> closed-fixed
     
  • Allan Cunliffe

    Allan Cunliffe - 2009-11-26

    Tested in Testing Branch - Xena v4.3.16, archive v1.2.4, plaintext v3.4.3.

    I've tested this a bit more thoroughly and there appears to be some problems with exporting archive files.

    I normalised a group of archive files (.zip, .gz, .jar)
    Opened the archive from the Normalisation Results
    From within Xena Viewer, I attempted to Export the archive.
    Some allow me to export and others throw up the error:

    "java.util.zip.ZipException: duplicate entry. org/w3c/dom/UserDataHandler.class"

    Terminal output:

    "au.gov.naa.digipres.xena.kernel.XenaException: java.util.zip.ZipException: duplicate entry: org/w3c/dom/UserDataHandler.class
    at au.gov.naa.digipres.xena.core.Xena.export(Xena.java:765)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewFrame.exportXenaFile(NormalisedObjectViewFrame.java:275)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewFrame.access$300(NormalisedObjectViewFrame.java:69)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewFrame$2.actionPerformed(NormalisedObjectViewFrame.java:157)
    ..."

    This only happens with some archive files - too large to attach here but one is the xena.jar

     
  • Allan Cunliffe

    Allan Cunliffe - 2009-11-26
    • assigned_to: acunliffe --> jwaddell
    • status: closed-fixed --> open-fixed
     
  • Daniel Black

    Daniel Black - 2010-11-17
    • assigned_to: jwaddell --> matthewoliver
     
  • Daniel Black

    Daniel Black - 2011-01-18

    opening based on last comment (though a different problem to what was originally reported)

     
  • Daniel Black

    Daniel Black - 2011-01-18
    • summary: GZIP files not normalising --> exporting archive file - duplicate entries
    • priority: 5 --> 6
    • status: open-fixed --> open
     
  • Allan Cunliffe

    Allan Cunliffe - 2011-02-25

    Tested in Stable Branch v5.0.0

    Looks like this is still an issue in current stable branch.

    When exporting Xena.jar, I'm still getting the error:

    "java.util.zip.ZipException: duplicate entry.
    org/w3c/dom/UserDataHandler.class"

     
  • Michael Carden

    Michael Carden - 2011-08-01
    • assigned_to: matthewoliver --> terryoneill
     
  • Allan Cunliffe

    Allan Cunliffe - 2011-08-29

    Tested in Xena imageMagicFix branch (Date: Thu Aug 25 13:49:19 2011 +1000)

    Still an issue.

    Console output:

    Destination: /home/al/Xena/Destination/xena.jar_Zip.xena
    au.gov.naa.digipres.xena.kernel.XenaException: org.xml.sax.SAXException: Problem exporting archive entry org/w3c/dom/UserDataHandler.class
    java.util.zip.ZipException: duplicate entry: org/w3c/dom/UserDataHandler.class
    at au.gov.naa.digipres.xena.core.Xena.export(Xena.java:911)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewDialog.exportXenaFile(NormalisedObjectViewDialog.java:305)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewDialog.access$300(NormalisedObjectViewDialog.java:72)
    at au.gov.naa.digipres.xena.viewer.NormalisedObjectViewDialog$2.actionPerformed(NormalisedObjectViewDialog.java:174)
    at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2012)
    at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2335)
    at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:404)
    at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)
    at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:253)
    at java.awt.Component.processMouseEvent(Component.java:6203)
    at javax.swing.JComponent.processMouseEvent(JComponent.java:3267)
    at java.awt.Component.processEvent(Component.java:5968)
    at java.awt.Container.processEvent(Container.java:2105)
    at java.awt.Component.dispatchEventImpl(Component.java:4564)
    at java.awt.Container.dispatchEventImpl(Container.java:2163)
    at java.awt.Component.dispatchEvent(Component.java:4390)
    at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4461)
    at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4125)
    at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4055)
    at java.awt.Container.dispatchEventImpl(Container.java:2149)
    at java.awt.Window.dispatchEventImpl(Window.java:2478)
    at java.awt.Component.dispatchEvent(Component.java:4390)
    at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:649)
    at java.awt.EventQueue.access$000(EventQueue.java:96)
    at java.awt.EventQueue$1.run(EventQueue.java:608)
    at java.awt.EventQueue$1.run(EventQueue.java:606)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:105)
    at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:116)
    at java.awt.EventQueue$2.run(EventQueue.java:622)
    at java.awt.EventQueue$2.run(EventQueue.java:620)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:105)
    at java.awt.EventQueue.dispatchEvent(EventQueue.java:619)
    at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:275)
    at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:200)
    at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:190)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:185)
    at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:177)
    at java.awt.EventDispatchThread.run(EventDispatchThread.java:138)
    Caused by: org.xml.sax.SAXException: Problem exporting archive entry org/w3c/dom/UserDataHandler.class
    java.util.zip.ZipException: duplicate entry: org/w3c/dom/UserDataHandler.class
    at au.gov.naa.digipres.xena.plugin.archive.ArchiveDeNormaliser.startElement(ArchiveDeNormaliser.java:166)
    at org.xml.sax.helpers.XMLFilterImpl.startElement(XMLFilterImpl.java:551)
    at au.gov.naa.digipres.xena.kernel.metadatawrapper.DefaultUnwrapper.startElement(DefaultUnwrapper.java:40)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    at org.apache.xerces.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
    at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.export(NormaliserManager.java:1666)
    at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.export(NormaliserManager.java:1420)
    at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.export(NormaliserManager.java:1391)
    at au.gov.naa.digipres.xena.kernel.normalise.NormaliserManager.export(NormaliserManager.java:1347)
    at au.gov.naa.digipres.xena.core.Xena.export(Xena.java:905)
    ... 39 more
    Caused by: java.util.zip.ZipException: duplicate entry: org/w3c/dom/UserDataHandler.class
    at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:192)
    at au.gov.naa.digipres.xena.plugin.archive.ArchiveDeNormaliser.startElement(ArchiveDeNormaliser.java:151)
    ... 56 more

     

Log in to post a comment.