Menu

#852 Bug

v1.0_(example)
closed
nobody
None
1
2015-01-15
2014-10-30
Anonymous
No

I just added folder to be indexed (the folder is very large) and some time after I noticed a window with the following error information. I haven't been able to index my folder yet because program says that there is no enough memory and that's a real problem because I already allocated 1024MB for it.

program.name=DocFetcher
program.version=1.1.12
program.build=20140925-0049
program.portable=false
java.runtime.name=Java(TM) SE Runtime Environment
java.runtime.version=1.8.0_25-b18
java.version=1.8.0_25
sun.arch.data.model=64
os.arch=amd64
os.name=Windows 7
os.version=6.1
user.language=en
org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException: Fail to save: an error occurs while saving the package : duplicate entry: docProps/core.xml
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:500)
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1417)
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1404)
at org.apache.poi.openxml4j.opc.ZipPackage.closeImpl(ZipPackage.java:349)
at org.apache.poi.openxml4j.opc.OPCPackage.close(OPCPackage.java:420)
at com.google.common.io.Closeables.close(Closeables.java:80)
at com.google.common.io.Closeables.closeQuietly(Closeables.java:99)
at net.sourceforge.docfetcher.model.parse.MSOffice2007Parser.doParse(MSOffice2007Parser.java:145)
at net.sourceforge.docfetcher.model.parse.MSOffice2007Parser.parse(MSOffice2007Parser.java:85)
at net.sourceforge.docfetcher.model.parse.ParseService.doParse(ParseService.java:309)
at net.sourceforge.docfetcher.model.parse.ParseService.parse(ParseService.java:233)
at net.sourceforge.docfetcher.model.index.file.FileContext.index(FileContext.java:146)
at net.sourceforge.docfetcher.model.index.file.FileIndex$1.handleFile(FileIndex.java:288)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.runWithHtmlPairing(HtmlFileLister.java:124)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.doRun(HtmlFileLister.java:57)
at net.sourceforge.docfetcher.util.Stoppable.run(Stoppable.java:57)
at net.sourceforge.docfetcher.model.index.file.FileIndex.visitDirOrZip(FileIndex.java:275)
at net.sourceforge.docfetcher.model.index.file.FileIndex.access$200(FileIndex.java:51)
at net.sourceforge.docfetcher.model.index.file.FileIndex$1.handleDir(FileIndex.java:386)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.runWithHtmlPairing(HtmlFileLister.java:145)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.doRun(HtmlFileLister.java:57)
at net.sourceforge.docfetcher.util.Stoppable.run(Stoppable.java:57)
at net.sourceforge.docfetcher.model.index.file.FileIndex.visitDirOrZip(FileIndex.java:275)
at net.sourceforge.docfetcher.model.index.file.FileIndex.access$200(FileIndex.java:51)
at net.sourceforge.docfetcher.model.index.file.FileIndex$1.handleDir(FileIndex.java:386)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.runWithHtmlPairing(HtmlFileLister.java:145)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.doRun(HtmlFileLister.java:57)
at net.sourceforge.docfetcher.util.Stoppable.run(Stoppable.java:57)
at net.sourceforge.docfetcher.model.index.file.FileIndex.visitDirOrZip(FileIndex.java:275)
at net.sourceforge.docfetcher.model.index.file.FileIndex.access$200(FileIndex.java:51)
at net.sourceforge.docfetcher.model.index.file.FileIndex$1.handleDir(FileIndex.java:386)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.runWithHtmlPairing(HtmlFileLister.java:145)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.doRun(HtmlFileLister.java:57)
at net.sourceforge.docfetcher.util.Stoppable.run(Stoppable.java:57)
at net.sourceforge.docfetcher.model.index.file.FileIndex.visitDirOrZip(FileIndex.java:275)
at net.sourceforge.docfetcher.model.index.file.FileIndex.access$200(FileIndex.java:51)
at net.sourceforge.docfetcher.model.index.file.FileIndex$1.handleDir(FileIndex.java:386)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.runWithHtmlPairing(HtmlFileLister.java:145)
at net.sourceforge.docfetcher.model.index.file.HtmlFileLister.doRun(HtmlFileLister.java:57)
at net.sourceforge.docfetcher.util.Stoppable.run(Stoppable.java:57)
at net.sourceforge.docfetcher.model.index.file.FileIndex.visitDirOrZip(FileIndex.java:275)
at net.sourceforge.docfetcher.model.index.file.FileIndex.doUpdate(FileIndex.java:159)
at net.sourceforge.docfetcher.model.TreeIndex.update(TreeIndex.java:148)
at net.sourceforge.docfetcher.model.index.Task.update(Task.java:98)
at net.sourceforge.docfetcher.model.index.IndexingQueue.threadLoop(IndexingQueue.java:163)
at net.sourceforge.docfetcher.model.index.IndexingQueue.access$100(IndexingQueue.java:46)
at net.sourceforge.docfetcher.model.index.IndexingQueue$2.run(IndexingQueue.java:118)
Caused by: org.apache.poi.openxml4j.exceptions.OpenXML4JException: duplicate entry: docProps/core.xml
at org.apache.poi.openxml4j.opc.internal.marshallers.ZipPackagePropertiesMarshaller.marshall(ZipPackagePropertiesMarshaller.java:59)
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:482)
... 46 more

Discussion

  • Nam-Quang Tran

    Nam-Quang Tran - 2014-10-30

    Hi,

    The crash above is unrelated to memory issues. Rather, there seems to be a corrupted MS Office file that causes DocFetcher to crash.

    Best regards
    q:-) <= Quang

     
  • Nam-Quang Tran

    Nam-Quang Tran - 2015-01-15
    • status: open --> closed
     
  • Nam-Quang Tran

    Nam-Quang Tran - 2015-01-15

    Fixed in DocFetcher 1.1.13.

     

Anonymous
Anonymous

Add attachments
Cancel