Menu

#37 PDF Indexer

open
nobody
None
5
2005-11-07
2005-11-07
Steve Pugh
No

We were getting a lot of errors from PDF Indexing.
There are two issues:
1. Many of out PDF files are not parsable by the PDFBox
library for various reasons, causing exception to be
thrown.
2. The file handle is not closed correctly in cases
where PDFBox fails to parse (because thrown exception
prevents file being closed).

Patch is designed to address both issues by:
1. less severe action when item not parsable - log
details but return empty content for indexer.
2. parser sorrounded by try / finally to ensure file
handle always closed.

Discussion

  • Steve Pugh

    Steve Pugh - 2005-11-07

    pdf indexer patch

     
  • Steve Pugh

    Steve Pugh - 2005-11-11

    Logged In: YES
    user_id=1271522

    By the way, we have also upgraded PDFBox to version
    PDFBox-0.7.2-log4j

    ..which seems to be working OK.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.