Menu

#511 NullPointerException in PDPageNode.getAllKids

closed-out-of-date
parsing (91)
5
2014-12-09
2008-07-02
Art Peel
No

The parser cannot seem to find the Pages object in files created with Acrobat Pro 9. A sample file is attached.

public static void main(String[] argv) throws Exception {
String name = "./test.pdf";
PDDocument doc = PDDocument.load(name);
doc.close();
PDPageNode root = doc.getDocumentCatalog().getPages();
ArrayList<PDPage> pages = new ArrayList<PDPage>();
root.getAllKids(pages);
System.out.println("pages.size() == "+pages.size());
}

Exception in thread "main" java.lang.NullPointerException
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)

Discussion

  • Art Peel

    Art Peel - 2008-07-02

    created with Acrobat 9 Pro, default settings

     
  • Art Peel

    Art Peel - 2008-07-02

    Logged In: YES
    user_id=1693709
    Originator: YES

    This happens with the latest code from CVS and also in older versions.

     
  • James Wilson

    James Wilson - 2008-07-14

    Logged In: YES
    user_id=853566
    Originator: NO

    We are experiencing the same problem. Offending pdf available if any of you need it (jwilson@nmcourt.fed.us). Looks like pdfbox does not support some new feature introduced in Acrobat 9.

     
  • Art Peel

    Art Peel - 2008-07-14

    Logged In: YES
    user_id=1693709
    Originator: YES

    In Acrobat 8, the default was to generate PDFs following version 1.4 of the PDF specification. In Acrobat 9, the default is to to generate PDFs following version 1.5 of the PDF specification. PDF1.5 has objects known as cross-reference streams and it turns out that PDFBox does not parse them correctly.

     
  • Kevin Day

    Kevin Day - 2008-10-27

    I can confirm foundart's comments - Acrobat 9 is indeed using XRef streams. This is going to become a pretty big problem as Adobe 9 is adopted.

    Information on cross reference streams is here:

    Section 7.5.8 of http://www.adobe.com/devnet/acrobat/pdfs/PDF32000_2008.pdf

    long and short: the xref parser is going to need to be enhanced to do a stream read

     
  • ptarver

    ptarver - 2008-11-04

    I'd also like to confirm this issue. We have a custom web server application that uses PDFbox to merge FDF's with template PDF contracts and after upgrading to Acrobat 9, the java server began throwing NullPointer Exception's with the following messages:

    [#|2008-11-03T07:13:45.375-0600|INFO|sun-appserver-ee8.2|javax.enterprise.system.stream.out|_ThreadID=15;| java.lang.NullPointerException
    at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
    at org.pdfbox.pdmodel.PDPageNode.getKids(PDPageNode.java:171)
    at org.pdfbox.pdmodel.PDPageNode.updateCount(PDPageNode.java:90)
    at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:606)
    at org.pdfbox.pdmodel.PDDocument.save(PDDocument.java:592)

    I'm hoping that this problem is being reviewed, but I see that the priority setting is only at 5. We are going to have to revert back to Acrobat 6 since that was our previous version in order to get our contracts working again. I can provide several pdf documents created with Acrobat 9 that failed if examples are necessary.

    Thanks!

     
  • Ben Litchfield

    Ben Litchfield - 2010-04-07

    PDFBox has moved to Apache. Bugs have been moved over to the Apache bug tracking system. If you don't see the bug and it's still not fixed in the current release then please create a new bug on the Apache site.

    http://pdfbox.apache.org

     
  • Ben Litchfield

    Ben Litchfield - 2010-04-07
    • status: open --> closed-out-of-date
     

Log in to post a comment.