#20 add support for using crop box for pdf import

Chris Gilling

Using the crop box is something that I have found that I need, and was not implemented. Here is a basic implementation. To use it one would do the following:
-define pdf:use-cropbox=true

I still have a one question about the implementation though. As I was testing the patch, I was testing to see if I needed to modify the loop that goes through and looks for /MediaBox entries to find the page size. Initially I thought that I would have to, because we are looking for /CropBox entries, and they should have priority over /MediaBox entries. But upon testing I couldn't see it making any difference. I even did this after right after the end of the loop:


And it didn't make any difference in the final output image.

So if this loop does need to be changed, it would be nice if someone could clarify what it is being used for (It only gets one size where each page could be a different size, and it doesn't necessary get the largest size either). Also it seems to me like it would need to changed to be page aware somehow, because the /CropBox entries would need to be chosen for pages that have them, but the /MediaBox entries otherwise.


  • Chris Gilling
    Chris Gilling

    PDF Import CropBox Patch

    • assigned_to: nobody --> bfriesen
  • It does seem likely that the MediaBox info is no longer used. Prior to 2009-03-31 the MediaBox and Rotate info was used to define bounding box arguments to pass to Ghostscript. Even with just one page, sometimes the info from the PDF was wrong, yet Ghostscript could produce a correct result so we stopped passing a bounding box request to Ghostscript.

    An alternative way to implement things is to invoke Ghostscript on a per page basis and expect Ghostscript to efficiently locate and return each page. This would allow the parser to step through the PDF and pass page-specific requests to Ghostscript.

  • Patch applied to CVS HEAD.

    • status: open --> closed