#5 Advanced PDF Merging

closed-fixed
nobody
None
5
2013-03-08
2012-07-30
No

This patch extends the current PageManager.Add funtionality in such a way, that all the destinations are merged as well. More importantly it tries to solve the problem with circular references and therefore allows links in both PDFs to stay functional and valid during the merging process.

The main Idea is to allocated new objectIDs as soon as they are needed, but "fill" them as soon as they are finished. This way other objects can easily reference to this already allocated objectIDs.

Discussion

  • Andreas Pinter

    Andreas Pinter - 2012-08-14

    I have found a bug with /GoTo actions referencing pages.

     
  • Stefano Chizzolini

    • status: open --> closed-fixed
     
  • Andreas Pinter

    Andreas Pinter - 2013-02-25

    Hi Stefano,

    the cloning of annotations and links seem to work just fine.
    But I saw that you didn't include my function for copying NamedDestinations in the PageManager.Add() functions. Is there another way (I havn't found yet) to copy those destinations because all the links are rather useless if the destination is not copied as well.

    Nevertheless, thanks for fixing that circular reference issue (in a much more elegant way than I did)

     
  • Andreas Pinter

    Andreas Pinter - 2013-03-04

    Hi Stefano,

    I hate to bother you again, but I can't get it to work.
    As far as I understand your solution you are now simply cloning each page and recursivly cloning everything inside it using the "Cloner". Since none of the objects (like LocalDestination) have a .clone method any more, its rather hard to debug and find out if a LocalDestination is actually cloned or not. When a link like << Type Action S GoTo D (Ne76e096b) >> should be copied to the target pdf it is indeed copied with a new object id but unfortunately 'Ne76e096b' does not show up in the /Names section of the pdf.

    May this be, because named destinations are siblings of pages (hierarchically speaking) and not children?

     
  • Stefano Chizzolini

    Hi Andreas,
    to definitely solve this issue, the best thing would be if you could send me a PDF to test your assertion.

    I'm looking forward to your sample, thank you!
    Stefano

     
  • Andreas Pinter

    Andreas Pinter - 2013-03-04

    Hi Stefano,

    attached are two example pdfs and the resulting file.
    I changed the "open_source.pdf" in such a way, that the "Introduction" link on the first page is using a named destination.

    The merging was done by opening "alice.pdf" with a PageManager and calling manager.add(3, "open_source.pdf"); I know that this is not totally correct, but I guess I can spare you the details how to open the files etc.

    I save the resulting file using orig.Save("path", StandardMode);

    As you can see the "result.pdf" does not have any named destinations.

    Hope this helps

     
  • Andreas Pinter

    Andreas Pinter - 2013-03-05

    Unfortunately I'm still not totally happy.
    See the attached c# source to see how I am using the merging. When using alice.pdf as "file1" and open_source.pdf as "file2" the "Introduction" Link from open_source.pdf is working just fine in the result, but Acrobat Pro 9 does not seem to find any named destinations to display in its list. Although the "Introduction" destination is present in the pdf-source (see result.pdf).

    Another little issue (which may be my own problem): when including a pdf with named destination links to named destinations which are not present (yet) the cloner throws an exception and aborts the whole process. A more resilient approach would be better in my opinion.

    Greetings
    -- Andreas

     
    Last edit: Andreas Pinter 2013-03-05
  • Andreas Pinter

    Andreas Pinter - 2013-03-08

    Works fine, thanks.

    A last note:
    If there is a named destination "foo" in both pdfs one set of links* to "foo" is semantically broken after the merge, because their original foo is not the same any more.
    But I guess that's a problem I need to take care prior to merging the pdfs. Especially because some destination names have additional meaning for me and can't be simply replaced by some hash value.

    • I can't really say that the "inserted links" are the broken once, since I also have an example where the original links get changed to point to the destination newly inserted.

    Anyway keep up the good work, it's a pleasure to work with the clown so far!
    -- Andreas

     
    Last edit: Andreas Pinter 2013-03-11

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks