Output files size

  • Anonymous - 2007-04-24


    (Version info: pdfsam-basic 0.6sr3)

    I stumbled a small "issue": splitting a 4MB file into two parts produces 4MB + 4MB.
    The user that reported it to me wanted to reduce the size of the whole stuff he had to send by email: not really the expected result.

    I tried to have a look at the code and being ignorant when it comes to Java, I merely understand that the splitting process begins with PdfCopy'ing the original document, then insert those pages we want. Knowing a little bit about PDF structure, I guess all the parts resulting from the split operation hold the whole object collection. Only the structure description for the pages is adjusted to reflect the content we want displayed.
    Would it be possible to have a cleaning pass afterward, before closing the resulting file, to ensure that unreferenced object are thrown away ?

    Should I file a bug report, or a feature request ?

    Thanks in advance for your support,

    Best regards,

    • Andrea Vacondio

      Andrea Vacondio - 2007-04-26

      Hi Francis,

      i already rewrote the split process and it will be available in the next release (there were some other issue like annotations and so on) but i dont know where i'll be able to release this new version.

      I'm even planning to add some compress feature but i don't know, we'll see... anyway i hope to release soon this new split process.
      Please tell me if this issue will be still valid even with the new version.

      Best regards,

    • Anonymous - 2007-04-27


      glad to know it.

      No problem, I'll post a follow-up on this as soon as the next release is out. Thank you !

      Best regards,

    • Tony Gravagno

      Tony Gravagno - 2007-06-01

      I just experienced the same issue with 0.6.sr3.  I have an 11MB file that was split at page 114 of 199.  After about 40 minutes of CPU intensive processing, the first file was 47MB and the second was 37MB.  Adobe Acrobat seems to have a problem with "large" files, which is the reason I wanted to split _down_ from 11MB.  Thanks for helping with the split, but I'm now also looking for a compression tool because these files are useless as-is.

      FYI, I was going to use the latest beta from source but I'm off-site, the PC I'm using doesn't have Java development tools, and there is no information in the packages about what the files are or what we need to do to compile.  I wouldn't mind trying to work from source and seeing if I can help with the code, but that can't happen for a while.


    • Andrea Vacondio

      Andrea Vacondio - 2007-06-02

      svn repository contains source code of pdfsam as an Eclipse project so you simply need to check it out (external libraries are not included so you also have to download iText, dom4j, jaxen, jcmdline and jlooks). An ant script is also included to create jars.
      I'm going to release soon and this new version uses a new console split method so, perhaps, this issue will be fixed. Anyway i added "compression feature" to my TODO list for the next release.
      It would be nice if i could have this big pdf document to make some test.
      Write me an email if you want some more detail about the source code or if you want to send me the document.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks