Page removal

Help
2012-03-06
2013-01-26
  • Hi all,

    a user wrote to me about an issue on page removal:

    I want to use the PDF Clown's ability of removing pages from PDF Files in a .NET project.
    Therefore I use the following code:

    public static void deletePages(string inputPdfFile, string outputPdfFile, int[] deletePages)
    {
      org.pdfclown.files.File inputFile = new org.pdfclown.files.File(inputPdfFile);
      org.pdfclown.documents.Document inputDocument = inputFile.Document;
      foreach(int pageIndex in deletePages)
      {inputDocument.Pages.Remove(inputDocument.Pages[pageIndex]);}
      inputFile.Save(outputPdfFile, SerializationModeEnum.Standard);
    }
    

    The problem is that the deleted pages still seem to exist physically: after executing my deletePages method the file size of the newly generated outputPdfFile is the same as the inputPdfFile.

    I suggest you to read the User Guide in the downloadable distribution: it describes the basic mechanisms involved in PDF document manipulation through PDF Clown; in particular, each high-level object (i.e. inheriting from PdfObjectWrapper, like Page) has two (orthogonal) referential dimensions:
    * horizontal: a reference within the same abstraction level (e.g. a Page is contained in a Pages collection)
    * vertical: a reference across multiple abstraction levels (e.g. a Page is contained in an indirect object)
    Removing a page from a pages collection (as you did above) cuts only the horinzontal reference, leaving orphaned the page within the document. To remove completely a page, you have also to call its delete() method (which detaches the object from its document); to simplify these operations, there's a specialized helper class (PageManager - see PageManagementSample in the downloadable distribution for a practical demonstration).

    I hope this helps!
    Stefano