Menu

PDF File Creation Anomaly between compose and clone.

2015-07-22
2015-08-02
  • Steven Swenson

    Steven Swenson - 2015-07-22

    Hello, I wrote some test code that essentially creates a new source PDFClown.File ala "Hello World".
    The file is saved.

    A new destination PdfClown.File is created its Document assigned to an object. The Source Page Collection is assigned to an object and the specified page is added to the destination, The resulting file is saved.

     foreach (int page in (this.PageList))
            {
                int pageIndex = page - 1; //Page count is 1 based, array is not.
                Page myPage = (Page) srcPages[pageIndex].Clone(destfile.Document);
                destPages.Add(myPage);
            }
    

    So far so good... The Documents appear identical in a reader so the process seems to work. However, using the following to test that externally:

    byte[] sourcestream = SYSFILE.ReadAllBytes(mySourceFile);
    byte[] deststream = SYSFILE.ReadAllBytes(myOutputFile);
    
    Assert.True(sourcestream.Length == deststream.Length, String.Format("Source was {0} bytes long Dest was {1} bytes long", sourcestream.Length, deststream.Length));
    

    The test fails:

    KauffmanClassTests.TestTask.TestTaskPageRetrieval:
    Source was 893 bytes long Dest was 975 bytes long
    Expected: True
    But was: False

    The Filenames:
    test.pdf
    output.pdf

    Is there anything particular to PDFClown's Clone method that would explain this discrepancy? Did I miss something?

     
  • Steven Swenson

    Steven Swenson - 2015-07-22

    Added the two files. The PDF Headers differ stream interior appears to be identical. Xref count different.

    /type /kids shows 5 0 r in test, and 4 0 r in output.

    Test lists 7 objects, Output shows 8

    Are these document differences important? Is a difference in an empty PDFClown.File with a composed page expected to be different than one with an added clone of the composed page?

     
  • Stefano Chizzolini

    Why are you so obsessed by binary identity of distinct files with cloned pages? :-) PDF is a somewhat relaxed format when it comes to token spacing, object numbers or object composition -- it doesn't make any sense to lament "creation anomaly" based on wrong assumptions: the assertion in your test is nonsensical, as the PDF specification doesn't require to respect the binary representation of the file as long as the resulting contents look the same.

    PS: This forum ("Open Discussion") is about the development of the library -- any question about the use of the library should be posted on the Help forum.

     

    Last edit: Stefano Chizzolini 2015-08-02

Log in to post a comment.