PDF Clown / Discussion / Open Discussion: PDF File Creation Anomaly between compose and clone.

PDF File Creation Anomaly between compose and clone.

Forum: Open Discussion

Creator: Steven Swenson

Created: 2015-07-22

Updated: 2015-08-02

Steven Swenson - 2015-07-22

Hello, I wrote some test code that essentially creates a new source PDFClown.File ala "Hello World".
The file is saved.

A new destination PdfClown.File is created its Document assigned to an object. The Source Page Collection is assigned to an object and the specified page is added to the destination, The resulting file is saved.

foreach (int page in (this.PageList)) { int pageIndex = page - 1; //Page count is 1 based, array is not. Page myPage = (Page) srcPages[pageIndex].Clone(destfile.Document); destPages.Add(myPage); }

So far so good... The Documents appear identical in a reader so the process seems to work. However, using the following to test that externally:

byte[] sourcestream = SYSFILE.ReadAllBytes(mySourceFile); byte[] deststream = SYSFILE.ReadAllBytes(myOutputFile); Assert.True(sourcestream.Length == deststream.Length, String.Format("Source was {0} bytes long Dest was {1} bytes long", sourcestream.Length, deststream.Length));

The test fails:

KauffmanClassTests.TestTask.TestTaskPageRetrieval:
Source was 893 bytes long Dest was 975 bytes long
Expected: True
But was: False

The Filenames:
test.pdf
output.pdf

Is there anything particular to PDFClown's Clone method that would explain this discrepancy? Did I miss something?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Steven Swenson - 2015-07-22

Added the two files. The PDF Headers differ stream interior appears to be identical. Xref count different.

/type /kids shows 5 0 r in test, and 4 0 r in output.

Test lists 7 objects, Output shows 8

Are these document differences important? Is a difference in an empty PDFClown.File with a composed page expected to be different than one with an added clone of the composed page?

output.pdf

test.pdf

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stefano Chizzolini - 2015-08-02

Why are you so obsessed by binary identity of distinct files with cloned pages? :-) PDF is a somewhat relaxed format when it comes to token spacing, object numbers or object composition -- it doesn't make any sense to lament "creation anomaly" based on wrong assumptions: the assertion in your test is nonsensical, as the PDF specification doesn't require to respect the binary representation of the file as long as the resulting contents look the same.

PS: This forum ("Open Discussion") is about the development of the library -- any question about the use of the library should be posted on the Help forum.

Last edit: Stefano Chizzolini 2015-08-02

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.