#153 Again on the Stacking distortion in gscan2pdf - ID: 3382109

open
nobody
None
5
2013-01-28
2012-12-20
No

I am submitting another bug report related to the so called Stacking distortion in gscan2pdf - ID: 3382109. This bug, whose effect can be described as a shift along the horizontal direction of large/small rectangular blocks of the image itself, has been elusive but regularly in the use of gscan2pdf as a front-end for digitizing documents. I've succeded in reproducing it deterministically, and I've added the log file of the session, a copy of the standard error of the session, and a precise description of the steps I've done to unveil it. In order to describe those steps, let's precise the structure of my system: I'm running a Ubuntu 12.10 fully updated on Vaio VGN-FE48M, and I've installed gscan2pdf 1.1.0 (the latest stable?), and the file I am working on is a 1857 paper of Faà di Bruno, taken from a copy of the Quarterly Journal of Pure and Applied Mathematics made freely available by Google books:
1) I downloaded a PDF copy of the whole Journal from the link http://books.google.com/books?id=7BELAAAAYAAJ&pg=PA359
2) I lauched gscan2pdf from the command line: gscan2pdf --log=log 2> std_err_log
3) I opened the downloaded PDF file and exctrated the two pages I'm interested to, i.e. pages 425 and 426.
4) I removed Google's watermarks
5) I renumbered the pages. I think this step is not significant, since during the recorded test I have to do it twice since I chosen the a wrong initial value (i.e 2 instead of 1) for the page number.
6) I cleaned the two pages with unpaper using the standard (i.e. factory default) options. This step, the following one and their relative order of execution may be important in the fixing of the bug, see below.
7) I then cleaned the same two pages with gimp (version 2.8.2), in order to remove the upper part of the text (the concluding remarks of the preceeding paper) on the first page and to remove the lower black streams on the second page plus some big black points unpaper cannot remove on both pages.
8) I saved the file in PDF format, and the saved file shows stacking distortion.
I repeated the test three times. The first one succeded in reproducing the bug while the second one did not, but in that test I did step 7 before step 8: I also noticed that the files in gimp where in PNG instead of PNM format. For the third, last test, to which the logs and the result file included here are related, I powerd off the computer, restarted it and then executed exactly the sequence described above. Well, I hope all this will help fixig the bug: if you need me to experiment more, please let me know. Thank you for your attention.
Best,
Daniele Tampieri

Discussion

  • Daniele Tampieri

    The log file of the execution

     
  • Daniele Tampieri

    The copy of the standard error duringthe same execution

     
  • Daniele Tampieri

    Addendum: note that the stcing distortion appears always in the second page in the same position. Check it by comparing the PDF results of the first and the second tests, i.e. qjpam_1857_359-360.pdf and Francesco Faà di Bruno 2012-12-20_II.pdf.

     
  • Daniele Tampieri

    I must precise that Bruno 2012-12-20_II.pdf is the result of the third and not of the seconf test, as correctly stated in the bug report: I apologize for the mistake.

     
  • Jeffrey Ratcliffe

    I have seen similar problems before, and then, the bug was in PDF::API2, and the workaround was to use PNG compression rather than LZW on the save PDF dialog.

     
  • Daniele Tampieri

    Thank Jeff (could I call you so). I saw also that the module is currently not mantained by anyone, so there are few hopes to see it corrected soon. However I've decided to signal this bug in its sourceforge, with a comment and a link from there to here: let's feed some work to everione who'd like to take care of its mantainance. :-D

     
  • Jeffrey Ratcliffe

    Can you confirm that PNG compression is a viable workaround?

    Would you mind adding a link to the new bug?

     
  • Daniele Tampieri

    Here it is the link to the new bug:

    https://sourceforge.net/tracker/?func=detail&atid=428259&aid=3601130&group_id=40547

    I'll test the workaround on a few old PDF papers in the weekend: if it gives reasonable results (and I have no dubts it will be so), I'll post here a confirmation. After that it maybe also useful if you would write a few lines on the web page of the software, but let's see how the test behaves.

    Best, Daniele

     
  • Daniele Tampieri

    Ok, done all. A little later, but I succeeded in doing all the tests, so let's precise the setup.
    1) I used the same file which served to identify bug occurrence, proceeding exactly as described before.
    2) After editing the files with the gimp, I returned to gscan2pdf and opened the save dialog panel to save my work in pdf format.
    3) I saved the file several times, chosing each time a different compression method, and I obtained the following results:
    --Automatic compression: stacking error present in the output
    --JPEG compression: stacking error absent in the output
    --LZW compression: stacking error present in the ouput
    --Packbits compression: stacking error present in the output
    --PNG compression: stacking error absent in the output
    --ZIP compression: stacking error absent in the output
    4) Conclusions: except for the Atuomatic and the LZW compression methods, all other compression methos gave and output which is free of the stacking error. Therefore I CONFIRM THE WORKAROUND of the bug proposed by Jeff: however, if you need LZW compression for some reason and also need to process your images with the gimp, then process them with it BEFORE processing them by unpaper.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks