gscan2pdf generates large file size pdfs
Brought to you by:
ra28145
Congratulations for the software, It's so much easier to use than the other softwares in ubuntu and has a lot of more funcionalities. I like it a lot.
On the other hand, I noted that the pdf generated in gscan2pdf are a lot larger than simple-scan. The same document that occupies 1.7mb in gray-scale with 150dpi in simple-scan, in gscan2pdf, with the same resolution, occupies 5.5mb (that is a media using all the compression options). I know that that is a jpg quality option that can decrease even more the size file, but than the document becomes useless.
Am I the only one noting that?
thank you
André
Which compression option are you using when saving the PDF?
Can you attach test PDFs from simplescan and gscan2pdf so that I can see better what sort of image you are scanning?
Sure, i tried all the file compression options and the files generated are in the zi file. All the files were created with the option to reduce to 90 ppi, except one, marked in the zip file.
Thanx
http://dl.dropbox.com/u/1610398/Gscan%20pdf%20documents.zip
The reason that the PDF from simple scan is smaller is that simple scan has save the images as black & white, rather than greyscale. If I import the PNG compressed PDF, threshold the pages and save, I get a PDF of 500kb - less than a third of the size of that from simple scan.
Propose to change to 'open'.
Version 1.5.2 on Kubuntu 16.10 - but also noted on earlier versions.
Greyscale files are disproportionally large. Scanning 4 pages as binary results in about 500kB PDF file size; 8-bit greyscale creates a 19.1MB file. PDF compression is set to 'auto'. Even an 8-bit colour scan of the four pages results in a reasonable file size of 4.2MB. So the greyscale file is really unreasonably large.
Other than that, thanks for providing this very useful piece of software. The possibility to scan into a PDF with on-the-fly OCR is really, really useful for digitally archiving documents.
Cheers
Syiad
Last edit: Syiad 2016-12-17
Auto compression for 8-bit greyscale is PNG. LZW gives better compressoin, but there is a bug in the PDF::API2 module for LZW, which means that it sometimes corrupts the image. You can use it, but should check the results and switch to a different compression method if necessary.
Thanks for the info. Using LZW isn't really practicable then, if I would have to check the resulting file page by page afterwards. So I'm basically left with binary (fast) and 8-bit colour (which takes longer for the scanning process than greyscale).
Any chance of fixing greyscale file size in a future version of gscan2pdf?
The LZW bug is not in gscan2pdf, but in the Perl module PDF::API2.
I accept, however, that it is a significant problem for some people, so I have reopened this bug report.
I have already had a stab at fixing the bug, without success. I am making another attempt, but I don't have a solution yet.