I'm seeing regular page size corruption when scanning using an Epson
NX100 in gscan2odf 1.5.1.
From the log file (near the end):
DEBUG - Started saving /home/alistair/Desktop/Renovation/Expenses/2016-07-22 Paragon.pdf
INFO - Using /usr/share/fonts/TTF/Aegean_hint.ttf for non-ASCII text
INFO - Depth of /tmp/gscan2pdf-40nE/jYgsq3crFJ.pgm is 8
INFO - Type of /tmp/gscan2pdf-40nE/jYgsq3crFJ.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-40nE/jYgsq3crFJ.pgm to /tmp/gscan2pdf-40nE/pra0kpNAw8.png
INFO - Resizing /tmp/gscan2pdf-40nE/pra0kpNAw8.png to 1241.14815799884 x 1753.93700787402
INFO - Writing temporary image /tmp/gscan2pdf-40nE/pra0kpNAw8.png
INFO - Defining page at 595.751115839444pt x 841.889763779528pt
INFO - Added /tmp/gscan2pdf-40nE/pra0kpNAw8.png at 150 PPI
INFO - Depth of /tmp/gscan2pdf-40nE/LWqB0jdHBq.pgm is 8
INFO - Type of /tmp/gscan2pdf-40nE/LWqB0jdHBq.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-40nE/LWqB0jdHBq.pgm to /tmp/gscan2pdf-40nE/wWKlkBOuZ7.png
INFO - Resizing /tmp/gscan2pdf-40nE/wWKlkBOuZ7.png to 1255.39490156924 x 1753.93700787402
INFO - Writing temporary image /tmp/gscan2pdf-40nE/wWKlkBOuZ7.png
INFO - Defining page at 602.589552753237pt x 841.889763779528pt
INFO - Added /tmp/gscan2pdf-40nE/wWKlkBOuZ7.png at 150 PPI
INFO - Depth of /tmp/gscan2pdf-40nE/icvjuc74VZ.pgm is 8
INFO - Type of /tmp/gscan2pdf-40nE/icvjuc74VZ.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-40nE/icvjuc74VZ.pgm to /tmp/gscan2pdf-40nE/SkV6_B9lB1.png
INFO - Resizing /tmp/gscan2pdf-40nE/SkV6_B9lB1.png to 1240.10194258633 x 1753.93700787402
INFO - Writing temporary image /tmp/gscan2pdf-40nE/SkV6_B9lB1.png
INFO - Defining page at 595.248932441439pt x 841.889763779528pt
INFO - Added /tmp/gscan2pdf-40nE/SkV6_B9lB1.png at 150 PPI
INFO - Depth of /tmp/gscan2pdf-40nE/hCTYjcUAJe.pgm is 8
INFO - Type of /tmp/gscan2pdf-40nE/hCTYjcUAJe.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-40nE/hCTYjcUAJe.pgm to /tmp/gscan2pdf-40nE/LQFCUKANaf.png
INFO - Resizing /tmp/gscan2pdf-40nE/LQFCUKANaf.png to 11504.1666666667 x 15462.5
INFO - Writing temporary image /tmp/gscan2pdf-40nE/LQFCUKANaf.png
INFO - Defining page at 5522pt x 7422pt
INFO - Added /tmp/gscan2pdf-40nE/LQFCUKANaf.png at 150 PPI
INFO - Closing PDF
DEBUG - Finished saving /home/alistair/Desktop/Renovation/Expenses/2016-07-22 Paragon.pdf
INFO - Wrote config to /home/alistair/.gscan2pdf
All but the last page have a size of around 600 x 842pts. The last page
is 5522 x 7422pts. The page size in the Geometry tab is set to A4, and
the NX100 scanner glass area is only slightly larger than A4 (flatbed
scanner).
pdfinfo returns the following information:
Creator: gscan2pdf v1.5.1
Producer: PDF::API2
CreationDate: þÿ
ModDate: þÿ
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 4
Encrypted: no
Page 1 size: 595.75 x 841.89 pts (A4)
Page 1 rot: 0
Page 2 size: 602.59 x 841.89 pts
Page 2 rot: 0
Page 3 size: 595.25 x 841.89 pts (A4)
Page 3 rot: 0
Page 4 size: 5522 x 7422 pts
Page 4 rot: 0
File size: 2438077 bytes
Optimized: no
PDF version: 1.4
I'm not able to reproduce this problem reliably, but it happens quite
frequently, just about every scanning session, and is incredibly
annoying as it is more likely to heppen the more pages that are scanned
(thus taking longer to rescan).
The complete log file is attached.
Additional version information:
INFO - Perl version v5.24.0
INFO - Glib-Perl version 1.321
INFO - Built for Glib 2.48.1
INFO - Running with Glib 2.48.1
INFO - Gtk2-Perl version 1.2498
INFO - Built for GTK 2.24.30
INFO - Running with GTK 2.24.30
INFO - Gscan2pdf::Document version 1.5.1
INFO - Using GtkImageView version 1.6.4
INFO - Using Gtk2::ImageView version 0.05
INFO - Using PDF::API2 version 2.027
INFO - Using Sane version 1.0.25
INFO - Using libsane-perl version 0.05
Arch Linux 4.6.4-1-ARCH
I haven't changed the priority from 5, but this is significantly impacting
usability from my perspective.
Any suggestions on where to look next to help track this down?
Thanks very much,
Alistair
It looks like this is triggered by scanning a page that is significantly smaller than A4, in particular one that is narrow, e.g. receipts from cash registers.
HTH,
Alistair
It looks to me as though the textcleaner process is losing the resolution of the last page. You are scanning at 600 PPI and gscan2pdf is sometimes importing the output from textcleaner at 72 PPI.
There are two ways of testing this theory:
If you can confirm the above, I can see what we can do about it.
Hi Jeff,
Thanks for the hints. I've tried scanning several times, and of course the problem hasn't reappeared (with textcleaner). I'll keep testing and let you know.
Thanks again,
Alistair
Hi Jeff,
It looks like your hypothesis about textcleaner is correct. Before
running textcleaner the resolution of the first two pages was 600, after
they were 70 (the third page stayed at 600). Note that the resolution
displayed by properties, 70.0, and the resolution mentioned in the log
file below, 72, obviously aren't exactly the same.
DEBUG - Free space in /tmp/gscan2pdf-RmoL (Mb): 3847.15234375 (warning at 10)
INFO - textcleaner -e stretch -f 23 -o 6 -u -T -p 10 /tmp/gscan2pdf-RmoL/3gpWZRS6N8.pnm /tmp/gscan2pdf-RmoL/mQheECLnEB.pnm
INFO - New page filename /tmp/gscan2pdf-RmoL/mQheECLnEB.pnm, format Portable graymap format (gray scale)
INFO - textcleaner -e stretch -f 23 -o 6 -u -T -p 10 /tmp/gscan2pdf-RmoL/bA53zjjrkr.pnm /tmp/gscan2pdf-RmoL/jJZeZ4GyW4.pnm
INFO - Replaced /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm at page 1 with /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm, resolution 72
INFO - New page filename /tmp/gscan2pdf-RmoL/jJZeZ4GyW4.pnm, format Portable graymap format (gray scale)
INFO - textcleaner -e stretch -f 23 -o 6 -u -T -p 10 /tmp/gscan2pdf-RmoL/5d4TmgcSfY.pnm /tmp/gscan2pdf-RmoL/7r408Dn5Ey.pnm
INFO - Replaced /tmp/gscan2pdf-RmoL/s5tC1nObpn.pgm at page 2 with /tmp/gscan2pdf-RmoL/s5tC1nObpn.pgm, resolution 72
INFO - New page filename /tmp/gscan2pdf-RmoL/7r408Dn5Ey.pnm, format Portable graymap format (gray scale)
INFO - Replaced /tmp/gscan2pdf-RmoL/sppfviqxye.pgm at page 3 with /tmp/gscan2pdf-RmoL/sppfviqxye.pgm, resolution 603.185858585859
(end of log)
Some file info:
The pnm and pgm image sizes obviously don't agree, however the
differences are much less than the change in dpi would suggest (600 ->
70).
Modifying the textcleaner command to avoid any options which
change the image size appears to avoid the problem, i.e. the unrotate
(-u) and padding (-p) options.
I can write a script to rescale the image to the original size, but
would be interested to know if you have a better solution.
Thanks very much,
Alistair
P.S. I'm finding the log extract a bit confusing:
INFO - textcleaner -e stretch -f 23 -o 6 -u -T -p 10 /tmp/gscan2pdf-RmoL/3gpWZRS6N8.pnm /tmp/gscan2pdf-RmoL/mQheECLnEB.pnm
INFO - New page filename /tmp/gscan2pdf-RmoL/mQheECLnEB.pnm, format Portable graymap format (gray scale)
However this filename doesn't exist (this is immediately after
textcleaner is run, no other operations have been performed).
INFO - Replaced /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm at page 1 with /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm, resolution 72
The two log lines above were referencing a pnm and now the page is
replaced with a pgm. Is there a conversion message missing?
Saving the file produces:
DEBUG - save filename dialog returned ok
DEBUG - save filename dialog returned ok
DEBUG - Started saving /home/alistair/Desktop/Renovation/Expenses/2016-08-15 UCB sro.pdf
INFO - Using /usr/share/fonts/TTF/Aegean_hint.ttf for non-ASCII text
INFO - Depth of /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm is 8
INFO - Type of /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-RmoL/8xVhl1ztn_.pgm to /tmp/gscan2pdf-RmoL/yiNaylZW5_.png
INFO - Resizing /tmp/gscan2pdf-RmoL/yiNaylZW5_.png to 10700 x 14897.9166666667
INFO - Writing temporary image /tmp/gscan2pdf-RmoL/yiNaylZW5_.png
INFO - Defining page at 5136pt x 7151pt
INFO - Added /tmp/gscan2pdf-RmoL/yiNaylZW5_.png at 150 PPI
INFO - Depth of /tmp/gscan2pdf-RmoL/s5tC1nObpn.pgm is 8
INFO - Type of /tmp/gscan2pdf-RmoL/s5tC1nObpn.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-RmoL/s5tC1nObpn.pgm to /tmp/gscan2pdf-RmoL/SvKaqkPAr2.png
INFO - Resizing /tmp/gscan2pdf-RmoL/SvKaqkPAr2.png to 10739.5833333333 x 14918.75
INFO - Writing temporary image /tmp/gscan2pdf-RmoL/SvKaqkPAr2.png
INFO - Defining page at 5155pt x 7161pt
INFO - Added /tmp/gscan2pdf-RmoL/SvKaqkPAr2.png at 150 PPI
INFO - Depth of /tmp/gscan2pdf-RmoL/sppfviqxye.pgm is 8
INFO - Type of /tmp/gscan2pdf-RmoL/sppfviqxye.pgm is Grayscale
INFO - Selecting png compression
INFO - Converting /tmp/gscan2pdf-RmoL/sppfviqxye.pgm to /tmp/gscan2pdf-RmoL/yJcMJLWyQR.png
INFO - Resizing /tmp/gscan2pdf-RmoL/yJcMJLWyQR.png to 1242.90049133054 x 1753.93700787402
INFO - Writing temporary image /tmp/gscan2pdf-RmoL/yJcMJLWyQR.png
INFO - Defining page at 596.592235838661pt x 841.889763779528pt
INFO - Added /tmp/gscan2pdf-RmoL/yJcMJLWyQR.png at 150 PPI
INFO - Closing PDF
DEBUG - Finished saving /home/alistair/Desktop/Renovation/Expenses/2016-08-15 UCB sro.pdf
I don't think anything needs to be rescaled. The image formats returned by the scanner (or rather SANE) are pnm (pgm or pbm). This is a really primitive format, which doesn't include metadata like resolution. Therefore, gscan2pdf either knows the resolution, because it used it when scanning, or otherwise tried to guess it given your current list of paper sizes. If the pnm output from textcleaner is not a standard size, gscan2pdf defaults to 72.
The image itself is not affected, just what gscan2pdf thinks the resolution, and therefore page size, should be. i.e. if it thinks the resolution is 72, when it is in fact 144, then the page length will be double the size it should be.
The annoying workaround for you would be to set the resolution in the image properties dialog back to whatever it should be.
Another solution (assuming textcleaner understands png and the resolution metadata field) would be for me to add an option to force gscan2pdf to convert images from pnm to png before doing anything with them.
On reflection, a better solution would be if the output from a user-defined tool is pnm, then it is probably better to assume that resolution has not changed (although it could have be rescaled), and thus use the resolution from the input image.
This was easy to implement, so I have done so. BTW this doesn't just affect v1.5.1, but every other before, too.
Hi Jeff,
First off, thanks very much for looking at this. This will certainly
solve my problem. It has the page size drawback I mention below,
however I can easily work around that by either printing with "scale to
fit" or scaling the image back to the original size.
The rest of this is comments and questions in case you want to look at
it further...
Isn't the output format specified by gscan2pdf? The two options are to
edit the image in place, or for gscan2pdf to specify a separate output
file, right?
I think the drawback with this is that it can make the page slightly
bigger. If the page isn't being scaled down while printing, there's a
possibility that it will be clipped because it is outside the printable
area. (This might not be an issue in practice because of the "scale
to fit" option that is typically available).
I don't know much about png, but assuming it doesn't lose any
information (from the pnm file), and allows the page size & resolution to be set,
this seems like the best solution.
textcleaner does understand png (it's just a wrapper around
imagemagick). If it doesn't allow the metadata to be edited, I can
easily do that separately.
I think what would suit me best is to keep the print size the same
regardless of the number of pixels / resolution, because what I am
typically scanning is something that I plan to print on an A4 page, and
that way I don't have to worry about the drawback mentioned above.
Thanks again,
Alistair
P.S.
I wrote:
Just to be clear: I'm not asking for a bespoke solution here, but a
general one that is likely to meet the most scenarious (including mine
:-)). Your png suggestion seems like the best option to me.
Thanks again,
Alistair
Just to be clear. This should have been fixed in v1.5.2. If not, please say so.
Hi Jeffrey,
Yep, I haven't seen this problem since I upgraded to 1.5.2. I've just
upgraded to 1.6.0, still no problems (only a few pages scanned since
the latest upgrade, but I'm not expecting any problems).
Thanks!
Alistair
On 9 December 2016 at 19:41, Jeffrey Ratcliffe ra28145@users.sf.net wrote:
Related
Bugs:
#227