convert and gs security restrictions breaks pdfsandwich
In my case was this classic issue.
I think the answer lies at https://sourceforge.net/p/pdfsandwich/code/HEAD/tree/trunk/src/pdfsandwich.ml#l255 But I don't know enough to check.
Large file size - any way to use original images?
pdfsandwich: permissions are to restrictive, not respecting umask
It's a design feature of Tesseract. https://github.com/tesseract-ocr/tesseract/issues/1769 As most PDF viewers display transparently bitmap data under the GlyphLessFont -> the solution would be rather to fix the bug in Evince.
After some research, I found that the issue is with the current version of ghostscript (9.27). It occurs on a Mac, with ImageMagick (and ghostscript) installed via Macports. The solution is to downgrade to the prior version of Ghostscript. Here are a list of commands in case anyone can benefit from them: git clone --single-branch https://github.com/macports/macports-ports.git cd macports-ports git checkout af11993aab38d7ba77c2df895aa5ce9b405c5681 cd print/ghostscript sudo port install When asked...
Resulting PDS has skewed pages
Abort trap: 6
Strangely enough, we don't see any specific error message from convert, only the message that it doesn't work. Do you see more if you execute the same command "manually", i.e. convert -units PixelsPerInch -type Bilevel -density 300x300 "mypdf.pdf[1]" "output.pbm" Does this work or do you see any error?
pdfsandwich failes after 'convert' error
Problem re-appeared in Ubuntu 19.04 with ImageMagick 6.9.10-14 :(
[Feature] Allow passing multiple files as arguments
Outputs only blank pages.
Thank you that did the trick, and your program works beautifully!
Hi, this looks like an ImageMagick problem and might be a duplicate of Bug #22: On some systems, ImageMagick has not enough permissions to deal with pdf files. You can easily fix this. Look, for instance, here: https://stackoverflow.com/questions/42928765/convertnot-authorized-aaaa-error-constitute-c-readimage-453 Does that help?
Outputs only blank pages.
unpaper option --output-pages 2
Great to hear, although that does not solve the problem with 18.04 yet. It seems there is no other way than having the users to change the policy file manually. And even this only works if the user has admin rights.
The latest version distributed with Ubuntu 18.10 seems to ship with a version which does not forbid PDF handling by default. Here is a link to that version if you want to test it on your installation.
I don't have an idea myself but asked in the official forum.
Disable tesseract parallelization
Thanks for bringing this up, this is pretty awful. I don't see a way to override this policy with -limit or with an environment variable. Do you have any ideas? The only way seems to be to edit the policy.xml file manually, which would be a nuisance for occasional users.
Maybe you can overwrite it with an environment variable or -limit command line option. See ImageMagick architecture and search for policy.xml. I noticed that in Ubuntu 18.04 the problem is a coder policy, while in the linked commit it is a module policy. Anyway, I hope you find a way to overwrite this policy!
Maybe you can overwrite it with an environment variable or -limit command line option. See ImageMagick architecture and search for policy.xml. Note that the commit linked in the original bug description is quite new and thus documentation does not include the module policy yet.
I just noticed this is fixed in the latest version. This issue can be closed.
Disable tesseract parallelization
ImageMagick policy forbids handling PDF files
Seems that one hunk escaped... See updated, attached patch:
website updated
Please create temp dir under subdirectory
From version 0.1.7 on, a global temporary directory for all pdfsandwich temp files is used.
Unable to convert .tif (multipages .tif file) to pdf using pdfsandwich on Ubuntu
files put into /tmp that are world readable.
This has been indirectly fixed by generating a user readable temp directory which now contains all the temp files.
Spelling errors (patch)
Fixed in version 0.1.7.
Currently, pdfsandwich only accepts pdf files as input files. Could you convert your tiff to pdf (there should be command line utilities for this) and try it with the resulting pdf as input to pdfsandwich?
website update
global temp dir completed
bugfix to prevent hang ups of tesseract
Unable to convert .tif (multipages .tif file) to pdf using pdfsandwich on Ubuntu
I have installed pdfsandwich on Ubuntu and I'm trying to execute below command for .tif (Multipages tif file) to .pdf file and it throws below error message. Can you please help me on this? $ /usr/bin/pdfsandwich -verbose -lang spa+eng+fra Sample_3_Multi_page.tif -o Sample_3_Multi_page.pdf pdfsandwich version 0.1.4 Checking for convert: convert -version Version: ImageMagick 6.8.9-9 Q16 x86_64 2018-07-10 http://www.imagemagick.org Copyright: Copyright (C) 1999-2014 ImageMagick Studio LLC Features:...
I've came up with the attached patch. Note that I'm not very familiar with OCaml, so you might want to review it throughoutly. Thanks!
Spelling errors (patch)
files put into /tmp that are world readable.
It might also be a good idea to remove temporary files as soon as they are no longer needed (e.g. when a page is processed succesfully). Further, I have a lot of magick-xxx files staying even after successful exit.
trunk/src/pdfsandwich.ml: ouput changed to output
convert options: parameter order fixed
Please create temp dir under subdirectory
crash in pdfunite with 10k pages pdf: too many open files
pdfsandwich increases PDF size by 7 times!
bugfix: convert command line order
Superfluous convert command and missing resolution units
Good points. The seemingly superfluous 2nd call of convert is indeed due to unpaper...
version 0.1.6 finalized
version 0.1.6 finalized
new option: grayscale output; bug fix
new option: grayscale output; bug fix
new option: grayscale output; bug fix
Because of unpaper in the middle, though it's not clear to me why unpaper can't handle...
Superfluous convert command and missing resolution units
small bug fixes
Lithuanian symbols are not recognized
Yes, the Lithuanian chars are now recognized. Thank you.
Should cope with 'unpaper' without '-version' support.
Unpaper version is now (pdfsandwich 0.1.5) checked with -V, which should work with...
Should cope with 'unpaper' without '-version' support.
Ghostscript is replaced in version 0.1.5 now. Does that solve the problem?
Could not determine number of pages
spaces between almost every character, somewhat scrambled select order
Use less ghostscript
Fixed in version 0.1.5. Ghostscript is now only used for resizing pdf pages (which...
deb packaging bugs fixed
little bug fixes
Confirmed. "unpaper -V" works on Fedora 24.
I would advise against using pdftk. It depends on iText which changed its license....
Oh, thanks for pointing this out. It seems unpaper -V (capital V) works with all...
It works nice for me (trunk version 62), Tobias. Thanks. I needed to make one change...
Great, Michal, the tif conversion really seems to do the trick. We don't need mogrify...
website updated
gs only required for resizing now
gs only required for resizing now
gs only required for resizing now
gs only required for resizing now
removing gs further
manual modified
gs replaced by pdfunite