Re: [gscan2pdf-help] ocropus integration?
Brought to you by:
ra28145
From: Jeffrey R. <jef...@gm...> - 2010-03-11 21:11:45
|
On Wed, Feb 03, 2010 at 02:51:56PM +0100, Bernhard Reiter wrote: > I'm away from my scanner right now, but i had an old (newspaper) scan > sample png which i imported and ran unpaper and ocropus on. > Interestingly, the error did not occur in this case. Pdf export, > however, did not work (stuck at half of the progress bar); and text > export, again, produced an empty file. [...] > ocroscript recognize --tesslanguage=eng /tmp/BBR6OrlpE6/edzE5xhGfF.pnm > /tmp/BBR6OrlpE6/HDQHZ61OhG.txt > Forked PID 3697 > ocroscript: /usr/share/ocropus/scripts/recognize.lua:113: CHECK ./ocr-utils/ocr-utils.cc:833 background_seems_white(a) > Process 3697 exited. Apologies for the late response. I wonder if the output from ocropus is somehow confusing gscan2pdf which is therefore not writing the PDF correctly. Please reproduce the problem with a test image, and give it to me so that I can check the output from ocropus. If you can't do that, please at least post the output from the ocropus command above, adjusted, of course for filenames. I had to refactor the hocr parser to cope with cuneiform, so it is possible that it will go away with the new release. Regards Jeff |