Applied your patch and then copied the new file over to its place in /root. It seems to work okay for me now.
However Ocropus wont work with a simple binary gif file. A binary G4 tiff (US Patent Office image) and a grayscale png work okay. The black and white gif file creates this in the logfile:
INFO - Found tesseract language eng
INFO - Using ocroscript with recognize.
INFO - tesslanguage=eng ocroscript recognize /tmp/Mx7NMOWh67.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
INFO - tesslanguage=eng ocroscript recognize /tmp/WpXZBl1pYA.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
INFO - tesslanguage=eng ocroscript recognize /tmp/w46jhcRY9Z.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
I switched to GOCR right after that and it worked okay, well it did the best it could with it...
Not sure but I think that problem may be related to something else which I'll file in a different report.
Thanks for all your hard work and the patch!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Which version of ocropus do you have?
What does
echo $OCROSCRIPTS
which ocroscript
ls /usr/share/ocropus/scripts /usr/local/share/ocropus/scripts
give?
ocropus version is 0.3.1-2
office@office:~$ echo $OCROSCRIPTS
office@office:~$ which ocroscript
/usr/bin/ocroscript
office@office:~$ ls /usr/share/ocropus/scripts /usr/local/share/ocropus/scripts
ls: cannot access /usr/local/share/ocropus/scripts: No such file or directory
/usr/share/ocropus/scripts:
align-lines-wordwise.lua rec-bpnet.lua
align.lua rec-guided.lua
align-transcription.lua rec-line.lua
build-ngram-model.lua rec-ltess.lua
check-train-valid-bpnet-feature.lua rec-minimal.lua
degrade.lua recognize.lua
deskew.lua rec-tess-complete.lua
editdist.lua reflow.lua
erode3.lua sauvola.lua
eval-bpnet-on-words.lua segment-line.lua
eval-editdist-layout.lua show.lua
eval-on-word-list.lua showseg.lua
hocr-to-text.lua strict.lua
lib text-to-hocr.lua
line-clean.lua train-bpnet-isolated.lua
matra-clipping.lua train-bpnet-lines.lua
rec-bpnet-isolated.lua
What about:
[[ -n ${OCROSCRIPTS} ]] && echo "variable is set and not blank"
?
[[ -n ${OCROSCRIPTS} ]] && echo "variable is set and not blank"
returns nothing...
I had the same problem...
Try changing line 20 in this file:
/gscan2pdf-1.0.1/lib/Gscan2pdf/Ocropus.pm
from:
local $ENV{OCROSCRIPTS} = "$_/share/ocropus/scripts"
to:
$ENV{OCROSCRIPTS} = "$_/share/ocropus/scripts"
That got it working for me.
Patch
Thanks for the suggestion. Please test the attached patch, which should fix the problem without changing the environment.
Applied your patch and then copied the new file over to its place in /root. It seems to work okay for me now.
However Ocropus wont work with a simple binary gif file. A binary G4 tiff (US Patent Office image) and a grayscale png work okay. The black and white gif file creates this in the logfile:
INFO - Found tesseract language eng
INFO - Using ocroscript with recognize.
INFO - tesslanguage=eng ocroscript recognize /tmp/Mx7NMOWh67.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
INFO - tesslanguage=eng ocroscript recognize /tmp/WpXZBl1pYA.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
INFO - tesslanguage=eng ocroscript recognize /tmp/w46jhcRY9Z.png
ocroscript: /usr/share/ocropus/scripts//lib/hocr.lua:28: rectangle parsing error
I switched to GOCR right after that and it worked okay, well it did the best it could with it...
Not sure but I think that problem may be related to something else which I'll file in a different report.
Thanks for all your hard work and the patch!
Ok, rebuilt from scratch with patch and it was fixed as expected. Thanks for the great job on this
Ignore this suggested problem from earlier post:
> However Ocropus wont work with a simple binary gif file...
The problem is in Ocropus, it doesn't like something about the contents in the gif file I used for testing this.
But, you already knew that...