Menu

#156 Can't do OCR of loaded image with cuneiform

v1.0_(example)
closed-works-for-me
nobody
None
5
2013-03-06
2013-02-13
No

In 1.1.2, I loaded a PDF and then tried to do Tools | OCR on it with cuneiform. It failed with this error in the window from which I launched gscan2pdf:

Cuneiform for Linux 1.1.0
PUMA_XFinalrecognition failed.

Both gocr and tessaract worked with cuneiform failed.

Cuneiform did work when I scanned directly from the scanner instead of loading a PDF.

It may be relevant that the images in the PDF were monochrome, i.e., black and white rather than color or greyscale.

Discussion

  • Jeffrey Ratcliffe

    Are you sure you are using 1.1.2? I fixed this in 1.1.1.

    If so, can you post an example image that cuneiform doesn't process, please?

     
    • Jonathan Kamens

      Jonathan Kamens - 2013-02-14

      Are you sure you are using 1.1.2? I fixed this in 1.1.1.

      Yes, I am using 1.1.2.

      If so, can you post an example image that cuneiform doesn't process, please?

      The document I have that exhibits this problem has some text in it that I do not want to make public, and I don't have time to produce a redacted version, so if you send me your email address (jik@kamens.us) I will email it to you privately, assuming you will keep it private and delete it when you are done with it.

       

      Last edit: Jonathan Kamens 2013-02-14
  • Jeffrey Ratcliffe

    gscan2pdf/cuneiform processed your sample PDF perfectly - i.e. I can't reproduce your problem.

    Which distro and architecture are you using?

     
  • Jonathan Kamens

    Jonathan Kamens - 2013-02-20

    Fedora 18, x86_64, cuneiform version 1.1.0.

     
  • Jeffrey Ratcliffe

    I wonder whether this is down to different versions of ImageMagick converting the image in different ways - or alternatively the Fedora version of cuneiform not linking to ImageMagick properly.

    Which version of ImageMagick do you have?

    What do you get if you import the PDF you sent me into gscan2pdf, save one of the pages as a PNG, and then

    identify <image.png></image.png>

    from the command line?

     
    • Jonathan Kamens

      Jonathan Kamens - 2013-02-23

      $ rpm -q ImageMagick
      ImageMagick-6.7.7.5-3.fc18.x86_64
      $ identify ~/Desktop/foo.png
      /home/jik/Desktop/foo.png PNG 2544x3299 2544x3299+0+0 8-bit PseudoClass 2c 64.2KB 0.000u 0:00.000
      $

       
  • Jeffrey Ratcliffe

    The image imports from PDF as 8-bit - i.e. greyscale or colour.

    What happens if you use Tools/Threshold after importing. Can you then get cuneiform to process the image?

     
  • Jonathan Kamens

    Jonathan Kamens - 2013-02-28

    OCR with Cuneiform is successful after I do Tools/Threshold (but BTW there's another bug there -- after I do Tools/Threshold and click Apply, the Tools/Threshold window stays up; shouldn't it go away after the work is done?).

     
  • Jeffrey Ratcliffe

    As far as Cuneiform is concerned, do you consider the bug closed?

    For the Threshold dialog, I see your point. The OCR dialog is hidden when you start the process. However the Scan dialog is not...

     
  • Jonathan Kamens

    Jonathan Kamens - 2013-03-04

    I don't think the cuneiform bug is closed because I shouldn't have to run Tools / Threshold to get cuneiform to work, and in any case there's no way for anyone who hasn't read this bug to know that will fix the issue.

     
  • Jeffrey Ratcliffe

    But its not gscan2pdf's fault that your cuneiform build can't deal with 8-bit depth images. It works fine here.

    I'm using cuneiform 1.1.0, too, so I wonder whether Fedora 18 is building it against imagemagick properly.

     
  • Jonathan Kamens

    Jonathan Kamens - 2013-03-05

    Fair enough. If you tell me what distribution and architecture you're using on which it works, I'll file a bug for it with the Fedora folks and then you can consider the issue closed in gscan2pdf.

     
  • Jeffrey Ratcliffe

    gentoo amd64

     
  • Jeffrey Ratcliffe

    • status: open --> closed-works-for-me
     

Log in to post a comment.

MongoDB Logo MongoDB