Menu

Multithreading with Tess4j

Help
2014-02-02
2014-02-07
  • Tuan Nguyen

    Tuan Nguyen - 2014-02-02

    Hi Quan,

    I tried to use the sample code below, it works with 1 thread when I but
    create 3 threads and parsing 3 different images at the same time and I got
    error. I had a look at Tesseract.java class and it is a singleton. How
    would I do to make Tess4J multithreading?

    Thank you very much for reading the question. Hey, I'm also Vietnamese, I
    can talk in both English and Vietnamese :)

    Error:

    Core dump and,

    Error in pixGetWidth: pix not defined

    If you would like to submit a bug report, please visit:

    Error in boxCreate: w and h not both >= 0

    http://bugreport.sun.com/bugreport/crash.jsp

    Error in pixClipRectangle: pixs not defined

    The code:

    public class TesseractExample {

    public static void main(String[] args) {
        File imageFile = new File("eurotext.tif");
        Tesseract instance = Tesseract.getInstance();  // JNA Interface Mapping
        // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
    
        try {
            String result = instance.doOCR(imageFile);
            System.out.println(result);
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
        }
    }
    

    }

     
  • Quan Nguyen

    Quan Nguyen - 2014-02-02

    Hi Tuan,

    Can you try Tesseract1 instead?

    Thanks.

    Quan

     
  • Tuan Nguyen

    Tuan Nguyen - 2014-02-02

    Perfect, it works :). Thank you very much Quan!!!

    Hey Quan, my last question. Do you have any suggestions that I could use to get the best ocr results? I'm talking image pre-processing and how to use Tess4j to get the best of it, or anything you think I should look into.

    Thank you,

     
  • Quan Nguyen

    Quan Nguyen - 2014-02-07

    Tesseract, and Tess4J, contains little preprocessing of images; Leptonica has many routines that you can use to improve your images. Some of the common image enhancements include deskew, dewarp, denoise, despeckle, and threshold/binarize. Tesseract Forum and StackOverflow have a lot of discussion topics on image processing techniques. You may want to research at those sites.

     

Log in to post a comment.