#4 Add the ability to return confidence level

None
closed
None
5
2013-09-22
2013-06-06
No

Would be nice if Tess4J can also return:

Discussion

  • Dmitry Katsubo

    Dmitry Katsubo - 2013-06-07

    Indeed they are available in TessAPI, but handle is deleted in doOCR(). What should be the flow then? Expected something like:

    Tesseract tess = Tesseract.getInstance();
    String result = tess.doOCR(...);
    int confidence = tess.getMeanTextConf(); ?
    
     
  • Quan Nguyen

    Quan Nguyen - 2013-06-07

    The more general Tesseract class is not final, so you can certainly extend it to expose more functionality provided by the lower level TessAPI interface.

     
  • Dmitry Katsubo

    Dmitry Katsubo - 2013-06-10

    Extending Tesseract does not help too much, as still whole method Tesseract.doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) should be copy-pasted. Would be nice if, for example, initialization block would be extracted to separate function:

    protected TessAPI.TessBaseAPI prepareTessAPI(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) {
        TessAPI api = TessAPI.INSTANCE;
        TessAPI.TessBaseAPI handle = api.TessBaseAPICreate();
        ...
        api.TessBaseAPISetRectangle(handle, rect.x, rect.y, rect.width, rect.height);
        retrun handle;
    }
    

    plus maybe another helper:

    protected String getOCRText(TessAPI.TessBaseAPI handle) {
        TessAPI api = TessAPI.INSTANCE;
        Pointer utf8Text = hocr ? api.TessBaseAPIGetHOCRText(handle, pageNum - 1) : api.TessBaseAPIGetUTF8Text(handle);
        String text = utf8Text.getString(0);
        api.TessDeleteText(utf8Text);
        return text;
    }
    

    Basically, above mentioned doOCR() is now decomposed:

    public String doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) throws TesseractException {
        TessAPI.TessBaseAPI handle = prepareTessAPI(xsize, ysize, buf, rect, bpp);
        String text = getOCRText(handle);
        TessAPI.INSTANCE.TessBaseAPIDelete(handle);
        return text;
    }
    

    Now extending Tesseract makes sense. If you have another scenario in mind, please share a complete example.

    Another note: I think that expression

    if (rect != null && !rect.equals(EMPTY_RECTANGLE)) {
    

    should be better turned into:

    if (rect != null && !rect.isEmpty()) {
    

    and then one don't need static EMPTY_RECTANGLE.

     
  • Quan Nguyen

    Quan Nguyen - 2013-06-15

    doOCR is a simple method that encapsulates Tesseract engine initialization, processing a single image, and then shutdown. It is not efficient if you process multiple images. Sure you can override it with a more efficient algorithm in which the engine is initialized once, processes or manipulates all the images, and finally shuts down to release used resources.

    Due to my personal work, it could be some time before I can get back on this. You're welcome to submit a patch. Thanks.

     
  • Dmitry Katsubo

    Dmitry Katsubo - 2013-06-20

    I am attaching my first attempt for your review. From my perspective it is a step for better because:

    • Batch image processing is now faster, as init() / dispose() are called only once.
    • Class that extends Tesseract1 can implement other OCRing strategy easily, as all needed functions are now a separate blocks.

    Notes:

    • I think that all occurrences of IIOImage can be replaced by RenderedImage with no impact as IIOImage is used as wrapper for RenderedImage. ImageIO is forced to read thumbnails, which are not used.
    • Using of System.err in the library is mauvais ton, as if it is used in AS application, you don't know where it is logged to (if logged at all). So logger, logger is the way out. Or throw further.
    • Having two approaches as Tesseract1 and Tesseract makes no sense to me. If one is left it simplifies the development, reduces code duplication. Extension (Tesseract1) or aggregation (Tesseract): you need to choose one. I personally thing that extension (Tesseract1) is more natural in respect to handle.
     
  • Quan Nguyen

    Quan Nguyen - 2013-09-07

    Dmitry,

    It turns out that Tesseract class cannot be extended due to the private constructor. Tesseract1 is the extensible one here as necessary elements are exposed for access to inheriting classes.

    I incorporated many of your suggestions, including logging, into the code baseline. Tesseract is maintained because the alternative direct mapping method that Tesseract1 is based on was until recently still an experimental feature for JNA.

    Please help test the changes. Version 1.2 will be released soon. Thanks.

    Quan

     
  • Anonymous - 2013-09-09

    Could you upload .jar and .source.jar somewhere (e.g. Maven snapshots)? I will test against binaries that you will create when you make a release.

     
  • Quan Nguyen

    Quan Nguyen - 2013-09-09

    1.2-Beta attached.

     
  • Quan Nguyen

    Quan Nguyen - 2013-09-22

    Fixed with release of v1.2. Special thanks to Dmitry Katsubo for the software patch, testing, and valuable suggestions.

     
  • Quan Nguyen

    Quan Nguyen - 2013-09-22
    • status: open --> closed
     


Anonymous

Cancel  Add attachments





Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks