Tess4J / Bugs / #4 Add the ability to return confidence level

Quan Nguyen - 2013-06-07

They are supported.

http://tess4j.sourceforge.net/docs/docs-1.1/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Dmitry Katsubo - 2013-06-07

Indeed they are available in TessAPI, but handle is deleted in doOCR(). What should be the flow then? Expected something like:

Tesseract tess = Tesseract.getInstance(); String result = tess.doOCR(...); int confidence = tess.getMeanTextConf(); ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-06-07

The more general Tesseract class is not final, so you can certainly extend it to expose more functionality provided by the lower level TessAPI interface.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Extending Tesseract does not help too much, as still whole method Tesseract.doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) should be copy-pasted. Would be nice if, for example, initialization block would be extracted to separate function:

protected TessAPI.TessBaseAPI prepareTessAPI(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) {
    TessAPI api = TessAPI.INSTANCE;
    TessAPI.TessBaseAPI handle = api.TessBaseAPICreate();
    ...
    api.TessBaseAPISetRectangle(handle, rect.x, rect.y, rect.width, rect.height);
    retrun handle;
}

plus maybe another helper:

protected String getOCRText(TessAPI.TessBaseAPI handle) {
    TessAPI api = TessAPI.INSTANCE;
    Pointer utf8Text = hocr ? api.TessBaseAPIGetHOCRText(handle, pageNum - 1) : api.TessBaseAPIGetUTF8Text(handle);
    String text = utf8Text.getString(0);
    api.TessDeleteText(utf8Text);
    return text;
}

Basically, above mentioned doOCR() is now decomposed:

public String doOCR(int xsize, int ysize, ByteBuffer buf, Rectangle rect, int bpp) throws TesseractException {
    TessAPI.TessBaseAPI handle = prepareTessAPI(xsize, ysize, buf, rect, bpp);
    String text = getOCRText(handle);
    TessAPI.INSTANCE.TessBaseAPIDelete(handle);
    return text;
}

Now extending Tesseract makes sense. If you have another scenario in mind, please share a complete example.

Another note: I think that expression

if (rect != null && !rect.equals(EMPTY_RECTANGLE)) {

should be better turned into:

if (rect != null && !rect.isEmpty()) {

and then one don't need static EMPTY_RECTANGLE.

Anonymous

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-06-15

doOCR is a simple method that encapsulates Tesseract engine initialization, processing a single image, and then shutdown. It is not efficient if you process multiple images. Sure you can override it with a more efficient algorithm in which the engine is initialized once, processes or manipulates all the images, and finally shuts down to release used resources.

Due to my personal work, it could be some time before I can get back on this. You're welcome to submit a patch. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Dmitry Katsubo - 2013-06-20

I am attaching my first attempt for your review. From my perspective it is a step for better because:

Batch image processing is now faster, as init() / dispose() are called only once.

Class that extends Tesseract1 can implement other OCRing strategy easily, as all needed functions are now a separate blocks.

Notes:

I think that all occurrences of IIOImage can be replaced by RenderedImage with no impact as IIOImage is used as wrapper for RenderedImage. ImageIO is forced to read thumbnails, which are not used.

Using of System.err in the library is mauvais ton, as if it is used in AS application, you don't know where it is logged to (if logged at all). So logger, logger is the way out. Or throw further.

Having two approaches as Tesseract1 and Tesseract makes no sense to me. If one is left it simplifies the development, reduces code duplication. Extension (Tesseract1) or aggregation (Tesseract): you need to choose one. I personally thing that extension (Tesseract1) is more natural in respect to handle.

1.patch
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-09-07

Dmitry,

It turns out that Tesseract class cannot be extended due to the private constructor. Tesseract1 is the extensible one here as necessary elements are exposed for access to inheriting classes.

I incorporated many of your suggestions, including logging, into the code baseline. Tesseract is maintained because the alternative direct mapping method that Tesseract1 is based on was until recently still an experimental feature for JNA.

Please help test the changes. Version 1.2 will be released soon. Thanks.

Quan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2013-09-09

Could you upload .jar and .source.jar somewhere (e.g. Maven snapshots)? I will test against binaries that you will create when you make a release.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-09-09

1.2-Beta attached.

Tess4J-1.2-Beta-src.zip

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-09-22

Fixed with release of v1.2. Special thanks to Dmitry Katsubo for the software patch, testing, and valuable suggestions.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2013-09-22

status: open --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2021-05-03

I know this issue is a years old, but I'm wondering what is the current 'best' way to get the confidences? Like others, I am also confused by the difference between Tesseract vs Tesseract1 and TessAPI vs TessAPI1

I see what you said about doOcr() being intended for a single image because it shuts down after processing. What is the best way to be able to process multiple images? Is there any documentation on the best way to do this (as well as getting the confidences)

thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Peter Kronenberg - 2021-05-03

I just entered that last post, but I wasn't logged in.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2021-05-03

Documentation: http://tess4j.sourceforge.net/docs/docs-4.4/

You can pass in a List<IIOImage> to doOCR method. There are other methods in Tesseract class that returns confidence values.

JNA Direct Mapping: https://github.com/java-native-access/jna/blob/master/www/DirectMapping.md

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Peter Kronenberg - 2021-05-04

I see TessBaseAPIAllWordConfidences, which says that it returns the same number of values as that returned by GetUTF8. But TessBaseAPIGetUTF8Text returns a single string, not an array. Can you provide an example? I've read the Javadoc, but it's not always clear without an example.

Is there an efficient way to process multiple images, but one at a time, without sending them all in as an array.

TessBaseAPIAllWordConfidences() doesn't seem to work with doOCR(), because doOCR() closes everything down instead of leaving it open for the TessBaseAPIAllWordConfidences() call

Last edit: Peter Kronenberg 2021-05-04

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Quan Nguyen - 2021-05-05

Please continue the discussion either in the Discussion section or over on GitHub site rather than on this old, closed ticket.

Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Add the ability to return confidence level

Group

Searches

Help

#4 Add the ability to return confidence level

Discussion