Menu

#218 Update to recent version of tesseract

4.X
open
nobody
None
2024-06-06
2023-01-24
Zdenko
No

Can you please update to the recent leptonica[1] (1.83) and tesseract (5.3.0)[2]? There is a lot of fixes and improvements and speed improvements.
Also please consider minimalist leptonica build (only really need libraries) - see e.g. first part of[3].

[1] https://github.com/DanBloomberg/leptonica/releases/tag/1.83.0
[2] https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.0
[3] https://bucket401.blogspot.com/2021/03/building-tesserocr-on-ms-windows-64bit.html

Discussion

  • Gabriel Lambert

    Gabriel Lambert - 2023-02-10

    I second this

     
  • John Smith

    John Smith - 2023-03-20

    It would be so great.

     
  • John Smith

    John Smith - 2023-03-24

    So, in version 4.6.3 Capture2Text we have -- leptonica 1.74.4 and tesseract 4.00 , which represented in "pvt.cppan.demo.danbloomberg.leptonica-1.74.4.dll" and "tesseract400.dll"
    Will it work, if we will replace those dll on a build, made by those instructions? -- https://bucket401.blogspot.com/2021/03/building-tesserocr-on-ms-windows-64bit.html

     
  • Zdenko

    Zdenko - 2023-03-24

    No, replacing of dll will not work. You have to recompile Capture2Text against tesseract (and its dependencies).

     
  • Setsumi

    Setsumi - 2024-06-06

    Updated tesseract here https://github.com/setsumi/Capture2TextPlus#capture2textplus
    Haven't noticed any improvements. Trained data is still the same.

     
  • Zdenko

    Zdenko - 2024-06-06

    capture2text uses tesseract 4.0 build at 2017 (at the moment 7 years ago!) Check how many commits where made https://github.com/tesseract-ocr/tesseract/commits/main/. Non of them are relevant to you? e.g. speed improvements?

    Official trainnedata did not changed, but you can do custom fine tuning for you case.

     

Log in to post a comment.