Update to recent version of tesseract
Quickly OCR part of the screen and save resulting text to clipboard
Brought to you by:
cb4960
Can you please update to the recent leptonica[1] (1.83) and tesseract (5.3.0)[2]? There is a lot of fixes and improvements and speed improvements.
Also please consider minimalist leptonica build (only really need libraries) - see e.g. first part of[3].
[1] https://github.com/DanBloomberg/leptonica/releases/tag/1.83.0
[2] https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.0
[3] https://bucket401.blogspot.com/2021/03/building-tesserocr-on-ms-windows-64bit.html
I second this
It would be so great.
So, in version 4.6.3 Capture2Text we have -- leptonica 1.74.4 and tesseract 4.00 , which represented in "pvt.cppan.demo.danbloomberg.leptonica-1.74.4.dll" and "tesseract400.dll"
Will it work, if we will replace those dll on a build, made by those instructions? -- https://bucket401.blogspot.com/2021/03/building-tesserocr-on-ms-windows-64bit.html
No, replacing of dll will not work. You have to recompile Capture2Text against tesseract (and its dependencies).
Updated tesseract here https://github.com/setsumi/Capture2TextPlus#capture2textplus
Haven't noticed any improvements. Trained data is still the same.
capture2text uses tesseract 4.0 build at 2017 (at the moment 7 years ago!) Check how many commits where made https://github.com/tesseract-ocr/tesseract/commits/main/. Non of them are relevant to you? e.g. speed improvements?
Official trainnedata did not changed, but you can do custom fine tuning for you case.