VietOCR v6.19.0 & VietOCR.NET v6.16.0 Releases
Hi, I am currently having this issue of Tesseract mode when it just yelling at me the Error-1 but there is no extra information why the error appears. Is there a way to get log or diagnose this without touching the source code? And by the way, how to resolve this error?
VietOCR v6.17.0 & VietOCR.NET v6.15.0 Releases
Everything is clear, thanks for the information... and for the great program!
The scanning function only works in Linux 32-bit. We don't know how to generate the jsane binary suitable for Linux 64-bit, which is much more prevalent nowadays. Maybe in the next release, we'll hide the function in the user interface to avoid confusion to the users. https://github.com/nguyenq/VietOCR3/issues/6
OK.. I solved the problem by installing the latest version of the program in the Linux system: leptonica-1.85.0.tar.gz and I compiled it exactly according to this tutorial: https://stackoverflow.com/questions/29626463/tess4j-on-ubuntu-linux-unsatisfiedlinkerror Ps. If only my Canon Lide 220 scanner worked in the program, it would be perfect Unfortunately, I get this message: Cannot invoke "uk.co.mmscomputing.device.Scanner.addListener(uk.co.mmscomputing.device.Scanner.addListener)"because"this.this$0.scanner"...
OK.. I solved the problem by installing the latest version of the program in the Linux system: leptonica-1.85.0.tar.gz and I compiled it exactly according to this tutorial: https://stackoverflow.com/questions/29626463/tess4j-on-ubuntu-linux-unsatisfiedlinkerror
The program started correctly in Linux. The scanned image file loaded correctly but when clicking on the OCR processing icon I get an error message: 1- Error looking up function 'returnErrorFloat1': /lib/x86_64-linux-gnu/liblept.so.5: undefined symbol: returnErrorFloat1 2- Could not initialize class net.sourceforge.lept4j.Leptonica1 maybe it's some bug related to the latest version of tesseract v5.5 which I compiled from github and then replaced the files in Linux? regards
VietOCR v6.15.0 & VietOCR.NET v6.14.0 Releases
Hello with Mahmoud Abdel Aleem I saw your contributions in GitHub about Tesseract and I benefited from you well Thank you for your useful contributions, I want you to help me with the following: 1- I have a set of digital images of book covers, 10 images in Arabic, I want to convert them to text using Tesseract 2- The conversion model is inaccurate and does not recognize most of the words ara.traineddata in the tessdata file in Tesseract 3- I created a model ara1.traineddata using jtessboxeditor...
jTessBoxEditor v2.6.0 & jTessBoxEditorFX 2.6.0 Releases
VietOCR v6.14.0 & VietOCR.NET v6.13.0 Releases
Thanks a lot for the guidance. I will try and update
It's because the program does not know where tesseract's tessdata directory is. You can define TESSDATA_PREFIX environment variable that contains the path to the directory. https://stackoverflow.com/questions/65597552/how-exactly-to-set-up-and-use-environment-variables-on-a-mac
VietOCR.NET v6.13.0 Release
Dear Nguyenq, Need advice on the VietOCR installation in macOS Sonoma. Whenever I open VietOCR I get the message: "tessdata folder is not found. Please install lanugage packs and/or set TESSDATA_PREFIX environment variable to parent directory of tessdata." Please help
Dear Nguyenq, Need advice on the VietOCR installation in macOS Sonoma. Whenever I open VietOCR I get the message: "tessdata folder is not found. Please install lanugage packs and/or set TESSDATA_PREFIX environment variable to parent directory of tessdata." Please help
LSTM or WORDSTR box files generated are to be used in Tesseract 4.x training, not in this program itself. The support for generating these box files was incorporated into the program on a request by a Tesseract developer. You can edit the box files, though, before using them in the training process. https://tesseract-ocr.github.io/tessdoc/tess4/TrainingTesseract-4.00.html
Hello everyone ! I have created a box file with the latest jtessboxeditor. I have deleted all empty spaces. When I try to train with the "trainer", I get the following error message : “cannot train with LSTM or WORDSTR box files. Training for Tesseract 4.0x is not supported” What is the issue ? creates jtessboxeditor only WORDSTR or LSTM ADDENDUM: if i do not edit the boxfile (delete blank characters) the boxfile is not recognized as WORDSTR or LSTM best regards !
Hello everyone ! I have created a box file with the latest jtessboxeditor. I have deleted all empty spaces. When I try to train with the "trainer", I get the following error message : “cannot train with LSTM or WORDSTR box files. Training for Tesseract 4.0x is not supported” What is the issue ? creates jtessboxeditor only WORDSTR or LSTM best regards !
Hello everyone ! I have created a box file with the latest jtessboxeditor. I have deleted all empty spaces. When I try to train with the "trainer", I get the following error message : “cannot train with LSTM or WORDSTR box files. Training for Tesseract 4.0x is not supported” What is the issue ? Erstellt jtessboxeditor nur WORDSTR oder LSTM ? best regards !
VietOCR v6.13.1 & VietOCR.NET v6.11.1 Releases
VietOCR v6.13.0 & VietOCR.NET v6.11.0 Releases
jTessBoxEditor v2.5.0 & jTessBoxEditorFX 2.5.0 Releases
VietOCR v6.12.0 & VietOCR.NET v6.10.0 Releases
Hello all, I am using vietocr 6.10.0 with tesseract 5.3.2/3 support to extract a English sanskrit iast image file (attached). I have a proper train data set that i downloaded (also attached) and I put it in tessdata. It is a page with two columns and it is extracted quite well except a few issues. The main issue I face is that it misses a line (the line underlined in blue in the image). I have attached the output file too. Someone suggested me to use different PSMs (as opposed to PSM 3 default).....
results
traineddata
Hello all, I am using vietocr 6.10.0 with tesseract 5.3.2/3 support to extract a English sanskrit iast image file (attached). I have a proper train data set that i downloaded (also attached) and I put it in tessdata. It is a page with two columns and it is extract quite well except a few issues. The main issue I face is that tesseract misses a line (the line underlined in blue in the image). I have attached the output file too. Someone suggested me to use different PSMs (as opposed to PSM 3 default).....
Hello all, I am using vietocr 6.10.0 with tesseract 5.3.2/3 support to extract a English sanskrit iast image file (attached). I have a proper train data set that i downloaded (also attached) and I put it in tessdata. It is a page with two columns and it is extract quite well except a few issues. The main issue I face is that tesseract misses a line (the line underlined in blue in the image). I have attached the output file too. Someone suggested me to use different PSMs (as opposed to PSM 3 default).....
Hello all, I am using vietocr 6.10.0 with tesseract 5.3.2/3 support to extract a English sanskrit iast image file (attached). I have a proper train data set that i downloaded (also attached) and I put it in tessdata. It is a page with two columns and it is extract quite well except a few issues. The main issue I face is that tesseract misses a line (the line underlined in blue in the image). I have attached the output file too. Someone suggested me to use different PSMs (as opposed to PSM 3 default).....
VietOCR v6.10.0 & VietOCR.NET v6.9.0 Releases
i have installed jtessboxeditor but i am not able to see devnagari font there .from where do u get devnagari font .please can u tell i am new to jtessboxeditor. so when i use box file in jtessboxeditor it doesnot recognize even a single word of hindi
We tested it with Oracle Java 20.0.2 on Windows. Can you try it with Oracle JRE? Thanks.
I have jTessBoxEditor running. I clicked File, Open and opened an image file. Now what? Is there anyone out there willing to give very simple instructions to walk me through the software? Much appreciated.
I have jTessBoxEditor running. I clicked File, Open and opened an image file. Now what? Is there anyone out there willing to give very simple instructions to walk me throught the software? Much appreciated.
Hi, I'm not able to choose the output formats for bulk processing. The dropdown activates and I see the choices - nothing happens when I click any choice. Running the batch process in this manner, I get a .box and a .unlv file per image (saved in the output directory). Also, the image magnification buttons are greyed out and not accessible. Choosing segmented regions' options doesn't draw any boxes on the left pane. Running VietOCR 6.9, ubuntu 22 java -version openjdk version "11.0.20" 2023-07-18...
VietOCR v6.9.0 & VietOCR.NET v6.8.0 Releases
VietOCR v6.8.0 & VietOCR.NET v6.7.0 Releases