i use jTessBoxEditor to open the box file for Chinese , but i can't understand the char it shows for each character in jTessBoxEditor . there are just some messy code, like rectangles and english letter. if that ,how can i edit the incorrect character by Tesseract.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I am also training for Chinese and would like to train tesseract so that it can recognize 10 pt SimSun from a screenshot (clean image). SimSun has this property where when the font becomes small the "serif" (well there really isn't serif, but the stroke widths, etc.) go away when the font is small enough. Can I just train with the image and box from your software, or do I need to somehow expand the image? I cannot use a large pt size as then it becomes a serifed font.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i use jTessBoxEditor to open the box file for Chinese , but i can't understand the char it shows for each character in jTessBoxEditor . there are just some messy code, like rectangles and english letter. if that ,how can i edit the incorrect character by Tesseract.
The default font probably cannot display Chinese characters. You will need to change it, via the Font dialog, to a compatible font.
Hi, I am also training for Chinese and would like to train tesseract so that it can recognize 10 pt SimSun from a screenshot (clean image). SimSun has this property where when the font becomes small the "serif" (well there really isn't serif, but the stroke widths, etc.) go away when the font is small enough. Can I just train with the image and box from your software, or do I need to somehow expand the image? I cannot use a large pt size as then it becomes a serifed font.
denniel, were you ever able to use jTessBoxEditor to train for Chinese?