Capture2Text / Tickets / #38 Allow usage of own dictionary

Pavol Brilla - 2018-02-02

ok so I made made my eng.user-words, put it to tessdata, loaded custom tesseract config, all fine, but seems that capture box disappeared ( I can capture through shortcut, but I dont see overlay box )

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Giacomo Cocchella - 2018-10-19

More or less, I've the same request. In my case, I need to recognize keywords and nicknames, not directly related to a language. Capture2Text is qute good to recognize those words fine, but sometimes is wrong. For example, if the word it Tany65, sometimes is wrong... it decodes Tanyb5. It's just an example... So, if I could create a dictionary with all the nicknames and keywords to be recognized, since these words are without meaning, probably this may help Capture2Text to improve its performance. In my case, I have to detect more or less 250 words without meaning. The best could be if I may create a new OCR language. Otherwise, a dictionary. Do you think it's possible?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Giacomo Cocchella - 2018-10-21

Searching for a way to match words to keywords, a way to do that could be using the Levenshtein algorythm https://en.wikipedia.org/wiki/Levenshtein_distance

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Allow usage of own dictionary