GOCR for other languages

RKVS Raman
  • RKVS Raman

    RKVS Raman - 2004-11-13


    Is there a method through which GOCR can be trained for Indian Languages.


    • marcel

      marcel - 2004-11-29

      Hi Raman!
      Altough I'm new to gocr I can give you one hint. In the directory '/gocr-0.39/bin' you will find a script called ''create_db', this script will create the glyph-database with the character-templates. It's a simple bash-script. It creates the glyphs from tex-files, if your LateX-version supports Indian glyphs, simply try to customise this script. Then it could work with Indian Languages...

      But the main fact is, gocr does not work in any way with a trained db (like knn-Algorithms do in other programs), it workes with the template-method.

      I hope, even as 'noob' , i was able to help a little bit.. ;-)

      Best regards,

    • Emanoil Kotsev

      Emanoil Kotsev - 2006-11-17

      Unfortunately as mentioned by Joerg the code for the db is bad and because of the j/gocr concept I don't think that even with a correct created db in some other language it is not working. In my case cyrillic where some letters look like latin ones is compleetely mising up the whole stuff.
      I tried to 'sed' the output but results were very poor :-( because of the latin encoding, that is only supported. I ended up with a text containing latin and cyrillic encoded chars.

      I think this information will help you save time.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks