This project is for sharing the training sources and traineddata files for devanagari script for use with Tesseract OCR.
Please note that Tesseract 4.0.0-alpha with LSTM engine gives better results for Hindi and other Indian languages.
See some OCR evalaution reports at:
Hindi and Sanskrit
https://shreeshrii.github.io/tess4eval/
Kannada
Newer files
https://github.com/Shreeshrii/
Older files - Archival Purposes only
https://sourceforge.net/projects/tesseracthindi/files/
Font file
http://www.omkarananda-ashram.org/Sanskrit/sanskrit2003.zip
Font details
http://www.sanskritweb.net/itrans/index.html#SANS2003
Font File
https://sites.google.com/site/bayaryn/siddhanta.ttf?attredirects=0
http://www.sanskritweb.net/cakram/chandas.ttf
http://www.sanskritweb.net/cakram/uttara.ttf
Font Details
http://svayambhava.blogspot.in/p/siddhanta-devanagariunicode-open-type.html
http://www.sanskritweb.net/cakram/
Font details
http://www.sanskritweb.net/itrans/#S99FONTS
http://www.sanskritweb.net/itrans/s99fonts.pdf
Font file
http://bombay.indology.info/software/fonts/devanagari/nakula.ttf
http://bombay.indology.info/software/fonts/devanagari/sahadeva.ttf
Font details
http://bombay.indology.info/software/fonts/devanagari/
Font file
http://software.sil.org/downloads/annapurna/AnnapurnaSIL-1.201.zip
Font details
http://software.sil.org/annapurna/
Font file
https://fedorahosted.org/releases/l/o/lohit/lohit-devanagari-ttf-2.95.2.tar.gz
Font details
https://fedorahosted.org/lohit/
Font Installation details
http://cikitsa.blogspot.in/2014/12/gnu-freefont-fonts-and-xelatex.html
Font Details
https://www.gnu.org/software/freefont/index.html
https://savannah.gnu.org/bugs/?group=freefont
The following software packages and utilities were used for this.
Windows installer of tesseract-ocr 3.05 dev
http://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-3.05.00dev.exe
https://github.com/UB-Mannheim/tesseract/wiki
Sanskrit traineddata by training using text2image from https://github.com/Shreeshrii/imagessan
https://github.com/Shreeshrii/imagessan/blob/master/tessdata/san95.traineddata
https://github.com/Shreeshrii/imagessan/blob/master/tessdata/san21.traineddata
Sanskrit traineddata from Google for tesseract-ocr 3.04
https://github.com/tesseract-ocr/tessdata/blob/master/san.traineddata
Windows installer of tesseract-ocr 3.02.02 (including English language data)
https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-setup-3.02.02.exe/download
Hindi language data for Tesseract 3.02
https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.hin.tar.gz/download
https://github.com/Shreeshrii/ocr-evaluation-tools
https://gitorious.org/ancient-greek-training-for-tesseract/ocr-evaluation-tools/archive-tarball/master
https://sourceforge.net/projects/vietocr/files/vietocr.net/
https://sourceforge.net/projects/vietocr/files/vietocr/