From: G K. <ind...@gm...> - 2012-10-17 17:37:12
|
On Wed, Oct 17, 2012 at 10:56 PM, Raj Mathur (राज माथुर) <ra...@li...> wrote: > On Wednesday 17 Oct 2012, Raj Mathur (राज माथुर) wrote: >> On Wednesday 17 Oct 2012, Ravishankar Shrivastava (रवि-रतलामी) wrote: >> > On 10/17/2012 6:01 PM, Raj Mathur (राज माथुर) wrote: >> > > Hi, >> > > >> > > I have a spreadsheet made out of data captured in the field that >> > > contains what is purported to be Devnagari, but in some weird >> > > encoding. >> > > >> > > Sample: >> > >> loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh >> > >> eh.kk] fuoklh & nkekrkykc >> > >> > The data is in most popular old 8bit font - Krutidev010 Hindi. >> > It can be converted to unicode through converters like this - >> > >> > सवजी पिता जौदा मीणा, लक्ष्मण पिता जौदा मीणा, धन्ना पिता वक्सी >> > मीणा, >> > >> > निवासी - दामातालाब >> > >> > >> > Then Unicode Hindi can easily be converted to Phonetic-Roman. >> > >> > >> > You can find one such converter (Krutidev>Unicode) online here: >> > http://raviratlami.blogspot.com/2010/10/blog-post_22.html >> >> Thanks for the info about Krutidev010. I need to do bulk conversion >> of multiple spreadsheets, so will look for a command-line FOSS tool >> to do this. Well, even an algorithm is fine -- I can create a tool >> around it. Any pointers appreciated. > > Ah, http://shantiniketanvidyapeeth.com/News/UniKrutidev+Converter.htm > has the complete encoder/decoder in JS. No copyright or licence, so I'm > going to assume it's public domain and adapt into a GPL Perl script. > > Seems to be straight translation with some exceptions handled through > code. > This could be of help, basically a web proxy gateway to convert legacy font encoded sites to unicode and display them. http://sourceforge.net/projects/unigateway/ conversion codes (in PHP) are here. http://unigateway.svn.sourceforge.net/viewvc/unigateway/trunk/unigateway/Encoder/fonts/ Karunakar |