From: Raj M. ( र. म. ) <ra...@li...> - 2012-10-17 12:31:33
|
Hi, I have a spreadsheet made out of data captured in the field that contains what is purported to be Devnagari, but in some weird encoding. Sample: > loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh eh.kk] > fuoklh & nkekrkykc Any idea what this is, and whether I can (a) convert to Unicode and (b) transliterate to Roman character set? For the record, this is some development-related data being used for a study. Regards, -- Raj -- Raj Mathur || ra...@ka... || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F |
From: Raj M. ( र. म. ) <ra...@li...> - 2012-10-17 13:07:52
|
On Wednesday 17 Oct 2012, Raj Mathur (राज माथुर) wrote: > I have a spreadsheet made out of data captured in the field that > contains what is purported to be Devnagari, but in some weird > encoding. > > Sample: > > loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh > > eh.kk] fuoklh & nkekrkykc > > Any idea what this is, and whether I can (a) convert to Unicode and > (b) transliterate to Roman character set? > > For the record, this is some development-related data being used for > a study. On digging further, I find that the data is viewable when the following fonts are installed on a Winduhs computer: Devanagari MT Bold Devanagari MT Devanagari Sangam MN Devanagari Sangam MN Bold DevLys 010 Kruti Dev 010 Krishna Bold Italic Krishna Condensed Krishna Wide Krishna Bold Krishna Italic Krishna Krishna Thin Krishna Kruti Dev 010 Shusha02 Shusha05 Shusha Now I don't know exactly which of these fonts is used to render the text, whether these fonts are Unicode or not, and whether I can use them in Linux. The objective is to get them into Unicode, and eventually transliterate into Roman. Any help appreciated. Regards, -- Raj -- Raj Mathur || ra...@ka... || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F |
From: Ravishankar S. (रवि-रतलामी) <rav...@gm...> - 2012-10-17 13:52:58
|
On 10/17/2012 6:01 PM, Raj Mathur (राज माथुर) wrote: > Hi, > > I have a spreadsheet made out of data captured in the field that > contains what is purported to be Devnagari, but in some weird encoding. > Sample: > >> loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh eh.kk] >> fuoklh & nkekrkykc The data is in most popular old 8bit font - Krutidev010 Hindi. It can be converted to unicode through converters like this - सवजी पिता जौदा मीणा, लक्ष्मण पिता जौदा मीणा, धन्ना पिता वक्सी मीणा, निवासी - दामातालाब Then Unicode Hindi can easily be converted to Phonetic-Roman. You can find one such converter (Krutidev>Unicode) online here: http://raviratlami.blogspot.com/2010/10/blog-post_22.html Regards, Ravi > Any idea what this is, and whether I can (a) convert to Unicode and (b) > transliterate to Roman character set? > > For the record, this is some development-related data being used for a > study. > > Regards, > > -- Raj |
From: Raj M. ( र. म. ) <ra...@li...> - 2012-10-17 17:19:50
|
On Wednesday 17 Oct 2012, Ravishankar Shrivastava (रवि-रतलामी) wrote: > On 10/17/2012 6:01 PM, Raj Mathur (राज माथुर) wrote: > > Hi, > > > > I have a spreadsheet made out of data captured in the field that > > contains what is purported to be Devnagari, but in some weird > > encoding. > > > > Sample: > >> loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh > >> eh.kk] fuoklh & nkekrkykc > > The data is in most popular old 8bit font - Krutidev010 Hindi. > It can be converted to unicode through converters like this - > > सवजी पिता जौदा मीणा, लक्ष्मण पिता जौदा मीणा, धन्ना पिता वक्सी > मीणा, > > निवासी - दामातालाब > > > Then Unicode Hindi can easily be converted to Phonetic-Roman. > > > You can find one such converter (Krutidev>Unicode) online here: > http://raviratlami.blogspot.com/2010/10/blog-post_22.html Thanks for the info about Krutidev010. I need to do bulk conversion of multiple spreadsheets, so will look for a command-line FOSS tool to do this. Well, even an algorithm is fine -- I can create a tool around it. Any pointers appreciated. Regards, -- Raj -- Raj Mathur || ra...@ka... || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F |
From: Raj M. ( र. म. ) <ra...@li...> - 2012-10-17 17:26:40
|
On Wednesday 17 Oct 2012, Raj Mathur (राज माथुर) wrote: > On Wednesday 17 Oct 2012, Ravishankar Shrivastava (रवि-रतलामी) wrote: > > On 10/17/2012 6:01 PM, Raj Mathur (राज माथुर) wrote: > > > Hi, > > > > > > I have a spreadsheet made out of data captured in the field that > > > contains what is purported to be Devnagari, but in some weird > > > encoding. > > > > > > Sample: > > >> loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh > > >> eh.kk] fuoklh & nkekrkykc > > > > The data is in most popular old 8bit font - Krutidev010 Hindi. > > It can be converted to unicode through converters like this - > > > > सवजी पिता जौदा मीणा, लक्ष्मण पिता जौदा मीणा, धन्ना पिता वक्सी > > मीणा, > > > > निवासी - दामातालाब > > > > > > Then Unicode Hindi can easily be converted to Phonetic-Roman. > > > > > > You can find one such converter (Krutidev>Unicode) online here: > > http://raviratlami.blogspot.com/2010/10/blog-post_22.html > > Thanks for the info about Krutidev010. I need to do bulk conversion > of multiple spreadsheets, so will look for a command-line FOSS tool > to do this. Well, even an algorithm is fine -- I can create a tool > around it. Any pointers appreciated. Ah, http://shantiniketanvidyapeeth.com/News/UniKrutidev+Converter.htm has the complete encoder/decoder in JS. No copyright or licence, so I'm going to assume it's public domain and adapt into a GPL Perl script. Seems to be straight translation with some exceptions handled through code. Regards, -- Raj -- Raj Mathur || ra...@ka... || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F |
From: G K. <ind...@gm...> - 2012-10-17 17:37:12
|
On Wed, Oct 17, 2012 at 10:56 PM, Raj Mathur (राज माथुर) <ra...@li...> wrote: > On Wednesday 17 Oct 2012, Raj Mathur (राज माथुर) wrote: >> On Wednesday 17 Oct 2012, Ravishankar Shrivastava (रवि-रतलामी) wrote: >> > On 10/17/2012 6:01 PM, Raj Mathur (राज माथुर) wrote: >> > > Hi, >> > > >> > > I have a spreadsheet made out of data captured in the field that >> > > contains what is purported to be Devnagari, but in some weird >> > > encoding. >> > > >> > > Sample: >> > >> loth firk tkSnk eh.kk] y{e.k firk tkSnk eh.kk] /kUuk firk oDlh >> > >> eh.kk] fuoklh & nkekrkykc >> > >> > The data is in most popular old 8bit font - Krutidev010 Hindi. >> > It can be converted to unicode through converters like this - >> > >> > सवजी पिता जौदा मीणा, लक्ष्मण पिता जौदा मीणा, धन्ना पिता वक्सी >> > मीणा, >> > >> > निवासी - दामातालाब >> > >> > >> > Then Unicode Hindi can easily be converted to Phonetic-Roman. >> > >> > >> > You can find one such converter (Krutidev>Unicode) online here: >> > http://raviratlami.blogspot.com/2010/10/blog-post_22.html >> >> Thanks for the info about Krutidev010. I need to do bulk conversion >> of multiple spreadsheets, so will look for a command-line FOSS tool >> to do this. Well, even an algorithm is fine -- I can create a tool >> around it. Any pointers appreciated. > > Ah, http://shantiniketanvidyapeeth.com/News/UniKrutidev+Converter.htm > has the complete encoder/decoder in JS. No copyright or licence, so I'm > going to assume it's public domain and adapt into a GPL Perl script. > > Seems to be straight translation with some exceptions handled through > code. > This could be of help, basically a web proxy gateway to convert legacy font encoded sites to unicode and display them. http://sourceforge.net/projects/unigateway/ conversion codes (in PHP) are here. http://unigateway.svn.sourceforge.net/viewvc/unigateway/trunk/unigateway/Encoder/fonts/ Karunakar |
From: Raj M. ( र. म. ) <ra...@li...> - 2012-10-17 17:48:03
|
On Wednesday 17 Oct 2012, G Karunakar wrote: > On Wed, Oct 17, 2012 at 10:56 PM, Raj Mathur (राज माथुर) > <ra...@li...> wrote: > >> Thanks for the info about Krutidev010. I need to do bulk > >> conversion of multiple spreadsheets, so will look for a > >> command-line FOSS tool to do this. Well, even an algorithm is > >> fine -- I can create a tool around it. Any pointers appreciated. > > > > Ah, > > http://shantiniketanvidyapeeth.com/News/UniKrutidev+Converter.htm > > has the complete encoder/decoder in JS. No copyright or licence, > > so I'm going to assume it's public domain and adapt into a GPL > > Perl script. > > > > Seems to be straight translation with some exceptions handled > > through code. > > This could be of help, basically a web proxy gateway to convert > legacy font encoded sites to unicode and display them. > http://sourceforge.net/projects/unigateway/ > conversion codes (in PHP) are here. > http://unigateway.svn.sourceforge.net/viewvc/unigateway/trunk/unigate > way/Encoder/fonts/ Great, I'll use that as a fallback in case the JS converter conversion to Perl doesn't work (properly). The unigateway script seems to be doing a more complex conversion to something called "Padma", involving multiple classes and $deity knows what else. Will take time to parse and understand! Of the two evils (JS and PHP), I guess JS is the lesser :) Regards, -- Raj -- Raj Mathur || ra...@ka... || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F |
From: Guntupalli K. <kar...@in...> - 2012-10-17 17:59:09
|
On Wed, 17 Oct 2012 23:17:39 +0530 Raj Mathur (राज माथुर) wrote: > On Wednesday 17 Oct 2012, G Karunakar wrote: > > On Wed, Oct 17, 2012 at 10:56 PM, Raj Mathur (राज माथुर) > > <ra...@li...> wrote: > > >> Thanks for the info about Krutidev010. I need to do bulk > > >> conversion of multiple spreadsheets, so will look for a > > >> command-line FOSS tool to do this. Well, even an algorithm is > > >> fine -- I can create a tool around it. Any pointers > > >> appreciated. > > > > > > Ah, > > > http://shantiniketanvidyapeeth.com/News/UniKrutidev+Converter.htm > > > has the complete encoder/decoder in JS. No copyright or > > > licence, so I'm going to assume it's public domain and adapt > > > into a GPL Perl script. > > > > > > Seems to be straight translation with some exceptions handled > > > through code. > > > > This could be of help, basically a web proxy gateway to convert > > legacy font encoded sites to unicode and display them. > > http://sourceforge.net/projects/unigateway/ > > conversion codes (in PHP) are here. > > http://unigateway.svn.sourceforge.net/viewvc/unigateway/trunk/unigate > > way/Encoder/fonts/ > > Great, I'll use that as a fallback in case the JS converter > conversion to Perl doesn't work (properly). The unigateway script > seems to be doing a more complex conversion to something called > "Padma", involving multiple classes and $deity knows what else. > Will take time to parse and understand! > Yes, a firefox extension for same purpose.. unigateway is based on it. http://padma.mozdev.org/ Padma stuff is JS again. Karunakar |