Re: [Indic-computing-devel] javascript indic renderer and community portals
Status: Alpha
Brought to you by:
jkoshy
From: Krishnamurthy N. <kn...@ya...> - 2003-05-12 12:32:58
|
Hi Suryaprakash, Arun and others, A good, generic transliteration library for the Indian languages/scripts is what is needed, IMHO. As anyone who has studied the structure of Indian languages and scripts would appreciate, there is good structure in the phonetic input and how the input syllables are encoded using a script and how they are mapped to a series of simple or composite glyphs. In a web-based client-server model of app development, unlike Latin scripts/languages, it would be more appropriate to do input processing, transliteration and glyph composition for rendering on the server side. So, using Javascript may not be that feasible to achieve this. At the same time, if all processing is on the server side, then the application can't be really that interactive (such as showing the display to the user for each syllable typed in). Some intermediate soluton would be needed. Also, using PHP may be a better idea than Javascripts or jsp. What do you folks think ? A couple of weeks back, I made an enhancement to my transliteration library (translib under the indic-computing project on sourceforge) to take a 'word' in an Indian language encoded as a sequence of Unicode characters (in UTF-8 format), kind of map it (using a user-defined lookup mapping file) to the most appropriate Roman phonetic input and then apply the transliteration rules for that language+script by looking up the transliteration rule file. The output, as before, is a sequence of symbolic glyph names that correspond to glyph indices in a given font file. This can be fed to a font reading and rendering library such as freetype2 for final display (Koshy wrote a python script to do this using gozer, but he is now replacing gozer with a python wrapper to ft2). I tried out this utf8-to-final-glyph rendering for Hindi+Devanagari with very minimal mapping done and did some prelim testing and it's ok. All the intelligence is in the user-defined mapping files and the source code itself has no knowledge of any Indian language. Unicode is neither an input mapping scheme nor a glyph mapping scheme; it's just an encoding scheme, as all of us know. It has limitations, but with a sound transliteration library in place, utf-8 can be used for storing Indian language content for further processing (search, sort, display etc etc). cheers, Nagarajan --- Suryaprakash Kompalli <kom...@ce...> wrote: > Hello, > > What about Javascript ? When user types in a text > field using some > > Transliteration scheme or Inscript KeyBoard layout > convert it to font > > For collecting data on Indic scripts, we had > created an interface > that uses Java. I had used ITRANS transliteration > and the default java > fonts to come up with an input window that accepts > transliteration in > Latin script and outputs Devanagari on another > window. > I am not familiar with CMS or *nuke - but with what > I gather > reusable java classes could be a good way to look at > it. We can develop > classes to work with other languages - > Right now, the tool might be a bit bulky to work > with since the > package has a lot of Image Processing related code > to go with it. But > its' good to test offline - Ppl can take a look at > it here - > www.cedar.buffalo.edu/ilt > If its' useful, we can plan to make the input > system independant > of the IP part - > > > http://www.wandel.subnet.dk/hindi.html > I tried the site out - it didnt need applets - but > it didnt work > on netscape either - gave me a message saying it > needs windows to work > with but then behaved OK on mozilla. :) :( What it > is doing is - it > takes the unicode value for the character, converts > it into int and this > is what we can do with the o/p - > > <html> > ???? > </html> > > But since what it is in the background is basically > unicode, we > still need a *properly configured* browser to view > the stuff - It came > out properly on Mozilla, didnt show up on netscpape. > Pretty nifty - but > the addition of transliterated/keyboard based input > might be more welcome. > The characters present in the interface are the ones > from the unicode > table for Devanagari. > > Bye, > Surya > On 5 May 2003, Arun M wrote: > > > Hi Friends, > > > > We see a lot of community portals coming up > these days based on > > CMS like *nuke. But one of these CMS supports > Indic. Also we dont see > > any community discussion boards in Indian > languages (Pls correct me). > > > > Some issues I see are: > > > > - There is no indic(Unicode/Non unicode) support > in most platforms. > > - Most browsers doesnt support unicode. > > > > Any community portal we build should be based on > font encoding > > systems, at least for some more time. A idea is > store > > the data in Unicode and then convert to font > encoding at the server > > side. A special proxy should help here. > > > > Second issues is entering data from the client > side. This is major > > prob. Most of the sites uses Java applets for > entering the data in > > local languages. This may require good amount of > modification in the > > CMS we have now like *nuke. > > > > Yesterday I made a proto of this. A crude one. It > works. > > > > Do you think this will be of use ? If so I will > work on it and > > make the code better and generic(I dont know > Javascript much , will have to > > learn it first). > > > > > > Arun. > > __________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo. http://search.yahoo.com |