[Indic-computing-devel] From Kannada to keyboards... an Indian language enters the cyberage
Status: Alpha
Brought to you by:
jkoshy
From: sunil <su...@in...> - 2001-12-15 11:48:38
|
Dear Colleagues: An article on Dr. Pavanaja and the work of KGP. Since they are keen on open source - should we collaborate with them for Kannada. Thanks, Sunil -- Sunil Abraham Team Leader - MAHITI Info-tech for the Voluntary Sector India Cares, Vijay Kiran 314/1, 7th Cross, Domlur Bangalore - 560 071. Karnataka. India Pager: +91 80 9624 279519 Ph/Fax: +91 80 5352003, 5350035 E-mail: su...@ma... Web: http://www.mahiti.org -----Original Message----- From: Frederick Noronha <fr...@by...> To: byt...@go... Date: Sat, 15 Dec 2001 01:40:52 +0530 (IST) Subject: From Kannada to keyboards... an Indian language enters the cyberage ********************************************************************** FROM KANNADA TO KEYBOARDS: AN INDIAN LANGUAGE ENTERS THE CYBERAGE ********************************************************************** By Frederick Noronha fr...@by... For Dr U.B. Pavanaja, an unlucky 1993 scooter accident turned out to be the proverbial blessing in disguise. For nine months as he lay immobilised in bed, the scientist learnt Visual Basic. Laying prostrate on his bed, with a computer alongside, he then went on to write the first versions of what is now his 'Kannada Kali' software programme. This is a game that helps a child or new learner of the Kannada language of the Southern Indian state of Karnataka to shape his alphabets properly. "I did it lying on the bed with a computer by my side," he recalls with a smile. Over the years, as he stepped up work on the issue of Indian regional language computing, the one-time scientist at India's prestigious atomic research centre finds his output increasingly relevant to the commonman. Currently he's at the helm of the Kannada Ganaka Parishat (or, Kannada Computer Association). This is a voluntary organisation formed by computer professionals, literary persons and others to promote the standardisation and usage of the Kannada language on computers. It's probably important not to underestimate the size of this task. Kannada is the language of some 47 million people worldwide -- more than the number of Polish speakers in the globe, and just below the number of Ukrainian speakers. Besides, the lessons learnt with Kannada could have important implications for other prominent Indian languages whose speakers number in millions. For instance, Hindi (496 million), Bengali (215 million), Urdu (106 million), Punjabi (96 million), Telugu and Tamil (75 million each), and Marathi (72 million). "There is so much talk about computing for the commonman. But the main problem that everyone seems to overlook is that the commonman (specially in countries like India) speaks in languages other than English," as Dr Ubaradka Bellippady Pavanaja reminds us. (Both his first names are village-names, and in the South Indian style, are generally not spelt out in full.) So, for the past many years, he's been working sweating over this front. Some solutions are simple, why-didn't-we-think-of-it-earlier ways out. Others are attempts to do the groundwork and undertake standardisation that could have far-reaching implications for the future. So far, the standardisation has already been done, both on a uniform keyboard for Kannada, and also for the glyphs and glyph-codes. (The latter refer to the component parts that, when joined together in varying combinations, make up each alphabet.) There's a big difference between English and Indian-languages over the display and storage of information in computers. In the case of English, there is a one-to-one correspondence between the display codes and the storage codes. But in the case of an Indian language, say Kannada, the letters are made up of combinations of consonants and vowels. Using, for example, a consonant-plus-consonant-plus-consonant-plus-vowel combination. These characters have a unique storage code in ISCII, or the Indian Standards Code for Information Interchange. Display of these characters are accomplished by joining pieces of characters known as 'glyphs'. Codes for the storage characters and the display pieces (glyphs) are different. In addition, the number of characters which make the make the character (used for storage) and the number of display pieces which are used for the display of the letter simply don't have a one-to-one correspondence. An example: the Kannada language uses some 142 pieces to obtain all the possible combinations that can be obtained from the based 49 Kannada alphabets. In the past, Indian groups working on language-solutions -- like the Pune-based government backed C-DAC and Mithi, which specialises in local language computing, also from Pune -- have worked on similar work. But in earlier cases, everyone followed their own glyph sets. This meant data lacked 'portability'. Text composed on one computer could not be carried over, or understood by, another computer which did not share the same software. This was a great handicap in a world where the ability of computers to 'talk to one another' has made them into the powerful tool they currently are. "We feel the best solution is to have the storage in ISCII. Other solutions have attempted to tie up the user in their own software solutions," says Dr Pavanaja. He says that the Government of India's stand is that ISCII should have standardised glyph sets. "In our region, the Government of Karnataka has standardised glyph sets already. We have benchmark software too... to ensure that the software would work with any standard computer." Admits Dr Pavanaja: "Standardisation is something that has to be imposed (for the sake of moving ahead together)." At another level, the Kannada language has also pushed for what it calls the Kannada Standard Code for Language Processing. This is used for sorting, as per the Kannada order of alphabets. "Sorting is a very important job for computers. Can youthink of a single database operation without sorting and indexing?" asks Dr Pavanaja. "For all these years, using computers for Kannada-work meant simply using it for typing, making books, printing invites and DTP (desktop publishing) work. It has now changed," points out Dr Pavanaja. Sorting and indexing in the regional language, he argues, has opened up new possibilities. C-DAC (the Government of India-backed Centre for Development of Advanced Computing) earlier had solutions, but this, he says, was not particularly suitable for the Kannada language. This attempt evolved a national standard based on Hindi, whereas every language of India has its own specialities and requirements. At another level, the Parishad has been working towards a standardised Unicode for Kannada. "KGP general secretary Srinatha Sastry and myself put together a document, and sent it to the Unicode Consortium. It was partly accepted," says Dr Pavanaja. He underlines the importance of uniformity for the Unicode character table and collation code for this regional language. Incidentally, India's voting-member at Unicode Consortium is the Indian government's Ministry of Information Technology (MIT). But lack of uniform interests among the various Indian languages used for computing means that sometimes not much can be done on this front. In September 2000, Dr Pavanaja took part in a Unicode conference in California. "We explained the issues (involved in Kannada), and that was appreciated a lot. The MIT is waiting for all languages to come up with a decision. Only Kannada has done this much groundwork on Unicode. At least Kannada could be implemented on Unicode for now (instead of waiting for all Indian languages to finish their task)." Besides, the Parishad has developed a free Kannada script software. This was released in October 2001 in Bangalore. "It has got SDK (the software development kit) as part of it. But most importantly, it comes free (in terms of price)," stresses Dr Pavanaja. He suggests that this is important too in a price-sensitive region like India, where millions still live in poverty. Using this, developers can write Kannada database applications. It could, therefore, have applications linked to phone directories, ration cards, banking, libraries and even road-transportation operations. This spells immense fallouts for this large state of Karnataka, which has a population roughly the size of South Africa, and over half the area of Germany in land-mass. "Everyone needs good database applications. In Indian language computing, 90% of the uses are linked to DTP unfortunately. But in English, computers are overwhelmingly used for database applications," says he, stressing that the lack of applications also causes problems. Whether it's e-commerce, business transactions or public utlities and governance, all these sectors need good database applications, stresses Dr Pavanaja. One of this team's solution is called 'Kalitha'. It is a Kannada keyboard driver and font. "It also has a sorting engine, not just a sorting-facility. This is the first time that any Indian language had this facility," says Dr Pavanaja. This group led by Srinatha Sastry, has modified a Kannada keyboard-layout originated by K.P. Rao. It uses the 26 English-language keys for Kannada's 49 alphabets. "Even Bill Gates appreciated (the concept behind) such a layout for a keyboard," says Dr Pavanaja. But just how does it work? The 'shift' (or 'caps') key comes to the rescue. "English has 26 alphabets multiplied by two (with each using the caps key). This makes a total of 52. In Kannada, we need only a total of 49. It works well with the 'shift' and 'unshift' key," says he. This layout has been accepted and notified by the Karnataka government. In order to keep things simple for the typist and computer-operator, this keyboard makes things a "little more difficult" for the programmer. But once that is taken care of, things become simple in actually using this solution. Besides his technical work, this man's own story is also interesting. Dr Pavanaja, currently 42, is a PhD in chemistry. He was a scientist at the Bhabha Atomic Research Centre (BARC) in Bombay. "We used computers extensively, in lab-automation and we also experimented in connecting a lot of lab equipment to computers," he recalls. Using computers "as a tool" for his scientific work for awhile, he says he "got addicted". His own efforts took the chemical scientists closer to the computer in the early days of the PCs. "I soon became seen as a computer professional," he recalls of times in the mid-eighties, when the PC first began to make its appearance in the Indian scientific establishments. In BARC, a group to promote the Kannada language often faced difficulties in publishing technical articles in its Kannada-language science magazine. That set him thinking. "While doing our magazine 'Belagu' (whose name loosely translated to 'Shine' or 'Reflect Light'), we decided to buy our own DTP package." In 1995, a visit for advanced research to Taiwan revealed that computer professionals were heavily into computer use, but were overwhelmingly using Chinese. "If they could use their language, why not we?" thought Dr Pavanaja. Soon, he became active on Internet 'news' groups like soc.culture.indian.karnataka and also set up websites. What happened afterwards is narrated in terms of the output achieved and listed above. "When I was a scientist, I felt my doctorate had no use. I was hardly doing any (socially-relevant) work. Now, I don't feel guilty about that anymore," he says. He returned from Taiwan in 1996 and resigned from BARC in June 1997. In 1998, his work made Kannada one of the first Indian languages to use dynamic fonts. He explains: "Earlier, if you wanted to browse a web-site, you needed the (same font used by the site) to be installed on your PC." Obviously, a real dilemma in a region where there exist dozens or hundreds of non-standardised fonts for each language. This meant downloading the font. You needed to do it each time you used a different computer! Dynamic fonts solve the problem by residing on the 'server', not on the 'client' (or user's computer). When you browse a site, you automatically pull the font info the first time you browse it. Also, it works with any operating system you're using, Dr Pavanaja points out. "In English, you don't have the problem of clashing glyphs. If you use a fancy font, you can still read it at least in Times or Arial...," He notes. Pavanaja has also createD a Kannada version of LOGO. "LOGO stands for 'logic-oriented, graphic-oriented' programming. It is a language for children. It uses very simple commands, like 'forward', 'backward', and so on. School children of the fifth to eight standards (roughly 10 to 13 years of age) can use it effectively. I thought of Kannada-medium schools, and wanted something for them," says Dr Pavanaja. Work done by this group could make Kannada the first Indian langauge to get onto a palm-top computing device, believes Dr Pavana. "Much of the coding (for some of our projects) has been done by K.M.Harsha, a 22-year-old mechanical diploma holder from a village," he points out. This, says the scientist, only underlines the creativity of youngsters if given the chance. It challenges the myth that city-born children are more intelligent! One of the KGP's dreams is to have Kannada working with the 'free' and 'open source' Linux operating system, which was largely build up by volunteers worldwide. "But that could take some time," concedes Dr Pavanaja. "We need to have keyboard drivers, fonts, a toolkit for software developers, a free office suite like Star Office, and even the complete Linux working in Kannada," he adds. Getting legal copies of proprietorial software would cost millions for a state the size of Karnataka. "So far the KGP has been taking its funding from the government, semi-government institutions, corporate world and philanthrophy. We need to develop software and make it available freely (so as to make it affordable to the commonman in a country where millions still live in poverty). We don't sell anything," says Dr Pavanaja. Says Dr Pavanaja: "If you don't put Indian languages into the computer, all our tongues will get relegated to being just spoken languages in five to ten years time." Currently the editor of 'Vishva Kannada', which he terms the world's first Internet magazine in the Kannada language, Dr Pavanaja can be contacted at <pav...@vi...> This magazine's site can be visited on the World Wide Web at www.vishvakannada.com (ENDS) ------------------------------------------------------------------- Frederick Noronha * Freelance Journalist * Saligao 403511 Goa India fr...@by... * Phone +91-832-409490 Mobile 9822 12 24 36 ------------------------------------------------------------------- |