[Indic-computing-devel] ANARCHY OF KANNADA STANDARDS IN IT.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

ANARCHY OF KANNADA STANDARDS IN IT.

Historically, standards were never forced or defined without thorough evaluation to analyse the pros/cons and implementations on trial basis. Standards evolve and becomes mandatory for a measure of quality assurance and building consensus among all the concerned regarding norms for compliance and criteria for certification. How the goals of standardisation could be achieved by adopting controversial methods in evolving standard glyph set / font / encoding for Kannada.

Since, topic of this mail is around code set - about its usefulness and interoperability. I take an example of PC character set. How this character set has become acceptable to all PC manufacturers, OS developers, application developers etc, inspite of the provision to alter or change the character generator to define ones choice of character set. The character set defined by IBM was accepted by the whole Industry for the sake of interoperability. Interoperability is an important criteria for a standard to succeed. Well. Is there interoperability in the standards of Kannada for computers.

Introduction

Often Indian Languages complexities really complicates the so called expert of the field and influence the officials to favour their idealism and get funded their efforts and confuse those end users, who are already in dilemma say a typist - a government employee who is at the receiving end of the resultant half baked potatoes.

You may be wondering what relevance this mail has to you. As you are interested in Indian Language computing, I thought it would be of some interest to you.

Is there any single product available for Indian Languages, which can talk of some technological marvel. No. It is to be noted that every Indian Language is now implemented just by hacking the fonts. Even the hacking of font is not made properly in many instances. I take the example of Kannada to go into the details of "How Kannada Language / Script / Glyph / Font / Code is being handled for standardisation". I have included Language, Script, Glyph, Font and code just to throw some light on the ambiguities the people concerned have implied in standardising Kannada.

Indian Script code standard

All the Indian Languages were encoded as ISCII (Indian Script Code for Information Interchange - this often misinterpreted as Indian Standard Code for Information Interchange) based on the script principles of Vowels and Consonants. Hence, this standard has only Signs, Vowels, Vowelconsonants (instead of using of pure consonant, a consonant having an initial vowel in it, which is a base for writing the script and non-use of pure consonant in Devanagari is the reason.), Vowel signs (to differentiate from the vowels in a string and to avoid auto-combining feature of consonant and vowel when a vowel comes in the non-initial position). This principle is also adopted in the Unicode (Which is supposed to be a character encoding) for Indian Languages based on the earlier version of ISCII.

Indian Script implementation

As the existing OS (except MS Windows 2000 and MS Windows XP) does not have the capability to handle these standards, the enthusiastic developers found a trick of having the glyphs in some useful manner for these languages and handle the combinational complexities at the input level. As this kind of trick was being followed with the MS DOS based DTP softwares prior to MS Windows popularisation, the same trick was followed even for MS Windows for the sake of convertibility and its ease of use. When all the available Indian Language solutions are based on these hack tricks, NO technology is existing on the GUI based operating systems like MS Windows to handle ISCII with the off-the-shelf application software like Office suites.

It is this annoying trick which is being followed by the 'Specific Group' for Kannada in standardising and developing NUDI for Kannada while blaming the developers. Why this 'Specific Group' never attempted to invent a technology to handle Kannada efficiently using ISCII on computers for the off-the-shelf applications. I leave it to your guess and further pacifying.

Kannada standards

Let me focus on, how ambiguously they interpret the Language, which resulted in today's anarchy. Leave alone the complexities of script composition, Kannada has one special interpretation of consonant 'r' as in Karnataka when written in Kannada. Which is a most commonly used form of 'r'. The so called experts, instead of handling the complexity of 'r' in the software have introduced it as one of the symbol in the standards announced for Kannada Keyboard (reference Karnataka G.O sa am ka e 70 kaa 99 dated 4-2-1999). In this standard, 47 necessary Kannada characters and 4 symbols are listed for modern Kannada language issued by the secretariat of Kannada and Culture.

When the same Specific Group sent recommendation for Kannada in Unicode to Government of India through Directorate of Information Technology of Government of Karnataka, have left two diacritic marks(part of the above standard), which were recommended for composing Vedic text. But, included a new set of additional characters.

Later on, when the Specific Group reached the peak of confusion, came out with another altogether different set of character set as Kannada Standard Code for Language Processing. Strangely, the arguable special symbol for 'r' is left out in this character set. If KSCLP code set is meant for Language processing, then what else the other encodings ISCII/UNICODE etc., do. Does it mean that the Group is not aware of the sorting problems when they submitted the recommendation. Why the Government is insisting on SORTING order as per ISCII when KGP is allowed to do sorting based on KSCLP. Is it not a malpractice recommending two different standards based on altogether different principle and use it for self advantage. Are they not misleading the Kannada people, people of Karnataka and the Government of Karnataka. If Character encoding (KSCLP) is the most suitable for Kannada Language processing, why the same was not recommended for Unicode. Is it not a wonder ?

Font Standards

Leave apart the Kannada character set, which the Specificd Group handle/suggest/innovate. Let me throw some light on their script handling glyph code standards. To solve the Kannada text portability and its compatibility, the Government of Karnataka have appointed a Committee, which included constituents of Specific Group. The Specific Group has submitted the report recommending a set of Glyphs and Glyph codes and was announced as standard on Nov 1, 2000. The Specific Group has managed to get funding to develop a model software (an input handling software), and the same was developed and announced without even bothering about the minimum features, which were recommended by the same Specific Group. Later on, when a difficulty arose in using their standards for e-governance projects which requires English to be part of the user choice, they had silently included a bi-lingual glyph encoding by making use of the code positions which were spared due to its unusability of nature. I wonder how the Government is promoting the bi-lingual encoding which is based on the codes which are spared while evolving the mono-lingual glyph standard. If the Specified Group decide as they wish, then why the Government appoints committee to standardise glyphs and glyph codes for Kannada. Is it to approve by stamping or to elevate the Specific Group to a level of consultants to Microsoft.

Conversion

When a document created using KGP recommended Unicode is converted to KSCLP, it will be a lossy conversion resulting into loss of diacritic marks recommended for Vedic texts and 'ru' long vowel and 'ru' long vowel consonants. Similarly, when one tries to convert the text created in mono-lingual encoding into bi-lingual encoding it will result into loss of data.

As the diacritic marks were not part of the recommended glyph set but recommended as part of the keyboard standard, any software to be developed as per the prevailing standard, developers are allowed to accommodate the diacritic marks as per their convenience. If such Software are used, it will only result into non-standard, non-compatible font and text will result in non-portable format. How anybody can write a converter utility to convert the texts created with diacritic marks, which are allowed in any vacant codes in any order. This may be the reason why the Specific Group has not attempted to provide a conversion utility for their software NUDI.

Keyboard standard

It is often argued that the strength of the standardised keyboard lies in the layout being managed within the keys meant for English. English keys are used as reference for the user to remember the keys. However, it is conveniently forgotten the loading of fingers. In the standard Kannada keyboard, left hand is loaded with 15 keys and the right hand is loaded with 11 keys (excluding the punctuation keys). Normally, when a keyboard layout is designed, the frequency analysis of letters and in turn keystrokes are considered. When, language specific encoding standards and keyboard layouts are getting evolved worldwide, it is strange that the Specific Group managed to get attestation from the Government for their unscientific keyboard layout. It is alarming the Group also try to influence the other language groups working in the area of standardisation efforts.

NUDI

The Specific Group cleverly managed to get the support of the concerned authorities of the Government of Karnataka and made use of the Government machinery to fulfill their whimsical will. As once one amongst the Specific Group was blaming the developers for proprietary glyph encoding "We feel the best solution is to have the storage in ISCII. Other solutions have attempted to tie up the user in their own software solutions". But in reality they have succeeded in announcing their proprietory set of glyphs as standard and have not provided a solution for storage in ISCII for the off-the-shelf applications. It amply proves of their lip service.

You may be wondering as who are these friends. They are none other than Dr. U.B Pavanaja, who holds Vishva Kannada Softech, Mr C. V Srinatha Sastry, General Secretary of Kannada Ganaka Parishat, Mr. G. N Narasimha Murthy, Secretary of Kannada Ganaka Parishat (I have not mentioned their attached institutions to maintain the dignity of the Institutions)

Now the KGP has tied up the Government users by forcing to use the proprietary non-standard bi-lingual encoding by implementing the e-governance projects with the blessings of Kannada Development Authority and with the support of Directorate of Information Technology which is the controlling and monitoring body of the IT requirements of Government of Karnataka.

Can any one list out the ten features that KGP has provided with NUDI as claimed by the Specific Group. NUDI, purported to be the benchmark software has been developed with non-standard fonts like English numerals, bi-lingual fonts, No conversion utilities.

While KGP sings for standard for Kannada, what has prompted them to develop NUDI using a non-standard bi-lingual font.

When the Directorate of Information technology penalises the shortlisted developers for not having followed the standard, what is the modus operandi behind promoting the non-standard uncertified software NUDI.

Kannada Development

KGP is successful in implementing (by influencing the project handling agencies) e-governance projects in non-standard proprietary bi-lingual glyphs by restricting all the Government data flow only confining to its wishes. Is it not a dirty trick.

KGP is increasingly using yet another proprietary encoding for its own internal implementations and pass on their invisible internal encoding in the name of SDK, just crawl and grab the entire application development in Kannada. This proprietary encoding is also being used in the New NLP projects which KGP has started developing with huge funds flooded from the Government. With this initiative, KGP would build a considerable size of MRD, which are necessary for NLP projects. Can anyone explain how this huge size of MRD is interoperable and how it is going to help develop Kannada on computers.

Kannada has become a victim of jealousy KGP. 

I wish Kannada with its outstanding 2300 years of survival and very rich literary contribution has to face this challenge and expose the erratic management of Kannada standards by KGP to maintain its sustained growth and enthronement on digital media.

With my everlasting love and creed towards Kannada Kasthuri I have taken your precious time. I welcome your views on this subject.

N. ANBARASAN
email : ar...@bg... , phone : +91-080-3386167.