Re: [Indic-computing-users] [LangWorkingGroup] Language Handbook Format
Status: Alpha
Brought to you by:
jkoshy
From: Kalika B. <ka...@pi...> - 2002-09-30 11:46:13
|
I'd be happy to do the Background and Section 1 bits for any of the languages that you'd like me to do. tnx, Kalika On Mon, 30 Sep 2002, Keyur Shroff wrote: > Hi, > > I am also attaching a format available with me. We can club both the > formats. > > - Keyur > > > --- "Tapan S. Parikh" <ta...@ya...> wrote: > > > > Im attaching in plain text a possible format for the language handbook > > of the indic-computing project. > > > > Im working on a draft for the format of the Technology Map that I will > > take up amongst the Technology Working Group soon. Ill circulate that > > shortly. > > > > -- Tapan > > > > > Language Handbook Format Working Draft > > Indic-Computing Project > > 22-9-2002 > > > > For each language: > > > > Background > > ----------- > > Some background that would help put the language in appropriate > > historical and social context. > > - Language History > > - Language Family (Etymology): Which languages is this language derived > > from and related to. > > - Number of Speakers > > - Geographic Distribution > > - Variants / Dialects > > > > Section 1 Linguistic Analysis > > -------------------------------- > > Some more in-depth background of the language from a linguistic > > perspective, with a focus on issues relevant to computing, display and > > text processing. > > - List of Writing Systems : A list of different writing systems used to > > represent the language in text. For each writing system, one would try > > to include: > > - Graphemes: the basic graphemes used in the writing system, > > combination rules, and mapping to semantic characters. > > - Usage: Usage details (Is it still used? Where? For what purpose? > > By how many people in what contexts?) > > - Basic Grammatical Info: Grammatical information about basic sentence > > structure and grammar rules. > > > > Section 2 Character Encoding > > ----------------------------- > > - List of Encodings: A list of character encodings to store this > > language in digital format. > > - Size of a character: (in bits) > > - Code Point / Character Map: Map between code points and semantic > > characters. > > - Outstanding Issues: Issues with how this encoding represents the > > language. Types of issues could include the following: > > - Missing Chars > > - Missing Semantics > > - Missing Processing Rules > > - Redundant / Extra Chars > > - Erroneous Semantics > > - Erroneous Processing Rules > > - Writing Systems / Language Variants Supported: Which variants and > > different writing systems does this encoding support for the given > > language. > > - Who created the encoding? > > - Who is in charge of the encoding management and modification process? > > - Software / OS support - What software and OS's support this encoding > > - OS / Network (Linux, FreeBSD, Solaris, Windows, Novell, MacOS, etc.) > > > > - Databases > > - Programming Language Libraries and IDEs (C, C++, Java, Perl, Python, > > etc.) > > - Standards (Unicode, ISO 10464, XML, Linux Standards Base, etc.) > > > > Section 3 Fonts > > ---------------- > > - List of fonts or font families available for this language. For each > > font, > > - What type of Font is it? (TTF, Type 1, X Window, OTF, other) > > - What is the availability? > > - Who is the creator of the font? > > - Who currently manages / develops / owns the font? > > - Is it Open Source? > > - What encodings are supported? > > - What is the glyph set? > > - Brief description of semantic character / glyph mapping > > - Brief description of positioning and substitution issues > > > > > > Section 4 Input Methods > > ------------------------- > > - List of Keyboard Layouts for a language > > - Keyboard Type - keyboard types (hardware) supported > > - Key - Char Mapping - Mapping between keys and code points > > - Usage Information - Information about how the layout is used in > > practice > > - Prevalence > > - Types of Users > > - Encodings Supported > > > > Section 5 Text Processing > > -------------------------- > > Information about the language useful from a text processing (searching, > > sorting, spelling, etc.) point of view. > > - List of Sort Orders - Different ways the language can be sorted. > > - Searching / Matching Semantics - What it means for one word to equal > > another. > > - Word Roots > > - Prefix / Suffix Rules > > - Line Break Rules - When to break a line > > - Hyphenation Rules > > > > Section 6 Typography and Display > > -------------------------------- > > - Basics > > - Ligatures > > - Punctuation > > - Justification > > - Issues Related to Multi-Lingual Document Display > > > > Section 7 Locale Info > > ---------------------- > > Locale-Specific Information would include info about the following: > > - List of Possible Locales - List of locales the language could be > > applicable for. Could refer to a previously described locale. > > - Time - Time Systems > > - Clock Time > > - Calendar > > - Numeric System > > - Measures > > - Currency > > - Salutations > > > > Section 8 XML / HTML Markup > > --------------------------- > > - XML: Issues related to including local language text in XML docs > > - HTML: Issues related to including and displaying text in HTML docs > > > > Section 9 New Areas > > -------------------- > > A list of people / projects working on each of the following for the > > language: > > - Text to Speech Support > > - Voice Recognition > > - OCR Support > > - Natural Language Processing and Machine Translation > > > > Section 10 Language Resources > > -------------------------------- > > Other important resources regarding the language: > > - Local Language Software Available - Different types of software and > > systems that support the language in one way or another > > - Organizations - Different organizations, people and institutions > > interested in the language, either from a computing perspective or not > > - Dictionaries - On-line and Off-line dictionaries for the language > > - Books, Articles, etc. > > - Other Language Links and Resources > > > > > > > > > > > > __________________________________________________ > Do you Yahoo!? > New DSL Internet Access from SBC & Yahoo! > http://sbc.yahoo.com -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Kalika Bali Picopeta Simputers Pvt Ltd Specialist - Language Technology 146 5th Cross e-mail: ka...@pi... RMV Ext phone: (080) 361 0567 Bangalore - 560080 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |