Re: [Indic-computing-users] [LangWorkingGroup] Language Handbook Format
Status: Alpha
Brought to you by:
jkoshy
From: Keyur S. <key...@ya...> - 2002-09-30 11:38:22
|
Hi, I am also attaching a format available with me. We can club both the formats. - Keyur --- "Tapan S. Parikh" <ta...@ya...> wrote: > > Im attaching in plain text a possible format for the language handbook > of the indic-computing project. > > Im working on a draft for the format of the Technology Map that I will > take up amongst the Technology Working Group soon. Ill circulate that > shortly. > > -- Tapan > > > Language Handbook Format Working Draft > Indic-Computing Project > 22-9-2002 > > For each language: > > Background > ----------- > Some background that would help put the language in appropriate > historical and social context. > - Language History > - Language Family (Etymology): Which languages is this language derived > from and related to. > - Number of Speakers > - Geographic Distribution > - Variants / Dialects > > Section 1 Linguistic Analysis > -------------------------------- > Some more in-depth background of the language from a linguistic > perspective, with a focus on issues relevant to computing, display and > text processing. > - List of Writing Systems : A list of different writing systems used to > represent the language in text. For each writing system, one would try > to include: > - Graphemes: the basic graphemes used in the writing system, > combination rules, and mapping to semantic characters. > - Usage: Usage details (Is it still used? Where? For what purpose? > By how many people in what contexts?) > - Basic Grammatical Info: Grammatical information about basic sentence > structure and grammar rules. > > Section 2 Character Encoding > ----------------------------- > - List of Encodings: A list of character encodings to store this > language in digital format. > - Size of a character: (in bits) > - Code Point / Character Map: Map between code points and semantic > characters. > - Outstanding Issues: Issues with how this encoding represents the > language. Types of issues could include the following: > - Missing Chars > - Missing Semantics > - Missing Processing Rules > - Redundant / Extra Chars > - Erroneous Semantics > - Erroneous Processing Rules > - Writing Systems / Language Variants Supported: Which variants and > different writing systems does this encoding support for the given > language. > - Who created the encoding? > - Who is in charge of the encoding management and modification process? > - Software / OS support - What software and OS's support this encoding > - OS / Network (Linux, FreeBSD, Solaris, Windows, Novell, MacOS, etc.) > > - Databases > - Programming Language Libraries and IDEs (C, C++, Java, Perl, Python, > etc.) > - Standards (Unicode, ISO 10464, XML, Linux Standards Base, etc.) > > Section 3 Fonts > ---------------- > - List of fonts or font families available for this language. For each > font, > - What type of Font is it? (TTF, Type 1, X Window, OTF, other) > - What is the availability? > - Who is the creator of the font? > - Who currently manages / develops / owns the font? > - Is it Open Source? > - What encodings are supported? > - What is the glyph set? > - Brief description of semantic character / glyph mapping > - Brief description of positioning and substitution issues > > > Section 4 Input Methods > ------------------------- > - List of Keyboard Layouts for a language > - Keyboard Type - keyboard types (hardware) supported > - Key - Char Mapping - Mapping between keys and code points > - Usage Information - Information about how the layout is used in > practice > - Prevalence > - Types of Users > - Encodings Supported > > Section 5 Text Processing > -------------------------- > Information about the language useful from a text processing (searching, > sorting, spelling, etc.) point of view. > - List of Sort Orders - Different ways the language can be sorted. > - Searching / Matching Semantics - What it means for one word to equal > another. > - Word Roots > - Prefix / Suffix Rules > - Line Break Rules - When to break a line > - Hyphenation Rules > > Section 6 Typography and Display > -------------------------------- > - Basics > - Ligatures > - Punctuation > - Justification > - Issues Related to Multi-Lingual Document Display > > Section 7 Locale Info > ---------------------- > Locale-Specific Information would include info about the following: > - List of Possible Locales - List of locales the language could be > applicable for. Could refer to a previously described locale. > - Time - Time Systems > - Clock Time > - Calendar > - Numeric System > - Measures > - Currency > - Salutations > > Section 8 XML / HTML Markup > --------------------------- > - XML: Issues related to including local language text in XML docs > - HTML: Issues related to including and displaying text in HTML docs > > Section 9 New Areas > -------------------- > A list of people / projects working on each of the following for the > language: > - Text to Speech Support > - Voice Recognition > - OCR Support > - Natural Language Processing and Machine Translation > > Section 10 Language Resources > -------------------------------- > Other important resources regarding the language: > - Local Language Software Available - Different types of software and > systems that support the language in one way or another > - Organizations - Different organizations, people and institutions > interested in the language, either from a computing perspective or not > - Dictionaries - On-line and Off-line dictionaries for the language > - Books, Articles, etc. > - Other Language Links and Resources > > > > __________________________________________________ Do you Yahoo!? New DSL Internet Access from SBC & Yahoo! http://sbc.yahoo.com |