indic-computing-devel Mailing List for The Indic-Computing Project (Page 23)
Status: Alpha
Brought to you by:
jkoshy
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(25) |
Feb
(90) |
Mar
(41) |
Apr
(16) |
May
(8) |
Jun
|
Jul
(37) |
Aug
(35) |
Sep
(62) |
Oct
(37) |
Nov
(22) |
Dec
(7) |
2003 |
Jan
(16) |
Feb
(19) |
Mar
(10) |
Apr
(5) |
May
(26) |
Jun
(11) |
Jul
(35) |
Aug
(4) |
Sep
(14) |
Oct
(5) |
Nov
(5) |
Dec
(10) |
2004 |
Jan
(25) |
Feb
(2) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(10) |
Aug
(2) |
Sep
(2) |
Oct
(1) |
Nov
(9) |
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2006 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
From: <jk...@Fr...> - 2002-02-12 20:42:07
|
I'm trying to refine my understanding of the basic algorithms involved in Indic glyph rendering, for future inclusion into the Handbook. My current understanding: There seem to be two major issues when rendering Indic scripts --- given a sequence of code points representing characters in some encoding like Unicode or ISCII: (A) the presentation (i.e. visual) order of glyphs need not match the order of code points in the sequence. (B) these scripts use a number of glyph shapes representing combinations of characters, so there isn't a 1-1 mapping of character encoding code points to glyphs. (A) can come about because of the structure of the character encoding used. For example, UNICODE follows the convention that the code point for a 'base character' precedes the code points for any modifiers. However, some indic scripts may require that glyphs representing these modifiers (e.g:- "vowel marks") be placed before the glyph for the 'base character'. [Note: You could possibly think of a character encoding where text is encoded in "visual" order. Some transliteration schemes for indian languages use such "visual" order encodings. ] (B) is a property of the script: most (all?) indic scripts have special glyph shapes for double-consonants, consonants+vowel combinations, etc. So, our rendering process has to map: `M' code points -> `N' language glyph shapes and in doing so we have to do glyph re-ordering "(A)" and composite glyph selection "(B)". [Q: Are there any other issues to be taken care of when rendering indic scripts? ] Some indian language fonts are designed to contain "partial glyphs"; these fonts require a sequence of glyphs to be specified to render a full language glyph on screen (for example, Baraha (Kannada)). For such fonts, each of the `N' language glyph shapes selected above will need to be mapped further into `O' font-specific glyph indices. My questions are: - do we do reordering of glyphs (A) before looking for composite glyphs (B), or is it best done the other way round? - do (A) and (B) have to be done multiple times? - is there ONE algorithm that can handle correct glyph rendering for every indic script, or are the glyph selection/re-ordering algorithms language specific? Thanks in advance for answers; this discussion will form the basis of a section on Indic rendering in our Indic-Computing Handbook. Regards, Koshy <jk...@fr...> |
From: Tapan S. P. <ta...@ya...> - 2002-02-12 11:47:01
|
Keyur, Koshy, First off, I dont think the transformation suggested by Keyur can be considered the "same as" UCS-2. For instance, in some ISO encodings, such as those for Cyrillic, Arabic, etc, the code point range 128...255 was used for the special characters required for each such script, which would get mapped to completely different (and unrelated) values in UCS-2. So Keyur unless you are doing a table based mapping for all of these encodings, which doesnt seem possible since you dont know the source encoding, you are probably doing the wrong thing. (Actually even if you are doing things this way you would still likely end up with the wrong result, since the fonts for such encodings are indexed by the ISO-style). I think that is why Koshy suggested to test your server on those encodings, and I tend to think things would break (badly) also, since initial characters in the range 128..255 would be interpreted as multi-byte chars in UTF-8, causing all kinds of funky UCS-2 characters to be emitted (and eventually rendered, again, funkily). In short, it would be a mess. At least this is how things would happen in my imagination of how your code works. Please correct me if I am wrong. Still, I must say, I have been following your discussion closely, and as time goes by, I am more and more convinced by Keyur's line of reasoning. Remember, X was designed and developed in a time when we were dealing with a (ISO) Latin-only world, with no idea of supporting complex scripts and / or fonts , where 8-bit char codes most likely meant glyph codes as well, so the distinction was moot. At that point in time there was no idea of supporting complex scripts and issues such as ligatures, conjuncts, positioning, etc., nor even the idea of true type (not to mention open-type) fonts and cmap tables. As we move into the 21st century, it seems very much out of the X frame of view to have the client worrying about the specific font it will use and the corresponding mapping table, as well as positioning issues. To me this seems clearly within the domain of the X Server, and its rendering engine. If the X protocol is vague on this issue, it is my opinion that the X protocol should change, not our approach to this problem. So this becomes a deeper issue of modifying the X Server to support Unicode and Open (or True) Type Fonts, which must be going on elsewhere as well. Anyone know? For one place, see http://www.io.com/~kazushi/xtt/, a Japanese effort for making the X Server ttf-friendly, in order to support Japanese on the server-side... In a possibly un-desired tone, I venture to say we have lived under the shadow of a latin-dominated computing world for long enough. It is time we make our voice heard, and make sure solutions to support our languages are not ad hoc, but rather made at the most appropriate level for the overall architecture... --Tapan _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Keyur S. <key...@ya...> - 2002-02-12 10:02:13
|
Hello, I don't understand what is happening. Now sending it again. :( Hello, It seems that my earlier mail was not sent in full. So sending it again. Sorry for the inconvenience. --- Joseph Koshy <jk...@Fr...> wrote: > > > Dear Keyur, > > ks> When Xlib converts 8-bit string into 16-bit string, > it > ks> first send MSB first. This is same as Little-Endian > UCS-2. > > The X11 Protocol definition predates Unicode. This isn't > Little-Endian UCS-2, its just a 2-byte encoding of the 8 > bit glyph > indices. I don't say that "it is" UCS-2. I say that it is "same as" UCS-2 (or compatible with UCS-2). > > ks> How can client have knowledge about the glyph > indices? > > That is what the encoding field in the long name of X > fonts is for. > For Latin fonts this will be `-iso8859-1' meaning the > font is encoded > compatibly with the ISO8859-1 character encoding. As you said, this is really character encoding not font encoding. Some distinction should be made between "character" and "glyph". ----------- According to Unicode standard (see glossary), a character is (1) The smallest component of written language that has semantic value; refers to the abstract meaning and/or shape, rather than a specific shape (see also glyph), though in code tables some form of visual representation is essential for the reader's understanding. (2) Synonym for abstarct character (3) Loosely, the basic unit of encoding for the Unicode character encoding, a 16-bit unit of textual representation. (4) Synonym for code value. (5) The English name for the ideographic written elements of Chinese origin. Abstract character : A unit of information used for the organization, control, or representation of textual data. (See also character (1, 2)) And glyph has been defined as (1) An abstract form that represents one or more glyph images. (2) A synonym for glyph image. In displaying Unicode character data, one or more glyphs may be selected to depict a particular character. These glyphs are selected by a rendering engine during composition and layout processing. --------------- As can be depicted from the above definations, a client pass "something" that has semantic value, means "characters". One or more glyphs may be selected to display a particular character. So client is in no position to decide upon the glyph indices to be used for a character. It is totally at the sole discretion of font designer to select _proper_ glyph(s) for a character. We can't say that particular glyph should be used for a character. > You can have fonts that are not indexed by character > codes and fonts > that follow different encoding schemes e.g:- hp-roman8. Can you give me few font formats used in X Window system which doesn't use mapping table? Even in case of different encoding like hp-roman8 or font coding like ISFOC, there should be mapping from these encoding values to the glyph codes. In case of ISFOC, font glyph encoding matches with ISFOC encoding. > The client > has to select the correct glyph indices in the X text > drawing calls, > appropriately. In XWindow system system client doesn't have direct access to font resources when fonts are loaded by the font library interactively with Xserver. Also all the font resources and security data are kept by the Xserver. Clients can only send request to Xserver to display a character string or to get extents of a character string. > > If the font's glyph encoding matches the character > encoding, then an X > client can just send over the numeric values of > 'characters' unchanged > and the correct glyphs will get selected automatically. > This is what > you are seeing when you put a "printf()" in > "XQueryTextExtents()". I have objection against the word "automatically". The glyphs are not selected automatically but since glyph codes and character codes are matched, they are displayed properly. It is also possible that font designer decides to use two glyphs "/" and "\" for character "X". In that case it is the job of mapping tables to do the things properly. Client will only request to draw glyph for character "X". It will not send indices for "/" and "\". > TrueType fonts do have a 'cmap' that maps from character > codes to > internal glyph indices. This happens to work in X because > the X client > is assuming a font encoding (like iso8859-1) when sending > over the > glyph indices and the fonts 'cmap' is setup to map the > same character > encoding to its internal layout. So you are coming to the point. As you said TrueType characters do have a 'cmap' table that maps from character codes to internal glyph indices. It means that clients has to pass character codes to such fonts. And clients do pass character codes. My stand becomes more clear if you take example of XDrawString16 or XQueryTextExtents16. In these functions we use XChar2b structure to pass character codes (e.g., Unicode). A font may have as many as 500 glyphs. But we pass values like below. ---- XChar2b str[10]; str[0].byte1 = 0x09; str[0].byte2 = 0x15; str[1].byte1 = 0x09; str[0].byte2 = 0x30; XDrawString16(dpy, drawable, gc, x, y, str, 2); ---- Clearly, we are passing Unicode values U+0915 and U+0930 which are Unicode characters "Devanagari Ka" and "Devanagari Ra" respectively. The glyphs for these characters may be at position 156 and 183 respectively. We are not passing values "156" or "183". > > Such "remapping" by TrueType fonts is out of the scope of > the X > protocol. > > ks> Please read the first sentence in > ks> X Protocol Specification, Glossary, pp 37 > ks> "This request returns the logical extents of the > ks> specified string of characters in the specified font". > ^^^^^^^^^^^^^^^^^^^^ > > Agreed, this is poorly worded. You need to read the > formal > definitions of FONT, STRING8 and STRING16 to put the > definition in > context. See also the protocol descriptions for > PolyText{8,16} and > ImageText{8,16}. OK. Here are the definations. ------- FONT (Page 154) A font is a matrix of glyphs (typically characters). The protocol does no translation or interpretation of character sets. The client simply indicates values used to index glyph array. A font contains additional metric information to determine interglyph and interline spacing. ------- Here "values used to index" doesn't necessarily mean glyph codes. "Character codes" are also values used to index glyph array using some mapping table. --------- (Page 3) STRING8 -> LISTofCARD8 STRING16 -> LISTofCHAR2B CHAR2B -> [byte1, byte2: CARD8] BYTE -> 8-bit value CARD8 -> 8-bit unsigned integer CARD16 -> 16-bit unsigned integer --------- At no place they have indicated anything about glyph indices. In fact, Protocol doesn't clearly describe anything explicitly about the "values" used in the protocol. The freedom was given for the implementation. X Window system is not merely an X Protocol but it includes X library, X Protocol, Xserver, and now Font renderers. It is totally on the implementation to decide what these "values" mean. And the developers have decided to use "character codes" to pass as values in X Protocol. Regards, Keyur __________________________________________________ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com |
From: Keyur S. <key...@ya...> - 2002-02-12 09:41:46
|
Hello, It seems that my earlier mail was not sent in full. So sending it again. Sorry for the inconvenience. --- Joseph Koshy <jk...@Fr...> wrote: > > > Dear Keyur, > > ks> When Xlib converts 8-bit string into 16-bit string, > it > ks> first send MSB first. This is same as Little-Endian > UCS-2. > > The X11 Protocol definition predates Unicode. This isn't > Little-Endian UCS-2, its just a 2-byte encoding of the 8 > bit glyph > indices. I don't say that "it is" UCS-2. I say that it is "same as" UCS-2 (or compatible with UCS-2). > > ks> How can client have knowledge about the glyph > indices? > > That is what the encoding field in the long name of X > fonts is for. > For Latin fonts this will be `-iso8859-1' meaning the > font is encoded > compatibly with the ISO8859-1 character encoding. As you said, this is really character encoding not font encoding. Some distinction should be made between "character" and "glyph". ----------- According to Unicode standard (see glossary), a character is (1) The smallest component of written language that has semantic value; refers to the abstract meaning and/or shape, rather than a specific shape (see also glyph), though in code tables some form of visual representation is essential for the reader's understanding. (2) Synonym for abstarct character (3) Loosely, the basic unit of encoding for the Unicode character encoding, a 16-bit unit of textual representation. (4) Synonym for code value. (5) The English name for the ideographic written elements of Chinese origin. Abstract character : A unit of information used for the organization, control, or representation of textual data. (See also character (1, 2)) And glyph has been defined as (1) An abstract form that represents one or more glyph images. (2) A synonym for glyph image. In displaying Unicode character data, one or more glyphs may be selected to depict a particular character. These glyphs are selected by a rendering engine during composition and layout processing. --------------- As can be depicted from the above definations, a client pass "something" that has semantic value, means "characters". One or more glyphs may be selected to display a particular character. So client is in no position to decide upon the glyph indices to be used for a character. It is totally at the sole discretion of font designer to select _proper_ glyph(s) for a character. We can't say that particular glyph should be used for a character. > You can have fonts that are not indexed by character > codes and fonts > that follow different encoding schemes e.g:- hp-roman8. Can you give me few font formats used in X Window system which doesn't use mapping table? Even in case of different encoding like hp-roman8 or font coding like ISFOC, there should be mapping from these encoding values to the glyph codes. In case of ISFOC, font glyph encoding matches with ISFOC encoding. > The client > has to select the correct glyph indices in the X text > drawing calls, > appropriately. In __________________________________________________ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com |
From: Keyur S. <key...@ya...> - 2002-02-12 06:20:50
|
--- Joseph Koshy <jk...@Fr...> wrote: > > > Dear Keyur, > > ks> When Xlib converts 8-bit string into 16-bit string, > it > ks> first send MSB first. This is same as Little-Endian > UCS-2. > > The X11 Protocol definition predates Unicode. This isn't > Little-Endian UCS-2, its just a 2-byte encoding of the 8 > bit glyph > indices. I don't say that "it is" UCS-2. I say that it is "same as" UCS-2 (or compatible with UCS-2). > > ks> How can client have knowledge about the glyph > indices? > > That is what the encoding field in the long name of X > fonts is for. > For Latin fonts this will be `-iso8859-1' meaning the > font is encoded > compatibly with the ISO8859-1 character encoding. As you said, this is really character encoding not font encoding. Some distinction should be made between "character" and "glyph". ----------- According to Unicode standard (see glossary), a character is (1) The smallest component of written language that has semantic value; refers to the abstract meaning and/or shape, rather than a specific shape (see also glyph), though in code tables some form of visual representation is essential for the reader's understanding. (2) Synonym for abstarct character (3) Loosely, the basic unit of encoding for the Unicode character encoding, a 16-bit unit of textual representation. (4) Synonym for code value. (5) The English name for the ideographic written elements of Chinese origin. Abstract character : A unit of information used for the organization, control, or representation of textual data. (See also character (1, 2)) And glyph has been defined as (1) An abstract form that represents one or more glyph images. (2) A synonym for glyph image. In displaying Unicode character data, one or more glyphs may be selected to depict a particular character. These glyphs are selected by a rendering engine during composition and layout processing. --------------- As can be depicted from the above definations, a client pass "something" that has semantic value, means "characters". One or more glyphs may be selected to display a particular character. So client is in no position to decide upon the glyph indices to be used for a character. It is totally at the sole discretion of font designer to select _proper_ glyph(s) for a character. We can't say that particular glyph should be used for a character. > You can have fonts that are not indexed by character > codes and fonts > that follow different encoding schemes e.g:- hp-roman8. Can you give me few font formats used in X Window system which doesn't use mapping table? Even in case of different encoding like hp-roman8 or font coding like ISFOC, there should be mapping from these encoding values to the glyph codes. In case of ISFOC, font glyph encoding matches with ISFOC encoding. > The client > has to select the correct glyph indices in the X text > drawing calls, > appropriately. In __________________________________________________ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com |
From: <jk...@Fr...> - 2002-02-11 11:11:36
|
Dear Keyur, ks> When Xlib converts 8-bit string into 16-bit string, it ks> first send MSB first. This is same as Little-Endian UCS-2. The X11 Protocol definition predates Unicode. This isn't Little-Endian UCS-2, its just a 2-byte encoding of the 8 bit glyph indices. ks> If you are still not happy with my explanation, then put a ks> 'printf' sentence in the function XQueryTextExtents and see ks> the values passed in the request. :) ks> How can client have knowledge about the glyph indices? That is what the encoding field in the long name of X fonts is for. For Latin fonts this will be `-iso8859-1' meaning the font is encoded compatibly with the ISO8859-1 character encoding. There can be other encodings; Big5 (chinese), iso8859-8 (latin+hebrew) or iso8859-5 (latin+arabic). You can have fonts that are not indexed by character codes and fonts that follow different encoding schemes e.g:- hp-roman8. The client has to select the correct glyph indices in the X text drawing calls, appropriately. If the font's glyph encoding matches the character encoding, then an X client can just send over the numeric values of 'characters' unchanged and the correct glyphs will get selected automatically. This is what you are seeing when you put a "printf()" in "XQueryTextExtents()". TrueType fonts do have a 'cmap' that maps from character codes to internal glyph indices. This happens to work in X because the X client is assuming a font encoding (like iso8859-1) when sending over the glyph indices and the fonts 'cmap' is setup to map the same character encoding to its internal layout. Such "remapping" by TrueType fonts is out of the scope of the X protocol. ks> Please read the first sentence in ks> X Protocol Specification, Glossary, pp 37 ks> "This request returns the logical extents of the ks> specified string of characters in the specified font". ^^^^^^^^^^^^^^^^^^^^ Agreed, this is poorly worded. You need to read the formal definitions of FONT, STRING8 and STRING16 to put the definition in context. See also the protocol descriptions for PolyText{8,16} and ImageText{8,16}. ks> Unfortunately, I don't have test suite installed on my ks> system. It is not there in xc/test :( I am also not able to ks> locate it on XFree86 site. Will you please tell me where ks> can I get it from? It is part of the XFree86 repository, available under directory "test/", a sibling of directory "xc/". It can be retrieved in the usual ways (Anon-CVS checkout, CVSup mirroring etc). Anyone changing the X11 library or the X server really should be running the test suite to check for breakages. Do be sure to run the test suite from a remote (unmodified) system as well as locally. Regards, Koshy <jk...@fr...> |
From: Keyur S. <key...@ya...> - 2002-02-11 08:04:37
|
Dear Joseph, --- Joseph Koshy <jk...@Fr...> wrote: > > ks> Err! Please carefully see the source code of > xc/lib/X11/QuTextExt.c > I don't see anything specific to Unicode or UCS-2 in this > file. When Xlib converts 8-bit string into 16-bit string, it first send MSB first. This is same as Little-Endian UCS-2. > There is only one [QueryTextExtents] protocol request in > the X11 > protocol. This request is used for both the > `XQueryTextExtents16()' > and `XQueryTextExtents()'. It expects 2 byte glyph > indices. These are not glyph indices. These are character codes which are passed in the request. The X server passes it to the appropriate font library which then maps these character codes to the glyph codes and do the further processing. "The client simply indicates values used to index the glyph array." In this sentence 'values used to index the glyph array' means 'character codes' which are used to index the glyph array using some mapping table (e.g., cmap table in TrueType font) in the font. > > For "linear" (single byte) glyph indices, the X library > makes the MSB > of each 2 byte glyph index to be zero (i.e. linear > encodings are > treated as row 0 of a 2-D glyph matrix). All this is > explained in the > X Protocol specification. [See Page 37, > QueryTextExtents] Please read the first sentence in X Protocol Specification, Glossary, pp 37 "This request returns the logical extents of the specified string of characters in the specified font". ^^^^^^^^^^^^^^^^^^^^ Let me explain this through an example. Client passes a string of characters, e.g., "Hello World", in XQueryTextExtent. Xlib will convert it into 16-bit string before sending it to Xserver in 'QueryTextExtents' request. At this place no conversion from these character codes to glyph codes is done. At the server side, proper font renderer (font library) is chosen (see xc/lib/font). This font library then gets glyph ids and other glyph information (glyph metrics etc.) from this character string using a mapping table stored in the font. Font library then passes this information back to the XServer which then processes the request further and finally either send reply/error/event or fulfill the request (as in case of XDrawString). In the mapping table of the font, character code is not necessarily same as glyph code (glyph id). For example, character 'A' which has character code 65 may be at glyph position 10 and thus having glyph code 10. In the font table, there is a mapping from character code 65 to glyph code 10. If you are still not happy with my explanation, then put a 'printf' sentence in the function XQueryTextExtents and see the values passed in the request. :) > ks> In my view, it is compatible with the X Window System > protocol. > > You seem to have ignored the part of the X protocol > specification > (that I had quoted in my review) that explicitly states > that the X > protocol DOES NOT deal with character codes and that the > clients just > use indices into the glyph array. As I have explained earlier, you have misinterpreted the sentence. How can client have knowledge about the glyph indices? Client always pass character string in Xlib routine. > > If you want to see the effect of your changes on protocol > compliance, > you could: > > (a) run the X protocol test suite. In particular, > Unfortunately, I don't have test suite installed on my system. It is not there in xc/test :( I am also not able to locate it on XFree86 site. Will you please tell me where can I get it from? > You are probably the best placed in our group of > developers to talk > about the technology behind Indic script rendering. I > am looking forward to learning from your experience. Do > you have tutorial or writeup on Indic rendering that you > could share with this group? Sure. Working as a group, we shall definately arrive at some solution. I'll be happy to share my experience with this group. I would also like to comment on various design issues that you explained in one of your earlier mails. There are some documents on Indic rendering (not written by me). I'll send you pointers. I'll also give the document written by us. We also have developed a series of printing tools that can produce high quality PS file using outlines. It uses OpenType font and supports UTF-8, ISCII, and UCS-2 (Little-Endian and Big-Endian) encodings. I am looking forward for your feedback on these tools. I'll also register all our projects on sourgeforge. Regards, Keyur __________________________________________________ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com |
From: Tapan S. P. <ta...@ya...> - 2002-02-11 07:46:29
|
http://www.newsbytes.com/news/02/174182.html |
From: <jk...@Fr...> - 2002-02-09 05:42:47
|
Dear Keyur, Welcome! ks> Err! Please carefully see the source code of xc/lib/X11/QuTextExt.c ks> in original XFree86. It also first converts the string into UCS-2 ks> before sending request to the X Server. The only difference between ks> the conversion is that, originally X Server pads an extra byte to ks> each element of the string to make it UCS-2. We assume incoming ks> sequence into UTF-8 and convert it into UCS-2. The changes made in ks> IndiX was earlier breaking relationship with other foreign languages ks> like French, German (all with iso-8859-* encoding). But now I am ks> taking care of this also. I don't see anything specific to Unicode or UCS-2 in this file. http://cvsweb.xfree86.org/cvsweb/xc/lib/X11/QuTextExt.c?rev=1.4&content-type=text/x-cvsweb-markup There is only one [QueryTextExtents] protocol request in the X11 protocol. This request is used for both the `XQueryTextExtents16()' and `XQueryTextExtents()'. It expects 2 byte glyph indices. For "linear" (single byte) glyph indices, the X library makes the MSB of each 2 byte glyph index to be zero (i.e. linear encodings are treated as row 0 of a 2-D glyph matrix). All this is explained in the X Protocol specification. [See Page 37, QueryTextExtents] Compare: http://cvsweb.xfree86.org/cvsweb/xc/lib/X11/QuTextE16.c?rev=1.4&content-type=text/x-cvsweb-markup ks> X11 text drawing calls accepts character codes and send them to the ks> X Server along with other data in the form of a request. We have not ks> changed this semantic. This character codes are then used by the ks> subsequent font library to get the glyph codes. ks> In my view, it is compatible with the X Window System protocol. You seem to have ignored the part of the X protocol specification (that I had quoted in my review) that explicitly states that the X protocol DOES NOT deal with character codes and that the clients just use indices into the glyph array. ``Font: A font is a matrix of glyphs (typically characters). The protocol does no translations or interpretation of character sets. The client simply indicates values used to index the glyph array. A font contains additional metric information to determine interglyph and interline spacing.' X Protocol Specification, Glossary, pp 154. If you want to see the effect of your changes on protocol compliance, you could: (a) run the X protocol test suite. In particular, /tset/CH06/drwimgstr16/Test{all} ... and others in this section ... /tset/XPROTO/imgtxt16/Test{all} /tset/XPROTO/plytxt16/Test{all} /tset/XPROTO/qrytxtextn/Test{all} (b) attempt to view cryllic, korean or japanese text (i.e character encodings whose code points fall outside the US-ASCII range) You are probably the best placed in our group of developers to talk about the technology behind Indic script rendering. I am looking forward to learning from your experience. Do you have tutorial or writeup on Indic rendering that you could share with this group? Regards, Koshy <jk...@fr...> |
From: Keyur S. <key...@ya...> - 2002-02-08 09:22:03
|
Hi, I have just joined the list. Let me first say thanks to Mr. Joseph Koshy for evaluating the IndiX system. It's really good to have someone who can independently evaluate our work. Thanks a lot Mr. Koshy! Also I am thankful to Mr. Tapan Parikh for informing me about this mailing list. There are certain points which I want to put comments on. --- Joseph Koshy wrote: > [[ This is a 28MB download. I wonder why they didn't > (also) > put up a "diff" wrt. XFree86 sources instead. ]] Agreed. I was too lazy to put 'diff' on the website :(. I'll do it. > > IMO, the plus points of this work are: At this moment, I don't want to say anything about plus point of IndiX. I rather like to discuss negative points so that I can improve the design and the system :) > > The NCST changes are, unfortunately, intrusive, and break > the semantics of the X Window system protocol. > > The problem with the NCST design arises from confusion > over > character codes and the glyph indices used in X11 text > drawing calls. > I have some objection on this point. X11 text drawing calls accepts character codes and send them to the X Server along with other data in the form of a request. We have not changed this semantic. This character codes are then used by the subsequent font library to get the glyph codes. > > In the NCST work however, all text strings fed into X11 > text calls are assumed to be UNICODE character streams > encoded in UTF-8 format. True. This assumption we have taken. There are certain reasons behind this but we'll discuss them later on. > The NCST system cannot be considered an implementation of > the X Window System protocol. Applications using the > NCST > X library will not work correctly on other X servers and > applications compiled on other systems will not work > correctly on the NCST X server. Again this is not true. I have been downloading binary RPMS and using it on my machine where IndiX has been installed. You can also use applications compiled on the IndiX system and use it without any problem on your machine. > > Compatibility of clients using the NCST X11 library with > `stock' X servers is broken because of a change to > XQueryTextExtents(): in the NCST system, the text string > sent over to the X server is assumed to be in UTF-8 > format > and is first converted to UCS-2 by their X11 library. > Thus > the bytes (in UCS-2 format) that get sent out will be > quite different from what the client passed in. The NCST > X > server will deal correctly with this UCS-2 encoded data, > but stock X servers will not. Err! Please carefully see the source code of xc/lib/X11/QuTextExt.c in original XFree86. It also first converts the string into UCS-2 before sending request to the X Server. The only difference between the conversion is that, originally X Server pads an extra byte to each element of the string to make it UCS-2. We assume incoming sequence into UTF-8 and convert it into UCS-2. The changes made in IndiX was earlier breaking relationship with other foreign languages like French, German (all with iso-8859-* encoding). But now I am taking care of this also. > > Nice system, nice code; unfortunately not compatible with > the X Window System protocol. In my view, it is compatible with the X Window System protocol. Thanks, Keyur __________________________________________________ Do You Yahoo!? Send FREE Valentine eCards with Yahoo! Greetings! http://greetings.yahoo.com |
From: <jk...@Fr...> - 2002-02-08 04:47:20
|
> He has some very compelling reasons for doing things the way he did, > which he has been fighting out with the GTK/Pango people on his own. > I tend to agree with his view to a certain extent. I'm sure many of us would be interested in knowing the rationale behind the NCST design -- in particular, breaking the X11 protocol was an extreme step which I'm curious about. Regards, Koshy <jk...@fr...> |
From: Guntupalli K. <kar...@fr...> - 2002-02-07 15:33:43
|
On Thu, 7 Feb 2002 19:24:13 +0530 "Tapan S. Parikh" <ta...@ya...> wrote: > Hi Guys, > > Actually I sat with the lead developer of the NCST project, Keyur > Schroff, who is a young, nice and knowledgeable guy, for over an > hour today. He seems to have done almost all of the work on the > port himself. > > Whatever the complaints, I must say his port is very robust, user / > programmer friendly, and renders beautifully. > I agree ( after having gone through the opentype code ). This part could be reused in Gtk/Pango, & Qt. Also it can be used, if not the desktop , in niche segments, where using KDE/GNOME for just indian lang support would be too heavy, eg. kiosks > He has some very compelling reasons for doing things the way he did, > which he has been fighting out with the GTK/Pango people on his own. > I tend to agree with his view to a certain extent. > Interested to hear, I too had couple of them. Regards, Karunakar |
From: Tapan S. P. <ta...@ya...> - 2002-02-07 13:52:41
|
Hi Guys, Actually I sat with the lead developer of the NCST project, Keyur Schroff, who is a young, nice and knowledgeable guy, for over an hour today. He seems to have done almost all of the work on the port himself. Whatever the complaints, I must say his port is very robust, user / programmer friendly, and renders beautifully. He has some very compelling reasons for doing things the way he did, which he has been fighting out with the GTK/Pango people on his own. I tend to agree with his view to a certain extent. Ill let him explain that, I told him to join the list. He may also be posting the project onto sourceforge, which would be great... Here is my reply to the query earlier.. > > Please explain why you have decided to store data in ISCII format in the > RDBMS. Why not Unicode? I agree with you on Binary fields for we would > not be able to sort, search and index. > Good point on storing data directly in Unicode. The reason for that is that we have been using CDAC GIST controls, much to our continual chagrine. Unfortunately Mithi came out with their line too late for our proj (dev has been going on for one year and is due to deploy within the month). But we had such a bad time w/ CDAC that we did pick up Mithi fonts and web apis very late in the development, and they have worked great. Actually though, unicode may not even have solved the problem b/c many RDBMS' dont support Unicode yet either... Best, Tapan _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: sunil <su...@in...> - 2002-02-07 13:31:30
|
-----Original Message----- From: Guntupalli Karunakar <kar...@fr...> > Is (1) intended to be done remotely, I mean something like a web > based content development ? Yes, exactly! Now you know where we are coming from. So we are not sure what platform or browser the client is using. > > 1. http://www.hexidec.com/ekit.php. > This seems to be a good one. This is using Java 2. But it is very comprehensive work including internationalization. I guess this is where we will start. Also thanks for the links on 1. Java Input Method Engine from National University of Singapore. This is an alternative starting point if we want to be Java 1.0 compliant. What do other members on this list feel. Should we go with Java 1.2 or should we be backward compatible with older browser with older JVMs. 2. Java Input Method Framework from Sun.com. 3. http://ibm.nsysu.edu.tw/INET98/5f/5f_2.htm -> this is giving me a 404 Another starting point would be the 'Java Input Method Engine' developed by the > we can automate the translation process a bit. Does running your app, > on top of indix or iitm X work ? Thanks for all the details on the translation project. You have my every good wish. I can imagine how difficult the task would be. W.r.t to our app I did not try it yet on indix or iitm X. Since I am focussing on a web interface I guess I will focus on the Java applet solution within the browser. However we will continue to test X-interfaces. I cannot promise to do any direct work in this area because we have no X competence. I don't understand most of Koshy's evaluation mails. But the importance of this work does not escape me especially for users at a village level using applications such as Open Office. -----Original Message----- From: "Tapan S. Parikh" <ta...@ya...> > In reality if I wanted to be as clean as possible I should store all > ISCII text in binary fields, since the DBMS / Driver doesnt seem like it is meant Thanks for explaining the problem you faced with JDBC drivers for MS Access and MS SQL Server. I guess I will be requiring a similar work arounds shortly. Please explain why you have decided to store data in ISCII format in the RDBMS. Why not Unicode? I agree with you on Binary fields for we would not be able to sort, search and index. We are using Mithi.com's India Interactive [Software Development Toolkit] for one of our projects. We will be storing UNICODE data in MySQL. Will report back later with our experience with search, sort and index - for Hindi, Gurmukhi and Telegu. -----Original Message----- > From: ni...@vi... > We are now entering into designing phase, The major problem we phased > during our r&d is that a JApplet can only run with java1.2 compatible > web browser. I guess that is OK. But it means a huge download for both Mozilla and MSIE. > While technically it is possible to use the Swing components in any > Java 1.1-based browser, realistically is another story. The JAR file of > Swing classes is approximately 2 MB, which would be required to be > downloaded every time a user needed to run the applet. Unless you > required your users to manually install the classes locally, this isn't > usually a realistic option. The best alternative is to have your users > install the Java Plug-in through html code. The white paper from the 'National University of Singapore' says that AWT widgets are peer components and rely on the host platforms widget functionality. So should we be going with Swing since they are independent of the host platform. But what can we do about Nitin's concern w.r.t the download? Whether we choose to go with Plugins or Classes. If the big download is non-negotiable then I guess we will have to live with it. Let me run you through the architecture of the Input method - 2 dimensional lookup array of Strings [to map user keystrokes to corresponding character codes] Or alternatively Tree based implementation of the same - Java bitmap fonts or Unicode fonts [Using fonts installed on the Host platform] - Widgets Package - Font Interface Package Thanks, Sunil |
From: Guntupalli K. <kar...@fr...> - 2002-02-06 13:05:11
|
On Tue, 29 Jan 2002 16:33:40 +0530 "sunil" <su...@in...> wrote: > Questions and replies to Koshy, Harsha and Karunakar > > On Koshy's <ko...@fr...> mail:- > I agree with your division of the project into (1) a widget or > framework that allows WYSIWYG creation of HTML pages including > tables. (2) an input method that can handle indian languages > Is (1) intended to be done remotely, I mean something like a web based content development ? > > My options are... > 1. http://www.hexidec.com/ekit.php. This seems to be a good one. > > On Karunakar's <kar...@fr...>:- > > Their work is mainly in foll. > > ISCII plugin - view ISCII based web pages, > > using locally installed fonts , supports many popular fonts , > > including CDAC ISFOC & bharatbhasha's shusha fonts. > > Does this completely address 'display' of local language content for > Mozilla/MSIE on Window and *nix. To a large extent. I have put the linux verion at http://www.indlinux.org/downloads/iscii_plugin_linux.tar.gz in *.rc files the font names need a little change DV TTYogesh DV1 TTYogesh /.netscape/plugin-iscii/dvngri.map ALLOWED_PUNCTUATION_MARKS = ! ' ` ( ) / : ; DV TTYogesh becomes dv_ttyogesh this is to get the font name properly on linux > > > Anusaaraka - > > Machine transalation for indian languages. Right now only indlang > > to indlang supported, English to hindi in development. English - > > indian lang dictionaries in ISCII format. > > Is this translation or transliteration? Sorry to ask a dumb > question. What is time frame on this project. Translation , Transliteration AFIAK is writing a language in a script other than the native one. Like ITRANS / JTRANS, which write Hindi using latin script. they have build up a big database of indian language words, based on rules Anusaraka processes the input text, makes searches in database for suitable translations for words & gives output in a readable format in target language. Contact Vineet Chaitanya <vc at iiit.net> & Amba Kulkarni <amba at iiit.net> for more details about their plans. Also I have put up the files I had at http://www.indlinux.org/downloads/ANU.tar.gz > > Can you give us more details of what you are are doing. Also can to > tell us why you are doing this? Are you converting the dictionaries > for 'Anusaaraka'? We are newbie Java and Indic-computing people. > Where to we start with the work done by IIT. You must know by now > what our end goal is. > I want to use the dictionary for GNOME/KDE translations. Once I have a basic set of words translated from the GNOME glossary ( which has the most commonly uses terms ) & the english-hindi dictionary , then we can automate the translation process a bit. Does running your app, on top of indix or iitm X work ? Regards, Karunakar |
From: Tapan S. P. <ta...@ya...> - 2002-02-06 11:07:18
|
Sorry, I guess I was a few days late in responding to Sunils questions so the context was lost. A few days ago I had posted some Java code that worked around some bugs in the JDBC driver and/or underlying DB support for storing ISCII text in varchar or text fields in SQL Server or Access. The basic problem was that when the driver read characters in the range 128..255 (i.e. ISCII text), for some reason the high order bit became sign extended all the way across - so that all ISCII chars were being represented as unicode chars in the range #FFFF - #FF80. So my code basically clips the out the extra #FF, then converts the ISCII representation to standard Unicode codes for devanagri. In reality if I wanted to be as clean as possible I should store all ISCII text in binary fields, since the DBMS / Driver doesnt seem like it is meant to support non-ASCII text encodings. But that would take away a lot of neccesary searching, sorting and indexing facilities... So basically the take-home point being an example of the kinds of difficulties one could have when storing and retrieving indic text in standard DBs and that being another possible thing we should take a look at... Does anyone have any exp whether or not these kinds of problems happen with other DBs (MySQL, PostGRE, Oracle, etc...) --Tapan ----- Original Message ----- From: "Joseph Koshy" <jk...@Fr...> To: "Tapan S. Parikh" <ta...@ya...> Sent: Wednesday, February 06, 2002 4:23 PM Subject: Re: [Indic-computing-devel] Notes > > > Dear Tapan, > > > But really my main concern was to bring peoples attention to the > > kinds of hassles that may come with using non-ascii char > > representations in standard dbms packages and middleware, which is > > another hurdle we will have to overcome... > > What kinds of hassles? Could you provide some context for your email > to the list please? > > Regards, > Koshy > <jk...@fr...> _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: Tapan S. P. <ta...@ya...> - 2002-02-06 10:06:40
|
Hi Sunil, Actually my code is only really meant for a very specific situation (i.e. using JSP w/ SQL Server or Access w/ Merant drivers). Im not sure if the same bug pops up w/ other DBs and other JDBC Drivers. I wouldnt be surprised if it did, in that case this hack might be more generally applicable. But really my main concern was to bring peoples attention to the kinds of hassles that may come with using non-ascii char representations in standard dbms packages and middleware, which is another hurdle we will have to overcome... Regards, --Tapan _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com |
From: <jk...@Fr...> - 2002-02-05 10:58:44
|
Dear Sastry, > I sincerely appreciate your efforts in trying to support Indian > Languages on Linux. Thank you for the appreciation of the teams work. > After having shared the keen insights you have > gained by looking at both IndLinux(IITM) and Indix(NCST), what are > your final suggestions/recommendations? I'm still looking at alternatives. Some notes: The X protocol supports a very simple model of text rendering, namely: - an X font is a collection of glyphs, indexed by one of two schemes: linear and 2-D. - text drawing calls specify a starting X,Y coordinate on a drawable entity and a list of glyph indices. The X server `draws' glyphs by placing the bits of the selected glyphs `next' to each other. Indic scripts generally have a large number of glyph shapes, representing combinations and conjuncts of the component `characters'. The mapping between `character codes' and glyphs is complex, and is also language and character encoding specific. The NCST and IITM folks have tried to implement this mapping in the X server and X library respectively, but have sacrificed X protocol compatibility in the process. The following broad approaches to rendering Indic text seem possible, (without breaking anything, that is): A) render glyphs entirely on the X client side, sending the final bitmap across. [pros] + will work on any X server [cons] - each client needs to 'know' the gory details of indian language text processing - clients need to 'know' about specific font technologies (Type1/TTF/OpenType/...) and font encodings - additional network traffic compared to sending just glyph indices - no opportunity for the server to cache glyphs and fonts - doesn't allow the X server to use hardware knowledge effectively (e.g. sub-pixel positioning of glyphs on LCD displays) - changes the programming model at the client B) Have the client do all the `character code' processing needed (re-ordering of glyphs, selection of glyphs for composite characters, etc) and have it send over a list of (font specific) glyph indices. [pros] + will work on any X server + will allow the X server to use hardware features for text rendering + will allow for caching of font glyphs in the X server + low network load + allows clients to be independent of the font technology used by the X server + client programming model does not change drastically [cons] - clients need to 'know' the gory details of indian language text processing (how to reorder glyphs, how to select composite glyphs, etc.). - clients need to 'know' the font encodings used by the fonts served by the X server GTK+Pango appears to partially(?) follow this model. C) An X server extension specially for Indic glyph drawing In this approach we add new drawing requests that allow text in the form of 'character codes' to be sent to the X server and have the X server process these characters appropriately, according to the language and the character encoding. [pros] + low network load + clients can be unaware of underlying font technology + the X server can use hardware features for text rendering + {?} clients can perhaps be written to deal with 'characters', and not glyphs. + {?} clients can perhaps be language and encoding independent. [cons] - text metrics needs a round-trip request/response from the server; this could result in dramatic slowdowns. - programming model is incompatible with the regular X model; clients require a rewrite of their text processing portions. Of these alternatives, I prefer (B). X servers in general are very good at rendering text glyphs, this being one of the most important areas of performance optimization. Option (A), i.e. client side rendering, is not able to take advantage of hardware speed ups, and also burdens the client with a number of dependencies. The situation is not so bad that an option like Option (C) is required. As mentioned earlier GTK+Pango appears to follow model (B). However, it is not generic enough for my taste; even if GTK+Pango could be speeded up (it currently draws strings one glyph at a time), the GTK+Pango indic algorithms appear to be specific to UNICODE. I would personally prefer a table-driven approach to reordering and character composition (like that of the Graphite system from SIL). In summary, adding language specific reordering and aggregation rules to the client while leaving actual glyph rendering to the X server seems a promising alternative today. > When can we expect a complete release that would support atleast > Hindi completely without breaking the compatibility with the X > Window System protocol? How much design/coding are you willing to do? :). Regards, Koshy <jk...@fr...> |
From: Guntupalli K. <kar...@fr...> - 2002-02-05 10:17:19
|
On Mon, 4 Feb 2002 20:58:38 +0530 "Sastry Ramachandrula" <rs...@mg...> wrote: > Dear Koshy, > > I sincerely appreciate your efforts in trying to support Indian > Languages on Linux. After having shared the keen insights you have > gained by looking at both IndLinux(IITM) and Indix(NCST), what are > your final suggestions/recommendations? > > When can we expect a complete release that would support atleast > Hindi completely without breaking the compatibility with the X > Window System protocol? > Maybe something like this is needed http://www.x.org/contrib/i18n/ Some lead developers (on KDE/GNOME, even X) have given the opinion that complex text servies should be carried out at higher layers (eg toolkits). Xft & Xrender mech. address the font issues, but not the issues like , text reordering, cluster formation, glyph selection etc. which is what pango does. Some discussions have been going on regarding modifying X text support, mainly to accomodate complex texts & new font mechanisms. Team from Sun made the foll posting http://XFree86.Org/pipermail/i18n/2001-December/002727.html One font gurus opinion http://XFree86.Org/pipermail/fonts/2001-December/001210.html Freetype's plans in reference to above http://www.freetype.org/pipermail/devel/2001-December/002740.html Regards, Karunakar |
From: Sastry R. <rs...@mg...> - 2002-02-04 15:40:31
|
Dear Koshy, I sincerely appreciate your efforts in trying to support Indian Languages on Linux. After having shared the keen insights you have gained by looking at both IndLinux(IITM) and Indix(NCST), what are your final suggestions/recommendations? When can we expect a complete release that would support atleast Hindi completely without breaking the compatibility with the X Window System protocol? Warm Regards Sastry |
From: <jk...@Fr...> - 2002-02-04 11:12:02
|
NCST has, at [http://rohini.ncst.ernet.in/indix/], a modified XFree86 distribution that attempts to provide support for Indic glyph rendering over X11. [[ This is a 28MB download. I wonder why they didn't (also) put up a "diff" wrt. XFree86 sources instead. ]] IMO, the plus points of this work are: - there is source code for an Indic shaping engine available for study, for tamil, gujarati, and devanagari [xc/lib/indic/]. This has been cleanly separated into a library that can be used in other contexts. - the changes seem to have been done by someone with an understanding of the way the X Window System is built. Indic language support gets "turned on" if the system is compiled with -DINDIC_SUPPORT, and this can be controlled system-wide using the standard X configuration mechanism. - the changes are complete, in that most of the places that needed to be changed, have been changed. - the code was a pleasure to read. Thank you for putting your work up, NCST! The NCST changes are, unfortunately, intrusive, and break the semantics of the X Window system protocol. The problem with the NCST design arises from confusion over character codes and the glyph indices used in X11 text drawing calls. In the X11 design, the X protocol does NOT assign any semantic meaning to the glyph indices that are passed to a PolyText or ImageText request. From the X protocol specification: ``Font: A font is a matrix of glyphs (typically characters). The protocol does no translations or interpretation of character sets. The client simply indicates values used to index the glyph array. A font contains additional metric information to determine interglyph and interline spacing.'' X Protocol Specification, Glossary, pp 154. In the NCST work however, all text strings fed into X11 text calls are assumed to be UNICODE character streams encoded in UTF-8 format. They then have a complicated sequence of checks and conversions that happen when rendering text on-screen. The NCST system cannot be considered an implementation of the X Window System protocol. Applications using the NCST X library will not work correctly on other X servers and applications compiled on other systems will not work correctly on the NCST X server. >> The details: Clients call `XDrawString()' and `XDrawImageString()' as usual. The X server interprets the strings passed as ranges UNICODE characters encoding according to UTF-8. Fonts are associated with Unicode ranges that they cover. The X server, when servicing a text drawing request, checks whether the requested characters fall in the ``Indic'' character ranges. If so, special processing is triggered including grouping characters into ``syllables'', re-ordering of glyphs etc. The NCST X server maintains a global list of ``Indic Fonts'' pointed to by `pFirstIndicFont', which seemed to be used as fall-back fonts. There are number of new changes in the device independent code (e.g:- changes in "dix.c", "main.c", new globals `FontPtr devanagariFont' etc). Compatibility of clients using the NCST X11 library with `stock' X servers is broken because of a change to XQueryTextExtents(): in the NCST system, the text string sent over to the X server is assumed to be in UTF-8 format and is first converted to UCS-2 by their X11 library. Thus the bytes (in UCS-2 format) that get sent out will be quite different from what the client passed in. The NCST X server will deal correctly with this UCS-2 encoded data, but stock X servers will not. "xc/lib/indic/" contains sources for an OpenType capable rendering engine based on FreeType. I have not reviewed this code, since I do not at this point have a clear idea of how OpenType rendering works. >> Summary Nice system, nice code; unfortunately not compatible with the X Window System protocol. Regards, Koshy <jk...@fr...> |
From: Guntupalli K. <kar...@fr...> - 2002-02-04 07:56:22
|
On Mon, 04 Feb 2002 12:32:09 +0530 "sunil" <su...@in...> wrote: > Dear Team, > > Please see Nitin Thora's mail below. Nitin Thora is a teacher at > VIIT. The team at VIIT have commenced design for the Java applet. I > will post the design to the mailing list for comments and review. We > need to ensure that the applet is truly multi-platform and > multi-lingual. > A few links would probably be helpful for above Java Input method Engine , a bit old one but would still be of some help http://www7.scu.edu.au/programme/fullpapers/1915/com1915.htm JIMEPlug is a Netscape Composer plugin. It allows you to input multilingual text (e.g. Chinese/Japanese/Korean characters) in your web documents when you are using Netscape Composer 4.0. http://www.irdu.nus.edu.sg/jime/jimeplug/ Web Internationalization and Java Keyboard Input Method. (This is a good one one but using old 1.1 api, link not working at present , I will mail it if I find it on my m/c) http://ibm.nsysu.edu.tw/INET98/5f/5f_2.htm Java Input Method Framework (v1.2) http://java.sun.com/products/jdk/1.2/docs/guide/intl/spec.html Same as above (but latest edition 1.4) http://java.sun.com/j2se/1.4/docs/guide/imf/index.html ( You will need to use this ) Regards, Karunakar |
From: sunil <su...@in...> - 2002-02-04 07:03:31
|
Dear Team, Please see Nitin Thora's mail below. Nitin Thora is a teacher at VIIT. The team at VIIT have commenced design for the Java applet. I will post the design to the mailing list for comments and review. We need to ensure that the applet is truly multi-platform and multi-lingual. Nitin: Please join indic computing on source-forge.net Koshy: Please add golisoda to the team of developers + Have you got the Kannada vowelfonts...is the EPS format Ok. Shall we proceed with all the other alphabets. Sunil -- Sunil Abraham Team Leader - MAHITI Info-tech for the Voluntary Sector India Cares, Vijay Kiran 314/1, 7th Cross, Domlur Bangalore - 560 071. Karnataka. India Pager: +91 80 9624 279519 Ph/Fax: +91 80 5352003, 5350035 E-mail: su...@ma... Web: http://www.mahiti.org -----Original Message----- From: ni...@vi... To: <su...@ma...> Date: Mon, 4 Feb 2002 11:13:59 +0530 Subject: From VIIT > Dear Sunil, > > Hello! Sorry for the delay in writing to you, we were busy in the > project assignments to the students. We have started working on the > applet project and basic framework has been complited. We have decided > the api's and classes that are to be used for building the editor and > now we are deciding the core design part. I will consult to you if we > will stuck some where. > > Vasant sir called me today regarding the contents, he said that Dr. > Surendran is having some contents and now he will be talking to Goje > sir regarding next event. > > As a result of training on Zope, we are glad to inform you that we > have undertaken one project for ERP of Vidya prathisthan and one group > is working on that. > > I will be regular contact with you for RTFEditor project and I will > give you regular feedback for that. At this moment two students and > myself are working on that and I hope with in a couple of week we will > be in a position to show some good results. > > thanks and regards, > > Nitin Thora > > ___________________________________ > V I I T, http://www.viitindia.org > |
From: Guntupalli K. <kar...@fr...> - 2002-02-01 12:10:11
|
Begin forwarded message: Date: Thu, 31 Jan 2002 09:04:19 +0100 From: Christof Pintaske - Sun Germany - ham02 - Hamburg <Chr...@su...> To: de...@l1... Newsgroups: openoffice.l10n.dev Subject: [l10n-dev] Changes for Hindi Hi all, we've received changes to support the Hindi language. I would like to invite everybody interested in CTL language support to have a look at the modules and discuss about it. My main question is how much other languages can benefit from the solution or how much it would hinder them. Of course I'll like to hear about whether the implementation is appropriate. you'll find the submitted files at http://l10n.openoffice.org/servlets/ProjectDocumentList you'll find the original files at http://graphics.openoffice.org/source/browse/graphics/svx/source/editeng/impedit2.cxx http://gsl.openoffice.org/source/browse/gsl/vcl/source/gdi/outdev3.cxx http://gsl.openoffice.org/source/browse/gsl/vcl/win/source/gdi/salgdi3.cxx http://gsl.openoffice.org/source/browse/gsl/vcl/inc/salgdi.hxx http://sw.openoffice.org/source/browse/sw/sw/source/core/crsr/crsrsh.cxx http://util.openoffice.org/source/browse/util/tools/source/generic/toolsin.cxx and last not least, thanx for submitting ! best regards Christof --------------------------------------------------------------------- To unsubscribe, e-mail: dev...@l1... For additional commands, e-mail: dev...@l1... |
From: Arun S. <ar...@sh...> - 2002-02-01 04:18:56
|
On Thu, Jan 31, 2002 at 02:39:58AM -0800, Joseph Koshy wrote: > - if the glyph index being passed in is greater than 0xA0 and less > than 0xFF, the routine tries to use the ``indian font'', if not it > uses the ``normal font''. A number of XSetFont() calls can thus > result ``behind the scenes'' when XDrawString() is invoked. > This sounds very similar to indiX. http://rohini.ncst.ernet.in/indix/ (click on Technical details). -Arun |