Re: [Indic-computing-devel] Re: NCST Indix Examined
Status: Alpha
Brought to you by:
jkoshy
From: <jk...@Fr...> - 2002-02-20 07:17:33
|
Arun, as> That is for efficiency reasons. man XTextExtents. XQueryTextExtents as> (something more powerful than that) was the proposed new mechanism. And even *that* isn't being used in the Athena widget set. Folks, please read the code before offering suggestions. It would help to keep the signal-to-noise ratio reasonable. as> The proposed new algorithm: as> as> FindPosition(textpos, startx, pixel_width) as> // Make a single request to the X Server - this doesn't exist in as> // the X protocol yet as> nchars = XComputeWidth(textbuf[textpos:end-of-line], startx, pixel_width, as> // other args font etc) as> as> // everything starting from textpos to textpos + nchars is "selected" Well, you've just changed your X client. I thought you were going to describe an algorithm that would allow X clients to work unchanged in the presence of arbitrary glyph reordering, substitutioning and positioning by the X server. as> I just said it was inconsistent in the use of character codes vs glyph as> codes - not that it was ambiguous or in error. This seems to be a as> consequence of it being designed at a time, when the distinction between as> the two was not as important as it is today. The distinction between characters and glyphs is important even for Latin scripts. Consider ligatures and diacritical marks; some Latin encodings have separate character codes for the diacritical marks; a "c" and a "cedilla" (two code points) together can have a different glyph in these languages. Similarly "f", "f" and "i" combine to form a distinct glyph "ffi". The X protocol was explicitly designed NOT to support these kinds of transformations. as> And you yourself (along with others on this list) accepted that certain as> references were ambiguous. What's all the fuss about then ? :) One place in the X protocol specification uses the phrase 'string of characters'. Now the word 'character' has (today) become an overloaded phrase, with meanings ranging from the visual representation (the letterform), the 'abstract' character itself, the code point assigned to the character in a given encoding, a specific glyph in a font, etc. The exact meaning is usually clear from the context. Nowhere does the X11 protocol specification say that 'character codes' are to be used in text drawing requests. In fact, it EXPLICITLY states that the semantics of character `codes' are NOT to be honored by the X server. If you change this, you'll end up with some other "protocol", not the X protocol. This new graphics "protocol" is however: a. inconsistent i. how do you map a screen coordinate back to position in the text stream if you are doing complex text rendering? b. incomplete i. how do you specify text in a different character encoding? ii. how do you access glyphs in a font that do not correspond to a `character'? c. suffers from new problems i. If you are indexing fonts using character codes, how do you use fonts that do not contain glyphs of 'letters'? You don't want glyph combining and reordering happening for the glyphs in a symbol font for example. ...etc... as> need to come up with the pros and cons of each approach. I've given as> several tangible advantages of implementing it on the X as> server. Perhaps you could articulate your thoughts on why you think as> it should be done in a client side library ? Implementing Indic script support in the X server alone without changing clients appears to be infeasible. However, you don't need to change the X server to support Indic scripts. Here is one way how it would work: >> Client side Indic Rendering I In a client side rendering model, the client transforms: `M' code-points -> `N' PolyText protocol requests The client then draws glyphs on screen using the standard PolyText/ImageText requests. In this model, the client does the necessary glyph substitution, reordering and positioning, using whatever algorithm appropriate for the script it is processing it chooses. The end result of the transformation is a set of [font, x/y-position, glyph-lists] tuples that would go out as protocol requests. Further, in this model, the client has all the information required to map an [x,y] screen coordinate returned in an X event back to a position in the 'text' stream (since it did all the reordering, positioning and glyph substitution). o this is efficient in terms of network bandwidth (glyph indices are sent over) o it doesn't break anything; you are still using the X11 protocol :) o it will work on every X server in the world; no need for extensions. o the X server is still doing the rendering of glyphs onto the screen and can apply the usual caching/pre-rendering optimizations for done for text. o you can support multiple encodings (KSCLP, TSCII, UNICODE, whatever) o you can support multiple algorithms for Indic rendering The downside: Client side rendering requires fonts to be coded to a well-known font encoding scheme, since the client has to transform character code-points to lists of glyph indices and their positions. Question to the list: What font encoding standards are available for indic scripts? How complete are they --- do they cover every letterform (graphical shape) used by a language's writing system? >> Client side Indic Rendering II Another way of getting Indic rendering to work without any X server modifications would be to have the client render glyphs onto a bitmap and send this "final" bitmap across. I.e, the client transforms `M' code points -> 1 bitmap This doesn't have the dependency on "well-known" font encodings (in fact the font need not be present at the X server at all) but has at least three drawbacks: o sending a bitmap over is costlier than sending over glyph indices o the client has to do text rendering inside of itself, adding to its complexity, and complexity of administration o the X server can't optimize its use of the glyphs of a font The other characteristics are like that of ``Client Side Indic Rendering I''. Regards, Koshy <jk...@fr...> |