[Indic-computing-devel] FW: Displaying the languages of the Indian subcontinent. (derives from Re: P
Status: Alpha
Brought to you by:
jkoshy
From: Andy W. <And...@bt...> - 2003-03-05 13:45:34
|
I am Forwarding this from the 'Unicode list': [from William Overington] Firstly, I mention that I am not a linguist and do not write to make a linguistic comment at all. As some readers of this mailing list may know, I am very interested in interactive television, in particular the DVB-MHP (Digital Video Broadcasting - Multimedia Home Platform) system, which uses Unicode. Now, from the specification for the DVB-MHP system, which can be downloaded from the http://www.mhp.org website, it appears that fonts for the DVB-MHP system, which can be broadcast, are to be in the PFR0 system, Portable Font Resource version 0. I have some time ago obtained some details of that system and looked through them, but did not follow all of the details, yet, as the system seemed to date from the early 1990s it seems entirely possible that the PFR0 system does not support the mechanism which allows a font to substitute a particular glyph for a sequence such as the U+0985 U+09CD U+09AF which Michael mentioned in his reply to Andy, quoted above. [A Bengali unicode sequence] It would therefore seem that the DVB-MHP interactive television system, which is a system for worldwide use, may come up against considerable rendering problems when it comes to making broadcasts using the languages of the Indian subcontinent. I am seeking to resolve that problem by devising an infrastructural tool to program round the problem by preprocessing received Unicode text in the television receiver before it is passed to the font, so that facilities for quality typography for the languages of the Indian subcontinent exist with the DVB-MHP platform. Is this a problem particular just to interactive television or is it a wider problem? I made a suggestion for a eutocode typography file in the following web page. http://www.users.globalnet.co.uk/~ngo/ast03300.htm Now whether that use of some of the code points of the Private Use Area by a user community were used in some scenarios (for example with PFR0 fonts in interactive broadcasting) or whether the glyphs would be numbered in some other sequence of numbering within a font, I am putting forward for discussion the question as to whether it might be useful for there to be produced a list of ligatures for the languages of the Indian subcontinent such that each ligature has an index number in an ordered sequence from 1 upwards, so that those code numbers can be a standard way of accessing glyphs within fonts or within systems such as a eutocode typography file. It may be that any particular application of such a list would add an offset constant to the list number during processing, for example hexadecimal EC00 for a eutocode typography file, or maybe 500 for an advanced format font, yet the idea would be that some particular glyph for a particular ligature glyph, for, say, Tamil, would always be at position XYZ relative to the start of the list. This would mean that substitution tables for rendering from a Unicode sequence to a displayable glyph could become portable rather than font specific, so there might, in time, be a great saving of duplicated effort in having such a numbered list of ligature glyphs. I emphasise that I am not in any way suggesting using Private Use Area codes for (italics) interchange (/italics) of text in these languages, I am simply suggesting that there seems to be the possibility that the process of producing fonts and other software systems for the carrying out of the task of glyph substitution for particular Unicode sequences could be made a more portable process if such a list were to exist. Is there interest in such a list of ligature characters in a numbered list being produced? As I say, I am not a linguist so I could not carry out the task, yet perhaps the task might be fairly straightforward, though necessarily taking a substantial amount of effort, for some of the readers of this mailing list, if there is interest in such a list being produced. Once done, the list would have long term usefulness. Spaces for the numbering could perhaps be allocated in the same order as the various languages of the Indian subcontinent are encoded within the Unicode Standard. Clearly expert guidance is needed as to how many ligatures exist for any particular language. The list would also be a useful index for glyphs in a "glyph library" of designs. I was interested to read in a recent thread in this forum of the founding of the International Font Technology Association (IFTA) and wonder whether that organization would be an appropriate body to produce such a list, if there should be interest in the production of such a list. I would be pleased to know the views of people within this group as to whether such a list would be of advantage to typographers and others involved in computerized typography. . William Overington 3 March 2003 |