[Indic-computing-devel] FW: Displaying the languages of the Indian subcontinent. (derives from Re: P

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I am Forwarding this from the 'Unicode list':

[from William Overington]

Firstly, I mention that I am not a linguist and do not write to make a
linguistic comment at all.

As some readers of this mailing list may know, I am very interested in
interactive television, in particular the DVB-MHP (Digital Video
Broadcasting - Multimedia Home Platform) system, which uses Unicode.

Now, from the specification for the DVB-MHP system, which can be
downloaded from the http://www.mhp.org website, it appears that fonts
for the DVB-MHP system, which can be broadcast, are to be in the PFR0
system, Portable Font Resource version 0.  I have some time ago obtained
some details of that system and looked through them, but did not follow
all of the details, yet, as the system seemed to date from the early
1990s it seems entirely possible that the PFR0 system does not support
the mechanism which allows a font to substitute a particular glyph for a
sequence such as the  U+0985 U+09CD
U+09AF which Michael mentioned in his reply to Andy, quoted above. [A
Bengali unicode sequence]

It would therefore seem that the DVB-MHP interactive television system,
which is a system for worldwide use, may come up against considerable
rendering problems when it comes to making broadcasts using the
languages of the Indian subcontinent.  I am seeking to resolve that
problem by devising an infrastructural tool to program round the problem
by preprocessing received Unicode text in the television receiver before
it is passed to the font, so that facilities for quality typography for
the languages of the Indian subcontinent exist with the DVB-MHP
platform.

Is this a problem particular just to interactive television or is it a
wider problem?

I made a suggestion for a eutocode typography file in the following web
page.

http://www.users.globalnet.co.uk/~ngo/ast03300.htm

Now whether that use of some of the code points of the Private Use Area
by a user community were used in some scenarios (for example with PFR0
fonts in interactive broadcasting) or whether the glyphs would be
numbered in some other sequence of numbering within a font, I am putting
forward for discussion the question as to whether it might be useful for
there to be produced a list of ligatures for the languages of the Indian
subcontinent such that each ligature has an index number in an ordered
sequence from 1 upwards, so that those code numbers can be a standard
way of accessing glyphs within fonts or within systems such as a
eutocode typography file. It may be that any particular application of
such a list would add an offset constant to the list number during
processing, for example hexadecimal EC00 for a eutocode typography file,
or maybe 500 for an advanced format font, yet the idea would be that
some particular glyph for a particular ligature glyph, for, say, Tamil,
would always be at position XYZ relative to the start of the list.  This
would mean that substitution tables for rendering from a Unicode
sequence to a displayable glyph could become portable rather than font
specific, so there might, in time, be a great saving of duplicated
effort in having such a numbered list of ligature glyphs.

I emphasise that I am not in any way suggesting using Private Use Area
codes for (italics) interchange (/italics) of text in these languages, I
am simply suggesting that there seems to be the possibility that the
process of producing fonts and other software systems for the carrying
out of the task of glyph substitution for particular Unicode sequences
could be made a more portable process if such a list were to exist.

Is there interest in such a list of ligature characters in a numbered
list being produced?  As I say, I am not a linguist so I could not carry
out the task, yet perhaps the task might be fairly straightforward,
though necessarily taking a substantial amount of effort, for some of
the readers of this mailing list, if there is interest in such a list
being produced. Once done, the list would have long term usefulness.
Spaces for the numbering could perhaps be allocated in the same order as
the various languages of the Indian subcontinent are encoded within the
Unicode Standard.  Clearly expert guidance is needed as to how many
ligatures exist for any particular language.

The list would also be a useful index for glyphs in a "glyph library" of
designs.

I was interested to read in a recent thread in this forum of the
founding of the International Font Technology Association (IFTA) and
wonder whether that organization would be an appropriate body to produce
such a list, if there should be interest in the production of such a
list.

I would be pleased to know the views of people within this group as to
whether such a list would be of advantage to typographers and others
involved in computerized typography. . William Overington

3 March 2003

[Indic-computing-devel] FW: Displaying the languages of the Indian subcontinent. (derives from Re: P

[Indic-computing-devel] FW: Displaying the languages of the Indian subcontinent. (derives from Re: Please see my latest proposal)