Re: [Indlinux-group] Use OpenType as an Intelligent font (not as a Digital font) for Indic

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On 24 July 2011 16:12, Arjuna Rao Chavala <arj...@gm...> wrote:

> Digital font also has context sensitive substitutions, in my
> understanding. Of course, your approach seems to imply a simple
> implementation.
>
> Consider Devanagari Ka Halant Pa. In an OpenType font the GSUB tables are
in terms of GlyphIds (GID). In the digital font approach, the whole syllable
is analysed . Pa is deemed to be the Base character and Ka Halant is prebase
and to be displayed as a half shape.  The character string is first
converted to string in terms of Gids (from the CharMap table of the font)
and then it is tagged with Feature flags HALF. So the feature tagged string
to the font will look like HALF KaGid Halant Gid /HALF PaGid.  If the font
does not have context sensitive substitution table for half forms and the
table is tagged with the HALF flag, then
you need to give the font  HALF KaGid Halant Gid /HALF  and it will be
converted to HalfKaGid.  Then HalfKaGid and PaGid can be rendered to display
the form of Ka Halant Pa from this font.

If the HalfForms substituition table in the font has a post context of
another Consonant character, then the input to the font will have to be HALF
KaGid Halant Gid /HALF PaGid.
Nothing prevents te addition of context to the substitution table, but then
you pay the price of passing the context character too.

The earlier versions of OpenType Indic fonts from Microsoft (Mangal for
example) used to have contexts along with Features. Context dependent
substitutions are needed for Intelligent fonts and with those early
Microsoft fonts, it was possible to exercise the font as an Intelligent
font. With Intelligent fonts, feature tags are immaterial. So we use a
Feature flag that represents all the features. Concrete representation may
be a feature with value 0. Let us call this Feature ALL. Then the input
string to the font is given as ALL  KaGid Halant Gid PaGid /ALL. You do not
worry about which Feature Flag to set and on which GIds each has to be set!

>
> Currently, most development work  on indic pango is being contributed
> by Behdad.  If the existing fonts can work without and a sample
> implementation can be done that integrates with Pango or QT,  your
> proposal can  gain traction.
>
> If the existing fonts have context sensitive substitutions and the
substitutions are in proper order then they will work as Intelligent fonts
without Feature flags (or with the default ALL feature flag) as the example
of Ka Halant Pa shows.

The order of the substitution tables tables is important too. Consider Ka
Halant Sha. If the font has an akhand table KaGid HalantGid ShaGid ->
KShaGid then this table has to precede the Half substitution tables that
transform KaGid HalantGid -> HalfKaGid.

If the font has a akahand substitution table of the entry HalfKaGid ShaGid
-> KshaGid then the akand table should be after the Half substitutions.

IndiX-II does not standardize the ordering of substitution tables in the
Intelligent font. There could be many. However, we have designed and
implemented one ordering and the fonts are available under GPL in binary and
Volt Source form from   http://www.cdacmumbai.in/projects/indix/

The substitution tables in the IndiX-II fonts are tagged with the Feature
flags defined by Microsoft at that time (around 2005). This was done in the
hope that these fonts will be able to work with Microsoft shaping software
as well. Microsoft Typography decided that ordering of tables in the font is
a highly error prone process and took out the ordering into one on the
Feature flags. So in the Ka Halat Sha example, Microsoft would say that the
AKHND feature has to be first set on the gids, the font exercised and then
if it is not applicable, then set HALF feature and try again.

So if the existing font does not have context sensitive substitutions or if
its  order of these substitution tables is incorrect, then the font cannot
be used as an Intelligent font.

The Indix-II live CD has a tutorial on how to use the IndiX-II libraries to
convert from a sequence of characters to a the corresponding sequence of
Glyph Ids. The Indix-II interfaces follow the ICU approach with three
changes

1) The basic IndiX-2 routines work on a syllable and not a string. Of course
higher level interfaces that work on character strings can be built on top
of syllable oriented interface.

2) The IndiX-II has a routine that gives the Visual syllable of characters
from the logical. For example, in Devanagari (Ka IMatra) is a ligical
syllable and (IMatra Ka) is the visual syllable.

3) Positiong information (for marks like eMatra, reph, uMatra) is given by
spacer glyphs in IndiX-II. So the output of IndiX-II is a stream of Glyphs
unlike the string of Glyphs with associated positioning information for each
glyph given by ICU.

> Hope this will  also help  in transforming CDAC in releasing font
> related resources under  public domain/Copyleft  license  modes
>

The software and font resources developed under the publicly funded DIT
project are already under GPL is source and binary form for both the
software and the fonts since 2005. Hope all centres of C-DAC too release
their script related resources in the same way. This will also help the Free
Software community to use those fonts and adapt them to the Intelligent font
model.

Let me suggest another improvement too. OpenType fonts are too heavy and not
needed for low resolution devices like Mobile phones. For this domain, it is
better to use font technology like INSFOC (a sort of 8 bit Font encoding
scheme developed by DIT and C-DAC). Note that we do not suggest that we
replace Unicode with these Font encodings. What we say is that the text
should be always in Unicode. But instead of the GIDS being dug up from an
OpenType font through a complex process, the GIDS can be generated from the
Unicode to INSFOC code tables. INSFOC fonts where the corresponding glyphs
have the INSFOC codes as the GIDS can easily be prepared.

This approach was studied under the Janabhaarati project under the name
Setu. Siji Sunny and others have extended the scheme to Gujarati. They have
plans to extend it to all the 9 Indic scripts.   Again this is an
Intelligent processing approach with substitution tables standardized and
taken out of the fonts. This approach will also simplify the use of existing
fonts like TrueType fonts that do not have any substitution or positioning
tables. With this Setu approach, the GIDs of the glyphs will have to be set
to the ones  defined in the Setu standard.

Cheers
vinod kumar
-- 
पृथिवी सस्यशालिनी
the earth be green