indic-computing-devel Mailing List for The Indic-Computing Project (Page 4)

Status: Alpha

Brought to you by: jkoshy

indic-computing-devel — Discussing the development of tools for Indian language information processing

You can subscribe to this list here.

2001	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (14)
2002	Jan (25)	Feb (90)	Mar (41)	Apr (16)	May (8)	Jun	Jul (37)	Aug (35)	Sep (62)	Oct (37)	Nov (22)	Dec (7)
2003	Jan (16)	Feb (19)	Mar (10)	Apr (5)	May (26)	Jun (11)	Jul (35)	Aug (4)	Sep (14)	Oct (5)	Nov (5)	Dec (10)
2004	Jan (25)	Feb (2)	Mar	Apr (1)	May	Jun	Jul (10)	Aug (2)	Sep (2)	Oct (1)	Nov (9)	Dec
2005	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct (1)	Nov (1)	Dec (1)
2006	Jan	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (1)	Dec
2017	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov (4)	Dec

Flat | Threaded

<< < 1 2 3 4 5 6 .. 25 > >> (Page 4 of 25)

[Indic-computing-devel] handbook build on RH - some questions

From: Alok K. <alo...@so...> - 2003-11-16 04:59:32

Hi,
I'm in the process of building the handbook on redhat, and eventually 
creating the rpms for others to be be able to do that.

Now, when I run make using the handbook makefile, which is doc/en*/
books/handbook/Makefile, I get
Makefile:137: *** missing separator. Stop.

So before I start on the chase I'd like to know if this build requires 
a special make or a special anything else. The make I have is make-
3.79.1-14. gmake also gives the same error.

I suspect this comes at the first occurrence of a .include statement in 
the makefile.

Regards
Alok


-- 
http://9211.blogspot.com
Can't see Hindi? http://geocities.com/alkuma/seehindi.html
http://groups.yahoo.com/group/linux-bangalore-hindi/
Discuss devanagari at http://groups.yahoo.com/group/devanaagarii/
alok_kumar ऍट softhome डॉट net

[Indic-computing-devel] Unicode Collation Algorithm, version 4.0.0

From: Dr. U.B. P. <pav...@vi...> - 2003-11-04 13:36:12

<?xml  version="1.0" ?><html>
<head>
<title></title>
</head>
<body>
<div align="left"><font face="Courier"><span style="font-size:10pt">------- Forwarded message follows -------</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Date sent:&#160;&#160;&#160;&#160;&#160; </span></font><font face="Courier" color="#000080"><span style="font-size:10pt">Mon, 
3 Nov 2003 10:12:49 -0500</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">From:&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; </span></font><font face="Courier" color="#000080"><span 
style="font-size:10pt">Rick McGowan &lt;ri...@un...&gt;</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt"><b>Subject:&#160;&#160;&#160;&#160;&#160;&#160;&#160; </b></span></font><font face="Courier" color="#000080"><span 
style="font-size:10pt"><b>Unicode Collation Algorithm, version 4.0.0</b></span></font></div>
<div align="left"><br/>
</div>
<div align="left"><font face="Courier"><span style="font-size:10pt">We are pleased to announce the release of the 4.0.0 version of</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Unicode Technical Standard #10: The Unicode Collation Algorithm</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">(UCA), which specifies a default sorting order and comparison</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">mechanism for all Unicode characters.</span></font></div>
<div align="left"><br/>
</div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Major changes in this release include:</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">- The version of the UCA is now being synchronized with versions</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; of the Unicode Standard, so that the repertoire of characters</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; will be the same.</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">- An extensive new introduction has been added. It discusses</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; important concepts that were formerly in Section 5.17 of Unicode</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; 3.0, but has been completely reworked for clarity and coverage.</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">- The Scope section has been recast and is now at the end of</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; the introduction.</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">- The location of data files has been changed to</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">&#160; http://www.unicode.org/Public/UCA/</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">------- End of forwarded message -------</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">-----------------------------------------------------</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Dr. U.B. Pavanaja</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Editor, Vishva Kannada</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">World's first Internet magazine in Kannada</span></font></div>
<div align="left"><font face="Courier"><span style="font-size:10pt">http://www.vishvakannada.com/</span></font></div>
<div align="left"><br/>
</div>
<div align="left"><font face="Courier"><span style="font-size:10pt">Note: I don't worry about pselling mixtakes</span></font></div>
<div align="left"></div>
</body>
</html>

[Indic-computing-devel] GNU/Linux debian port for doctoolchain

From: Cherry G. M. <ch...@sd...> - 2003-11-03 12:42:07

Hello list,

This is to announce a pre-alpha release of a set of a debian repository
containing a set of updated tools, and dependencies for other available
ones required to build the indic-computing documentation.

Testing and BUG reports are most welcome.

You may download this package from

http://prdownloads.sourceforge.net/indic-computing/doc-toolchain-debian-0.5-i386.tgz?download

Best Regards,

Cherry.

--
ch...@sd...
Homepage - http://cherry.freeshell.org

Re: [Indic-computing-devel] [RFC] Docs for toolchain prot to Debian.

From: Cherry G. M. <ch...@sd...> - 2003-10-22 13:13:43

On Tue, 21 Oct 2003, [iso-8859-1] Joseph Koshy wrote:

> We need to work on simplifying the procedure to build install
> from sources.

Yep, something like doing a make build to get everything nice and
packaged. I should begin working on it pretty soon....

Besides that Mahiti Infotech, Bangalore,has agreed that I can continue
using their bandwidth for indic development. If you're reading this:
Thank you Sunil!!

Best,

Cherry.

--
ch...@sd...
Homepage - http://cherry.freeshell.org

Re: [Indic-computing-devel] [RFC] Docs for toolchain prot to Debian.

From: <a_j...@ya...> - 2003-10-21 14:32:20

> http://cherry.freeshell.org/indic-deb-doc-alpha.txt

> Comments/Critisism welcome on the indic-computing-devel list.

Thanks.

The "Binary Installation" part has been incorporated into our
"Documentation Build" article, modulo some incomplete URLS.

We need to work on simplifying the procedure to build install
from sources.



=====
Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/
Founder/Manager/Programmer/Peon, The Indic-Computing Project
                                 http://indic-computing.sf.net

________________________________________________________________________
Yahoo! India Matrimony: Find your partner online.
Go to http://yahoo.shaadi.com

Re: [Indic-computing-devel] [RFC] indic-computing website improvements.

From: Cherry G. M. <ch...@sd...> - 2003-10-19 19:25:56

Dear Koshy, list,

I had deliberately kept from replying earlier to see how active this list
is. Rather heartening to see that there are exactly two enthusiastic
volunteers :-), one a veteran, the other a newbie ;-)

On Fri, 10 Oct 2003, [iso-8859-1] Joseph Koshy wrote:
[...]
> > critical-mass in terms of volunteers for site maintanence and content
> > management, we could think about diversifying.
>
> We are using different tools for different requirements.
>
> DocBook is being used for the 'heavy-duty' documentation that needs
[...]
> However, there seems to be only one person working on the website
> (me :)) and I find EtText more of a hindrance than a benefit.

Yes, EtText doesn't seem to be much help other than to distract one from
the content. It brings us back to WYSIWIG, rather that WYMIWIG.
I learnt it the hard way while trying to fix up my own private website
with WebMake.

> >  Can we cut down on the number of tools used for documentation
> > maintanence ? As of now, a prospective documenter has to know 6
[...]
>
> If you are writing documentation, today you need to know only
> two tools, any text editor ('vi' or 'emacs' say) for editing
> and 'make' to build everything.  You don't really need to know
> how the rest of the toolchain works -- the intent is that 'every
> thing just works'.  For documentation, we are using one SGML "format"
> namely, DocBook.
>

Hmmm, that's making the presumption that someone on the list (which boils
down to A. Joseph Koshy) will take care of the SGML. Sounds very
altruistic, however doesn't address the basic issue of a steep learning
curve. However, I'm sure its the best ad-hoc solution, considering the
circumstances.....

> The website could conceivably be written using DocBook too (for
[...]
> In the end I chose plain HTML for the website's content and 'WebMake'
> to stich the pieces of the site's content into a coherent website.
>

Yes, I think that does make sense.

> Coming to casual contributors, these folks aren't affected by the
> internals of the toolchain (or even the specfics of HTML or DocBook)
> since they can contribute using plain-text too.
>

Thanks Koshy! We need more sincere volunteers like you. I'll try and learn
the hard pieces, so that I could help with the internals as well, but
that'll take a while. I'm sure you've held the fort for too long for that
to make a significant difference. So there, that's my offer ( like the GPL
"Without even the guarantee of Merchantability of fitness for any
particular purpose........" ;-) )

> Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/
> Founder/Manager/Programmer/Peon, The Indic-Computing Project
                             ^^^^

I can see why now!

Best,

Cherry.

--
ch...@sd...
Homepage - http://cherry.freeshell.org

Re: [Indic-computing-devel] [RFC] Docs for toolchain prot to Debian.

From: <a_j...@ya...> - 2003-10-10 09:50:02

> I've put up a draft article at :
> 
> http://cherry.freeshell.org/indic-deb-doc-alpha.txt

Thanks, will incorporate this into our documentation sources
and send the diff back to you for review.

Good work, Cherry!


=====
Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/
Founder/Manager/Programmer/Peon, The Indic-Computing Project
                                 http://indic-computing.sf.net

________________________________________________________________________
Yahoo! India Matrimony: Find your partner online.
Go to http://yahoo.shaadi.com

Re: [Indic-computing-devel] [RFC] indic-computing website improvements.

From: <a_j...@ya...> - 2003-10-10 09:48:34

> 1) From the perspective of a new volunteer for website maintanence:
>
>    The order of prerequisite reading is not specified.
>    Here is a suggested order:

[snip]

Suggestion taken.

>    It would be very helpfull to finish section "3.1, Directory
> Structure" of the documenation design goals document.

Sure.  I guess this could be added now.

> 2) Documentation layout.
> 
>    Three formats are in use, with no apparent logical distribution in
> directories. They are, sgml, html, and ettext. It would be nice to
> make a policy decision about which format to stick to ( even if it
> makes the site slightly uglier. ) Once the project has obtained
> critical-mass in terms of volunteers for site maintanence and content
> management, we could think about diversifying.

We are using different tools for different requirements. 

DocBook is being used for the 'heavy-duty' documentation that needs
"indic" stuff in it.  HTML is used only for the website.  EtText could 
be dispensed with since it is used only for a very small part of the 
website -- I had initially thought that being able to write "pseudo
plain 
text" would be of help to people willing to work on the website. 
However, there seems to be only one person working on the website 
(me :)) and I find EtText more of a hindrance than a benefit.

>   Can we cut down on the number of tools used for documentation
> maintanence ? As of now, a prospective documenter has to know 6
> tools namely: html, docbook sgml, ettext, webmake, bsd make, python.
> A shallower, smaller learning curve can bring in more volunteers at
> this stage. For now this is a substantial requirment for a casual
> contributor.

If you are writing documentation, today you need to know only
two tools, any text editor ('vi' or 'emacs' say) for editing 
and 'make' to build everything.  You don't really need to know
how the rest of the toolchain works -- the intent is that 'every
thing just works'.  For documentation, we are using one SGML "format"
namely, DocBook.  

The website could conceivably be written using DocBook too (for 
an example, see DocBook author Norman Walsh's home page).  I seriously
considered this when designing the infrastructure.  The plus point
of this approach was that we integrate our 'website' with the rest 
of the standalone documentation being written, but the drawback
was that DocBook isn't a good DTD for describing website content (IMO).

In the end I chose plain HTML for the website's content and 'WebMake'
to stich the pieces of the site's content into a coherent website.

Coming to casual contributors, these folks aren't affected by the 
internals of the toolchain (or even the specfics of HTML or DocBook) 
since they can contribute using plain-text too.

> put down the excellent work of current volunteers, but merely as a
> means to elicit discussion.

Its been a week+ of silence since your post, so I thought I'd chip in
:).


=====
Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/
Founder/Manager/Programmer/Peon, The Indic-Computing Project
                                 http://indic-computing.sf.net

________________________________________________________________________
Yahoo! India Matrimony: Find your partner online.
Go to http://yahoo.shaadi.com

[Indic-computing-devel] [RFC] Docs for toolchain prot to Debian.

From: Cherry G. M. <ch...@sd...> - 2003-09-30 12:02:08

Hello list,


I've put up a draft article at :

http://cherry.freeshell.org/indic-deb-doc-alpha.txt

Pls excuse me if the linebreaks mess up with links or lynx browsers.


Comments/Critisism welcome on the indic-computing-devel list.

Thanks,

Cherry.
--
ch...@sd...
Homepage - http://cherry.freeshell.org

[Indic-computing-devel] [RFC] indic-computing website improvements.

From: Cherry G. M. <ch...@fr...> - 2003-09-30 08:37:54

Hello list,

This email is intended to be a critique of the shortcomings of the
current indic-computing documentation system. It is intended to add
constructive criticism and elicit discussion.


1) From the perspective of a new volunteer for website maintanence:

   The order of prerequisite reading is not specified.

   Here is a suggested order:

-   The Docbook guide
-   WebMake homepage documentation
-   Etext homepage documentation
-   indic-computing doc-build New Volunteer's guide.

   It would be very helpfull to finish section "3.1, Directory
Structure" of the documenation design goals document.


2) Documentation layout.

   Three formats are in use, with no apparent logical distribution in
directories. They are, sgml, html, and ettext. It would be nice to
make a policy decision about which format to stick to ( even if it
makes the site slightly uglier. ) Once the project has obtained
critical-mass in terms of volunteers for site maintanence and content
management, we could think about diversifying.

3) Tool usage.

  Can we cut down on the number of tools used for documentation
maintanence ? As of now, a prospective documenter has to know 6
tools namely: html, docbook sgml, ettext, webmake, bsd make, python.
A shallower, smaller learning curve can bring in more volunteers at
this stage. For now this is a substantial requirment for a casual
contributor.


Please note that the above observations/suggestions are not meant to
put down the excellent work of current volunteers, but merely as a
means to elicit discussion.

Comments welcome on the devel list.

Best,

Cherry

ch...@sd...
http://cherry.freeshell.org/

[Indic-computing-devel] Re: [Kannada] Re: Problem with freetype rendering of an Indic opentype font

From: Dr. U.B. P. <pav...@vi...> - 2003-09-10 04:07:13

> On Monday 08 September 2003 01:39, Owen Taylor wrote:
> > On Sun, 2003-09-07 at 17:51, Lars Knoll wrote:
> > > On Sunday 07 September 2003 23:37, Arun Sharma wrote:
> > > > On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote:
> > > > > > I see a problem with the freetype rendering of this unicode string:
> > > > > >
> > > > > > "0xcb7 0xccd"
> > > > > >
> > > > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows
> > > > > > XP).  In fact, the problem manifests itself with all other
> > > > > > consonants too, including 0xcb7.
> > > > >
> > > > > I don't have time to investigate but it seems to me that some
> > > > > contextual shaping is happening.  If this is true it is not a problem
> > > > > of FreeType 2 (which doesn't support OpenType directly for the
> > > > > moment) but rather a problem one level higher where Indic scripts are
> > > > > handled, probably within the Pango library.  Owen?
> > >
> > > I did some investigations and the problem seems to lie within the
> > > freetype 1 open type code used by Qt and pango. The font uses chain
> > > substitutions to achieve correct rendering of this combination and these
> > > seem to work incorrectly. Up do now I didn't find time to dig into this
> > > in more detail.
> >
> > I just recently applied a patch to Pango for Chain Context substitutions
> > that apparently was needed for Kannada -
> >
> > http://people.redhat.com/otaylor/opentype-patches/pango-26-chain-format3
> >
> > (from Kailash C. Chowksey.) Could that fix the problem here? (That patch
> > is in Pango-1.2.5)
> 
> No, the problem was actually unrelated to the chaining substitutions.
> 
> > http://bugzilla.gnome.org/show_bug.cgi?id=118592 also comes up for
> > tunga.ttf, though we haven't come up with a final patch for that.
> 
> I have debugged the problem today, and found that the problem lies somewhere 
> else than I first guessed. tunga.ttf has a ligature for the character 
> combination quoted above. The lookup table for the ligature has a LookupFlag 
> of 0x0100, so it should skip marks that do not have a MarkAttachmentClass of 
> 0x01. The 0xccd character however is a mark according to the gdef table, but 
> has no MarkAttachmentClass defined.
> 
> Thus TT_GDEF_Get_Glyph_Property() return 0x8 for the glyph, and the 
> Check_Property() call fails Lookup_LigatureSubst. Seems like we have to 
> ignore the high byte of the LookupFlag if the glyph has no 
> MarkAttachmentClass (the specs are not really clear about this IMO).

What is meant by MarkAttachmentClass? When we design the 
Opentype font, normally we use Microsoft VOLT and define the 
glyphs there. As you correctly pointed out, 0xccd is defined as 
Mark. This will define the glyph as Mark in the GDEF table. How 
to add the MarkAttachmentClass? Which tool are you using to dump 
the embedded tables of the OTF? Is it TTX? 

BTW, I did not develop Tunga.ttf :-)
 
> I've attached a patch that fixes the problem and gives correct shaping of the 
> above glyph combination. 
> 
> Cheers,
> Lars


Rgds,
Pavanaja-----------------------------------------------------
Dr. U.B. Pavanaja
Editor, Vishva Kannada
World's first Internet magazine in Kannada
http://www.vishvakannada.com/

Note: I don't worry about pselling mixtakes

[Indic-computing-devel] Unicode Workshop - Sept 24-26, 2003, New Delhi

From: Dutta A. <dab...@in...> - 2003-09-09 09:37:37

Attachments: =?iso-8859-1?Q?Unicode_Workshop_Programme_.zip?=

Hello Everyone,

A Unicode Workshop will be held in Delhi between Sept 24 and 26.

Anyone who wants to participate should send a mail to co...@ma... =
for
verification and registration .

Please include the following information -
Name:
Affiliation and Organization:
Postal address:
Phone number:
Whether you subscribe to in...@un... (Yes/No):
Interest:
Language(s) on which you have expertize (Please state if this includes
development experience also):
What follow up activities you are interested in:
Names of References (for verification):

                             Unicode Workshop

                     0900  hrs, September 24-26, 2003

             =A0The Oak,The Park, Parliament Street,  New Delhi

A detailed programme is=A0attached for  your reference. (If detached by=
 the
list-server please contact me/MAIT for it)

(See attached file: Unicode Workshop Programme .zip)

Please come prepared. Study up on the material from the book (sections
download-able in pdf from :
http://www.unicode.org/versions/Unicode4.0.0/bookmarks.html )
Useful and considered inputs are very rare to come by !

Regards,
Abhijit
____________________________
http://www.ibm.com/software/globalization
=

[Indic-computing-devel] Re: [Devel] Re: [Kannada] Re: Problem with freetype rendering of an Indic opentype font

From: Owen T. <ot...@re...> - 2003-09-07 23:42:00

On Sun, 2003-09-07 at 17:51, Lars Knoll wrote:
> On Sunday 07 September 2003 23:37, Arun Sharma wrote:
> > On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote:
> > > > I see a problem with the freetype rendering of this unicode string:
> > > >
> > > > "0xcb7 0xccd"
> > > >
> > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows
> > > > XP).  In fact, the problem manifests itself with all other
> > > > consonants too, including 0xcb7.
> > >
> > > I don't have time to investigate but it seems to me that some
> > > contextual shaping is happening.  If this is true it is not a problem
> > > of FreeType 2 (which doesn't support OpenType directly for the moment)
> > > but rather a problem one level higher where Indic scripts are handled,
> > > probably within the Pango library.  Owen?
> 
> I did some investigations and the problem seems to lie within the freetype 1 
> open type code used by Qt and pango. The font uses chain substitutions to 
> achieve correct rendering of this combination and these seem to work 
> incorrectly. Up do now I didn't find time to dig into this in more detail.

I just recently applied a patch to Pango for Chain Context substitutions
that apparently was needed for Kannada - 

http://people.redhat.com/otaylor/opentype-patches/pango-26-chain-format3

(from Kailash C. Chowksey.) Could that fix the problem here? (That patch
is in Pango-1.2.5)

http://bugzilla.gnome.org/show_bug.cgi?id=118592 also comes up for
tunga.ttf, though we haven't come up with a final patch for that.

Regards,
						Owen

[Indic-computing-devel] Re: [Kannada] Re: Problem with freetype rendering of an Indic opentype font

From: Lars K. <la...@tr...> - 2003-09-07 21:55:14

On Sunday 07 September 2003 23:37, Arun Sharma wrote:
> On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote:
> > > I see a problem with the freetype rendering of this unicode string:
> > >
> > > "0xcb7 0xccd"
> > >
> > > Using Microsoft tunga.ttf (A kannada font that ships with Windows
> > > XP).  In fact, the problem manifests itself with all other
> > > consonants too, including 0xcb7.
> >
> > I don't have time to investigate but it seems to me that some
> > contextual shaping is happening.  If this is true it is not a problem
> > of FreeType 2 (which doesn't support OpenType directly for the moment)
> > but rather a problem one level higher where Indic scripts are handled,
> > probably within the Pango library.  Owen?

I did some investigations and the problem seems to lie within the freetype 1 
open type code used by Qt and pango. The font uses chain substitutions to 
achieve correct rendering of this combination and these seem to work 
incorrectly. Up do now I didn't find time to dig into this in more detail.

> ok, I did some more investigation. Here's a partial dump of the font.
> According to this rule, glyph 0x62 should've been substituted by glyph
> 0xb0. But for some reason, it's not happening.
>
> Does this ring a bell ?

The lookup table you quite below is actually a subtable of some chain 
substitution table as far as I can tell. 

>    <Lookup> <!-- 36 -->
>       <LookupType>SINGLE</LookupType>
[...]

I'll try to have a closer look in the next days if time allows.

Cheers,
Lars

[Indic-computing-devel] Re: [Kannada] Re: Problem with freetype rendering of an Indic opentype font

From: Arun S. <ar...@sh...> - 2003-09-07 21:35:33

On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote:
> > I see a problem with the freetype rendering of this unicode string:
> > 
> > "0xcb7 0xccd"
> > 
> > Using Microsoft tunga.ttf (A kannada font that ships with Windows
> > XP).  In fact, the problem manifests itself with all other
> > consonants too, including 0xcb7.
> 
> I don't have time to investigate but it seems to me that some
> contextual shaping is happening.  If this is true it is not a problem
> of FreeType 2 (which doesn't support OpenType directly for the moment)
> but rather a problem one level higher where Indic scripts are handled,
> probably within the Pango library.  Owen?

ok, I did some more investigation. Here's a partial dump of the font.
According to this rule, glyph 0x62 should've been substituted by glyph
0xb0. But for some reason, it's not happening.

Does this ring a bell ?

	-Arun

   <Lookup> <!-- 36 -->
      <LookupType>SINGLE</LookupType>
      <Subtable>
         <SubstFormat>2</SubstFormat>
         <Coverage>
            <CoverageFormat>2</CoverageFormat>
            <Glyph>  42 -   64</Glyph>
            <Glyph>  75 -   75</Glyph>
            <Glyph> 10d -  10e</Glyph>
         </Coverage>
         <GlyphCount>38</GlyphCount>
         <Substitute>0x90</Substitute> <!-- 0 -->
         <Substitute>0x91</Substitute> <!-- 1 -->
         <Substitute>0x92</Substitute> <!-- 2 -->
         <Substitute>0x93</Substitute> <!-- 3 -->
         <Substitute>0x94</Substitute> <!-- 4 -->
         <Substitute>0x95</Substitute> <!-- 5 -->
         <Substitute>0x96</Substitute> <!-- 6 -->
         <Substitute>0x97</Substitute> <!-- 7 -->
         <Substitute>0x98</Substitute> <!-- 8 -->
         <Substitute>0x99</Substitute> <!-- 9 -->
         <Substitute>0x9a</Substitute> <!-- 10 -->
         <Substitute>0x9b</Substitute> <!-- 11 -->
         <Substitute>0x9c</Substitute> <!-- 12 -->
         <Substitute>0x9d</Substitute> <!-- 13 -->
         <Substitute>0x9e</Substitute> <!-- 14 -->
         <Substitute>0x9f</Substitute> <!-- 15 -->
         <Substitute>0xa0</Substitute> <!-- 16 -->
         <Substitute>0xa1</Substitute> <!-- 17 -->
         <Substitute>0xa2</Substitute> <!-- 18 -->
         <Substitute>0xa3</Substitute> <!-- 19 -->
         <Substitute>0xa4</Substitute> <!-- 20 -->
         <Substitute>0xa5</Substitute> <!-- 21 -->
         <Substitute>0xa6</Substitute> <!-- 22 -->
         <Substitute>0xa7</Substitute> <!-- 23 -->
         <Substitute>0xa8</Substitute> <!-- 24 -->
         <Substitute>0xa9</Substitute> <!-- 25 -->
         <Substitute>0xaa</Substitute> <!-- 26 -->
         <Substitute>0xab</Substitute> <!-- 27 -->
         <Substitute>0xac</Substitute> <!-- 28 -->
         <Substitute>0xad</Substitute> <!-- 29 -->
         <Substitute>0xae</Substitute> <!-- 30 -->
         <Substitute>0xaf</Substitute> <!-- 31 -->
         <Substitute>0xb0</Substitute> <!-- 32 -->
         <Substitute>0xb1</Substitute> <!-- 33 -->
         <Substitute>0xb2</Substitute> <!-- 34 -->
         <Substitute>0xb3</Substitute> <!-- 35 -->
         <Substitute>0x10f</Substitute> <!-- 36 -->
         <Substitute>0x110</Substitute> <!-- 37 -->
      </Subtable>
   </Lookup>

Re: [Indic-computing-devel] Re: [Kannada] A few kannada bug fixes to qt-3.2.1

From: Arun S. <ar...@sh...> - 2003-09-05 16:18:48

Krishnamurthy Nagarajan wrote:

> Hi Arun,
> 
> The fundamental issue is that the glyph composition
> and rendering logic is 'hardcoded' in C code of each
> application, that to for a 'given' font. 

Agree that the logic is hardcoded in the libraries (pango and qt), 
without an extensible plugin. But I'm not sure that the logic is font 
specific. It should work with any opentype font for a given script, as 
far as I can see it.

> A more
> broad-based approach would be to provide generic X
> input methods for Indian languages.

Given the difficulty of getting my parents to learn the inscript 
keyboard, I think this is definitely important. If you read the 
archives, I've not been fan of implementing logic in client side 
libraries or even client side fonts in general.

So any approach with a X server side solution with X protocol extensions 
is what I like the best. But the client side does need modifications to 
understand Indic syllables for proper text editing.

But in the short term, qt and pango are the way people are going to do 
Indic computing. So I thought I'd fix up a few bugs during the weekend :)

> Tamil), with all the complexity in the rules (given in
> a text file to the translib library) and zero C code
> that is specific to any language/script, I am
> convinced that 
> any language/script specific peculiarities (like the
> reph case in Kannada, split vowel signs in Devenagari
> and Tamil, multiple representations for the same input
> etc etc) can be handled without any special coding at
> the app level (gnome, qt or whatever).

Interesting. Thanks for the pointer. Will take a look at it this week.

	-Arun

[Indic-computing-devel] Re: Problem with freetype rendering of an Indic opentype font

From: Arun S. <ar...@sh...> - 2003-09-05 16:08:24

Werner LEMBERG wrote:
> 
> I don't have time to investigate but it seems to me that some
> contextual shaping is happening.  If this is true it is not a problem
> of FreeType 2 (which doesn't support OpenType directly for the moment)
> but rather a problem one level higher where Indic scripts are handled,
> probably within the Pango library.  Owen?

Here's what the qt guys had to say about this:

> I did some tests, and it seem the problem is buried somewhere in the
> open type code used by both gtk and Qt.

I've personally verified that "0xcb7 0xccd" is not reordered by the 
shaping engines.

	-Arun

Re: [Indic-computing-devel] Re: [Kannada] A few kannada bug fixes to qt-3.2.1

From: Krishnamurthy N. <kn...@ya...> - 2003-09-05 12:07:50

Hi Arun,

The fundamental issue is that the glyph composition
and rendering logic is 'hardcoded' in C code of each
application, that to for a 'given' font. A more
broad-based approach would be to provide generic X
input methods for Indian languages. Pls take a look at
the infrastructure projects under indic-computing on
sourceforge that we have done/are doing (font
annotation, generic transliteration library, study of
various scripts and languages to develop a generic X
input method framework and so on). Having developed
and tested out the generic rule-based framework for
four Indian languages (Hindi, Telugu, Kannada and
Tamil), with all the complexity in the rules (given in
a text file to the translib library) and zero C code
that is specific to any language/script, I am
convinced that 
any language/script specific peculiarities (like the
reph case in Kannada, split vowel signs in Devenagari
and Tamil, multiple representations for the same input
etc etc) can be handled without any special coding at
the app level (gnome, qt or whatever). 

Pls visit the indic-computing projects, especially the
infrastructure projects and give your comments and
contribute to this base work. Thanks.

cheers,
Nagarajan

--- Arun Sharma <ar...@sh...> wrote:
> On Sun, Aug 31, 2003 at 02:17:52AM -0700, Arun
> Sharma wrote:
> 
> Issue (b) below has been fixed. (a) Still remains. A
> screenshot to
> demonstrate the problem with (a):
> 
> http://www.sharma-home.net/~adsharma/misc/wrong.jpg
> 
> Updated patch attached. Some of this may be
> applicable to Telugu too. 
> 
> 	-Arun
> 
> > Remaining issues:
> > 
> > a) Halant/Virama rendering broken. This seems to
> be freetype specific,
> > since I see the same behavior with gnome/gedit.
> > b) This comment in qscriptengine_x11.cpp:
> > 
> > // * In Kannada and Telugu, the base consonant
> cannot be
> > //   farther than 3 consonants from the end of the
> syllable.
> > 
> > is not strictly correct. For cases such as
> "Lakshmi" - "kshmi" is one
> > syllable. Gnome handles this correctly. But I
> couldn't figure out how to
> > change the qt code to fix that.
> > 
> > - if (skipped == 2 && (script == QFont::Kannada ||
> script == QFont::Telugu)) {	
> > + if (skipped == 4 && (script == QFont::Kannada ||
> script == QFont::Telugu)) {	
> > 
> > doesn't do it. Any patches will be highly
> appreciated.
> > ---
>

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

[Indic-computing-devel] Re: [Kannada] Problem with freetype rendering of an Indic opentype font

From: Arun S. <ar...@sh...> - 2003-09-02 00:34:42

Dr. U.B. Pavanaja wrote:
>> Here are the screenshots to demonstrate the wrong and the right rendering:
>> 
>> http://www.sharma-home.net/~adsharma/misc/wrong.jpg
>> http://www.sharma-home.net/~adsharma/misc/right.jpg
> 
> Second link is broken one.

Fixed, thanks!
	
	-Arun

[Indic-computing-devel] Re: [Kannada] Problem with freetype rendering of an Indic opentype font

From: Dr. U.B. P. <pav...@vi...> - 2003-09-01 18:41:49

> I see a problem with the freetype rendering of this unicode string:
> 
> "0xcb7 0xccd"
> 
> Using Microsoft tunga.ttf (A kannada font that ships with Windows XP).
> In fact, the problem manifests itself with all other consonants too,
> including 0xcb7.
> 
> Here are the screenshots to demonstrate the wrong and the right rendering:
> 
> http://www.sharma-home.net/~adsharma/misc/wrong.jpg
> http://www.sharma-home.net/~adsharma/misc/right.jpg

Second link is broken one.

-Pavanaja
-----------------------------------------------------
Dr. U.B. Pavanaja
Editor, Vishva Kannada
World's first Internet magazine in Kannada
http://www.vishvakannada.com/

Note: I don't worry about pselling mixtakes

[Indic-computing-devel] Problem with freetype rendering of an Indic opentype font

From: Arun S. <ar...@sh...> - 2003-09-01 18:32:38

I see a problem with the freetype rendering of this unicode string:

"0xcb7 0xccd"

Using Microsoft tunga.ttf (A kannada font that ships with Windows XP).
In fact, the problem manifests itself with all other consonants too,
including 0xcb7.

Here are the screenshots to demonstrate the wrong and the right rendering:

http://www.sharma-home.net/~adsharma/misc/wrong.jpg
http://www.sharma-home.net/~adsharma/misc/right.jpg

The problem is seen with both gnome and kde on Linux with Redhat Beta
(Severn) - freetype-2.1.4-4.0.

Can one of you tell me if this is a broken font or a bug in freetype ?

	-Arun

[Indic-computing-devel] Re: [Kannada] A few kannada bug fixes to qt-3.2.1

From: Arun S. <ar...@sh...> - 2003-09-01 18:01:41

Attachments: qt-3.2.1-kannada.patch

On Sun, Aug 31, 2003 at 02:17:52AM -0700, Arun Sharma wrote:

Issue (b) below has been fixed. (a) Still remains. A screenshot to
demonstrate the problem with (a):

http://www.sharma-home.net/~adsharma/misc/wrong.jpg

Updated patch attached. Some of this may be applicable to Telugu too. 

	-Arun

> Remaining issues:
> 
> a) Halant/Virama rendering broken. This seems to be freetype specific,
> since I see the same behavior with gnome/gedit.
> b) This comment in qscriptengine_x11.cpp:
> 
> // * In Kannada and Telugu, the base consonant cannot be
> //   farther than 3 consonants from the end of the syllable.
> 
> is not strictly correct. For cases such as "Lakshmi" - "kshmi" is one
> syllable. Gnome handles this correctly. But I couldn't figure out how to
> change the qt code to fix that.
> 
> - if (skipped == 2 && (script == QFont::Kannada || script == QFont::Telugu)) {	
> + if (skipped == 4 && (script == QFont::Kannada || script == QFont::Telugu)) {	
> 
> doesn't do it. Any patches will be highly appreciated.

[Indic-computing-devel] Re: [Kannada] A few kannada bug fixes to qt-3.2.1

From: Arun S. <ar...@sh...> - 2003-08-31 15:51:03

Arun Sharma wrote:
> // * In Kannada and Telugu, the base consonant cannot be
> //   farther than 3 consonants from the end of the syllable.

Actually the comment is correct.

> 
> is not strictly correct. For cases such as "Lakshmi" - "kshmi" is one
> syllable. Gnome handles this correctly. But I couldn't figure out how to
> change the qt code to fix that.
> 
> - if (skipped == 2 && (script == QFont::Kannada || script == QFont::Telugu)) {	
> + if (skipped == 4 && (script == QFont::Kannada || script == QFont::Telugu)) {	
> 
> doesn't do it. Any patches will be highly appreciated.

The problem is not with the above line. It's somewhere else.

	-Arun

[Indic-computing-devel] Re: [Kannada] A few kannada bug fixes to qt-3.2.1

From: Arun S. <ar...@sh...> - 2003-08-31 15:46:59

Dr. U.B. Pavanaja wrote:
>> c) Kannada has arkavattu and hence HasReph should be true
> 
> Is there any control with the user when to chose arkavattu 
> (reph) and when not to use? Since arkavattu originally does not 
> belong to Kannada, we should have the liberty to use it or not.

There isn't one right now, but I agree that there should be a knob for 
such things. Two things come to mind:

- environment variable
- qtconfig

	-Arun

[Indic-computing-devel] Major Enhancements to the Unicode Standard: Enabling International Domain Names, Expanding Worldwide Accessibility, and Reducing the Digital Divide

From: <kut...@ya...> - 2003-08-28 07:04:14

Hello Everyone,

I hope all of us will use this major step forward and
start building some real Indic web-applications.

Regards,
Abhijit Dutta

----------------------------------------


Major Enhancements to the Unicode Standard:
Enabling International Domain Names, Expanding
Worldwide Accessibility, 
and
Reducing the Digital Divide

Mountain View, CA, August 27, 2003 -- The Unicode®
Consortium and
Addison-Wesley announce publication of Version 4.0 of
the Unicode 
Standard.
Unicode is the fundamental specification for the
representation of 
text, at
the core of all modern software, programming
languages, and standards,
including Windows, Java, C#, Perl, XML, HTML, DB2,
Oracle, and many 
others.


Unicode is also central to the new internationalized
domain names, 
which
allow everyone in the world to have URLs in their own
languages. This 
is
yet another case where Unicode opens the door to more
of the world's
different cultures, helping to break down the digital
divide.

Version 4.0 strengthens Unicode support for worldwide
communication,
software availability, and publishing. The text has
been extensively
rewritten, and incorporates specifications that were
previously only
available as separate documents. The clarified
specification of 
conformance
requirements incorporates the most highly developed
character encoding
model in existence, encompassing the wide variety of
types of 
characters
needed by the world's languages, and permitting
compatibility with all
modern computer architectures.

Record-breaking character content

Version 4.0 encodes over 96,000 characters, twice as
many as Version 
3.0,
and includes two record-breaking collections of
encoded characters. The
largest encoded character collection for Chinese
characters in the 
history
of computing has doubled in size yet again to
encompass over 2000 years 
of
Chinese, Japanese, Korean, and Vietnamese literary
usage, including all 
the
main classical dictionaries of these languages.
Version 4.0 also 
encodes
the largest set of characters for mathematical and
technical publishing 
in
existence. The character repertoires of Version 4.0
and International
Standard ISO/IEC 10646 are fully synchronized.

Reducing the digital divide

To meet the needs of all linguistic communities, the
Unicode Standard 
and
associated standards are continually being extended,
not only in terms 
of
the addition of characters, but also in specifying
*how* those 
characters
work, such as:

- how text sorts or matches in different languages
- how text behaves for East Asian languages (e.g.
vertically) or in 
Middle
Eastern languages (from right to left)
- how text should upper- or lowercase
- how text breaks into lines or words
- how text behaves in Regular Expressions (a key tool
used in a vast 
number
of web servers)

Small linguistic communities all over the world have
the opportunity to 
get
mainstream software working right out of the box,
instead of waiting 
years
for special adaptations that may never come.

For more information on the scripts encoded in the
Unicode Standard, 
see
http://www.unicode.org/versions/Unicode4.0.0/

Version 4.0 is published by Addison-Wesley (ISBN
0-321-18578-1), and is
available from the Unicode Consortium or through the
book trade. The 
text
and code charts of Version 4.0 are also available on
the Consortium's 
Web
site www.unicode.org.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization
founded to develop,
extend and promote use of the Unicode Standard, which
specifies the
representation of text in modern software products and
standards.

Members of the Consortium are a broad spectrum of
corporations and
organizations in the computer and information
technology industry. Full
members are: Adobe Systems, Apple Computer, Basis
Technology, 
Government of
India (Ministry of Information Technology), Government
of Pakistan
(National Language Authority), HP, IBM, Justsystem,
Microsoft, Oracle,
PeopleSoft, RLG, SAP, Sun Microsystems, and Sybase.

Membership in the Unicode Consortium is open to
organizations and
individuals anywhere in the world who support the
Unicode Standard and 
wish
to assist in its extension and implementation.

For additional information on Unicode, contact the
Unicode Consortium,
650-693-3921

For more information on The Unicode Standard, Version
4.0, see:
http://www.awprofessional.com/titles/0321185781

________________________________________________________________________
Yahoo! India Promos: Win TVs, Bikes, DVD players & more!
Go to http://in.promos.yahoo.com

3 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 2 3 4 5 6 .. 25 > >> (Page 4 of 25)