indic-computing-devel Mailing List for The Indic-Computing Project (Page 4)
Status: Alpha
Brought to you by:
jkoshy
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(25) |
Feb
(90) |
Mar
(41) |
Apr
(16) |
May
(8) |
Jun
|
Jul
(37) |
Aug
(35) |
Sep
(62) |
Oct
(37) |
Nov
(22) |
Dec
(7) |
2003 |
Jan
(16) |
Feb
(19) |
Mar
(10) |
Apr
(5) |
May
(26) |
Jun
(11) |
Jul
(35) |
Aug
(4) |
Sep
(14) |
Oct
(5) |
Nov
(5) |
Dec
(10) |
2004 |
Jan
(25) |
Feb
(2) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(10) |
Aug
(2) |
Sep
(2) |
Oct
(1) |
Nov
(9) |
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2006 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
From: Alok K. <alo...@so...> - 2003-11-16 04:59:32
|
Hi, I'm in the process of building the handbook on redhat, and eventually creating the rpms for others to be be able to do that. Now, when I run make using the handbook makefile, which is doc/en*/ books/handbook/Makefile, I get Makefile:137: *** missing separator. Stop. So before I start on the chase I'd like to know if this build requires a special make or a special anything else. The make I have is make- 3.79.1-14. gmake also gives the same error. I suspect this comes at the first occurrence of a .include statement in the makefile. Regards Alok -- http://9211.blogspot.com Can't see Hindi? http://geocities.com/alkuma/seehindi.html http://groups.yahoo.com/group/linux-bangalore-hindi/ Discuss devanagari at http://groups.yahoo.com/group/devanaagarii/ alok_kumar ऍट softhome डॉट net |
From: Dr. U.B. P. <pav...@vi...> - 2003-11-04 13:36:12
|
<?xml version="1.0" ?><html> <head> <title></title> </head> <body> <div align="left"><font face="Courier"><span style="font-size:10pt">------- Forwarded message follows -------</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">Date sent:      </span></font><font face="Courier" color="#000080"><span style="font-size:10pt">Mon, 3 Nov 2003 10:12:49 -0500</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">From:           </span></font><font face="Courier" color="#000080"><span style="font-size:10pt">Rick McGowan <ri...@un...></span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt"><b>Subject:        </b></span></font><font face="Courier" color="#000080"><span style="font-size:10pt"><b>Unicode Collation Algorithm, version 4.0.0</b></span></font></div> <div align="left"><br/> </div> <div align="left"><font face="Courier"><span style="font-size:10pt">We are pleased to announce the release of the 4.0.0 version of</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">Unicode Technical Standard #10: The Unicode Collation Algorithm</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">(UCA), which specifies a default sorting order and comparison</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">mechanism for all Unicode characters.</span></font></div> <div align="left"><br/> </div> <div align="left"><font face="Courier"><span style="font-size:10pt">Major changes in this release include:</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">- The version of the UCA is now being synchronized with versions</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  of the Unicode Standard, so that the repertoire of characters</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  will be the same.</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">- An extensive new introduction has been added. It discusses</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  important concepts that were formerly in Section 5.17 of Unicode</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  3.0, but has been completely reworked for clarity and coverage.</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">- The Scope section has been recast and is now at the end of</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  the introduction.</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">- The location of data files has been changed to</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">  http://www.unicode.org/Public/UCA/</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">------- End of forwarded message -------</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">-----------------------------------------------------</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">Dr. U.B. Pavanaja</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">Editor, Vishva Kannada</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">World's first Internet magazine in Kannada</span></font></div> <div align="left"><font face="Courier"><span style="font-size:10pt">http://www.vishvakannada.com/</span></font></div> <div align="left"><br/> </div> <div align="left"><font face="Courier"><span style="font-size:10pt">Note: I don't worry about pselling mixtakes</span></font></div> <div align="left"></div> </body> </html> |
From: Cherry G. M. <ch...@sd...> - 2003-11-03 12:42:07
|
Hello list, This is to announce a pre-alpha release of a set of a debian repository containing a set of updated tools, and dependencies for other available ones required to build the indic-computing documentation. Testing and BUG reports are most welcome. You may download this package from http://prdownloads.sourceforge.net/indic-computing/doc-toolchain-debian-0.5-i386.tgz?download Best Regards, Cherry. -- ch...@sd... Homepage - http://cherry.freeshell.org |
From: Cherry G. M. <ch...@sd...> - 2003-10-22 13:13:43
|
On Tue, 21 Oct 2003, [iso-8859-1] Joseph Koshy wrote: > We need to work on simplifying the procedure to build install > from sources. Yep, something like doing a make build to get everything nice and packaged. I should begin working on it pretty soon.... Besides that Mahiti Infotech, Bangalore,has agreed that I can continue using their bandwidth for indic development. If you're reading this: Thank you Sunil!! Best, Cherry. -- ch...@sd... Homepage - http://cherry.freeshell.org |
From: <a_j...@ya...> - 2003-10-21 14:32:20
|
> http://cherry.freeshell.org/indic-deb-doc-alpha.txt > Comments/Critisism welcome on the indic-computing-devel list. Thanks. The "Binary Installation" part has been incorporated into our "Documentation Build" article, modulo some incomplete URLS. We need to work on simplifying the procedure to build install from sources. ===== Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/ Founder/Manager/Programmer/Peon, The Indic-Computing Project http://indic-computing.sf.net ________________________________________________________________________ Yahoo! India Matrimony: Find your partner online. Go to http://yahoo.shaadi.com |
From: Cherry G. M. <ch...@sd...> - 2003-10-19 19:25:56
|
Dear Koshy, list, I had deliberately kept from replying earlier to see how active this list is. Rather heartening to see that there are exactly two enthusiastic volunteers :-), one a veteran, the other a newbie ;-) On Fri, 10 Oct 2003, [iso-8859-1] Joseph Koshy wrote: [...] > > critical-mass in terms of volunteers for site maintanence and content > > management, we could think about diversifying. > > We are using different tools for different requirements. > > DocBook is being used for the 'heavy-duty' documentation that needs [...] > However, there seems to be only one person working on the website > (me :)) and I find EtText more of a hindrance than a benefit. Yes, EtText doesn't seem to be much help other than to distract one from the content. It brings us back to WYSIWIG, rather that WYMIWIG. I learnt it the hard way while trying to fix up my own private website with WebMake. > > Can we cut down on the number of tools used for documentation > > maintanence ? As of now, a prospective documenter has to know 6 [...] > > If you are writing documentation, today you need to know only > two tools, any text editor ('vi' or 'emacs' say) for editing > and 'make' to build everything. You don't really need to know > how the rest of the toolchain works -- the intent is that 'every > thing just works'. For documentation, we are using one SGML "format" > namely, DocBook. > Hmmm, that's making the presumption that someone on the list (which boils down to A. Joseph Koshy) will take care of the SGML. Sounds very altruistic, however doesn't address the basic issue of a steep learning curve. However, I'm sure its the best ad-hoc solution, considering the circumstances..... > The website could conceivably be written using DocBook too (for [...] > In the end I chose plain HTML for the website's content and 'WebMake' > to stich the pieces of the site's content into a coherent website. > Yes, I think that does make sense. > Coming to casual contributors, these folks aren't affected by the > internals of the toolchain (or even the specfics of HTML or DocBook) > since they can contribute using plain-text too. > Thanks Koshy! We need more sincere volunteers like you. I'll try and learn the hard pieces, so that I could help with the internals as well, but that'll take a while. I'm sure you've held the fort for too long for that to make a significant difference. So there, that's my offer ( like the GPL "Without even the guarantee of Merchantability of fitness for any particular purpose........" ;-) ) > Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/ > Founder/Manager/Programmer/Peon, The Indic-Computing Project ^^^^ I can see why now! Best, Cherry. -- ch...@sd... Homepage - http://cherry.freeshell.org |
From: <a_j...@ya...> - 2003-10-10 09:50:02
|
> I've put up a draft article at : > > http://cherry.freeshell.org/indic-deb-doc-alpha.txt Thanks, will incorporate this into our documentation sources and send the diff back to you for review. Good work, Cherry! ===== Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/ Founder/Manager/Programmer/Peon, The Indic-Computing Project http://indic-computing.sf.net ________________________________________________________________________ Yahoo! India Matrimony: Find your partner online. Go to http://yahoo.shaadi.com |
From: <a_j...@ya...> - 2003-10-10 09:48:34
|
> 1) From the perspective of a new volunteer for website maintanence: > > The order of prerequisite reading is not specified. > Here is a suggested order: [snip] Suggestion taken. > It would be very helpfull to finish section "3.1, Directory > Structure" of the documenation design goals document. Sure. I guess this could be added now. > 2) Documentation layout. > > Three formats are in use, with no apparent logical distribution in > directories. They are, sgml, html, and ettext. It would be nice to > make a policy decision about which format to stick to ( even if it > makes the site slightly uglier. ) Once the project has obtained > critical-mass in terms of volunteers for site maintanence and content > management, we could think about diversifying. We are using different tools for different requirements. DocBook is being used for the 'heavy-duty' documentation that needs "indic" stuff in it. HTML is used only for the website. EtText could be dispensed with since it is used only for a very small part of the website -- I had initially thought that being able to write "pseudo plain text" would be of help to people willing to work on the website. However, there seems to be only one person working on the website (me :)) and I find EtText more of a hindrance than a benefit. > Can we cut down on the number of tools used for documentation > maintanence ? As of now, a prospective documenter has to know 6 > tools namely: html, docbook sgml, ettext, webmake, bsd make, python. > A shallower, smaller learning curve can bring in more volunteers at > this stage. For now this is a substantial requirment for a casual > contributor. If you are writing documentation, today you need to know only two tools, any text editor ('vi' or 'emacs' say) for editing and 'make' to build everything. You don't really need to know how the rest of the toolchain works -- the intent is that 'every thing just works'. For documentation, we are using one SGML "format" namely, DocBook. The website could conceivably be written using DocBook too (for an example, see DocBook author Norman Walsh's home page). I seriously considered this when designing the infrastructure. The plus point of this approach was that we integrate our 'website' with the rest of the standalone documentation being written, but the drawback was that DocBook isn't a good DTD for describing website content (IMO). In the end I chose plain HTML for the website's content and 'WebMake' to stich the pieces of the site's content into a coherent website. Coming to casual contributors, these folks aren't affected by the internals of the toolchain (or even the specfics of HTML or DocBook) since they can contribute using plain-text too. > put down the excellent work of current volunteers, but merely as a > means to elicit discussion. Its been a week+ of silence since your post, so I thought I'd chip in :). ===== Joseph Koshy, FreeBSD Developer, http://people.freebsd.org/~jkoshy/ Founder/Manager/Programmer/Peon, The Indic-Computing Project http://indic-computing.sf.net ________________________________________________________________________ Yahoo! India Matrimony: Find your partner online. Go to http://yahoo.shaadi.com |
From: Cherry G. M. <ch...@sd...> - 2003-09-30 12:02:08
|
Hello list, I've put up a draft article at : http://cherry.freeshell.org/indic-deb-doc-alpha.txt Pls excuse me if the linebreaks mess up with links or lynx browsers. Comments/Critisism welcome on the indic-computing-devel list. Thanks, Cherry. -- ch...@sd... Homepage - http://cherry.freeshell.org |
From: Cherry G. M. <ch...@fr...> - 2003-09-30 08:37:54
|
Hello list, This email is intended to be a critique of the shortcomings of the current indic-computing documentation system. It is intended to add constructive criticism and elicit discussion. 1) From the perspective of a new volunteer for website maintanence: The order of prerequisite reading is not specified. Here is a suggested order: - The Docbook guide - WebMake homepage documentation - Etext homepage documentation - indic-computing doc-build New Volunteer's guide. It would be very helpfull to finish section "3.1, Directory Structure" of the documenation design goals document. 2) Documentation layout. Three formats are in use, with no apparent logical distribution in directories. They are, sgml, html, and ettext. It would be nice to make a policy decision about which format to stick to ( even if it makes the site slightly uglier. ) Once the project has obtained critical-mass in terms of volunteers for site maintanence and content management, we could think about diversifying. 3) Tool usage. Can we cut down on the number of tools used for documentation maintanence ? As of now, a prospective documenter has to know 6 tools namely: html, docbook sgml, ettext, webmake, bsd make, python. A shallower, smaller learning curve can bring in more volunteers at this stage. For now this is a substantial requirment for a casual contributor. Please note that the above observations/suggestions are not meant to put down the excellent work of current volunteers, but merely as a means to elicit discussion. Comments welcome on the devel list. Best, Cherry ch...@sd... http://cherry.freeshell.org/ |
From: Dr. U.B. P. <pav...@vi...> - 2003-09-10 04:07:13
|
> On Monday 08 September 2003 01:39, Owen Taylor wrote: > > On Sun, 2003-09-07 at 17:51, Lars Knoll wrote: > > > On Sunday 07 September 2003 23:37, Arun Sharma wrote: > > > > On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote: > > > > > > I see a problem with the freetype rendering of this unicode string: > > > > > > > > > > > > "0xcb7 0xccd" > > > > > > > > > > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows > > > > > > XP). In fact, the problem manifests itself with all other > > > > > > consonants too, including 0xcb7. > > > > > > > > > > I don't have time to investigate but it seems to me that some > > > > > contextual shaping is happening. If this is true it is not a problem > > > > > of FreeType 2 (which doesn't support OpenType directly for the > > > > > moment) but rather a problem one level higher where Indic scripts are > > > > > handled, probably within the Pango library. Owen? > > > > > > I did some investigations and the problem seems to lie within the > > > freetype 1 open type code used by Qt and pango. The font uses chain > > > substitutions to achieve correct rendering of this combination and these > > > seem to work incorrectly. Up do now I didn't find time to dig into this > > > in more detail. > > > > I just recently applied a patch to Pango for Chain Context substitutions > > that apparently was needed for Kannada - > > > > http://people.redhat.com/otaylor/opentype-patches/pango-26-chain-format3 > > > > (from Kailash C. Chowksey.) Could that fix the problem here? (That patch > > is in Pango-1.2.5) > > No, the problem was actually unrelated to the chaining substitutions. > > > http://bugzilla.gnome.org/show_bug.cgi?id=118592 also comes up for > > tunga.ttf, though we haven't come up with a final patch for that. > > I have debugged the problem today, and found that the problem lies somewhere > else than I first guessed. tunga.ttf has a ligature for the character > combination quoted above. The lookup table for the ligature has a LookupFlag > of 0x0100, so it should skip marks that do not have a MarkAttachmentClass of > 0x01. The 0xccd character however is a mark according to the gdef table, but > has no MarkAttachmentClass defined. > > Thus TT_GDEF_Get_Glyph_Property() return 0x8 for the glyph, and the > Check_Property() call fails Lookup_LigatureSubst. Seems like we have to > ignore the high byte of the LookupFlag if the glyph has no > MarkAttachmentClass (the specs are not really clear about this IMO). What is meant by MarkAttachmentClass? When we design the Opentype font, normally we use Microsoft VOLT and define the glyphs there. As you correctly pointed out, 0xccd is defined as Mark. This will define the glyph as Mark in the GDEF table. How to add the MarkAttachmentClass? Which tool are you using to dump the embedded tables of the OTF? Is it TTX? BTW, I did not develop Tunga.ttf :-) > I've attached a patch that fixes the problem and gives correct shaping of the > above glyph combination. > > Cheers, > Lars Rgds, Pavanaja----------------------------------------------------- Dr. U.B. Pavanaja Editor, Vishva Kannada World's first Internet magazine in Kannada http://www.vishvakannada.com/ Note: I don't worry about pselling mixtakes |
From: Dutta A. <dab...@in...> - 2003-09-09 09:37:37
|
Hello Everyone, A Unicode Workshop will be held in Delhi between Sept 24 and 26. Anyone who wants to participate should send a mail to co...@ma... = for verification and registration . Please include the following information - Name: Affiliation and Organization: Postal address: Phone number: Whether you subscribe to in...@un... (Yes/No): Interest: Language(s) on which you have expertize (Please state if this includes development experience also): What follow up activities you are interested in: Names of References (for verification): Unicode Workshop 0900 hrs, September 24-26, 2003 =A0The Oak,The Park, Parliament Street, New Delhi A detailed programme is=A0attached for your reference. (If detached by= the list-server please contact me/MAIT for it) (See attached file: Unicode Workshop Programme .zip) Please come prepared. Study up on the material from the book (sections download-able in pdf from : http://www.unicode.org/versions/Unicode4.0.0/bookmarks.html ) Useful and considered inputs are very rare to come by ! Regards, Abhijit ____________________________ http://www.ibm.com/software/globalization = |
From: Owen T. <ot...@re...> - 2003-09-07 23:42:00
|
On Sun, 2003-09-07 at 17:51, Lars Knoll wrote: > On Sunday 07 September 2003 23:37, Arun Sharma wrote: > > On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote: > > > > I see a problem with the freetype rendering of this unicode string: > > > > > > > > "0xcb7 0xccd" > > > > > > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows > > > > XP). In fact, the problem manifests itself with all other > > > > consonants too, including 0xcb7. > > > > > > I don't have time to investigate but it seems to me that some > > > contextual shaping is happening. If this is true it is not a problem > > > of FreeType 2 (which doesn't support OpenType directly for the moment) > > > but rather a problem one level higher where Indic scripts are handled, > > > probably within the Pango library. Owen? > > I did some investigations and the problem seems to lie within the freetype 1 > open type code used by Qt and pango. The font uses chain substitutions to > achieve correct rendering of this combination and these seem to work > incorrectly. Up do now I didn't find time to dig into this in more detail. I just recently applied a patch to Pango for Chain Context substitutions that apparently was needed for Kannada - http://people.redhat.com/otaylor/opentype-patches/pango-26-chain-format3 (from Kailash C. Chowksey.) Could that fix the problem here? (That patch is in Pango-1.2.5) http://bugzilla.gnome.org/show_bug.cgi?id=118592 also comes up for tunga.ttf, though we haven't come up with a final patch for that. Regards, Owen |
From: Lars K. <la...@tr...> - 2003-09-07 21:55:14
|
On Sunday 07 September 2003 23:37, Arun Sharma wrote: > On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote: > > > I see a problem with the freetype rendering of this unicode string: > > > > > > "0xcb7 0xccd" > > > > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows > > > XP). In fact, the problem manifests itself with all other > > > consonants too, including 0xcb7. > > > > I don't have time to investigate but it seems to me that some > > contextual shaping is happening. If this is true it is not a problem > > of FreeType 2 (which doesn't support OpenType directly for the moment) > > but rather a problem one level higher where Indic scripts are handled, > > probably within the Pango library. Owen? I did some investigations and the problem seems to lie within the freetype 1 open type code used by Qt and pango. The font uses chain substitutions to achieve correct rendering of this combination and these seem to work incorrectly. Up do now I didn't find time to dig into this in more detail. > ok, I did some more investigation. Here's a partial dump of the font. > According to this rule, glyph 0x62 should've been substituted by glyph > 0xb0. But for some reason, it's not happening. > > Does this ring a bell ? The lookup table you quite below is actually a subtable of some chain substitution table as far as I can tell. > <Lookup> <!-- 36 --> > <LookupType>SINGLE</LookupType> [...] I'll try to have a closer look in the next days if time allows. Cheers, Lars |
From: Arun S. <ar...@sh...> - 2003-09-07 21:35:33
|
On Thu, Sep 04, 2003 at 10:24:10PM +0200, Werner LEMBERG wrote: > > I see a problem with the freetype rendering of this unicode string: > > > > "0xcb7 0xccd" > > > > Using Microsoft tunga.ttf (A kannada font that ships with Windows > > XP). In fact, the problem manifests itself with all other > > consonants too, including 0xcb7. > > I don't have time to investigate but it seems to me that some > contextual shaping is happening. If this is true it is not a problem > of FreeType 2 (which doesn't support OpenType directly for the moment) > but rather a problem one level higher where Indic scripts are handled, > probably within the Pango library. Owen? ok, I did some more investigation. Here's a partial dump of the font. According to this rule, glyph 0x62 should've been substituted by glyph 0xb0. But for some reason, it's not happening. Does this ring a bell ? -Arun <Lookup> <!-- 36 --> <LookupType>SINGLE</LookupType> <Subtable> <SubstFormat>2</SubstFormat> <Coverage> <CoverageFormat>2</CoverageFormat> <Glyph> 42 - 64</Glyph> <Glyph> 75 - 75</Glyph> <Glyph> 10d - 10e</Glyph> </Coverage> <GlyphCount>38</GlyphCount> <Substitute>0x90</Substitute> <!-- 0 --> <Substitute>0x91</Substitute> <!-- 1 --> <Substitute>0x92</Substitute> <!-- 2 --> <Substitute>0x93</Substitute> <!-- 3 --> <Substitute>0x94</Substitute> <!-- 4 --> <Substitute>0x95</Substitute> <!-- 5 --> <Substitute>0x96</Substitute> <!-- 6 --> <Substitute>0x97</Substitute> <!-- 7 --> <Substitute>0x98</Substitute> <!-- 8 --> <Substitute>0x99</Substitute> <!-- 9 --> <Substitute>0x9a</Substitute> <!-- 10 --> <Substitute>0x9b</Substitute> <!-- 11 --> <Substitute>0x9c</Substitute> <!-- 12 --> <Substitute>0x9d</Substitute> <!-- 13 --> <Substitute>0x9e</Substitute> <!-- 14 --> <Substitute>0x9f</Substitute> <!-- 15 --> <Substitute>0xa0</Substitute> <!-- 16 --> <Substitute>0xa1</Substitute> <!-- 17 --> <Substitute>0xa2</Substitute> <!-- 18 --> <Substitute>0xa3</Substitute> <!-- 19 --> <Substitute>0xa4</Substitute> <!-- 20 --> <Substitute>0xa5</Substitute> <!-- 21 --> <Substitute>0xa6</Substitute> <!-- 22 --> <Substitute>0xa7</Substitute> <!-- 23 --> <Substitute>0xa8</Substitute> <!-- 24 --> <Substitute>0xa9</Substitute> <!-- 25 --> <Substitute>0xaa</Substitute> <!-- 26 --> <Substitute>0xab</Substitute> <!-- 27 --> <Substitute>0xac</Substitute> <!-- 28 --> <Substitute>0xad</Substitute> <!-- 29 --> <Substitute>0xae</Substitute> <!-- 30 --> <Substitute>0xaf</Substitute> <!-- 31 --> <Substitute>0xb0</Substitute> <!-- 32 --> <Substitute>0xb1</Substitute> <!-- 33 --> <Substitute>0xb2</Substitute> <!-- 34 --> <Substitute>0xb3</Substitute> <!-- 35 --> <Substitute>0x10f</Substitute> <!-- 36 --> <Substitute>0x110</Substitute> <!-- 37 --> </Subtable> </Lookup> |
From: Arun S. <ar...@sh...> - 2003-09-05 16:18:48
|
Krishnamurthy Nagarajan wrote: > Hi Arun, > > The fundamental issue is that the glyph composition > and rendering logic is 'hardcoded' in C code of each > application, that to for a 'given' font. Agree that the logic is hardcoded in the libraries (pango and qt), without an extensible plugin. But I'm not sure that the logic is font specific. It should work with any opentype font for a given script, as far as I can see it. > A more > broad-based approach would be to provide generic X > input methods for Indian languages. Given the difficulty of getting my parents to learn the inscript keyboard, I think this is definitely important. If you read the archives, I've not been fan of implementing logic in client side libraries or even client side fonts in general. So any approach with a X server side solution with X protocol extensions is what I like the best. But the client side does need modifications to understand Indic syllables for proper text editing. But in the short term, qt and pango are the way people are going to do Indic computing. So I thought I'd fix up a few bugs during the weekend :) > Tamil), with all the complexity in the rules (given in > a text file to the translib library) and zero C code > that is specific to any language/script, I am > convinced that > any language/script specific peculiarities (like the > reph case in Kannada, split vowel signs in Devenagari > and Tamil, multiple representations for the same input > etc etc) can be handled without any special coding at > the app level (gnome, qt or whatever). Interesting. Thanks for the pointer. Will take a look at it this week. -Arun |
From: Arun S. <ar...@sh...> - 2003-09-05 16:08:24
|
Werner LEMBERG wrote: > > I don't have time to investigate but it seems to me that some > contextual shaping is happening. If this is true it is not a problem > of FreeType 2 (which doesn't support OpenType directly for the moment) > but rather a problem one level higher where Indic scripts are handled, > probably within the Pango library. Owen? Here's what the qt guys had to say about this: > I did some tests, and it seem the problem is buried somewhere in the > open type code used by both gtk and Qt. I've personally verified that "0xcb7 0xccd" is not reordered by the shaping engines. -Arun |
From: Krishnamurthy N. <kn...@ya...> - 2003-09-05 12:07:50
|
Hi Arun, The fundamental issue is that the glyph composition and rendering logic is 'hardcoded' in C code of each application, that to for a 'given' font. A more broad-based approach would be to provide generic X input methods for Indian languages. Pls take a look at the infrastructure projects under indic-computing on sourceforge that we have done/are doing (font annotation, generic transliteration library, study of various scripts and languages to develop a generic X input method framework and so on). Having developed and tested out the generic rule-based framework for four Indian languages (Hindi, Telugu, Kannada and Tamil), with all the complexity in the rules (given in a text file to the translib library) and zero C code that is specific to any language/script, I am convinced that any language/script specific peculiarities (like the reph case in Kannada, split vowel signs in Devenagari and Tamil, multiple representations for the same input etc etc) can be handled without any special coding at the app level (gnome, qt or whatever). Pls visit the indic-computing projects, especially the infrastructure projects and give your comments and contribute to this base work. Thanks. cheers, Nagarajan --- Arun Sharma <ar...@sh...> wrote: > On Sun, Aug 31, 2003 at 02:17:52AM -0700, Arun > Sharma wrote: > > Issue (b) below has been fixed. (a) Still remains. A > screenshot to > demonstrate the problem with (a): > > http://www.sharma-home.net/~adsharma/misc/wrong.jpg > > Updated patch attached. Some of this may be > applicable to Telugu too. > > -Arun > > > Remaining issues: > > > > a) Halant/Virama rendering broken. This seems to > be freetype specific, > > since I see the same behavior with gnome/gedit. > > b) This comment in qscriptengine_x11.cpp: > > > > // * In Kannada and Telugu, the base consonant > cannot be > > // farther than 3 consonants from the end of the > syllable. > > > > is not strictly correct. For cases such as > "Lakshmi" - "kshmi" is one > > syllable. Gnome handles this correctly. But I > couldn't figure out how to > > change the qt code to fix that. > > > > - if (skipped == 2 && (script == QFont::Kannada || > script == QFont::Telugu)) { > > + if (skipped == 4 && (script == QFont::Kannada || > script == QFont::Telugu)) { > > > > doesn't do it. Any patches will be highly > appreciated. > > --- > __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com |
From: Arun S. <ar...@sh...> - 2003-09-02 00:34:42
|
Dr. U.B. Pavanaja wrote: >> Here are the screenshots to demonstrate the wrong and the right rendering: >> >> http://www.sharma-home.net/~adsharma/misc/wrong.jpg >> http://www.sharma-home.net/~adsharma/misc/right.jpg > > Second link is broken one. Fixed, thanks! -Arun |
From: Dr. U.B. P. <pav...@vi...> - 2003-09-01 18:41:49
|
> I see a problem with the freetype rendering of this unicode string: > > "0xcb7 0xccd" > > Using Microsoft tunga.ttf (A kannada font that ships with Windows XP). > In fact, the problem manifests itself with all other consonants too, > including 0xcb7. > > Here are the screenshots to demonstrate the wrong and the right rendering: > > http://www.sharma-home.net/~adsharma/misc/wrong.jpg > http://www.sharma-home.net/~adsharma/misc/right.jpg Second link is broken one. -Pavanaja ----------------------------------------------------- Dr. U.B. Pavanaja Editor, Vishva Kannada World's first Internet magazine in Kannada http://www.vishvakannada.com/ Note: I don't worry about pselling mixtakes |
From: Arun S. <ar...@sh...> - 2003-09-01 18:32:38
|
I see a problem with the freetype rendering of this unicode string: "0xcb7 0xccd" Using Microsoft tunga.ttf (A kannada font that ships with Windows XP). In fact, the problem manifests itself with all other consonants too, including 0xcb7. Here are the screenshots to demonstrate the wrong and the right rendering: http://www.sharma-home.net/~adsharma/misc/wrong.jpg http://www.sharma-home.net/~adsharma/misc/right.jpg The problem is seen with both gnome and kde on Linux with Redhat Beta (Severn) - freetype-2.1.4-4.0. Can one of you tell me if this is a broken font or a bug in freetype ? -Arun |
From: Arun S. <ar...@sh...> - 2003-09-01 18:01:41
|
On Sun, Aug 31, 2003 at 02:17:52AM -0700, Arun Sharma wrote: Issue (b) below has been fixed. (a) Still remains. A screenshot to demonstrate the problem with (a): http://www.sharma-home.net/~adsharma/misc/wrong.jpg Updated patch attached. Some of this may be applicable to Telugu too. -Arun > Remaining issues: > > a) Halant/Virama rendering broken. This seems to be freetype specific, > since I see the same behavior with gnome/gedit. > b) This comment in qscriptengine_x11.cpp: > > // * In Kannada and Telugu, the base consonant cannot be > // farther than 3 consonants from the end of the syllable. > > is not strictly correct. For cases such as "Lakshmi" - "kshmi" is one > syllable. Gnome handles this correctly. But I couldn't figure out how to > change the qt code to fix that. > > - if (skipped == 2 && (script == QFont::Kannada || script == QFont::Telugu)) { > + if (skipped == 4 && (script == QFont::Kannada || script == QFont::Telugu)) { > > doesn't do it. Any patches will be highly appreciated. |
From: Arun S. <ar...@sh...> - 2003-08-31 15:51:03
|
Arun Sharma wrote: > // * In Kannada and Telugu, the base consonant cannot be > // farther than 3 consonants from the end of the syllable. Actually the comment is correct. > > is not strictly correct. For cases such as "Lakshmi" - "kshmi" is one > syllable. Gnome handles this correctly. But I couldn't figure out how to > change the qt code to fix that. > > - if (skipped == 2 && (script == QFont::Kannada || script == QFont::Telugu)) { > + if (skipped == 4 && (script == QFont::Kannada || script == QFont::Telugu)) { > > doesn't do it. Any patches will be highly appreciated. The problem is not with the above line. It's somewhere else. -Arun |
From: Arun S. <ar...@sh...> - 2003-08-31 15:46:59
|
Dr. U.B. Pavanaja wrote: >> c) Kannada has arkavattu and hence HasReph should be true > > Is there any control with the user when to chose arkavattu > (reph) and when not to use? Since arkavattu originally does not > belong to Kannada, we should have the liberty to use it or not. There isn't one right now, but I agree that there should be a knob for such things. Two things come to mind: - environment variable - qtconfig -Arun |
From: <kut...@ya...> - 2003-08-28 07:04:14
|
Hello Everyone, I hope all of us will use this major step forward and start building some real Indic web-applications. Regards, Abhijit Dutta ---------------------------------------- Major Enhancements to the Unicode Standard: Enabling International Domain Names, Expanding Worldwide Accessibility, and Reducing the Digital Divide Mountain View, CA, August 27, 2003 -- The Unicode® Consortium and Addison-Wesley announce publication of Version 4.0 of the Unicode Standard. Unicode is the fundamental specification for the representation of text, at the core of all modern software, programming languages, and standards, including Windows, Java, C#, Perl, XML, HTML, DB2, Oracle, and many others. Unicode is also central to the new internationalized domain names, which allow everyone in the world to have URLs in their own languages. This is yet another case where Unicode opens the door to more of the world's different cultures, helping to break down the digital divide. Version 4.0 strengthens Unicode support for worldwide communication, software availability, and publishing. The text has been extensively rewritten, and incorporates specifications that were previously only available as separate documents. The clarified specification of conformance requirements incorporates the most highly developed character encoding model in existence, encompassing the wide variety of types of characters needed by the world's languages, and permitting compatibility with all modern computer architectures. Record-breaking character content Version 4.0 encodes over 96,000 characters, twice as many as Version 3.0, and includes two record-breaking collections of encoded characters. The largest encoded character collection for Chinese characters in the history of computing has doubled in size yet again to encompass over 2000 years of Chinese, Japanese, Korean, and Vietnamese literary usage, including all the main classical dictionaries of these languages. Version 4.0 also encodes the largest set of characters for mathematical and technical publishing in existence. The character repertoires of Version 4.0 and International Standard ISO/IEC 10646 are fully synchronized. Reducing the digital divide To meet the needs of all linguistic communities, the Unicode Standard and associated standards are continually being extended, not only in terms of the addition of characters, but also in specifying *how* those characters work, such as: - how text sorts or matches in different languages - how text behaves for East Asian languages (e.g. vertically) or in Middle Eastern languages (from right to left) - how text should upper- or lowercase - how text breaks into lines or words - how text behaves in Regular Expressions (a key tool used in a vast number of web servers) Small linguistic communities all over the world have the opportunity to get mainstream software working right out of the box, instead of waiting years for special adaptations that may never come. For more information on the scripts encoded in the Unicode Standard, see http://www.unicode.org/versions/Unicode4.0.0/ Version 4.0 is published by Addison-Wesley (ISBN 0-321-18578-1), and is available from the Unicode Consortium or through the book trade. The text and code charts of Version 4.0 are also available on the Consortium's Web site www.unicode.org. About the Unicode Consortium The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard, which specifies the representation of text in modern software products and standards. Members of the Consortium are a broad spectrum of corporations and organizations in the computer and information technology industry. Full members are: Adobe Systems, Apple Computer, Basis Technology, Government of India (Ministry of Information Technology), Government of Pakistan (National Language Authority), HP, IBM, Justsystem, Microsoft, Oracle, PeopleSoft, RLG, SAP, Sun Microsystems, and Sybase. Membership in the Unicode Consortium is open to organizations and individuals anywhere in the world who support the Unicode Standard and wish to assist in its extension and implementation. For additional information on Unicode, contact the Unicode Consortium, 650-693-3921 For more information on The Unicode Standard, Version 4.0, see: http://www.awprofessional.com/titles/0321185781 ________________________________________________________________________ Yahoo! India Promos: Win TVs, Bikes, DVD players & more! Go to http://in.promos.yahoo.com |