indic-computing-users Mailing List for The Indic-Computing Project (Page 23)
Status: Alpha
Brought to you by:
jkoshy
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(59) |
Oct
(153) |
Nov
(100) |
Dec
(69) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(71) |
Feb
(43) |
Mar
(57) |
Apr
(85) |
May
(44) |
Jun
(30) |
Jul
(34) |
Aug
(18) |
Sep
(22) |
Oct
(17) |
Nov
(8) |
Dec
(7) |
2004 |
Jan
(3) |
Feb
(5) |
Mar
(14) |
Apr
(3) |
May
(5) |
Jun
(9) |
Jul
(3) |
Aug
(11) |
Sep
(13) |
Oct
(9) |
Nov
(18) |
Dec
(12) |
2005 |
Jan
(8) |
Feb
(6) |
Mar
(12) |
Apr
(1) |
May
|
Jun
(9) |
Jul
(4) |
Aug
(6) |
Sep
(9) |
Oct
(6) |
Nov
(2) |
Dec
(7) |
2006 |
Jan
(2) |
Feb
(5) |
Mar
(2) |
Apr
(3) |
May
(5) |
Jun
(2) |
Jul
(1) |
Aug
(6) |
Sep
|
Oct
|
Nov
(8) |
Dec
(1) |
2007 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
(2) |
Oct
(1) |
Nov
|
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: FN <fr...@by...> - 2003-02-22 19:01:17
|
---------- Forwarded message ---------- Hi Hasin Raj and all the list members, I am helping to promote the 'Asian Open Source Centre' - www.asiaosc.org This site aims to form contacts between Asian open source people to encourage collaboration on things like localization, producing open fonts, etc. I have added your site to the information about Bangladesh, at the collaborative 'wiki': http://www.asiaosc.org/enwiki/page/Bangladesh.html Currently, the Wiki doesn't have much information about open source activities / contacts within Bangladesh. If you have time, it would be great if you could register then add more information to the Bangladesh page. Thanks Imran William Smith, Malaysia --- www.mimos.my - Mimos Berhad www.asiaosc.org - Asian Open Source Centre hasin_raj wrote: >Guys, > >I have uploaded my site today at www.banglaosc.cjb.net . It is a site >relating Bangla open Source community. I upload some special source >code into it > >Currently Available with source code: >Anti Redlof Source Code with Real Time Scan >bangla Menu ActiveX >SysDlgShow ActiveX >TransCTL ActiveX > > >Soon I will upload English to Bangla Translator and Standard Emailer >without SMTP. > >Stay tuned >Take a look at my site http://www.banglaosc.cjb.net. Please send me >comment . This site is stil under construction but you can download >easily > > > > > >To unsubscribe from this group, send an email to: >byt...@ya... > > > >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ > > > > > ------------------------ Yahoo! Groups Sponsor ---------------------~--> Get 128 Bit SSL Encryption! http://us.click.yahoo.com/FpY02D/vN2EAA/xGHJAA/C7EolB/TM ---------------------------------------------------------------------~-> To unsubscribe from this group, send an email to: byt...@ya... Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ |
From: Tapan S. P. <ta...@ya...> - 2003-02-22 15:57:38
|
Begin forwarded message: Date: Sat, 22 Feb 2003 21:55:55 +0800 From: Will Smith <wi...@wi...> To: byt...@ya... Subject: Re: [bytesforall_readers] Bengali Open Source Community Hi Hasin Raj and all the list members, I am helping to promote the 'Asian Open Source Centre' - www.asiaosc.org This site aims to form contacts between Asian open source people to encourage collaboration on things like localization, producing open fonts, etc. I have added your site to the information about Bangladesh, at the collaborative 'wiki': http://www.asiaosc.org/enwiki/page/Bangladesh.html Currently, the Wiki doesn't have much information about open source activities / contacts within Bangladesh. If you have time, it would be great if you could register then add more information to the Bangladesh page. Thanks Imran William Smith, Malaysia --- www.mimos.my - Mimos Berhad www.asiaosc.org - Asian Open Source Centre hasin_raj wrote: >Guys, > >I have uploaded my site today at www.banglaosc.cjb.net . It is a site >relating Bangla open Source community. I upload some special source >code into it > >Currently Available with source code: >Anti Redlof Source Code with Real Time Scan >bangla Menu ActiveX >SysDlgShow ActiveX >TransCTL ActiveX > > >Soon I will upload English to Bangla Translator and Standard Emailer >without SMTP. > >Stay tuned >Take a look at my site http://www.banglaosc.cjb.net. Please send me >comment . This site is stil under construction but you can download >easily > > > > > >To unsubscribe from this group, send an email to: >byt...@ya... > > > >Your use of Yahoo! Groups is subject to >http://docs.yahoo.com/info/terms/ > > > > > ------------------------ Yahoo! Groups Sponsor ---------------------~--> Get 128 Bit SSL Encryption! http://us.click.yahoo.com/FpY02D/vN2EAA/xGHJAA/C7EolB/TM ---------------------------------------------------------------------~-> To unsubscribe from this group, send an email to: byt...@ya... Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ |
From: Vijay P. S. A. <vi...@ek...> - 2003-02-22 11:43:40
|
Hi! There are a few updates after the last announcement on the OTF workshop to share with the community. We had been trying our bit to resolve some of the problems, concerns, issues raised by the members in the list, which are definitely correct, understandable and required action. There had been an off-line debate on the topics and some issues got resolved as well as some support has came about, hence this announcement. It has also been recommended by some of members that all debates on workshop issues should be done on lists only. So here it is for all the member for open debate as well as there is strong appeal to the members particularly those who wish to develop open source tools for development of OTF, hence developers of tools and applications and other those who suggest alternate standards as against limitations of Unicode for Indian languages as discussed in Indic-Computing workshop Sept 2002 and again many time in the Indic-mailing lists. This debate has been suggested to be only limited to "users" list so that it does not clutter "devel" & "standards" list. However for this time it is being posted on all lists, so that all members who are interested should join in users list (if they are already not subscribed there). Some of the major issues that had been raised by members are: 1) Why OTF, open standards Vs Proprietary tools for development, simply the debate "How open is OTF" 2) Unicode/OTF - How relevant for Indian languages, concerns and alternatives 3) Registration Fee issue 4) Travel support & Honorarium for speakers 5) Sponsorship At sponsorship front we have some good news to share, we have received some sponsorship that would help us now to offer some concessions to the participants. SARAI, New Delhi (thanks to Ravikant) has agreed to cover the cost of travel of the speakers upto a total of 25,000 Rs (III AC, Train only, might not cover all speakers, hence limitations, those who can afford should please afford). Although we are still waiting for confirmation from CHIPS, who have agreed to sponsor logistics and food, but we can well assume that the same might come, as has been assured by Aman Grewal (thanks to him) [if it does not alternative would be suggested #] ANNOUNCEMENTS: In view of the new developments and the concerns of the members we are happy to announce: 1) To change the name of workshop to "Indic-Font workshop" and include in the workshop agenda the issues raised by Koshy, Hemamurthy, Nagarajan & others and have a debate on relevance of OTF for Indian languages and alternatives. Then the workshop would functionally focus on OTF related issues and at the same time bring up other issues and alternatives as well. Incase there are not enough people to volunteer to take up the relevant topics such as open-source rendering technologies like Graphite, how Omega handles typesetting of complex scripts and make a presentation on the same in the workshop, (Once confirmed these sessions could be added in the existing agenda), then this would be an opentype only workshop (then the existing agenda would stand). The existing topics besides OTF that are already part of agenda are: a. Sunil Abraham, Mahiti, Bangalore, would arrange for an IPR (cyber laws) lawyer Lawrence Liang, who would discuss the issue of OTF openness after understanding the Adobe & Microsoft agreement on release of the fonts. b. Development of Otf from Ttf (Akruti) & other fonts experiences: Karunakar, Nagarjuna, Ravi Pande, Raj Kumar, Sayamindu??? c. Open/free software tools for Otf development: M Arun ????? Suggested Topics: VOLUNTEER, SPEAKERS & SESSION LEADERS NEEDED, also suggest additional topics relevant here a. Alternate standards best for Indian languages, RKK ??? - Hemamurthy b. Generic transliteration rules framework for Indian languages (input sequence & in context with display rendering)/Nagarajan c. Open-source rendering technologies like Graphite d. How Omega handles typesetting of complex scripts e. Nepal case study; Nepali Font Standardisation Project: Amar Gurung??? f. Any more ????????? We need a volunteer here to coordinate the open source sessions along with OTF sessions which Dr. Pavanaja is coordinating. 2) We are dropping the Registration Fee (as announced earlier) and would have a nominal registration fee of say 200.00 Rs. which could be submitted by participants at the time of registration. The amount collected thus would be for logistics/CD/reading material etc 3) Its is agreed that Speakers would voluntarily share there experiences, they shall not be provided any Honorarium, in view of concern raised by members. 4) All participants would have to travel at there own cost. # We would request (if CHIPS support do not come) all participants to pay for there food and mattress/pillow rent (if they are staying in hostels). Those who are making there own staying arrangements, it is fine. We shall arrange for the food in mess on confirmation and participants can directly pay. Lastly we hope that members would come up to volunteer for the sessions, if it is needed that workshop dates be extended from the present, to accommodate some of the open source topics (after confirmation of speakers), it could also be done. Till then the present agenda and dates would hold. We have already started getting applications from participants, those who are yet to submit can please write to me at <vi...@ek...>. We are also in process of putting up an online form for application, which shall be announced shortly. We hope to move ahead with this workshop towards regional language focused workshops soon, we would request all members to keep thinking of them also. regards vijay -- Vijay Pratap Singh Aditya ekgaon technologies email: vi...@ek... website: http://www.ekgaon.com Revised workshop Agenda: --------------------------------- --------------------------------- Indic-Computing Consortium announces ------------------------------------- First National Indic-Font Workshop Date: 28th to 30th March 2003 Venue: PESIT, Bangalore Sponsors: PESIT, Bangalore Vishwa Kannada, Bangalore DeepRoot Linux Pvt. Ltd, Bangalore Chattisgarh Infotech Promotion Society, Raipur Sarai, New Delhi Indlinux.org, Mumbai ekgaon technologies pvt. ltd, Madurai Indic-Computing Consortium: The Indic-Computing Consortium is an initiative of software developers, businesses and academic institutions to help evolve appropriate standards, resources and technologies for the Indic-Computing community. The Indic-Computing Consortium is designed as a national-level participatory organisation that serves as a common forum for discussion, information exchange and advocacy on behalf of all parties interested in the development of Indian Language Computing. The consortium aims to make true access of computing possible for Indian people by enabling support in local language. A framework is being built for creation of a hierarchy of participatory consortia, which would facilitate broad regional and local participation in the standardization and development process from a variety of stakeholders with differing areas of expertise and specialization. It is aimed that these consortia be participatory and inclusive to properly represent the viewpoint of local developers, users and other stakeholders. In step two, Indic-Computing Consortium would encourage & support formation of state-level consortia for each regional language, which could include participants from the following key member groups: - Developers: Software developers and managers developing local-language tools - Technologists: Academics and other experts in encoding and representation issues - Users / Practioners: Government agencies, publishers, NGOs and other major users of local-language software - Linguistic Groups: Academics and other experts of the linguistic features of a language and it's script Working closely with the State Government, this state-level consortium would serve as the representative body for deciding standards and other technical decisions for computing in a given regional language. The major roles to be carried out by the state-level consortium would be as follows: - Discuss various technical, linguistic and practical issues related to computing in the regional language - Serve as a capacity-building and educational resource for small regional software developers and users - Publish documents, tools & other materials helpful for local-language computing and development - Represent the regional language at National consortium meetings - Represent the regional language at International Standardization consortiums and proceedings such as Unicode and ISO In September 2002 Indic-Computing organised its first National workshop, which aimed at finding the various problems faced by the developer communities and issues related to standardization, technical support, policy and tools. To know more about the consortium, workshops and other initiatives, visit us at http://www.indic-computing.sourceforge.net The workshop: One of the working groups formed at the first Indic-Computing workshop was for development of OTF & issues related to language standardization and representation in international consortium. One of the action point and agenda for the group was to Hold OTF training workshop for developing major Indian language OTF fonts. Dr. U B Pavanaja took upon to hold and coordinate this workshop and Mr. Abhas Abhinav proposed to coordinate for logistics & sponsorship, Mr. G. Nagarjuna proposed to help coordinate with Akruti for making available fonts to be used for develop OTF. This group succeeded in its tasks making this workshop possible, Akruti released some free fonts to be used for conversion to OTF, which were taken up by some of the language groups (for more details on the language groups and the major issues being dealt by it please go through the proceedings of first workshop at our website) Some of the major concerns raised by the language community were: i. Developing good look fonts ii. Development of open source tools for rendering and hinting of OTF fonts (currently OTF development uses proprietary tools) iii. Finding font developers for all Indian languages and coordinating the group iv. Making available fonts to be converted to OTF This workshop seeks to address some of these issues and others enclosed in the workshop program as under. This is the first national workshop on the subject, while it shall focus on OTF, but it shall equally share concerns of open standards community and would have talks, demos and training sessions of open source tools, as well as various alternative technologies for Indian fonts. We propose to take up more regional language font workshops in future to give focused attention to each language, these workshops would be held all across India in different parts and would hold training programs and technology demonstration. We invite volunteers who would take upon to hold these workshops and provide coordination & logistics support. The Indic-Computing Consortium would provide necessary technical support and capacity building to these regional groups. Why OpenType? OpenType is an extension to TrueType, and uses Unicode as standard for character encoding. It also provides additional tables for defining rich set of mappings between characters and glyphs. It also provides for a having a large glyph set and even glyph varaints. All the features provided by OpenType format can be made use by having a application independent, preferable system level library with a api interface usable by applications. For Indic script processing OpenType tables like GSUB (glyph substitution) and GPOS (glyph positioning) gives font designer to define his rules on what conjuncts or combinations could be made available. Application programmer is relieved of the burden of knowing all the linguistic part. Also OpenType sort of makes the concept of glyph standard or font encoding standard redundant, again giving font vendors freedom to follow their own glyph sets and not really affecting the application. To summarize OpenType provides lot of benefits to Indic computing and also renders redundant some issues faced in Indic computing. There has been an ongoing debate on whether OTF is right for Indian Languages. The debate is relevant and contextual also. One of the perspectives penned by G Karunakar in support of OpenType is here for participants to explore. However the debate goes on and we invite all to participate in it on the Indic-computing mailing list. Please go through the attachment why_otf.txt for more on OTF fonts. Who can participate & pre-requisites: " Any developer/company/organization having interest in language technology and interested to learn development of OTF, understanding of Unicode, various technologies for Indian languages and related issues. " Pre-requites " Understand how to make a font " Knowledge of Unicode " Have a font that it/he/she can use for lab time " A willingness to keep trying until it/he/she understands " To get comprehensive information on pre-requisites mentioned above please go through the following links " Creating and supporting OpenType fonts for Indic scripts http://www.microsoft.com/typography/otfntdev/indicot/default.htm " Building OTF http://www.microsoft.com/typography/otfntdev/intro.htm " Details about VOLT http://www.microsoft.com/typography/developers/volt/default.htm " Unicode FAQ about Indic http://www.unicode.org/faq/indic.html " Unicode code charts http://www.unicode.org/charts/ (More links would be added subsequently on various development work on open source font technologies) How to participate: i. The workshop is by registration only, last date of registration in 20th March 2003 ii. Participants interested to participate in the workshop should send there applications to Vijay Pratap Singh Aditya <vi...@ek...>. Shortly an online applicatiion form would also be made available. iii. Application format " Name " Organisation " Communication Address " Whether participating in individual capacity or representing your organization " In either case please write in 200 words your interests and what do you expect from the workshop " If there are more then one participant from your organization/group please provide the numbers and communication address only. iv. The registration fee of Rs. 200.00 is to be paid in cash by the participants on the first day. v. Participants please note that all registered members would get, literature, CD with important font development softwares, lodging and boarding for three days (also dinner for the night before the workshop starts). Contacts: Participants are advised to contact the following for there queries: Dr. U B Pavanaja: <pav...@vi...> Technical clarification, sessions of the workshops, required preparation, fonts etc Mr. Abhas Abinav: <ab...@de...> Logistics, Lodging & Boarding, Venue etc Mr. Vijay Pratap Singh Aditya: <vi...@ek...> Workshop registration application, coordination, any other issue not covered above Instructions for Participants: 1) The participants are advised to go through the pre-requisites and equip themselves with necessary knowledge on font & Unicode standard. 2) The staying arrangements are from the evening before the day of start of the workshop. Dinner would be provided for the night also, for all participants reaching before 9.00 PM only. 3) The lodging & boarding provided by PESIT is in hostels & mess, and is modest by all means. Participants who wish to opt out of this arrangement can make there own staying arrangements. Please notify the same to us, also it is expected that the lunch would be taken by all the participants irrespective of there place of stay at mess only as there is not much time available in between the session. 4) The staying arrangement in the hostel is till 3rd day evening, it is expected that the participants would vacate the rooms by evening. 5) PESIT is at outskirts of Bangalore, participants are expected to make there own arrangements for local travel, as nothing could be provided by the organizers. Venue & how to reach: PESIT (PES Institute of Technology) 100 Feet Ring Road, Banashankri IIIrd Stage, (Off. Mysore Road) Bangalore Phone: (080) 672 0007 For participants arriving at Airport, take a prepaid taxi to the above address, the place is known to the prepaid stand. Directions from Airport: Airport Road--->Domlur signal--->Inner ring road--->Koramangala- -->BTM Layout--->Banneraghatta Road--->100 Feet (outer) ring road--->PESIT on 100 Feet (outer) Ring Road For participants arriving at Railway station & Bus station, best is to take a prepaid taxi to the above address, the place is known to the prepaid stand. A Bus is also available Directions from Railway Station: Station--->Chamrajpate--->Ashram--->Hanumant Nagar--->Hosakerehalli --->PESIT 100 Feet Ring Road Workshop program: (to be revised after confirmation of more speakers) ------------------------- Day-1: 09:00 - 09:45 Registration 09:45 - 10:15 Welcome & Inauguration J Koshy / Director PESIT 10:15 - 10:45 Overview - Pavanaja 10:45 - 11:00 Tea break 11:00 - 11:30 Planning glyph repertoire - Pavanaja 11:30 - 13:00 Introduction to Indian scripts, Character set, tools & Glyph Design & current trends in fonts- Ravi Pande 13:00 - 14:00 Lunch break 14:00 - 15:00 Testing glyphs, Generating Fonts, Font format - Ravi Pande 15:00 - 15:15 Tea break 15:15 - 17.15 Lab: Define glyphs and fill out repertoire 17.15 - Tea & informal interaction Day-2: 09:30 - 10:15 Encoding - J Koshy 10:15 - 10:45 Introduction to OTF - Pavanaja 10:45 - 11:00 Tea break 11:00 - 11:30 Open Type Tables - Pavanaja 11:30 - 13:00 Introduction to VOLT - Pavanaja 13:00 - 14:00 Lunch break 14:00 Lab: Open type tables Day-3: 09:30 - 10:15 Testing - Pavanaja 10:15 - 10:45 Hinting Pavanaja 10:45 - 11:00 Tea break 11:00 - 11:30 Digitally signing the font - Pavanaja 11:30 - 13:00 Fonts on Linux - Karunakar 13:00 - 14:00 Lunch break 14:00 Lab: Open type tables Special presentations proposed: a. Sunil Abraham, Mahiti, Bangalore, would arrange for an IPR (cyber laws) lawyer Lawrence Liang, who would discuss the issue of OTF openness after understanding the Adobe & Microsoft agreement on release of the fonts. b. Development of Otf from Ttf (Akruti) & other fonts experiences: Karunakar, Nagarjuna, Ravi Pande, Raj Kumar, Sayamindu??? c. Open/free software tools for Otf development: M Arun ????? |
From: Vijay P. S. A. <vi...@ek...> - 2003-02-22 11:43:37
|
Hi, An interesting application for the workshop, worth sharing to explore common ideas... vijay -- Vijay Pratap Singh Aditya ekgaon technologies email: vi...@ek... website: http://www.ekgaon.com -------- Original Message -------- Subject: Application for participation to the First national OTF Workshop Date: Fri, 21 Feb 2003 16:31:32 +0545 From: "Amar Gurung" <ama...@in...> To: <vi...@ek...> Name: Amar Gurung (Director), Nepali Font Standardization Project Organization: Madan Puraskar Pustakalaya [MPP] Email: ama...@in... <mailto:ama...@in...> (www.mpp.org.np <http://www.mpp.org.np>) Participation: Representing the organization What we expect from the workshop The Madan Puraskar Pustakalaya is a non-profit institution of 45 years standing, a respected repository of Nepali language and literature. It manages the largest archive in the Nepali language, and supports the use of Nepali for the social, economic and cultural advancement of all Nepalis. The Pustakalaya, in accordance with this goal, has been involved in a project for Nepali Font Standardization, which means making the many Nepali fonts compatible by deciding on one coding standard. In its first (and current) phase, the Pustakalaya project has achieved the following: 1. A Unicode-compatible Nepali font and corresponding keyboard driver. 2. A customizable sorting utility for the Nepali language. 3. A typing tutor software to be distributed freely. 4. A Nepali font converter utility For full details of the project please refer to our website. The completion of the first phase of work by the Madan Puraskar Pustakalaya lays the groundwork for the extensive usage of computing in the Nepali language. The standardization exercise will generate social dividends when software developers are spurred to develop applications in the vast area that opens up, from governance to administration, social and development work, in accounting, communications, education and so on. However, we need to keep ahead with time and learn from other similar experiences. We believe the workshop you are organizing is the right platform for us to learn new things as well as share our experience with others. Therefore I request you to accept our application to participate in the workshop. Myself and a collegue (Pawan chitrakar) from the technical team would like to attend your workshop. Hope to hear from you soon regarding the above. Thank you, Amar Gurung |
From: Sayamindu D. <unm...@So...> - 2003-02-19 17:46:22
|
-----Forwarded Message----- From: Greg Ferguson <gf...@sg...> To: la <ann...@en...>, cc...@tr..., mo...@su..., ve...@ta... Subject: updates (Linux-Complete-Backup-and-Recovery-HOWTO, Secure-CVS-Pserver, Tamil-Linux-HOWT)) Date: 18 Feb 2003 16:54:48 -0500 [....snipped] Tamil Linux HOWTO V. Venkataramanan <ve...@ta...> v1.0 2003-02-14 Document will help set up a working Tamil Linux environment. * NEW http://tldp.org/HOWTO/Tamil-Linux-HOWTO/ ______________________________________ for more info look at http://tldp.org/ -- Sayamindu Dasgupta [ http://www.peacefulaction.org/sayamindu/ ] ========================================= Speak out on social and cultural issues at PeacefulAction.Org http://www.peacefulaction.org ***************************************** Due to circumstances beyond your control, you are master of your fate and captain of your soul. |
From: Swapnil K. <swa...@ya...> - 2003-02-19 17:20:56
|
Note: forwarded message attached. __________________________________________________ Do you Yahoo!? Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com |
From: Guntupalli K. <kar...@fr...> - 2003-02-17 11:53:30
|
Fwding this to mail Indic computing list, where OCR issues are more relevant. Karunakar Begin forwarded message: Date: Sat, 15 Feb 2003 09:48:10 -0800 (PST) From: Swapnil Khedekar <swa...@ya...> To: ind...@li... Subject: Re: [Indlinux-group] ocr help Hi, I am Swapnil Khedekar, working as Research Assistant in CEDAR, Buffalo NY. I saw you referencing our research center on this list. Currently we are working on developing new page-segmentation techniques & character seperation & recognition techniques for Devanagari Script. I & Surya are working on developing a Truthing application for training Devanagari OCR systems. It will be made public in March in Hydrabad. Apart from us, there is lot of work going on in this are under RMK Sinha at IIT Kanpur & under BB Chaudhari at Indian Statistical Institute Kolkata. Also many small projects are going on in some uni & companies. One commercial Devanagari OCR is called "Chitrankan" by C-DAC. I had worked on this project, when I was in Pune. It was our effort to make it as professional & advanced as Finereader. I think more work is needed in recognition algorithms, designing new set of classifiers which can be extended to other similar Indian scripts. Also very less work has been done in Devanagari Handwriting recognition. That's all I have to say. I am regularly reading this list. It is really good to see you all working. I have less knowledge of linux internals. But still, if I could have been of any use to you, it will be my pleasure. Swapnil --- Guntupalli Karunakar <kar...@fr...> wrote: > On Thu, 13 Feb 2003 11:19:15 +0000 (GMT) > SWAPNIL HAJARE <dre...@ya...> wrote: > > > Hallo all, > > I am trying to develop an ocr for devanagari.I > want > > to use "pattern matching" for it. I've heard that > IITM > > have used the same for malyalam.Can anyone plz > help in > > this matter. > > These provide some info on OCR for devanagari. > > http://tdil.mit.gov.in/humis/ach-humis.htm > http://www.cedar.buffalo.edu/ILT/ > > HTH > > Regards, > Karunakar > > -- > Hating people is like burning down your house to get > rid of a rat - > Anon > > --------------------------------------------------- > * Indian Linux project, www.indlinux.org * > * Indic-Computing project, indic-computing.sf.net * > --------------------------------------------------- > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > IndLinux-group mailing list > Ind...@li... > https://lists.sourceforge.net/lists/listinfo/indlinux-group __________________________________________________ Do you Yahoo!? Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ IndLinux-group mailing list Ind...@li... https://lists.sourceforge.net/lists/listinfo/indlinux-group -- Hating people is like burning down your house to get rid of a rat - Anon --------------------------------------------------- * Indian Linux project, www.indlinux.org * * Indic-Computing project, indic-computing.sf.net * --------------------------------------------------- |
From: Sunil A. <su...@ma...> - 2003-02-17 09:58:21
|
Dear All, > Isa Seow <is...@ap...> informs us that the Asia-Pacific Development > Information Programme (APDIP) has launched the International Open Source > Network (IOSN), which will serve as a Centre of Excellence on Open Source > technologies and applications. > For more information, please see http://www.apdip.net/iosn/default.asp According to netcraft.com http://uptime.netcraft.com/up/graph/?mode_u=off&mode_w=on&site=http%3A%2F%2Fwww.apdip.net&submit=Examine Operating System and Web Server for www.apdip.net The site www.apdip.net is running Microsoft-IIS/5.0 on Windows 2000. Another case of 'blind leading the blind'. The consolation is that they are moving in the right direction. So who will bell this cat? Thanks, Sunil -- Sunil Abraham, CEO MAHITI Infotech Pvt. Ltd. 'Reducing the cost and complexity of ICTs' 314/1, 7th Cross, Domlur Bangalore - 560 071 Karnataka, INDIA Ph/Fax: +91 80 4150580. Mobile: 98441 01150 su...@ma... http://www.mahiti.org |
From: FN <fr...@by...> - 2003-02-17 06:44:04
|
Open Source Network ------------------- Isa Seow <is...@ap...> informs us that the Asia-Pacific Development Information Programme (APDIP) has launched the International Open Source Network (IOSN), which will serve as a Centre of Excellence on Open Source technologies and applications. It will aid countries in sharing information on Open Source (OS), assist with the development of needed toolkits and resource materials, support "localization" efforts and, generally, help facilitate and co-ordinate OS programmes and initiatives through networking. For more information, please see http://www.apdip.net/iosn/default.asp -- _____ _ _ _ | ___| __ ___ __| | ___ _ __(_) ___| | __ Freelance Journo, Goa India | |_ | '__/ _ \/ _` |/ _ \ '__| |/ __| |/ / http://linuxinindia.pitas.com | _|| | | __/ (_| | __/ | | | (__| < http://www.bytesforall.org |_| |_| \___|\__,_|\___|_| |_|\___|_|\_\ http://opennews.indianissues.org -- Frederick Noronha * Freelance Journalist * Goa * India 832.409490 / 409783 Writing with a difference... on what makes *the* difference |
From: Tapan S. P. <ta...@ya...> - 2003-02-16 20:54:10
|
Stumbled across this piece of history talking in early days of 8-bit ascii. See they were thinking about us even then! (I dont know why I am procrastinating on my work on Google Groups instead of doing it... ;) Date: 18 Nov 1982 1603-PST From: Pierre MacKay <MACKAY at WASHINGTON> Subject: Range of ASCII, alias ISO 646-1973 To: LES at SU-AI cc: Furuta at WASHINGTON, Binding at WASHINGTON, Your 8 bit ASCII message of 10 Nov 1982, found its way to me by a somewhat roundabout route, since I am not on the WorkS list, and, given the size of my mail file as it is, I am hesitant to get there. You underestimate the range of even 7-bit ASCII. In conjunction with the appropriate escape sequences from ISO 2022-1973, alias (for all practical purposes) ANSI X3.41-1974, the good old 7-bit table speaks several languages. For instance: Greek---ISO 5428-1980 (I haven't actually seen this yet. Japanese---National standard C6220-1969 (katakana only, of course, and this, in the form JISCII is a true 8-bit code, with ASCII residing in columns 0..7 and katakana in columns 10..13. Russian---GOST 13052-67, a dreadful aberration set up for the use of SO and SI coding, with the Cyrillic alphabet scrambled to match the visually similar Latin letters. Why even a Commissar would want to do that to his own language is beyond me, but it is AUTHORITATIVE, under the circumstances. The Arabic case is chaos. There is no reason why a good, efficient Arabic script coding table cannot be included in a 7-bit range. I am working with one now, but it is rather my own invention. It resembles some of the work done by ISO TC-46 and similar work done at the Library of Congress. There was a fine suggestion put forward at Riyadh, Saudi Arabia, about two and a half years ago, but it came to nothing, and a dreadful Moroccan notion, cobbled up out of a set of linotype matrices now has a certain currency, in that it has been registered, whatever that means, as Number 59, dated June 1, 1982 with ISO. It includes 4 ISO 2022 escape sequences to identify G0, G1, G2, and G3 graphic sets, but does not say what is to be done with all these alternatives. ECMA has plunged into the same waters with an entirely different proposal, which may even be worse. They all seem to assume that all Arabic ligature forms must be shown in the coding table, rather as if Don Knuth's TeX were to require the elimination of the open and close brace character positions so that you could code the double-f ligatures directly. The implications of microprocessor technology have not yet got through. Urdu, Pashto and Sindhi would probably overload a 7-bit table, since you are really dealing with two incompatible alphabets mashed into one in those cases. Malay and Chinese-Turkish (as seen on the lower right corner of PRC banknotes) will fit. Persian, of course will fit easily, as will Ottoman Turkish, a language for which I have a bizarre atavistic affection. Western Europe and Hungary have national versions of ISO 646 to account for heavily used diacriticals. I don't know about Czech, which is a bit overloaded. Modern Turkish is a nice problem too. I believe the Sanskrit-derived Indian languages would fit, and the Tamil family would certainly fit in a 7-bit table. Chinese, and Japanese Kanji would not. The Japanese use a manageable subset of Chinese ideographs, and have already established a multi-bit code. One proposal for Chinese uses the 94 cells available in the Graphic area of ISO 646 in a three level code. There are 94 books of 94 pages each of 94 characters each, or 94 to the third power possible characters. That should suffice even for Chinese. --Pierre MacKay |
From: Niranjan R. <nir...@ma...> - 2003-02-15 11:09:04
|
Sunil Abraham wrote: >>Translation of software packages is time consuming and it would >>be better if we dont tie the release of the distribution to such > > > I agree. This can be a huge effort. Let us aim for small victories in > the start. Hopefully we should be able to contribute 'hindi' and > 'kannda' Plone.org / Zope.org within the deadline. Right. Translation of software packages is not the proposed target of this project. The suggestion was for translation of the introductory materials, and possibly some books and articles which explain Free and Open Source concepts and ideas. These translations would be needed and are a good starting point irrespective of this particular project. Perhaps someone is already doing/done the FSF India introductory materials, books of RMS, any articles, books, speeches, presentations etc of relevant persons in any indian languages. Interviews with relevant people. some suggestions: Siva Vaidhyanathan <http://homepages.nyu.edu/~sv24/Bio.html> review of his books - The Anarchist in the Library <http://homepages.nyu.edu/~sv24/> - Copyrights and Copywrongs: The Rise of Intellectual Property and How It Threatens Creativity (New York: New York University Press, 2001) Rishab Ghosh of FLOSS study: http://www.infonomics.nl/FLOSS/report/index.htm Lawrence Lessig <http://lessig.org> Eben Moglen <http://emoglen.law.columbia.edu/> Michael Hart of Project Gutenberg http://promo.net/pg/ RMS, Linus Torvalds, Bruce Perens etc. etc. |
From: Sunil A. <su...@ma...> - 2003-02-15 10:28:44
|
> Translation of software packages is time consuming and it would > be better if we dont tie the release of the distribution to such I agree. This can be a huge effort. Let us aim for small victories in the start. Hopefully we should be able to contribute 'hindi' and 'kannda' Plone.org / Zope.org within the deadline. Thanks, Sunil On Sun, 2003-02-16 at 00:11, Meyarivan wrote: > > > > > IMHO, I think someone would be doing India a *great* favour if they could > > translate GNUWin into *any* Indian language. But who would that *someone* > > be? Of course, this is apart from Ravikant's ideas, which could also be > > gone ahead with if Sarai sees potential for it. > > > > It is probably a good idea to start with the following: > > * Software package compilation > > either original compilation or borrowed from > OpenCD, GNUWin etc > > the time required to test the software is > considerable .. > > > * Documentation > > starting with simple documentation on the > purpose of each software in atleast 2-3 > indian languages ( in paper ) > > > Translation of software packages is time consuming and it would > be better if we dont tie the release of the distribution to such > efforts and continue with whats available right now. As the translated > versions become available (through efforts from indic-computing, sarai..) > we can incorporate them. > > > -- Sunil Abraham, CEO MAHITI Infotech Pvt. Ltd. 'Reducing the cost and complexity of ICTs' 314/1, 7th Cross, Domlur Bangalore - 560 071 Karnataka, INDIA Ph/Fax: +91 80 4150580. Mobile: 98441 01150 su...@ma... http://www.mahiti.org |
From: Meyarivan <ma...@sa...> - 2003-02-15 09:23:34
|
> > IMHO, I think someone would be doing India a *great* favour if they could > translate GNUWin into *any* Indian language. But who would that *someone* > be? Of course, this is apart from Ravikant's ideas, which could also be > gone ahead with if Sarai sees potential for it. > It is probably a good idea to start with the following: * Software package compilation either original compilation or borrowed from OpenCD, GNUWin etc the time required to test the software is considerable .. * Documentation starting with simple documentation on the purpose of each software in atleast 2-3 indian languages ( in paper ) Translation of software packages is time consuming and it would be better if we dont tie the release of the distribution to such efforts and continue with whats available right now. As the translated versions become available (through efforts from indic-computing, sarai..) we can incorporate them. |
From: FN <fr...@by...> - 2003-02-15 05:21:46
|
Very interesting. Could any one on these lists take it forward? FN On Fri, 14 Feb 2003, Bill Kendrick wrote: > > Hi there! I saw your comment on GNUWinII about translations to Indian > languages. > > I'm interested in having my Open Source application (which happens to be > part of GNUWinII, as well), "Tux Paint" translated to Indian languages, > if possible. > > http://www.newbreedsoftware.com/tuxpaint/ > > > Do you know anyone who could help? > > So far, it's been translated into almost 25 other languages, > including Japanese, Chinese, Korean and Greek. (They've used UTF-8 encoding > so far. I think Unicode shouldn't be impossible.) > > > Thanks in advance! > > -- _____ _ _ _ | ___| __ ___ __| | ___ _ __(_) ___| | __ Freelance Journo, Goa India | |_ | '__/ _ \/ _` |/ _ \ '__| |/ __| |/ / http://linuxinindia.pitas.com | _|| | | __/ (_| | __/ | | | (__| < http://www.bytesforall.org |_| |_| \___|\__,_|\___|_| |_|\___|_|\_\ http://opennews.indianissues.org -- Frederick Noronha * Freelance Journalist * Goa * India 832.409490 / 409783 Writing with a difference... on what makes *the* difference |
From: FN <fr...@by...> - 2003-02-15 03:56:30
|
It's great that someone is starting somewhere! Over the past year, I've found Sarai to be useful partners to work with when it came to promoting/supporting Free Software. Incidentally, Niranjan sent me a set of 10 CDs (varied distros, including GNUWin). I went through GNUWin and was absolutely amazed by the range of software in that collection. Also, it does not overlook the 'ethical' and 'philosophical' side of Free Software, in favour of the technical advantages. IMHO, I think someone would be doing India a *great* favour if they could translate GNUWin into *any* Indian language. But who would that *someone* be? Of course, this is apart from Ravikant's ideas, which could also be gone ahead with if Sarai sees potential for it. But translating GNUWin (even if only into Hindi or Hindi/Tamil for a start) would be a great initiative. Wonder if someone from the IndicComputing network could also help? There's a fair bit of translating work to be done there. FN On Fri, 14 Feb 2003, ravikant wrote: > > Dear Niranjan, > > Sarai will be happy to take up the project, independent of funding, etc. We > feel we have the infrastructure and the network to make and distribute 5, 000 > cds to vendors and users, primarily in and around Delhi, but other areas as > well. This does not exclude the possibilty of any other group or individual > taking up the project and applying for the funding. The cds will be given out > free in both senses of the term. > > What we may need is some voluntary contributions from translators in a couple > of South Indian langauges, say Tamil and Kannada. We have the resources to > translate manuals, etc. in hindi and Bangla. Alternatively, Sarai takes up > the distribution responsibilities for the north and somebody else, say FSF, > for the South? > > I agree with niranjan that we need to move as swiftly as we can. We are > looking to a deadline in April to complete the selection of packages, burning > cds and translating manuals. > > Feedbacks, suggestions and and voluntary contributions are welcome. > > cheers > ravikant > www.sarai.net > > > > >> Now what is the next step. Who will be the Manager for this effort > > >> ? (Some one should take the responsibility of followup with other > > >> or the effort will start dying). > > > > I hope by this time you have had someone to take care of or manage this > > project. If not then we can safely forget about it, but if yes then a > > faster and more vigorous activity is needed. > > > > I have actually talked to some possible donors, who are quite positive > > to the idea, and could possibly contribute between and 1-2 lakh Rupees. > > > > So if this project is to be seen as promoting FLOSS as well as FSF, then > > act fast. > > > > Since I am in Helsinki, and see my part of the deal as basically getting > > some funding for you guys, please do not wait for my responses and go > > ahead with whatever organizational method you have to realise this project. > > > > As said earlier, If you would like to pursue this, do make a preliminary > > budget and a plan of action, describing in detail who will do what, and > > how much will be spent on what. Once I have that paper, I will come back > > with comments to make any changes if necessary and talk to the donors. > > > > FN wrote: > > > Going a little off-track, could such support be sought also/instead > > > for Prof Nagarajuna's long awaiting FLOSS-for-engineering-students > > > distro? (This is a distro of useful FLOSS tools to be widely > > > replicated and shared among engineering students in South Asia. You > > > could imagine the impact this could have on the shaping of the > > > thinking of a future generation). FN > > > > The donors can also be convinced about this > > FLOSS-for-engineering-students project, if it is presented properly. > > > > However I would still think the first one is an easy start, and if you > > manage to make enough money with it, you can then invest that money > > further on the new project. If this project is successful, I could even > > think of involving you guys (of course against a reasonable payment) in > > a similar project which we are considering here in Finland. > > > > Just a note: here in Scandinavia, people are usually very thorough about > > projects. If you do make a proposal, you have to try to think of all > > possible questions and answer those *before someone asks for an answer*. > > So make no presumptions, and clarify everything on paper. > > > > regards and good luck > > > > niranjan > > > > ########################################### > > This message has been scanned by F-Secure Anti-Virus for Internet Mail. > > For more information, connect to http://www.F-Secure.com/ > > > > ------------------------------------------------------- > -- _____ _ _ _ | ___| __ ___ __| | ___ _ __(_) ___| | __ Freelance Journo, Goa India | |_ | '__/ _ \/ _` |/ _ \ '__| |/ __| |/ / http://linuxinindia.pitas.com | _|| | | __/ (_| | __/ | | | (__| < http://www.bytesforall.org |_| |_| \___|\__,_|\___|_| |_|\___|_|\_\ http://opennews.indianissues.org -- Frederick Noronha * Freelance Journalist * Goa * India 832.409490 / 409783 Writing with a difference... on what makes *the* difference |
From: FN <fr...@by...> - 2003-02-15 03:55:32
|
------------------------------ Date: Thu, 13 Feb 2003 14:39:12 +0530 From: baiju m <ba...@ep...> Subject: [fsug-calicut] Mozilla 1.3 Beta Released Hi, Mozilla 1.3 Beta Released : http://www.mozilla.org/releases/mozilla1.3b/ Now almost all Malayalam webpages can view, by just copying the fonts to $HOME/.fonts directory (and enter $fc-cache, once) When I tested this with alpha release Manorama has shown some problems, I have to select Latin-1 encoding from View menu. All other websites like keralakaumudi.com,deshabhimani.com,weblokam.com deepika.com and mangalam.com has rendered properly. Again mathrubhumi's display is not perfect. I have created some free fonts with font-encodings of these fonts, it will be available from http://www.keralaindustry.org/malayalam (wait for few days) If you are trying to view deepika.com and mangalam.com cuncurrently by using their karthika fonts. it will again make problem, infact these two are different encodings of same family fonts, so the free fonts will incorporate font-encoding! s of both in one font, so that you can view both sites by using the same font. Regards, Baiju M _____________________________________________________________________ |
From: Krishnamurthy N. <kn...@ya...> - 2003-02-13 11:04:20
|
Hi all, My comments on OpenType * This being an extension of TrueType, M$ and Adobe hold patents on the format. So, though the format is published, it's not open. It may be possible, in that case, to develop open source tools to create/edit opentype font files (such as extending pfaedit), but then M$ and Adobe always have the last say and may have hidden extensions that only their tools would be able to handle. * The rendering algorithms right now are embedded in commecial tools such as M$'s Inscribe. So, they may even patent the rendering algorithms thus blocking any open source rendering tools. * Now about the great advantages of the GSUB, GPOS tables - something fundamental is being missed out by all developers here : display rendering is closely tied with (keyboard) input and since Indian languages are phonetic, the mapping of input to appropriate choice of glyphs, their relative Moreover, the contextual positioning of various glyphs, especially the mathras and dependent vowel signs (more so with 'split vowel signs') is very much dependent on the input context of relevant letters. Subtle contexts such as when a 'm' should be mapped to an anuswara can't be specified with these GSUB table rules at all. Though the documentation of GSUB tables says "The text-processing client uses the GSUB data to manage glyph substitution actions", it's quite a bit of burden on 'each' of the applications to figure out what glyph substitution to use when! Of the six types of glyph substitution, the one that is touted as the most powerful, "contextual substitution", works on the context of surronding glyphs and not the input context! And that's not what one looks for. All in all, every app has to bother about looking up the GSUB and GPOS tables to figure out various substitions. So, no common library that will do the job (like pango/gtk). By contrast, the generic transliteration rules framework for Indian languages that I developed and presented in the Sep 2002 Indic-computing workshop intelligently ties together the input sequence & context with display rendering, with very sophisticated facilities for context specification (both input and display), glyph reordering and so on. The C library, which is independent of any of the input languages/scripts, is available in sourceforge, along with sample rule files for four of the Indian languages (Hindi, Telugu, Tamil and Kannada). About this upcoming OTF workshop/seminar : it's not free and not related to open source projects. So, why is it being advertised on indic-computing list which is for open source development ? cheers, Nagarajan __________________________________________________ Do you Yahoo!? Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com |
From: Vijay P. S. A. <vi...@ek...> - 2003-02-12 06:10:20
|
Hi, I came across a recent article (although not this one) about the work on "Vachak" by this company, a text to speech software launched by this company for Indian languages. There website is www.prologixsoft.com and are based in Lucknow. There contact email id is co...@pr.... The software is also supported on Linux. I have tested there online speech files (.wav files) and its really amazing. Might help the speech group people. best vijay ------------------ When your PC converses with you By Royden D'Souza, Times News Network The Times of India, New Delhi, India, December 7, 2002 Imagine villagers walking up to a PCO, dialling a number and having their e-mail read out to them in their language; visually challenged people surfing the Internet, and having web-page content read out to them in their mother-tongue; users phoning up a directory assistance, speaking in any of the 18 official Indian languages, or having your child taught daily by his/her favourite soft-toy. Vaachak, a revolutionary text to speech software, created by Prologix Software Solutions Pvt Ltd of Lucknow, is the first ever software to translate Indian text in 18 Indian languages and even Romanised Hindi to clear, quality speech. Vibhu Agarwal, president of Prologix, is obviously excited with Microsoft India expected to incorporate Vaachak in its latest version of Internet Explorer. “The software’s synthesiser translates some 18 official Indian languages into clear electronic utterances in the desired Indian language, using the Indian Standard Code for Information Interchange (ISCII). What the user hears is clear and pleasant speech in a male or female voice." "So, from having your e-mail read out to you through a sort of an e-Dak Seva, to having web-page content read out, to making your presentations more spiffy, Vaachak opens up an amazing world of options,” said Vibhu. “Vaachak has a key role to play in e-governance, too, by enabling the transmission of important information and creating an administrative link with the inner-most reaches of a state, through the huge telephone network across India.” There are about 38 million telephone connections in the country, but only about 2 million Internet connections. The government has so far sought to take the information revolution to rural India through IT Kiosks. However, this has been based on the premise that villagers are literate, know English, and have the time and money to access the Internet at such kiosks. |
From: FN <fr...@by...> - 2003-02-11 07:36:51
|
Some feedback from the Tamil computing world! FN ---------- Forwarded message ---------- Vanakkam Aravindh! Of course I have no second doubt about Padhami's quality. I went to Higgonbothms one day, and saw it on display. It is really cool.........and I am definitely going to buy it, but not at this moment. I do not know about other tamil word processors, but definitely this is a cool one, no second doubt. Previously I was able to see a download version in your site, but the downloadable file link is not there nowadays. Anbudan, krupa --- In tam...@ya..., "aravindh@c..." <aravindh@c...> wrote: > Anbulla Kirupa! > > In my openion Padhmai is the best choice.Try with a free trial version for > 15 days.The cost is not much. > > Think it over and write back to me if u need any trial version of > padhami(free) > > Anbudan > Aravindan > > > > Original Message: > ----------------- > From: S. Krupa Shankar shankarkrupa@y... > Date: Wed, 05 Feb 2003 17:44:39 -0000 > To: tam...@ya... > Subject: [tamil-programers] Tamil word processing! > > > Vanakkam! > > Is there any tamil software that checks sandhi pizhai (of course not > any of those padhami, kamban, etc., rather something free)... > > If not, could anyone of you tell me about a link to a good tamizh > site that teaches the Sandhi part (otru migudhal) of tamizh grammar? > > Nandri. > > Anbudan, > krupa > > > To unsubscribe from this group, send an email to: > tam...@eg... > > > > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ > > > > > -------------------------------------------------------------------- > mail2web - Check your email from the web at > http://mail2web.com/ . To unsubscribe from this group, send an email to: tam...@eg... Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ |
From: Keyur S. <key...@ya...> - 2003-02-11 05:14:04
|
--- FN <fr...@by...> wrote: > Could anyone help my friend Nitya Jacob in finding suitable translation > facilities? What is the state of machine translation work being > undertaken > by IIIT-Hyderabad, if one recalls right? FN There are few attempts of machine translation work in India. Visit the following link for more details. http://www.tdil.mit.gov.in/mat/ach-mat.htm - Keyur __________________________________________________ Do you Yahoo!? Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com |
From: FN <fr...@by...> - 2003-02-10 18:48:52
|
Could anyone help my friend Nitya Jacob in finding suitable translation facilities? What is the state of machine translation work being undertaken by IIIT-Hyderabad, if one recalls right? FN PS: Nitya must apologise for the hurried end to the phone call. I was at the police station paying a fine for wrong parking... ;-) > Anyway, we need information on 2 things - > 1. Who are the people in the country working on translation software and > Indian language fonts that is either affordable or cheap? OneWorld is > willing to put some money behind open source/OGL work so that it becomes > available to the community provided there are people doing this. We are > looking at a few languages, maybe 6, to begin with. Pls send me alist of > people or organisations you think are credible and worth approaching. > > 2. Using the fonts they develop or something else that is immediately > availble, we will be translating material from Itrainonline and other > sources that is of use of Indian organisations. This will feature on the > OW support centre, at the moment empty. > > Given this, we need to know about translators immediately! > > Thanks > > Nitya Jacob > Regional coordinator > OneWorld South Asia > 1st floor > C1/22 Safdarjung Development Area > New Delhi - 110016 > India. > t: +91-11-26612008, 26532430 > f: extn 30 > http://www.oneworld.net/southasia -- _____ _ _ _ | ___| __ ___ __| | ___ _ __(_) ___| | __ Freelance Journo, Goa India | |_ | '__/ _ \/ _` |/ _ \ '__| |/ __| |/ / http://linuxinindia.pitas.com | _|| | | __/ (_| | __/ | | | (__| < http://www.bytesforall.org |_| |_| \___|\__,_|\___|_| |_|\___|_|\_\ http://opennews.indianissues.org -- Frederick Noronha * Freelance Journalist * Goa * India 832.409490 / 409783 Writing with a difference... on what makes *the* difference |
From: FN <fr...@by...> - 2003-02-10 09:25:58
|
Just fyi... ---------- Forwarded message ---------- First Microsoft's IT lab in Hindi launched in Uttaranchal (6th Feb.,03) Dehradun: Veteran congress leader and CM Shri ND Tewari of this newly born state "Uttaranchal" has taken a lead from other CM's by cashing the Bill Gates visit last year, when Tewariji got the privilege to meet this richest man of the globe on breakfast meeting in Delhi. As promised by Gates to CM to have their presence in the state soon, 6th February,03 became an important day for the IT community of the state when Microsoft launched two important projects here. CM inaugurated the first Hindi Lab of the country and project Shiksha was also launched here on Thursday, both the projects are a JV of Uttaranchal Govt. and Microsoft. The MOU regarding establishing the first IT Hindi lab in Doon Valley was signed between CM and Group Manager of Microsoft Shri Shailnder Kumar. Hope, this will bring a IT savvy image for Mr. CM as well as this Himalayan state. By Anil Jaggi Dehradun ------------------------ Yahoo! Groups Sponsor ---------------------~--> Get 128 Bit SSL Encryption! http://us.click.yahoo.com/LIgTpC/vN2EAA/xGHJAA/C7EolB/TM ---------------------------------------------------------------------~-> To unsubscribe from this group, send an email to: byt...@ya... Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ |
From: Vijay P. S. A. <vi...@ek...> - 2003-02-10 04:25:44
|
> > >Today's Topics: > > 1. why_OTF : Can txt file be in txt format. (jitendra) > >Why Opentype? > >Background >---------------- > >Any language is written using a script. A script - writing system - is collection of graphical shapes/symbols evolved over time and usually represent a distinct sound/idea/thing. These basic shapes are also called letter/aplhabet/character. 'Character' is widely used one in computer terminology. This graphical shape for characters in digital typography terms called a glyph. And Font is a collection of glyphs with similar style. Glyph data is either a bitmap of the shape or set of points drawing the outline of the glyph. > >The collection of characters for a script is called the characterset. To use the script on computers, which does all processing in bits&bytes, each character is assigned a numeric code (called code point or character code ) giving us a character encoding. The codes are usually 7bit, 8bit or 16bit and usually decided upon by countries where language is used majorly and accepted by standard bodies like ISO, Unicode or BIS(indian). For some characters in Devanagari: क ख ग घ अ आ इ ई > >Issues with TrueType >---------------------------- > >TrueType font stores font data in tables, one of which is the CMAP table (character to glyph mapping). For 8bit fonts this table is 8bit & therefore has a limit that it can only have maximum of 256 glyphs. Though of this 256 spaces only abt 190-200 is actually avialable for glyphs, as rest is occupied by control codes. > >This is a problem wrt Indic scripts as they have apart from basic character set (around 60-100 unique characters - ie consonants, vowels, vowel matras, punctuation/marks etc), they have large no of consonant vowel combinations, conjuncts (2 or more consonants combining), usually in 200-1000 or more. eg For devanagari we have > >35 consonants, 16 vowels so theoritically we would have >35*35 + 35*16 = 1225 + 560 = 1785 > >1785 glyphs cannot be put in a 8bit TTF font. This number can be reduced by studying roots of the script & making few compromises on conjuncts to be used. An obvious solution is break down the script elements in to glyph parts, which can be combined to give required shapes > >One simplification that is evident is that glyphs for consonant-vowel combinations are not needed & can be done with just using a basic set of consonants and vowel matras ( so 35+16+16 = 77 ). Most consonant conjuncts are half forms, so by just having half form. > > i) Large glyph set needs to be reduced to 190-200 > ii) A mapping function for converting characters to glyphs is needed. > Here we will have 1-to-1, 1-to-many , many-to-1 mappings. > iii) Appropriate rendering features to give proper positioning. > >With a 8bit TTF after making few compromises in no of conjuncts, glyphset can be brought to 190-200 range. But now glyphs are not accessed by character codes, but glyph codes or font encoding - so-so glyph given so-so code , which can vary from font to font. But complexity increases in step ii. > >Step (i) is domain of font designer, but (ii) & (iii) have to be done by application developer. He has to write a library or api interface to take care of (ii) & (iii). This library will be used where ever there is need for script processing & display. This library could be used universally in many applications, if the font encoding is fixed. So then many fonts would work with same library. > But unfortunately the current situation is such that there is no standardized font encoding, with font vendors having different encodings & application developers adopding different approaches for script processing. > >Step (i) could be avoided by having a 16bit table for character to glyph mapping. Then there is scope of having a large glyph set. This is what a Unicode font gives. It provides basic range for script and a private area in which one can put his own stuff. But still there is no standard access mechanism, which could simplify (ii) & (iii). Script processing logic & language details have still to be known by application developer & so also much of font stuff is to be hard coded into the application or the library as said above. > >Also if (i)-(iii) could be achieved in some way it doesnt relieve the burden of programmer of knowing language processing detail nor does it easy job of font designer or give him flexibilty to prove his creativity. Also there is an interdependence between font designer & programmer with Unicode fonts job of font designer can be made little easy but not enough to make him & programmer independent. > >All this can be made easy if we have >- Font access mechanism which is not dependent on font encoding but character encoding >- Can provide for large glyph set >- Some way to keep mapping information within the font. >- Script processing available as a library with simple api to access such fonts & do text rendering. > >And this is all what OpenType format provides and more. OpenType is an extension to TrueType , and uses Unicode as standard for character encoding. It also provides additional tables for defining rich set of mappings between characters and glyphs. It also provides for a having a large glyph set and even glyph varaints. All the features provided by OpenType format can be made use by having a application independent, preferable system level library with a api interface usable by applications. > >For Indic script processing OpenType tables like GSUB (glyph substitution) and GPOS (glyph positioning) gives font designer to define his rules on what conjuncts or combinations could be made available. Application programmer is relieved of the burden of knowing all the linguistic part. Also OpenType sort of makes the concept of glyph standard or font encoding standard redundant, again giving font vendors freedom to follow their own glyph setsand not really affecting the application. To summarize OpenType provides lot of benifits to Indic computing and also renders redundant some issues faced in Indic computing. > > > > |
From: Andy W. <And...@bt...> - 2003-02-09 21:18:48
|
Jitendra wrote: > Nice to hear about the OTF meet in Banglore. > The attachement (in announcement of OTF meet) of Karunakar's > 'why_otf.txt' was not legibel. Can the same be put up once > again. Jitendra > Here you are (see below). Andy Why Opentype? Background ---------------- Any language is written using a script. A script - writing system - is collection of graphical shapes/symbols evolved over time and usually represent a distinct sound/idea/thing. These basic shapes are also called letter/aplhabet/character. 'Character' is widely used one in computer terminology. This graphical shape for characters in digital typography terms called a glyph. And Font is a collection of glyphs with similar style. Glyph data is either a bitmap of the shape or set of points drawing the outline of the glyph. The collection of characters for a script is called the characterset. To use the script on computers, which does all processing in bits&bytes, each character is assigned a numeric code (called code point or character code ) giving us a character encoding. The codes are usually 7bit, 8bit or 16bit and usually decided upon by countries where language is used majorly and accepted by standard bodies like ISO, Unicode or BIS(indian). For some characters in Devanagari: ? ? ? ? ? ? ? ? Issues with TrueType ---------------------------- TrueType font stores font data in tables, one of which is the CMAP table (character to glyph mapping). For 8bit fonts this table is 8bit & therefore has a limit that it can only have maximum of 256 glyphs. Though of this 256 spaces only abt 190-200 is actually avialable for glyphs, as rest is occupied by control codes. This is a problem wrt Indic scripts as they have apart from basic character set (around 60-100 unique characters - ie consonants, vowels, vowel matras, punctuation/marks etc), they have large no of consonant vowel combinations, conjuncts (2 or more consonants combining), usually in 200-1000 or more. eg For devanagari we have 35 consonants, 16 vowels so theoritically we would have 35*35 + 35*16 = 1225 + 560 = 1785 1785 glyphs cannot be put in a 8bit TTF font. This number can be reduced by studying roots of the script & making few compromises on conjuncts to be used. An obvious solution is break down the script elements in to glyph parts, which can be combined to give required shapes One simplification that is evident is that glyphs for consonant-vowel combinations are not needed & can be done with just using a basic set of consonants and vowel matras ( so 35+16+16 = 77 ). Most consonant conjuncts are half forms, so by just having half form. i) Large glyph set needs to be reduced to 190-200 ii) A mapping function for converting characters to glyphs is needed. Here we will have 1-to-1, 1-to-many , many-to-1 mappings. iii) Appropriate rendering features to give proper positioning. With a 8bit TTF after making few compromises in no of conjuncts, glyphset can be brought to 190-200 range. But now glyphs are not accessed by character codes, but glyph codes or font encoding - so-so glyph given so-so code , which can vary from font to font. But complexity increases in step ii. Step (i) is domain of font designer, but (ii) & (iii) have to be done by application developer. He has to write a library or api interface to take care of (ii) & (iii). This library will be used where ever there is need for script processing & display. This library could be used universally in many applications, if the font encoding is fixed. So then many fonts would work with same library. But unfortunately the current situation is such that there is no standardized font encoding, with font vendors having different encodings & application developers adopding different approaches for script processing. Step (i) could be avoided by having a 16bit table for character to glyph mapping. Then there is scope of having a large glyph set. This is what a Unicode font gives. It provides basic range for script and a private area in which one can put his own stuff. But still there is no standard access mechanism, which could simplify (ii) & (iii). Script processing logic & language details have still to be known by application developer & so also much of font stuff is to be hard coded into the application or the library as said above. Also if (i)-(iii) could be achieved in some way it doesnt relieve the burden of programmer of knowing language processing detail nor does it easy job of font designer or give him flexibilty to prove his creativity. Also there is an interdependence between font designer & programmer with Unicode fonts job of font designer can be made little easy but not enough to make him & programmer independent. All this can be made easy if we have - Font access mechanism which is not dependent on font encoding but character encoding - Can provide for large glyph set - Some way to keep mapping information within the font. - Script processing available as a library with simple api to access such fonts & do text rendering. And this is all what OpenType format provides and more. OpenType is an extension to TrueType , and uses Unicode as standard for character encoding. It also provides additional tables for defining rich set of mappings between characters and glyphs. It also provides for a having a large glyph set and even glyph varaints. All the features provided by OpenType format can be made use by having a application independent, preferable system level library with a api interface usable by applications. For Indic script processing OpenType tables like GSUB (glyph substitution) and GPOS (glyph positioning) gives font designer to define his rules on what conjuncts or combinations could be made available. Application programmer is relieved of the burden of knowing all the linguistic part. Also OpenType sort of makes the concept of glyph standard or font encoding standard redundant, again giving font vendors freedom to follow their own glyph setsand not really affecting the application. To summarize OpenType provides lot of benifits to Indic computing and also renders redundant some issues faced in Indic computing. |
From: jitendra <jit...@vs...> - 2003-02-09 07:15:45
|
Nice to hear about the OTF meet in Banglore. The attachement (in announcement of OTF meet) of Karunakar's 'why_otf.txt' was not legibel. Can the same be put up once again. Jitendra |