From: Guntupalli K. <kar...@in...> - 2006-04-24 10:58:33
|
Some suggestions by Vivek Rai Begin forwarded message: Date: Mon, 24 Apr 2006 11:25:35 +0100 From: "Vivek Rai" <viv...@gm...> To: ind...@li... Subject: Re: [Indlinux-hindi] Google Summer of Code > On tuesday evening (25th April) we will have an IRC meet to discuss > more about this. Timing 9pm-11pm. > > Comments, suggestions welcome, > In case I am not be able to attend, (timezone differences, I will try my best).. I can think of following ideas: 1) Opensource OCR(optical character recognition) for Indian langauges on Linux If we get this working, people, organizations etc will be able to quickly generate a huge amount of online Hindi content by scanning printed matter. I had met some people at Kashi Nagari Prachini sabha, and they were scanning printed hindi books as images. I was wondering if someone can have a look at gocr (http://jocr.sourceforge.net/index.html), or ocrad (http://www.gnu.org/software/ocrad/ocrad.html) and see how we can include devanagari character recognition to it. I dont know if these already support Hindi or not, (ocrad examples show UTF-8 support in Japanese). If they do, it would be a matter of testing them with a diverse content(e.g. with scanned documents in some popular typesets in Hindi), and tweaking them as needed This could sound like a very complex task, but remember, we are NOT INVENTING OCR FROM SCRATH, the library is already there, it might be a matter of defining some character attributes, font mappings etc for devanagari. There is apparently some Hindi OCR software distributed along with the C-DAC CD. I dont know much details. 2) Some mechanism to ease translations of OpenOffice/Mozilla Have some utility scripts which take .po files as an Input, and generate an OO or mozilla firefox langauge pack. The idea is to make translations (and their maintainance) for such applicaions which dont use the common po file formats easier. There are already some tools for oo2po and po2oo and moz2po, po2moz kind of translations, we just need to build wrappers around them, so these become easy to use by the translators. 3) Enable pango support in at least one "lightweight" WM (such as IceWM) For machines with low RAM, running KDE/gnome from a liveCD is not a feasable option. We need at least one lightweight WM which supports pango. (note - xfce also needs larger RAM like kde/gnome) ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=120709&bid&3057&dat=121642 _______________________________________________ Indlinux-hindi mailing list Ind...@li... https://lists.sourceforge.net/lists/listinfo/indlinux-hindi --=20 ************************************* * Work: http://www.indlinux.org * * Blog: http://cartoonsoft.com/blog * ************************************* |