From: Rajesh R. <raj...@ya...> - 2012-10-26 04:39:03
|
Dear Ratlami jee, Awesome work Ratlami jee! Thanks a lot! >________________________________ > From: ""Ravishankar Shrivastava (रवि-रतलामी)"" <rav...@gm...> >To: Localization <ind...@li...> >Sent: Thursday, October 25, 2012 6:14 PM >Subject: [Indlinux-hindi] About Hindi spellcheck wordlist > >Hi, >Now nearly most FLOSS applications now support Hindi spell-check either >through addin (libre-office, firefox etc.) or through aspell-hunspell >(Gedit etc.), need for a comprehensive collection of Correct Hindi word >list is essential. > >Whatever wordlists currently available were not adequate, contains many >many wrong words and therefore remain almost meaningless in providing >usable spellcheck on any document. > >In recent past, there was a proposal to financially support a project of >collection of manually checked 2,00,000 Hindi words and preparation of >plugins etc. but that project was heldup for one or other reason. > >Recently, on Hindi week, a collection of nearly 1,60,000 (download link >is given here - http://raviratlami.blogspot.in/2012/10/blog-post.html) >Hindi word list (Beta version, with about 90% correct words) was >released for use in FLOSS Hindi spellcheck. For this wordlist, Linguist >Shri Arvind Kumar (Author of the Hindi's one and only thesaurus) had >generously contributed his collection of Hindi Wordlist error-free data, >without charging any fee. But this alone word list is not sufficient in >many ways as Hindi words can have many formations - such as for मुस्कान >(smile) - you can see below (and it is still incomplete) - > Say bahut-bahut shukriya to Arvind jee! Open Source community will be always grateful to Arvind jee! >मुस्करा >मुस्कराएगी >मुस्कराकर >मुस्कराता >मुस्कराती >मुस्कराते >मुस्कराना >मुस्कराने >मुस्कराया >मुस्कराहट >मुस्काया >मुस्काए >मुस्काता >मुस्काती >मुस्कान >मुस्काना >मुस्कानि >मुस्कुरा >मुस्कुराए >मुस्कुराकर >मुस्कुराता >मुस्कुराते >मुस्कुराने >मुस्कुराहट > I can only :-)...but for the time being it is great contribution! >Further, a while ago, we had initiated a web based work (through google >doc distributing chunks of word list to some individuals) to manually >spellcheck the collected wordlist. But that initiation had failed again. > >So, now the only option remain with us is that if we can get some kind >of financial support, then we can manually spellcheck the collected >data, and add various forms of Hindi words by hiring commercial Hindi >translators and reviewers. There is a group called Hindi-Translator >group (http://groups.google.com/group/hindianuvaadak/topics) and >according to the group, minimum Re. 1.00 per word is the rate for such >work. So, if we can generate this kind of fund (about 2 lakh) through >sponsership or through some projects etc., then maybe we can collect >comprehensive, correct Hindi wordlist (about 2 lakh most used words) for >use in our spellcheck programs and this data will then be further used >for standardization of Hindi spellcheck (using only standard Hindi words). > >I am writing this since some of my friends have shown keen interest in >contributing generating fund for this long pending work. > >And, I am writing this in my tooti-footi English to reach wider >audience, so forgive me :) > I am not afraid of tooti-footi angreji :-)...हिन्दी सही रहे उसका इंतज़ाम तो आपने कर दिया है :-) Regards, Rajesh |