[Indic-computing-devel] Re: [Indic-computing-users] Speech technology group: Indic -Computing
Status: Alpha
Brought to you by:
jkoshy
From: Hema A M. <he...@la...> - 2002-10-09 11:03:32
|
Prof. Yegnanaryana of IIT Madras has Doordarshan News data for 8 different Indian Languages. There are between 20-33 bulletins in each language. Tamil, Hindi and Telugu have been labelled using C*V. Prof. Yegnanarayana's e-mail id is: ye...@ii.... Students and faculty at IITM do have free access to this database. He is planning to make it available to others too. My suggestion is that you can write/ talk to him about these databases. We have been using this database for getting rules for prosody for TTS and Continuous Speech Recognition. -hema On Wed, 9 Oct 2002, Kalika Bali wrote: > > Please, > its KALIKA not any of the other varients used. > > I don't know if Dr. Dash remembers but I did mail him a month and a half > ago asking him if he was aware of any speech corpus in Indian languages > that I could use and he had told me the same thing. > > I did go to the TDIl site but they only mentioned some machine-readable (I > presumed- text) corpora being maintained by CIIL, Mysore. Nothing at all > on speech corpora. > > Dr. Dash do you know what the IBM speech recogniser is using to train > their models ? > > thanks, > Kalika > > > On Wed, 9 Oct 2002, Niladri Sekhar Dash wrote: > > > > > Dear Tapan S. Parikh, > > > > Thanks for your mail. Sorry I have late in reply. To my knowledge speech > > corpus in Indian languages are very rare. Only a few gropus have developed > > this. However, they are not properly designed follwing the designing > > principles applied in LOB or Swedish speech corpus. > > > > Some information regarding Indian speech corpus can be found in the News > > Letter of MIT, Govt. of India at : http://tdil.mit.gov.in > > > > With best wishes and regards, > > > > Sincerely, > > > > Niladri > > ======= > > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > % DR. NILADRI SEKHAR DASH % > > % MA(CU,Ind),NLP(IITK,Ind),PhD(CU,Ind) % > > % Linguist (Corpus Linguistics and Language Technology) % > > % Consultant (TDIL: MIT, Govt. of India) % > > % Consultant (SCiLaHLT: ASI@IT&C, European Commission) % > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > % Office Address: % > > % =============== % > > % Computer Vision and Pattern Recognition Unit % > > % Indian Statistical Institute % > > % 203, Barrakpore Trunk Road % > > % Kolkata 700108, West Bengal, INDIA % > > % ======================================================% > > % Telegram: STATISTICA % > > % Phone: (91)(33)578-1832/577-8085/577-2088 % > > % Extn.: 2850/2852/2858 % > > % Direct line: (91)(33)578-1832 % > > % Residential Phone: (91)(33)477-3337 % > > % FAX: (91)(33)5776680/5773035 % > > % Email: N.S.Dash<ni...@is...> % > > % Email: N.S.Dash<nil...@ho...> % > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > > > On Sat, 5 Oct 2002, > > Tapan S. Parikh wrote: > > > > > > > > Dr. Dash et al, > > > > > > Yes, that is great, welcome to the mailing list! > > > > > > btw I think Vijay meant Kavita Bali who attended the conference. She is > > > working with Picopeta Simputers and is researching indic language speech > > > corpuses (corpii?). > > > > > > We look forward to interact with you all in future! > > > > > > -- Tapan > > > > > > -- Mailing List Admin > > > -- Indic-Computing Project > > > > > > > > > On Sat, 05 Oct 2002 14:24:23 +0530 > > > Vijay Pratap Singh Aditya <vi...@ek...> wrote: > > > > > > > Hi, > > > > > > > > This is for attention for the groups particularly interested to work > > > > on speech technology for Indian languages & development of corpus. Let > > > > me introduce to the group Prof. Niladri Sekhar Dash of ISI Kolkatta, > > > > who is working on development of corpus (both written and spoken) > > > > generation in Indian languages, language processing, lexical resource > > > > generation and lexicography. > > > > > > > > As this was one of the identified gap at the workshop and we had only > > > > Kokila (& Ravikant) there to represent linguists. I had been in touch > > > > with Prof. Dash in post workshop communication and he offer to be part > > > > > > > > of the group and take up some responsibility in his domain areas, > > > > besides bringing in other colleagues from ISI to join in to contribute > > > > > > > > to Indic-Computing. > > > > > > > > I have suggested Prof. Dash to take intiative in this area and help to > > > > > > > > develop this group. I would suggest members interested in corpus > > > > building to get in touch with Prof. dash. I am including the mails > > > > exchnaged with him as under. > > > > > > > > best > > > > > > > > vijay > > > > > > > > > > > > Cc: Prof. Dash may I request you to kindly join Indic-Computing > > > > mailing lists, you can go to http://indic-computing.sourceforge.net/ > > > > to subscribe > > > > > > > > > > > > Niladri Sekhar Dash wrote: > > > > > > > > >Dear Vijay Pratap Singh Aditya, > > > > > > > > > >Nice to hear you. I will be happy to join the group and think myself > > > > >fortunate that I can serve with others in the team for common couse > > > > >of > > > > our > > > > >nation. > > > > > > > > > >Today, I had a discussion regarding your conference minutes with my > > > > >HoD, Prof. B.B. Chaudhuri who lamented that he did not prior > > > > >information about the conference. Otherwise, he might have joined > > > > >it. Meanwhile, he had a meeting with Pat Hall and Durgesh Rao at > > > > >Berlin where he had some feed backs about the conference. > > > > > > > > > >In fact, we have been doing researches in many areas of language > > > > >technology (e.g., OCR, spell-checker, morphological > > > > >processor, speech synthesis, Information retrieval, document > > > > >processing, font designing, MRDs, etc.) for Bangla and other > > > > >languages. All these people along with their works can probably join > > > > >Indic Computing to achieve better results. If your group likes then > > > > >I can approach > > > > them to > > > > >think about it. > > > > > > > > > >With best wishes, > > > > > > > > > >Niladri > > > > >====== > > > > > > > > > > > > > > > > > > > > > > > > >On Fri, 4 Oct 2002, Vijay Pratap Singh Aditya wrote: > > > > > > > > > >>Dear Dr. Dash, > > > > >> > > > > >>Thanks for your mail and support, we would be very happy if you > > > > >join the>Indic-Computing effort, the community which has come > > > > >together is looking>forward to guidance from senior people like you. > > > > >Your support is much>required and as the areas identified by you in > > > > >which you would be able>to contribute had been one of the major > > > > >areas of concerns amongst the>participants at the worksop, I think > > > > >people would look forward to all>possible help. > > > > >> > > > > >>Alow me to post your mail to the Indic computing mailing list, i > > > > >hope>people owho are working in localisation and developing lexical > > > > >resources>woould get back to you for your inputs on the work that > > > > >they are doing.> > > > > >><<2. Regarding Handbook writing, I can inform you that I have > > > > >started>writing a book on corpus generation in Indian languages. It > > > > >is near>completion. This can be a good resource for the people (old > > > > >and new)>related with this field.>> > > > > >> > > > > >> > > > > >>The Handbook that you are making could also be merged with the one > > > > >we>already are planning such that developers and people working in > > > > >the area>can have acccess to all the possible resources. This is an > > > > >execellent>direction aken by you, we on the other hand are trying to > > > > >begin in this>direction for the technical handbook. > > > > >> > > > > >><<3. If needed I can probably initiate efforts for generation of > > > > >speech>corpus for Indian languages which can be further > > > > >transliterated,>transcripted and annotated following a national > > > > >standard - to be > > > > designed > > > > >>by the experts. This area is still unexplored and needs strong > > > > initiation.>> > > > > >> > > > > >>I think this would be perfect, I think we can involve all the > > > > >people > > > > working in this area to organise themsleves, could I also suggest foor > > > > a small workshop of developers and experts in this area, as the one > > > > suggested by us for OTF. > > > > >> > > > > >><<4. Regarding coordination and interaction with lunguists > > > > communities, I > > > > >>can ask some of my teachers, colleagues and students to join in the > > > > >team>to work together.>> > > > > >> > > > > >>I think if you can coordiante this effort and if we can form a > > > > >team, > > > > it would be wornderful, I would infact request you to join in the > > > > Indic computing list and offer your help. I shall also post (subject > > > > to your confirmation) your mail at the list. > > > > >> > > > > >><<Kindly, let me know your views on these issues. Thanks. > > > > >> > > > > >>With best wishes,>> > > > > >> > > > > >>I think we have a lot oof value to add in terms oof standardization > > > > > > > > > and developing resources for enabling computing. Your help would be > > > > very useful. I look forwward to your confirmation so that II can > > > > introduce your maiil to the group and then start the process of group > > > > formation in this critical area. > > > > >> > > > > >>Regards > > > > >> > > > > >>vijay > > > > >> > > > > >> > > > > >> > > > > >>Niladri Sekhar Dash wrote: > > > > >> > > > > >>>Dear Vijay Pratap Singh Aditya, > > > > >>> > > > > >>>Thanks for your mail. Let me inform you that I went through the > > > > >>>"Action points Indic Computing Workshop September 2002" and found > > > > >some>>ereas where I can participate. These are as follows: > > > > >>> > > > > >>>1. My area of work is corpus (both written and spoken) generation > > > > >in>>Indian languages, language processing, lexical resource > > > > >generation and>>lexicography. I find some to be useful for Indic > > > > >Computing group.>> > > > > >>>2. Regarding Handbook writing, I can inform you that I have > > > > >started>>writing a book on corpus generation in Indian languages. It > > > > >is near>>completion. This can be a good resource for the people (old > > > > >and new)>>related with this field. > > > > >>> > > > > >>>3. If needed I can probably initiate efforts for generation of > > > > >speech>>corpus for Indian languages which can be further > > > > >transliterated,>>transcripted and annotated following a national > > > > >standard - to be > > > > designed > > > > >>>by the experts. This area is still unexplored and needs strong > > > > initiation. > > > > >>> > > > > >>>4. Regarding coordination and interaction with lunguists > > > > >communities, I>>can ask some of my teachers, colleagues and students > > > > >to join in the > > > > team > > > > >>>to work together. > > > > >>> > > > > >>>Kindly, let me know your views on these issues. Thanks. > > > > >>> > > > > >>>With best wishes, > > > > >>> > > > > >>>Niladri, > > > > >>>========== > > > > > > > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > > >>>% DR. NILADRI SEKHAR DASH % > > > > > >>>% MA(CU,Ind),NLP(IITK,Ind),PhD(CU,Ind) % > > > > > >>>% Linguist (Corpus Linguistics and Language Technology) % > > > > > >>>% Consultant (TDIL: MIT, Govt. of India) % > > > > > >>>% Consultant (SCiLaHLT: ASI@IT&C, European Commission) % > > > > > >>>%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > > >>>% Office Address: % > > > > > >>>% =============== % > > > > > >>>% Computer Vision and Pattern Recognition Unit % > > > > > >>>% Indian Statistical Institute % > > > > > >>>% 203, Barrakpore Trunk Road % > > > > > >>>% Kolkata 700108, West Bengal, INDIA % > > > > > >>>% ======================================================% > > > > > >>>% Telegram: STATISTICA % > > > > > >>>% Phone: (91)(33)578-1832/577-8085/577-2088 % > > > > > >>>% Extn.: 2850/2852/2858 % > > > > > >>>% Direct line: (91)(33)578-1832 % > > > > > >>>% Residential Phone: (91)(33)477-3337 % > > > > > >>>% FAX: (91)(33)5776680/5773035 % > > > > > >>>% Email: N.S.Dash<ni...@is...> % > > > > > >>>% Email: N.S.Dash<nil...@ho...> % > > > > > >>>%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > > >>> > > > > > >>> > > > > > >>> > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > This sf.net email is sponsored by:ThinkGeek > > > > Welcome to geek heaven. > > > > http://thinkgeek.com/sf > > > > _______________________________________________ > > > > Indic-computing-users mailing list > > > > http://indic-computing.sourceforge.net/ > > > > Ind...@li... > > > > https://lists.sourceforge.net/lists/listinfo/indic-computing-users > > > > [Other Indic-Computing mailing lists: -devel, -standards, -announce] > > > > > > > > > > > > |