CMU Sphinx / Forums / Sphinx4 Help: To transcribe lectures( Eng-Indian Accent)

vijay k - 2010-08-12

Hi all..,

I am working on automatically transcribing lectures which are in
english(indian accent). Since the sphinx is the best opensource speech
recognition I have decided to use it. But I am new to sphinx 4. please any one
help how to proceed..what are the requirements for it ? how to improve
accuracy?

thanks in advance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nasir Hussain - 2010-08-14

Hey Vijay,

I am working on automatically transcribing lectures which are in
english(indian accent)

For that u have create an acoustic model in your voice(indian Accent).

Since the sphinx is the best open source speech recognition I have decided
to use

You are absolutely right my Friend...:)

But I am new to sphinx 4. please any one help how to proceed

Just go through the Site Below.It contains all the neccesary details for
installation and Usage.
http://cmusphinx.sourceforge.net/sphinx4/

https://sourceforge.net/projects/cmusphinx/files/
Download latest Sphinx4 which is Beta 4 Source file and then configure it with
any IDE as per ur needs(like eclipse)
then run the demos in it...And check out transcriber demo ;)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

vijay k - 2010-08-14

Thanks for your reply.
can you please tell me the steps to create acoustic model?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nasir Hussain - 2010-08-14

Hey Vijay,

can you please tell me the steps to create acoustic model?

Use SphinxTrain to train your acoustic Model.You can easily download it from
https://sourceforge.net/projects/cmusphinx/files/

Check this Forum for assistance
https://sourceforge.net/projects/cmusphinx/forums/forum/5471/topic/3755260

Also read this For use in Sphinx4
http://cmusphinx.sourceforge.net/sphinx4/doc/UsingSphinxTrainModels.html

-Paul

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

vijay k - 2010-08-25

hi nasir,

can you please answer my following questions.

how much time it will take to create acoustic models? and what is the
complexity level?

what is the accuracy level?and how to improve accuracy?

what about dictionary and language models? new ones need to be created or
existing ones are enough?

how much training data is required and manually transcribed?

can it handle out of vocabulary words?

thanks in advance

-vijay

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nasir Hussain - 2010-08-25

Hey Vijay,

Although Such questions can be answered Accurately By Nickolay....
He is awesome and he helps :)
Anyways....

Can you please answer my following questions.

Ya Sure...:)

how much time it will take to create acoustic models? and what is the
complexity level?

Acoustic Model creation depends upon how much Sound files u Using....for
example I created a Acoustic model of 60 UK english words and it took me for
about 30 seconds.....If u have alot of words than it will take time
accordingly...:)
And its Not that complex...:P

what is the accuracy level?and how to improve accuracy?

Hmm...Accuracy Level is Gud....and u can increase ur accuracy By changing ur
config File settings.... If u need My settings Just let me know...:)

what about dictionary and language models? new ones need to be created or
existing ones are enough?

See U need a dictionary while creation of Acoustic model...so after creation
of the model use that dictionary in ur app...:)...
and language models can easily be created by just providing a corpus file here
http://www.speech.cs.cmu.edu/tools/lmtool-
new.html (Remember
language model created on the mentioned link is a US English Language
Model)....Inorder to create ur Own Language model u have to create it using
cmuclmtk tool...:)

how much training data is required and manually transcribed?

All depends upon ur needs....It can be 60 words to 60000...:)

can it handle out of vocabulary words?

Hell Ya...:P

-Nasir aka Paul

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

vijay k - 2010-08-26

thanks alot for the help.

how to create the dictionary for indian english? do we need to write phoneme
sequences for each word manually?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nasir Hussain - 2010-08-26

Hey Vijay,

how to create the dictionary for indian english?

I would recommend u to create It using lmtools..which i recommended u
earlier.. See the concept of Indian English comes With the acoustic
model...when u record ur voices and create the acoustic model the accent gets
recorded....:)..just Try to use more words in ur acoustic model to get gud
accuracy...:)

do we need to write phoneme sequences for each word manually?

Ya in Some cases we need to...but not in all...the pronunciation tht we get
using lmtool is sufficient according to me....
But Some need changing...as u and i both know Indian enlish :P

-Nasir

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

sriram - 2015-09-25

Hi Vijay and Nasir,

Do you have any woking database for training the Indian English Accent.
Please share if you have any.

Thanks and Regards
Sriram.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

akshat dashore - 2018-09-13

hi all,
i am a beginner and searching for indian english and hindi pretrained models for speech to text kindly share it to me if available else which please tell me how could i proceed and also i used this one : https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Indian%20English/cmusphinx-en-in-5.2.tar.gz/download

but it wouldn't works at all
Thanks alot in advance
waiting for your valuable reply

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- sheharyar masood - 2018-09-13
  
  hi akshat,
  the link is good from where you have downloaded cmusphinx.... here is the link for your further knowledge....
  https://cmusphinx.github.io/wiki/tutoriallm/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

To transcribe lectures( Eng-Indian Accent)

Speech Recognition Toolkit

Forums

Help

To transcribe lectures( Eng-Indian Accent) document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

To transcribe lectures( Eng-Indian Accent)