CMU Sphinx / Forums / Help: Help required in hindi speech recognition

anand vadlamani - 2016-07-18

Hi we are trying to create a model for Hindi Speech Recongiztion

We have put the data and logfile in the following location
ftp://123.176.44.99/hindi_asr/
credentials are: anandftp (username) password (password)

Here are the steps that we have done:

We have built the language model using http://www.speech.cs.cmu.edu/tools/lmtool.html. We have submitted our corpus text (ftp://123.176.44.99/hindi_asr/an4_test/etc/latest_dictionary.txt) and we got the language model which is (ftp://123.176.44.99/hindi_asr/an4_test/etc/an4_test.lm.DMP

This corpus text we have transliterated into english from hindi (is it ok ?)

Using the above language model, we created acoustic model as per the tutorial. Training data is in ftp://123.176.44.99/hindi_asr/an4_test/wav/an4_clstk/. We have 2400 odd audio files along with their transcription.

After training, we tested using "sphinxtrain -s decode run". In this testing phase, we have given all fileids (all 2400 fileids) along with their transcriptions from the training folder itself.

We got at the end the following message:

MODULE: DECODE Decoding using models previously trained
Decoding 2404 segments starting at 0 (part 1 of 1)
0%
Aligning results to find error rate
SENTENCE ERROR: 55.9% (1345/2404) WORD ERROR RATE: 7.2% (3044/42406)

We are using the same files from the training data to test in decoding phase. Is it ok or should we use different audio for training?

After this, we tried to recognize one of the audio (which we have used in testing) using the language model, acoustic model and dictionary which we created in the above steps.

command:
pocketsphinx_continuous -infile wav/an4_clstk/hindi/hindi_0010.wav -hmm model_parameters/an4.ci_cont_flatinitial/ -lm etc/an4.lm.DMP -dict etc/an4.dic

But the accuracy is very poor and we are not getting a single word getting recongized.

While running the test phase, sentence error rate and word error rate are low

Please guide us, where we are wrong.

Thanks
Anand
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-07-18
  
  This corpus text we have transliterated into english from hindi (is it ok ?)
  
  It is ok but not necessary.
  
  We are using the same files from the training data to test in decoding phase. Is it ok or should we use different audio for training?
  
  It is not recommended to use same audio for testing, you need to split your data on train and test. Those should not intersect.
  
  While running the test phase, sentence error rate and word error rate are low
  
  Final model is in an4.cd_cont_200 and it is very small due to small datasize. You need larger dataset and most recent pocketsphinx, the result would be
  
  YE DO PAR LAATHEE RAKH KAR GHAR SE NIKALA TO DHANIYA DVAAR PAR KHADEE USE DER TAK DEKHATEE RAHEE
  
  which is about accurate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

anand vadlamani - 2016-07-18

Hi Nickolay,
thanks for prompt reply, but when i use in my system (we are using pocketsphinx-5prealpha) with the following command -

pocketsphinx_continuous -infile wav/an4_clstk/hindi/hindi_0010.wav -hmm model_parameters/an4.cd_cont_200/ -lm etc/an4.lm.DMP -dict etc/an4.dic

we get the following result : OONT DIYE

I agree that our dataset is small, but while decoding it is able to decode properly where as while running through the command it is giving the above result.

In an4.align file, the decoded result we observe as: horee kandhon par laathee rakh kar ghar se nikala to dhaniya dvaar par khadee use der tak dekhatee rahee

We are confused, are we doing something wrong ? Are you using the same mode that we mentioned in our link or is it a different one ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-07-18
  
  You need to compile latest sphinxbase and pocketsphinx from github.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

anand vadlamani - 2016-07-18

Hi Nickolay, we set -cmninit 71, and we have given long sentences and we are able to get good results. Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-07-18
  
  I also see you used english phonemes for your dictionary. It is not a good idea, it is better to use hindi phoneset from here:
  
  https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Hindi/cmusphinx-hi-5.2.tar.gz
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arun Kumar - 2016-11-01

Hi Anand,
We are struggling with the same issue with the latest cmusphinx4-5prealpha build. I see you have been successful in transcribing hindi audio. I'm thinking it's something to do with the way we have our corpus setup. Can you share the ftp details, so we can see how u have formed the corpus to get the results?

Thanks,
Arun

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Help required in hindi speech recognition

Speech Recognition Toolkit

Forums

Help

Help required in hindi speech recognition document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Help required in hindi speech recognition