CMU Sphinx / Forums / Help: Improve accuracy for pocketsphinx results

Gowtham - 2017-02-01

Hi,

Now I'm using pocketsphinx to convert speech audio file to text using cmudict-5prealpha.dict dictionary and en-us.lm.bin language model.

Using this command
pocketsphinx_continuous
-lm /en-us.lm.bin
-fwdflat no
-remove_dc yes
-bestpath no
-dict /language/en_us_nostress/cmudict-5prealpha.dict
-infile /wav/confidential.wav > test_confFlat.txt

I'm getting only a accuracy of 46.15% ...but i need more accuracy.....For that i have to change the language model and dictionary or else i need to add more options in the command? Can you please suggest me better commands or better language models and dictionary. Thank You.

Last edit: Gowtham 2017-02-01
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Arseniy Gorin - 2017-02-01
  
  It a lot depends on your data.
  
  First, check http://cmusphinx.sourceforge.net/wiki/tutorialtuning
  Second, it depends on your signal quality (noise, reverb, etc)
  Third, depends on the language (you may need to adapt the language model if the speech style is not common)
  
  In any case, it is hard to say without looking at your data.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gowtham - 2017-02-01

Thanks Arseniy,

As you said, now I'm working on adaptation process with reference of this link http://cmusphinx.sourceforge.net/wiki/tutorialadapt

In adaptation process while using this command

sphinx_fe -samprate 16000 -argfile /usr/local/share/pocketsphinx/model/en-us/en-us/feat.params -c /testc2.fileids -ei wav -eo mfc -mswav yes

It shows error like :

INFO: sphinx_fe.c(970): Processing all remaining utterances at position 0 INFO: sphinx_fe.c(790): Converting confidential2.wav to confidential2.mfc ERROR: "sphinx_fe.c", line 119: Failed to open confidential2.wav: No such file or directory

But the directory wav contains the file named confidential2.wav.Now I can't trigger out why this issue is occuring. Please help me to get rid of the issue.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Arseniy Gorin - 2017-02-01
  
  You should check testc2.fileids.
  You should also check the dictionary.
  
  In fact, even though you are doing adaptation, it is highly recommended that you go through http://cmusphinx.sourceforge.net/wiki/tutorialam to understand better the file structure
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gowtham - 2017-02-01

After referrring with the link you provided, I have made changes to my files path and the testc2.fileids content.

But now also the same error happening like No Such file or directory.If this works fine I will continue with my adaptation process and check again with the adapted files.

It would be grateful if u help me on this process please.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Arseniy Gorin - 2017-02-01
  
  it cannot find the audio file in the place testc2.fileids specifies.
  We can only check what happens if you provide the complete training directory with files, audios and logs
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gowtham - 2017-02-02

Thanks Arseniy,

Now I have completed my adaptation process.But now also there is no change in the accuracy percentage.

I don't know what may be the exact problem.Can you please suggest me any other better alternative way to improve accuracy.

Last edit: Gowtham 2017-02-02

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gowtham - 2017-02-03

Hi,

I have attached my audio file and the original text file (speech_radio_sample1.txt) converted through online site and the text file convereted with pocketsphinx (testSpeech.hyp) with this comment.

And while using the below command :

pocketsphinx_batch -adcin yes -hmm /en-us-adapt -lm /en-us.lm.bin -fwdflat no -remove_dc yes -remove_silence no -round_filters no -nwpen 1e-10 -pl_pip 10 -bestpath no -dict /en_us_nostress/cmudict-5prealpha.dict -ctl test.fileids -cepdir /wav -cepext .wav -hyp testSpeech.hyp And I Got Accuracy as Words: 412 Correct: 249 Errors: 227 Percent correct = 60.44% Error = 55.10% Accuracy = 44.90% Insertions: 64 Deletions: 18 Substitutions: 145 TOTAL Words: 412 Correct: 249 Errors: 227 TOTAL Percent correct = 60.44% Error = 55.10% Accuracy = 44.90% TOTAL Insertions: 64 Deletions: 18 Substitutions: 145

I'm not able to improve accuracy even after changing many command line options is there any other way to improve accuracy?

Please check with these attachments I have provided and suggest me better ways.

Thanks in advance.

Last edit: Gowtham 2017-02-03

speech_radio_sample1.txt

speech_radio_sample1_1_.wav

testSpeech.hyp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-02-03
  
  And what do you use for reference in alignment? If you use speech_radio_sample1.txt, you need to remove punctuation and convert to lowercase it first.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gowtham - 2017-02-06

I think you misunderstood what I have said. Actually speech_radio_sample1.txt is the original text file i used only for comparing and calculating accuracy. The output file is the file named testSpeech.hyp .I have attached the output file with this.

Please suggest me better ideas to improve accuracy.

Thank You.

testSpeech.hyp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-02-06
  
  I think you didn't read what I wrote to you. Let me repeat in bold **you need to remove punctuation from speech_radio_sample1.txt and convert it to lowercase **
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pooja Withanage - 2018-06-18

Hi,
I used pocketsphinx in my android application. And there I need to recognize some local street names, junctions, distances, etc. Therefore I used my own dictionary file and the language model file with the default acoustic model provided by CMU Sphinx.

Sometimes it recognizes those words but accuracy is not that much satisfactory.

Can you please suggest me how to improve the accuracy.

Thank You.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Improve accuracy for pocketsphinx results

Speech Recognition Toolkit

Forums

Help

Improve accuracy for pocketsphinx results document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Improve accuracy for pocketsphinx results