Menu

Improve accuracy for pocketsphinx results

Help
Gowtham
2017-02-01
2018-06-18
  • Gowtham

    Gowtham - 2017-02-01

    Hi,

    Now I'm using pocketsphinx to convert speech audio file to text using cmudict-5prealpha.dict dictionary and en-us.lm.bin language model.

    Using this command
    pocketsphinx_continuous
    -lm /en-us.lm.bin
    -fwdflat no
    -remove_dc yes
    -bestpath no
    -dict /language/en_us_nostress/cmudict-5prealpha.dict
    -infile /wav/confidential.wav > test_confFlat.txt

    I'm getting only a accuracy of 46.15% ...but i need more accuracy.....For that i have to change the language model and dictionary or else i need to add more options in the command?
    
    Can you please suggest me better commands or better language models and dictionary.
    
    Thank You.
    
     

    Last edit: Gowtham 2017-02-01
    • Arseniy Gorin

      Arseniy Gorin - 2017-02-01

      It a lot depends on your data.

      First, check http://cmusphinx.sourceforge.net/wiki/tutorialtuning
      Second, it depends on your signal quality (noise, reverb, etc)
      Third, depends on the language (you may need to adapt the language model if the speech style is not common)

      In any case, it is hard to say without looking at your data.

       
  • Gowtham

    Gowtham - 2017-02-01

    Thanks Arseniy,

    As you said, now I'm working on adaptation process with reference of this link http://cmusphinx.sourceforge.net/wiki/tutorialadapt

    In adaptation process while using this command

    sphinx_fe -samprate 16000 -argfile /usr/local/share/pocketsphinx/model/en-us/en-us/feat.params -c /testc2.fileids -ei wav -eo mfc -mswav yes
    

    It shows error like :

    INFO: sphinx_fe.c(970): Processing all remaining utterances at position 0
    INFO: sphinx_fe.c(790): Converting confidential2.wav to confidential2.mfc
    ERROR: "sphinx_fe.c", line 119: Failed to open confidential2.wav: No such file or directory
    

    But the directory wav contains the file named confidential2.wav.Now I can't trigger out why this issue is occuring. Please help me to get rid of the issue.

     
    • Arseniy Gorin

      Arseniy Gorin - 2017-02-01

      You should check testc2.fileids.
      You should also check the dictionary.

      In fact, even though you are doing adaptation, it is highly recommended that you go through http://cmusphinx.sourceforge.net/wiki/tutorialam to understand better the file structure

       
  • Gowtham

    Gowtham - 2017-02-01

    After referrring with the link you provided, I have made changes to my files path and the testc2.fileids content.

    But now also the same error happening like No Such file or directory.If this works fine I will continue with my adaptation process and check again with the adapted files.

    It would be grateful if u help me on this process please.

     
    • Arseniy Gorin

      Arseniy Gorin - 2017-02-01

      it cannot find the audio file in the place testc2.fileids specifies.
      We can only check what happens if you provide the complete training directory with files, audios and logs

       
  • Gowtham

    Gowtham - 2017-02-02

    Thanks Arseniy,

    Now I have completed my adaptation process.But now also there is no change in the accuracy percentage.

    I don't know what may be the exact problem.Can you please suggest me any other better alternative way to improve accuracy.

     

    Last edit: Gowtham 2017-02-02
  • Gowtham

    Gowtham - 2017-02-03

    Hi,

    I have attached my audio file and the original text file (speech_radio_sample1.txt) converted through online site and the text file convereted with pocketsphinx (testSpeech.hyp) with this comment.

    And while using the below command :

     pocketsphinx_batch -adcin yes -hmm /en-us-adapt -lm /en-us.lm.bin -fwdflat no -remove_dc yes -remove_silence no -round_filters no -nwpen 1e-10 -pl_pip 10 -bestpath no -dict /en_us_nostress/cmudict-5prealpha.dict -ctl test.fileids -cepdir /wav  -cepext .wav -hyp testSpeech.hyp
    
     And I Got Accuracy as
    
      Words: 412 Correct: 249 Errors: 227 Percent correct = 60.44% Error = 55.10% Accuracy = 44.90%
    Insertions: 64 Deletions: 18 Substitutions: 145
    TOTAL Words: 412 Correct: 249 Errors: 227
    TOTAL Percent correct = 60.44% Error = 55.10% Accuracy = 44.90%
    TOTAL Insertions: 64 Deletions: 18 Substitutions: 145
    

    I'm not able to improve accuracy even after changing many command line options is there any other way to improve accuracy?

    Please check with these attachments I have provided and suggest me better ways.

    Thanks in advance.

     

    Last edit: Gowtham 2017-02-03
    • Nickolay V. Shmyrev

      And what do you use for reference in alignment? If you use speech_radio_sample1.txt, you need to remove punctuation and convert to lowercase it first.

       
  • Gowtham

    Gowtham - 2017-02-06

    I think you misunderstood what I have said. Actually speech_radio_sample1.txt is the original text file i used only for comparing and calculating accuracy. The output file is the file named testSpeech.hyp .I have attached the output file with this.

    Please suggest me better ideas to improve accuracy.

    Thank You.

     
    • Nickolay V. Shmyrev

      I think you didn't read what I wrote to you. Let me repeat in bold **you need to remove punctuation from speech_radio_sample1.txt and convert it to lowercase **

       
  • Pooja Withanage

    Pooja Withanage - 2018-06-18

    Hi,
    I used pocketsphinx in my android application. And there I need to recognize some local street names, junctions, distances, etc. Therefore I used my own dictionary file and the language model file with the default acoustic model provided by CMU Sphinx.

    Sometimes it recognizes those words but accuracy is not that much satisfactory.

    Can you please suggest me how to improve the accuracy.

    Thank You.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.