Now I'm using pocketsphinx to convert speech audio file to text using cmudict-5prealpha.dict dictionary and en-us.lm.bin language model.
Using this command
pocketsphinx_continuous
-lm /en-us.lm.bin
-fwdflat no
-remove_dc yes
-bestpath no
-dict /language/en_us_nostress/cmudict-5prealpha.dict
-infile /wav/confidential.wav > test_confFlat.txt
I'm getting only a accuracy of 46.15% ...but i need more accuracy.....For that i have to change the language model and dictionary or else i need to add more options in the command?Can you please suggest me better commands or better language models and dictionary.Thank You.
Last edit: Gowtham 2017-02-01
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First, check http://cmusphinx.sourceforge.net/wiki/tutorialtuning
Second, it depends on your signal quality (noise, reverb, etc)
Third, depends on the language (you may need to adapt the language model if the speech style is not common)
In any case, it is hard to say without looking at your data.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But the directory wav contains the file named confidential2.wav.Now I can't trigger out why this issue is occuring. Please help me to get rid of the issue.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
After referrring with the link you provided, I have made changes to my files path and the testc2.fileids content.
But now also the same error happening like No Such file or directory.If this works fine I will continue with my adaptation process and check again with the adapted files.
It would be grateful if u help me on this process please.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
it cannot find the audio file in the place testc2.fileids specifies.
We can only check what happens if you provide the complete training directory with files, audios and logs
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have attached my audio file and the original text file (speech_radio_sample1.txt) converted through online site and the text file convereted with pocketsphinx (testSpeech.hyp) with this comment.
And while using the below command :
pocketsphinx_batch -adcin yes -hmm /en-us-adapt -lm /en-us.lm.bin -fwdflat no -remove_dc yes -remove_silence no -round_filters no -nwpen 1e-10 -pl_pip 10 -bestpath no -dict /en_us_nostress/cmudict-5prealpha.dict -ctl test.fileids -cepdir /wav -cepext .wav -hyp testSpeech.hyp
And I Got Accuracy as
Words: 412 Correct: 249 Errors: 227 Percent correct = 60.44% Error = 55.10% Accuracy = 44.90%
Insertions: 64 Deletions: 18 Substitutions: 145
TOTAL Words: 412 Correct: 249 Errors: 227
TOTAL Percent correct = 60.44% Error = 55.10% Accuracy = 44.90%
TOTAL Insertions: 64 Deletions: 18 Substitutions: 145
I'm not able to improve accuracy even after changing many command line options is there any other way to improve accuracy?
Please check with these attachments I have provided and suggest me better ways.
I think you misunderstood what I have said. Actually speech_radio_sample1.txt is the original text file i used only for comparing and calculating accuracy. The output file is the file named testSpeech.hyp .I have attached the output file with this.
Please suggest me better ideas to improve accuracy.
I think you didn't read what I wrote to you. Let me repeat in bold **you need to remove punctuation from speech_radio_sample1.txt and convert it to lowercase **
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I used pocketsphinx in my android application. And there I need to recognize some local street names, junctions, distances, etc. Therefore I used my own dictionary file and the language model file with the default acoustic model provided by CMU Sphinx.
Sometimes it recognizes those words but accuracy is not that much satisfactory.
Can you please suggest me how to improve the accuracy.
Thank You.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Now I'm using pocketsphinx to convert speech audio file to text using cmudict-5prealpha.dict dictionary and en-us.lm.bin language model.
Using this command
pocketsphinx_continuous
-lm /en-us.lm.bin
-fwdflat no
-remove_dc yes
-bestpath no
-dict /language/en_us_nostress/cmudict-5prealpha.dict
-infile /wav/confidential.wav > test_confFlat.txt
Last edit: Gowtham 2017-02-01
It a lot depends on your data.
First, check http://cmusphinx.sourceforge.net/wiki/tutorialtuning
Second, it depends on your signal quality (noise, reverb, etc)
Third, depends on the language (you may need to adapt the language model if the speech style is not common)
In any case, it is hard to say without looking at your data.
Thanks Arseniy,
As you said, now I'm working on adaptation process with reference of this link http://cmusphinx.sourceforge.net/wiki/tutorialadapt
In adaptation process while using this command
It shows error like :
But the directory wav contains the file named confidential2.wav.Now I can't trigger out why this issue is occuring. Please help me to get rid of the issue.
You should check testc2.fileids.
You should also check the dictionary.
In fact, even though you are doing adaptation, it is highly recommended that you go through http://cmusphinx.sourceforge.net/wiki/tutorialam to understand better the file structure
After referrring with the link you provided, I have made changes to my files path and the testc2.fileids content.
But now also the same error happening like No Such file or directory.If this works fine I will continue with my adaptation process and check again with the adapted files.
It would be grateful if u help me on this process please.
it cannot find the audio file in the place testc2.fileids specifies.
We can only check what happens if you provide the complete training directory with files, audios and logs
Thanks Arseniy,
Now I have completed my adaptation process.But now also there is no change in the accuracy percentage.
I don't know what may be the exact problem.Can you please suggest me any other better alternative way to improve accuracy.
Last edit: Gowtham 2017-02-02
Hi,
I have attached my audio file and the original text file (speech_radio_sample1.txt) converted through online site and the text file convereted with pocketsphinx (testSpeech.hyp) with this comment.
And while using the below command :
I'm not able to improve accuracy even after changing many command line options is there any other way to improve accuracy?
Please check with these attachments I have provided and suggest me better ways.
Thanks in advance.
Last edit: Gowtham 2017-02-03
And what do you use for reference in alignment? If you use speech_radio_sample1.txt, you need to remove punctuation and convert to lowercase it first.
I think you misunderstood what I have said. Actually speech_radio_sample1.txt is the original text file i used only for comparing and calculating accuracy. The output file is the file named testSpeech.hyp .I have attached the output file with this.
Please suggest me better ideas to improve accuracy.
Thank You.
I think you didn't read what I wrote to you. Let me repeat in bold **you need to remove punctuation from speech_radio_sample1.txt and convert it to lowercase **
Hi,
I used pocketsphinx in my android application. And there I need to recognize some local street names, junctions, distances, etc. Therefore I used my own dictionary file and the language model file with the default acoustic model provided by CMU Sphinx.
Sometimes it recognizes those words but accuracy is not that much satisfactory.
Can you please suggest me how to improve the accuracy.
Thank You.