I have designed a new app for detecting a word which makes use of pocketsphinx and I have a requirement to re-adjust the dictonary values for that so I created text file with my keywords and then uploaded it on http://www.speech.cs.cmu.edu/tools/lmtool-new.html also created Dictonary and converted .lm file to .dmp file using command "pocketsphinx_continuous -inmic yes -lm 8521.lm -dict 8521.dic" then I implemented hmm.en-us-semi languvge model. I set my threshold value as 'HELLO /1e-1/' It's easily detecting human voice but not able to detect a telephonic Voice So I Implemented 8khz language Model as per soluton given in forum but found error in mdef file and dmp file. I followed these steps severel times and got the same error in mdef file which displays like "ERROR: "acmod.c",Folder '/path/hmm/en-us-8khz' does not contain acoustic model definition 'mdef' and error-: new_Decoder returned -1".
Can anybody Help Me fix this error and create a dictonary for Telephonic word detection. And are there any other changes required?
Last edit: Radhika Patel 2016-02-04
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can anybody Help Me fix this error and create a dictonary for Telephonic word detection. And are there any other changes required?
It is easy to fix, just make sure that model is properly placed in the folder and that you correctly specify the path to it. No other changes required.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But now I am facing another problem that says how to list language model and dictionary in 'assets.list', Can you please provide me syntax or snap? Currently I have written 'models/hmm/en-us-8khz/README'in this way but having error 'E/error-: sync/models/hmm/en-us-8khz/README.md5'. I dont have any idea how to set files in assests and also one another thing in gram file how should I mentaion threshold value for particular word?
Please reply me as soon as possible.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can you please tell me how to set a threshold value for getting high accuracy in world detection? Because Where there is background noise ramdom words made up from the dictionary will be decoded.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Below is my digit.gram file:
HELLO /1-e5/
HI /1-e1/
I HELP YOU /1-e20/
MY NAME IS /1-e20/
GOOD MORNING /1-e20/
GOOD AFTERNOON /1-e35/
GOOD EVENING /1-e25/
In which all words are working fine just Good Afternoon is not getting detected. So can you please tell me the threshold value for Good Afternoon?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can I tune my threshold value at runtime? For example if I want to set threshold value "HELLO /1-e5/" in my gram file then can I increase or decries it at run time as per user input ?? Is it possible?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I read the suggested tutorial Now I have a query with build Language Model that I have a word "hello" in my .dic file and there are 2 pronounciations option available for it as
HELLO HH AH L OW
HELLO(2) HH EH L OW
If I want to add 3rd pronounce for Hello Is it possible to add it in .disc file? Mainly .dic file is editable by us or not?
And another question is which library is better for telephony use 'en-us-ptm' or 'cmusphinx-en-us-ptm-8khz-5.2' ?? I got same result in both library so can you tell me which one is best ?
Can Anybody Help me As Soon as possible ?
Last edit: Radhika Patel 2016-02-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am using cmusphinx-en-us-ptm-8khz-5.2 library and my theshold is
HELLO /1e-5/
HI /1e-1/
I HELP YOU /1e-20/
MY NAME IS /1e-20/
WELCOME /1e-15/
GOOD MORNING /1e-25/
GOOD AFTERNOON /1e-30/
GOOD EVENING /1e-28/
This works fine with all except Google Translater's voice so can you please suggest me solution for this
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If I increase my threshold value it also increases false alerms. This works fine with all except Google Translater's voice so can you please suggest me solution for this
In testing the app, it seems that my voice can be picked up for most of the key phrases, yet when I got my client to speak into it, it doesn't pick up any of the key phrase. Can you please tell me why it happen? and what is the solutions ?
I have designed a new app for detecting a word which makes use of pocketsphinx and I have a requirement to re-adjust the dictonary values for that so I created text file with my keywords and then uploaded it on http://www.speech.cs.cmu.edu/tools/lmtool-new.html also created Dictonary and converted .lm file to .dmp file using command "pocketsphinx_continuous -inmic yes -lm 8521.lm -dict 8521.dic" then I implemented hmm.en-us-semi languvge model. I set my threshold value as 'HELLO /1e-1/' It's easily detecting human voice but not able to detect a telephonic Voice So I Implemented 8khz language Model as per soluton given in forum but found error in mdef file and dmp file. I followed these steps severel times and got the same error in mdef file which displays like "ERROR: "acmod.c",Folder '/path/hmm/en-us-8khz' does not contain acoustic model definition 'mdef' and error-: new_Decoder returned -1".
Can anybody Help Me fix this error and create a dictonary for Telephonic word detection. And are there any other changes required?
Last edit: Radhika Patel 2016-02-04
It is better to use
http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Acoustic%20Model/cmusphinx-en-us-ptm-8khz-5.2.tar.gz/download
It is easy to fix, just make sure that model is properly placed in the folder and that you correctly specify the path to it. No other changes required.
Thank you for your quick response...
But now I am facing another problem that says how to list language model and dictionary in 'assets.list', Can you please provide me syntax or snap? Currently I have written 'models/hmm/en-us-8khz/README'in this way but having error 'E/error-: sync/models/hmm/en-us-8khz/README.md5'. I dont have any idea how to set files in assests and also one another thing in gram file how should I mentaion threshold value for particular word?
Please reply me as soon as possible.
http://cmusphinx.sourceforge.net/wiki/tutorialandroid#including_resource_files
It's Working, Thank you :)
Can you please tell me how to set a threshold value for getting high accuracy in world detection? Because Where there is background noise ramdom words made up from the dictionary will be decoded.
Hi Radhika,
Find your answer here
Thank you It's works 80% cases. Where there is background noise ramdom words made up from the dictionary will be decoded.
Can you please tell me what is the best model for teliphonic audio detection in CMU Sphinx?
The best model for telephony is mentionned in Post 2. To address accuracy issues you need to send your config and the that you try to decode
Thank you David
Below is my digit.gram file:
HELLO /1-e5/
HI /1-e1/
I HELP YOU /1-e20/
MY NAME IS /1-e20/
GOOD MORNING /1-e20/
GOOD AFTERNOON /1-e35/
GOOD EVENING /1-e25/
In which all words are working fine just Good Afternoon is not getting detected. So can you please tell me the threshold value for Good Afternoon?
The format for float numbers is 1e-35, not 1-e35. Not strange your gram does not work.
Thank you Nickolay It's working
Can I tune my threshold value at runtime? For example if I want to set threshold value "HELLO /1-e5/" in my gram file then can I increase or decries it at run time as per user input ?? Is it possible?
1e-5, not 1-e5. In runtime you can remove the current search and add a new one.
HELLO 1e-5/
HI /1e-1/
I HELP YOU /1e-20/
MY NAME IS /1e-20/
GOOD MORNING /1e-20/
GOOD AFTERNOON /1e-30/
GOOD EVENING /1e-25/
How to tune this 7 key
Tutorial says:
Threshold must be tuned to balance between false alarms and missed detections, the best way to tune threshold is to use a prerecorded audio file.
http://cmusphinx.sourceforge.net/wiki/tutoriallm
Thank you Nickolay
I read the suggested tutorial Now I have a query with build Language Model that I have a word "hello" in my .dic file and there are 2 pronounciations option available for it as
HELLO HH AH L OW
HELLO(2) HH EH L OW
If I want to add 3rd pronounce for Hello Is it possible to add it in .disc file? Mainly .dic file is editable by us or not?
And another question is which library is better for telephony use 'en-us-ptm' or 'cmusphinx-en-us-ptm-8khz-5.2' ?? I got same result in both library so can you tell me which one is best ?
Can Anybody Help me As Soon as possible ?
Last edit: Radhika Patel 2016-02-18
It is editable.
For telephony you need to use 8khz.
Last edit: Nickolay V. Shmyrev 2016-02-19
Thank you,
I am using cmusphinx-en-us-ptm-8khz-5.2 library and my theshold is
HELLO /1e-5/
HI /1e-1/
I HELP YOU /1e-20/
MY NAME IS /1e-20/
WELCOME /1e-15/
GOOD MORNING /1e-25/
GOOD AFTERNOON /1e-30/
GOOD EVENING /1e-28/
This works fine with all except Google Translater's voice so can you please suggest me solution for this
You need to provide data files, command line, logs, results in order to get help on accuracy issues.
I attached my custom dictionary and gram file along with my recognizer setup,
private void setupRecognizer(File assetsDir) throws IOException {
Log.e("setupRecognizer", "setupRecognizer");
recognizer = defaultSetup()
.setAcousticModel(new File(assetsDir, "cmusphinx-en-us-ptm-8khz-5.2"))
.setDictionary(new File(assetsDir, "6199.dic"))
.setBoolean("-allphone_ci", true)
.getRecognizer();
recognizer.addListener(this);
File digitsGrammar = new File(assetsDir, "digits.gram");
recognizer.addKeywordSearch(DIGITS_SEARCH, digitsGrammar);
}
If I increase my threshold value it also increases false alerms. This works fine with all except Google Translater's voice so can you please suggest me solution for this
Last edit: Radhika Patel 2016-02-23
You need to share the audio file
I have attached vedio file the way I test app
In testing the app, it seems that my voice can be picked up for most of the key phrases, yet when I got my client to speak into it, it doesn't pick up any of the key phrase. Can you please tell me why it happen? and what is the solutions ?
Last edit: Radhika Patel 2016-02-23
You need to provide audio file recordings to let me reproduce your problems.
This is my audio recording. Please give me suggection as soon as possible
Last edit: Radhika Patel 2016-02-29
Hi Nickolay did you got chance to reproduce our problems?