Still it is not recognizing.. there is no error in logcat.. but it is not recognizing..Should I change the threshold value? what will be correct threshold value for this?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have checked by varying threshold values. It is somewhat better.. It is recognizing some voices only.
My locale was Indian English. How can I add more alternative pronounciations ?? That means you were saying me to extend the dictionary as in the tutorial ??
If I add the pronunciations, I need to train the acoustic model also right ?
Is there any possibility to add all kinds of English accents to Dictionary n acoustic model ??
Please guide me how to proceed further..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Indian English is heavily accented. You likely need a special acoustic model (trained on quite large data set of Indian English). I think such stuff is not freely available...
But as you need just one phrase, somehow a hack that probably somehow might help is to try extending dictionary a little bit by adding pronunciations sounding closer to your accent. It is really a hack and probably will not work well.
It is like in your example the word interface has 2 pronunciatioons (with T omited and present). You can add more pronunciations, but you can only use sounds from the acoustic model phone set
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can you please explain the significance of Threshold Value? I read somewhere like it is used to reduce the false detections. Is that right ??
And previously you told me that, reducing threshold leads to better spotting but more false alarms. what exactly false alarms mean??
Coming to my project,
The word "speech" is spotting good at threshold 30 and "interface" at 50 for me. But I couldn't figure out threshold value for the phrase "speech interface". kindly, could you help me out?
false alarms means if you say something weird like "blah" instead of "speech interface", the system will still say it detected "speech interface". This will happen if the threshold is too low
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
ERROR: "dict.c", line 195: Line 3: Phone '(2)' is mising in the acoustic model; word 'INTERFACE' ignored
please help me..
I am providing a link in which there are my logcats, dictionary model and acoustic model. Kindly let me know if anything are required.
https://drive.google.com/open?id=0Bx7zZtnMgeIlaG1EU1VHUm1yMU0
Awaiting for your response..
You should drop space before (2) in your 5041.dic
It should be:
I have changed it.. Thank you.. :)
Still it is not recognizing.. there is no error in logcat.. but it is not recognizing..Should I change the threshold value? what will be correct threshold value for this?
Reducing threshold leads to better spotting but more false alarms
You should try values from 1e-1 to 1e-100
It also depends on the microphone, can be affected by accent, noise, etc.
If you are not english native speaker, you can also check if more alternative pronunciations should be added to dictionary
Thank you for the fast reply. It means alot.. :)
I have checked by varying threshold values. It is somewhat better.. It is recognizing some voices only.
My locale was Indian English. How can I add more alternative pronounciations ?? That means you were saying me to extend the dictionary as in the tutorial ??
If I add the pronunciations, I need to train the acoustic model also right ?
Is there any possibility to add all kinds of English accents to Dictionary n acoustic model ??
Please guide me how to proceed further..
Indian English is heavily accented. You likely need a special acoustic model (trained on quite large data set of Indian English). I think such stuff is not freely available...
But as you need just one phrase, somehow a hack that probably somehow might help is to try extending dictionary a little bit by adding pronunciations sounding closer to your accent. It is really a hack and probably will not work well.
It is like in your example the word interface has 2 pronunciatioons (with T omited and present). You can add more pronunciations, but you can only use sounds from the acoustic model phone set
Thank you so much for taking time to guide a noob like me :)
I'll try out and get back to you if any further help needed..
Hi,
Can you please explain the significance of Threshold Value? I read somewhere like it is used to reduce the false detections. Is that right ??
And previously you told me that, reducing threshold leads to better spotting but more false alarms. what exactly false alarms mean??
Coming to my project,
The word "speech" is spotting good at threshold 30 and "interface" at 50 for me. But I couldn't figure out threshold value for the phrase "speech interface". kindly, could you help me out?
Link for my Dictation and acoustic model,
https://drive.google.com/drive/u/0/folders/0Bx7zZtnMgeIlaG1EU1VHUm1yMU0
you usually need smaller thresholds for phrases.
false alarms means if you say something weird like "blah" instead of "speech interface", the system will still say it detected "speech interface". This will happen if the threshold is too low