Keyword threshold controls the rate of detection and false alarms. It's in linear domain so values range from -beam (1e-50) to 1.0. You can also user larger thresholds if you increase the beam with -beam 1e-200 you can use thresholds up to 1e-200.
With threshold 1e-200 you will have a lot of detections and most of the words will be correctly detected but you will have many false alarms. With threshold 1.0 you will have almost no false alarms but many true matches might be missed.
Optimal threshold has to be selected on a test data set.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I work with digits (one, two, three, etc) only and get a lot of false alarms on anything between 1e-5f and 1e-20f. Any suggestions?
You are welcome to provide the files to reproduce your problems. For short words like "one" beam must be closer to 1e-5f. For spotting it's recommended to use words of 3-4 syllables.
On a related note, what does addKeywordSearch(String, File) do?
It enables search for multiple keywords listed in a file.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The difference between keyword spotting mode and grammar mode is the following:
1) Keyword spotting has automatic garbage loop for out-of grammar words while in grammar you have to insert the garbage phone explicitly.
2) Keyword spotting mode has per-phrase configurable activation threshold so you can tune false alarms for every word. In grammar mode even if you add the garbage loop there is no such thing, there are only common insertion penalties.
Last edit: Nickolay V. Shmyrev 2014-09-10
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What is keyword threshold?
Keyword threshold controls the rate of detection and false alarms. It's in linear domain so values range from -beam (1e-50) to 1.0. You can also user larger thresholds if you increase the beam with
-beam 1e-200
you can use thresholds up to 1e-200.With threshold 1e-200 you will have a lot of detections and most of the words will be correctly detected but you will have many false alarms. With threshold 1.0 you will have almost no false alarms but many true matches might be missed.
Optimal threshold has to be selected on a test data set.
I work with digits (one, two, three, etc) only and get a lot of false alarms on anything between 1e-5f and 1e-20f. Any suggestions?
On a related note, what does addKeywordSearch(String, File) do?
You are welcome to provide the files to reproduce your problems. For short words like "one" beam must be closer to 1e-5f. For spotting it's recommended to use words of 3-4 syllables.
It enables search for multiple keywords listed in a file.
Thank you. What is String and File for (each)?
It is the search name. You use this name in
SpeechRecognizer#startListening(String)
to activate this search.Last edit: Nickolay V. Shmyrev 2014-09-10
Why would I use a file with keywords when I can load a custom grammar to achieve the same? Is there any benefit in using one approach over the other?
Grammar search and keyword searches are different. Keyword search is being listened continuously.
Last edit: Nickolay V. Shmyrev 2014-09-10
So, is performance (ie speed - not accuracy) the only difference?
The difference between keyword spotting mode and grammar mode is the following:
1) Keyword spotting has automatic garbage loop for out-of grammar words while in grammar you have to insert the garbage phone explicitly.
2) Keyword spotting mode has per-phrase configurable activation threshold so you can tune false alarms for every word. In grammar mode even if you add the garbage loop there is no such thing, there are only common insertion penalties.
Last edit: Nickolay V. Shmyrev 2014-09-10