I am trying to recognize keywords but having a problem that pocketsphinx tries to recognize very slight noises and speeches.
I wonder, is there a way to set a minimum input volume threshold? Just so pocketsphinx ignores low volume sounds.
Last edit: kleone 2018-07-11
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I thought "-vad_treshold" is "Threshold for decision between noise and silence frames. Log-ratio between signal level and noise level." As I understood this option value is relative (ratio between signal level and noise level). So if total signal level is low "-vad_treshold" uses the higher volume as signal level. Did I get it right?
Also, I've made the test. I've create 10min 16000 16bit test audio file which sound level isn't higher than -20db and used follow command:
I tried to set different "-vad_threshold" (1e-10, 0.1, 1, 2, 3, 4, 5), but I always get false positive result despite sound level is really low (-20db)
(P.S. What I need now is ignore all sounds which is lower than -20db)
Last edit: kleone 2018-07-12
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Nickolay, thanx for answer. I've added test files to the attach.
You'd better share the audio file to get help on this issue. False positives are caused by very short keyword, not by background sounds.
Yes I understand this, but what I need is to disable recognition if sound level are lower than -20db so pocketsphinx hasn't any chance to recognize quiet sounds
Hi,
I am trying to recognize keywords but having a problem that pocketsphinx tries to recognize very slight noises and speeches.
I wonder, is there a way to set a minimum input volume threshold? Just so pocketsphinx ignores low volume sounds.
Last edit: kleone 2018-07-11
-vad_treshold option
I thought "-vad_treshold" is "Threshold for decision between noise and silence frames. Log-ratio between signal level and noise level." As I understood this option value is relative (ratio between signal level and noise level). So if total signal level is low "-vad_treshold" uses the higher volume as signal level. Did I get it right?
Also, I've made the test. I've create 10min 16000 16bit test audio file which sound level isn't higher than -20db and used follow command:
I tried to set different "-vad_threshold" (1e-10, 0.1, 1, 2, 3, 4, 5), but I always get false positive result despite sound level is really low (-20db)
(P.S. What I need now is ignore all sounds which is lower than -20db)
Last edit: kleone 2018-07-12
You'd better share the audio file to get help on this issue. False positives are caused by very short keyword, not by background sounds.
Nickolay, thanx for answer. I've added test files to the attach.
Yes I understand this, but what I need is to disable recognition if sound level are lower than -20db so pocketsphinx hasn't any chance to recognize quiet sounds
Last edit: kleone 2018-07-17
Last edit: kleone 2018-07-17