Hi guys, I'm working on a project of speech recognition, and I'm using pocketsphinx library on python3 with LiveSpeech for continuous recognition from a mic. Using a custom dictionary with the only words that I need, with a custom LM from my corpus.txt, the recognition works well, but it also recognizes random words as words in my dict. An example: I say "I might move on" (words not present in my dict) and it recognizes "Light On" (words present in my dict). I've tried also with a keyphrases.list but has worsened, because when I say "Light On" it recognize "Light One Light One Light Four Light Off Light Ten Light Eight" and so on.
I think that the better recognition on my case is using the lm instead a keyphrase list, but I would to know how to set a threshold on recognizing, so I could let recognize and do actions only when the words recognized has an accuracy> 85% or similar.
If you want to reject other words, you need keyword spotting mode, not LM. You need to tune thresholds in keyphrase.list file for reliable detection, but I doubt you will be able to do it with such a similar keywords.
For more accurate recognition you might try vosk-api
Yes, I have read in various sites that recognizing with keyphrase, for few phrases, is better, but in my case the results are terribles! When I say "Light On" it recognizes and print "Light On Off One Light Two Three Light Ten Light Off Light On", instead with LM it recognizes only "Light On". So this is a trouble.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi guys, I'm working on a project of speech recognition, and I'm using pocketsphinx library on python3 with LiveSpeech for continuous recognition from a mic. Using a custom dictionary with the only words that I need, with a custom LM from my corpus.txt, the recognition works well, but it also recognizes random words as words in my dict. An example: I say "I might move on" (words not present in my dict) and it recognizes "Light On" (words present in my dict). I've tried also with a keyphrases.list but has worsened, because when I say "Light On" it recognize "Light One Light One Light Four Light Off Light Ten Light Eight" and so on.
I think that the better recognition on my case is using the lm instead a keyphrase list, but I would to know how to set a threshold on recognizing, so I could let recognize and do actions only when the words recognized has an accuracy> 85% or similar.
I've attached my corpus, dict, lm and keyphrases.
Please someone can help me?
If you want to reject other words, you need keyword spotting mode, not LM. You need to tune thresholds in keyphrase.list file for reliable detection, but I doubt you will be able to do it with such a similar keywords.
For more accurate recognition you might try vosk-api
https://github.com/alphacep/vosk-api
Yes, I have read in various sites that recognizing with keyphrase, for few phrases, is better, but in my case the results are terribles! When I say "Light On" it recognizes and print "Light On Off One Light Two Three Light Ten Light Off Light On", instead with LM it recognizes only "Light On". So this is a trouble.