CMU Sphinx / Forums / Help: How to make it don't recognize the error, when I speak a word but not in the dictionary?

stevenyslin - 2016-06-13

Hello,

When I used "pocketsphinx_continuous" to detect what I say,
it will possible show word in my_db.dic when I say a wrong word.

For example, my my_db.dic as follows:

APPLE AE P AH L
BANANA B AH N AE N AH
CAT K AE T

But when I say "DOG", it will recognize the word in my_db.dic,
like "CAT" or or other words in my_dict.dic,
so how to make it do not recognize the error, when I speak a word but not in the dictionary?

Thanks for your help

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-13
  
  You can use keyword spotting mode
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-14

Hi Nickolay,
Thank you for your response,

May I re-clarify my questions I met:
Suppose that I had 20 words in my_db.dic,
ex. "apple" "banana" "cat" "dog" "like"......
I found that it will be a little bit easy for pocketsphinx
to substitute a word I spoke into one of the word in my_db.dic.

Just something like when I spoke "coat" (it's NOT in the dictionary),
but pocketsphinx may recognize it as "cat" (which is in the dictionary).

What I wonder is that is there any other threshold (except vad_threshold/kws_threshold),
or things that I could adjust,
to let the words that are NOT in my dictionary to be just dropped,
and not to be recognized as any word in my_db.dic?

Keyword spotting mode is not so fit with my situation,
since I will be listening to all these 20 words as a voice command input,
so spotting mode will still get a 20 words' list,
and kws_threshold seems not to have significant effect.

Thanks for your help again.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-14
  
  We only have keyword spotting mode for word verification, we do not have any other algorithms implemented.
  
  Threshold should work fine for you, you just need to tune it properly on a set of examples as described in our tutorial http://cmusphinx.sourceforge.net/wiki/tutoriallm
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-15

Hi Nickolay,
Thank you for your response,
I follow your suggestion to set the keyphrase list,
but the effect seems not useful.

my command：

$ pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl <my_db>_test.fileids -lm <my_db>.lm.DMP -dict <my_db>.dict -hmm en-us -kws keyphrase_list -jsgf <my_db>.gram -hyp myvoice.hyp $ perl word_align.pl <my_db>_test.transcription myvoice.hyp

1. The best result is as follows：

apple banana (arctic_0001) apple banana (arctic_0001) Words: 2 Correct: 2 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00% Insertions: 0 Deletions: 0 Substitutions: 0

2. The system recognize the noise as the keyword：

apple *** cat (arctic_0002) apple cat cat (arctic_0002) Words: 2 Correct: 2 Errors: 1 Percent correct = 100.00% Error = 50.00% Accuracy = 50.00% Insertions: 1 Deletions: 0 Substitutions: 0

3. The system recognize the other word as the keyword：

*coat* (arctic_0003) *cat* (arctic_0003) Words: 1 Correct: 0 Errors: 1 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00% Insertions: 0 Deletions: 0 Substitutions: 1

my setting as follows：

my keyphrase_list:

apple /1e-2/ banana /1e-5/ cat /1e-2/ dog /1e-2/ like /1e-2/

my voice.gram：

#JSGF V1.0; /** * JSGF Grammar for Hello World example */ grammar voice; public <nasty> = ((apple* | banana* | cat* | dog* | like*)+)+;

My question:
Is any way that can solve Result 2(The system recognize the noise as the keyword) and Result 3(The system recognize the other word as the keyword), or my setting have problem?

Thanks for your help again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-15
  
  When you use both -jsgf and -kws in command line system switches to grammar mode, kws argument is ignored. You need to supply only -kws
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-15

Hi Nickolay,
thank you so much for your quickly response.
but when I use keyphrase_list version 1 and version 2,
the result will be the same!!

Q1：
I would like to ask that my setting have any problem or something I miss?

Q2：
And does any way that can solve Result 2(The system recognize the noise as the keyword) and Result 3(The system recognize the other word as the keyword) to avoid recognition error?

Thanks for your help again.

my command has been modified as follows：

$ pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl <my_db>_test.fileids -lm <my_db>.lm.DMP -dict <my_db>.dict -hmm en-us -kws keyphrase_list -hyp myvoice.hyp $ perl word_align.pl <my_db>_test.transcription myvoice.hyp

my keyphrase_list version 1：

apple /1e-50/ banana /1e-50/ cat /1e-50/ dog /1e-50/ like /1e-50/

my keyphrase_list version 2：

apple /1e-2/ banana /1e-5/ cat /1e-2/ dog /1e-2/ like /1e-2/

my final result：

TOTAL Words: 116 Correct: 112 Errors: 9 TOTAL Percent correct = 96.55% Error = 7.76% Accuracy = 92.24% TOTAL Insertions: 5 Deletions: 0 Substitutions: 4
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-15
  
  You have too many insertions. You need to try higher values for thresholds then like 1e+2 or even 1e+10. It is hard to reproduce your problem since you didn't provide the data.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Nickolay,
Thank you for your response,

Since our case is for chinese speech recognition, so our data is all of chinese.

I try higher values for thresholds, but the result still be the same.
Attached file is my data,
https://drive.google.com/file/d/0BzvKcj5lMO_4Y0JLVUd1bVZDQ3M/view?usp=sharing

Explanation：

etc                      ：my setting
tdt_sc_8k                ：is our HMM model
wav                      ：I use testing_data to recognize accuracy rate
20160616_key_vX.hyp      ：my hyp file
keyphrase_list_vX        ：my keyphrase list setting
result_20160616_key_vX   ：my result

my command：

$ pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl etc/voiceadapt_test.fileids -lm etc/voiceadapt.lm.DMP -dict etc/voiceadapt.dict -hmm tdt_sc_8k -kws keyphrase_list_v1 -hyp 20160616_key_v1.hyp
$ perl word_align.pl etc/voiceadapt_test.transcription 20160616_key_v1.hyp 2>&1|tee result_20160616_key_v1

my keyphrase list setting as follows：

keyphrase_list_v1：

今天天氣狀況  /1e-10/
命令  /1e-1/
查詢  /1e-3/
氣溫  /1e-3/
語音助手    /1e-5/
路況  /1e-3/
開機  /1e-3/
關機  /1e-3/

keyphrase_list_v2：

今天天氣狀況  /1e+20/
命令  /1e+5/
查詢  /1e+5/
氣溫  /1e+5/
語音助手    /1e+10/
路況  /1e+5/
開機  /1e+5/
關機  /1e+5/

keyphrase_list_v3：

今天天氣狀況  /1e+200/
命令  /1e+50/
查詢  /1e+50/
氣溫  /1e+50/
語音助手    /1e+100/
路況  /1e+50/
開機  /1e+50/
關機  /1e+50/

Thanks for your help again.

Last edit: stevenyslin 2016-06-16

stevenyslin - 2016-06-17

Does anyone have this problem such as this case?
Because the keyphrase_list seems does not work,

Thanks a lot for everyone's help

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-17
  
  It does not work because you use -lm in your command line. Options -lm -jsgf and -kws conflict with each other.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-20

Hi Nickolay,
Thank you for your response,

I have two questions about this

Q1：
For options -lm -jsgf and -kws, which have highest accuracy if we are not sure of the noisy environment?

Q2:
If we use -lm, this problem can be solve?

The system recognize the noise as the keyword：

氣溫 *** 命令 (arctic_0044) 氣溫命令命令 (arctic_0044) Words: 2 Correct: 2 Errors: 1 Percent correct = 100.00% Error = 50.00% Accuracy = 50.00% Insertions: 1 Deletions: 0 Substitutions: 0

The system recognize the other word as the keyword：

*幫我發個信* (arctic_0064) *開機* (arctic_0064) Words: 1 Correct: 0 Errors: 1 Percent correct = 0.00% Error = 100.00% Accuracy = 0.00% Insertions: 0 Deletions: 0 Substitutions: 1

"幫我發個信" this word is not in our dictionary
"開機" this word is in our dictionary

Thanks for your help again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-21

Does anyone have idea about this?
Thanks a lot for everyone's help

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-21
  
  For options -lm -jsgf and -kws, which have highest accuracy if we are not sure of the noisy environment?
  
  For continuos listening only kws works
  
  If we use -lm, this problem can be solve? The system recognize the noise as the keyword. The system recognize the other word as the keyword：
  
  No
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stevenyslin - 2016-06-22

Ok, I got it, thank you so much.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

How to make it don't recognize the error, when I speak a word but not in the...

Speech Recognition Toolkit

Forums

Help

How to make it don't recognize the error, when I speak a word but not in the dictionary?