As we know in Sphinx2 an pocket sphinx ,there has the script file "sphinx2-simple" ,when we run it ,the system can execute programe that can reconize word through user's mic,but in spinx3 the file is different ,who can tell me how can I use sphinx3 to reconize through the mic
thank you very much
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Try saying smth like "one two three" after pressing the Enter key. At least it works for me with limited accuracy. If it doesn't work for you, check the with some other recording programs that your speech actually reaches the system. Also, by default it seems to listen at the first dound device (/dev/dsp). If you have more than one sound devices and the microphone is connected to the second device (/dev/dsp1), you might have to hack the source.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
dear sir , thank for your help ,could I ask your another quesiton , may be is my liunux system sound device problem . but I found after I run sphinx3-livedecode It means I press second "enter" the program is over ,and It produce the sound file "Out.raw" Did you see this file in your system , I think it is a recode fuction ,and what is your screen output after you input your voice
this is my Email "jack-cui-jarod@163.com"
thank again ,you are a kind man
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
FWDVIT: ONE Q AREA (* 1061022Z192556)
FWDXCT: * 1061022Z192556 S 6542531 T 3723665 A 3722100 L 1565 0 3276723 -8101 <sil> 109 -939000 -16344 ONE 141 -62537 -8101 <sil> 158 -1773374 -16344 Q 188 -710048 -8101 <sil> 205 -1138601 -16344 AREA 241 5068937 -8101 <sil> 331 0 -16344 </s> 331
INFO: stat.c(154): 331 frm; 71 cdsen/fr, 144 cisen/fr, 227 cdgau/fr, 246 cigau/fr, Sen 0.22, CPU 0.22 Clk [Ovrhd 0.17 CPU 0.17 Clk]; 206 hmm/fr, 1 wd/fr, Search: 0.01 CPU 0.01 Clk (* 1061022Z192556)
INFO: fast_algo_struct.c(398): HMMHist0..0: 331(100)
INFO: lm.c(944): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)INFO: lm.c(948): 1182 bg(), 1173 bo; 1 fills, 1 in mem (50.0%)
Hypothesis:
ONE Q AREA
I said "one two three" but the decoder recognized "ONE Q AREA" which is OK as my English is not very good ;)
Yes, I get the out.raw file, this is the recording of what I just said. You can open this with a sound editor (import as raw, mono, 16 bit, 16kHz). If there isn't any visible or audible waveform, it means that the sound doesn't reach the decoder.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As we know in Sphinx2 an pocket sphinx ,there has the script file "sphinx2-simple" ,when we run it ,the system can execute programe that can reconize word through user's mic,but in spinx3 the file is different ,who can tell me how can I use sphinx3 to reconize through the mic
thank you very much
Check scripts/sphinx3-simple
Or just execute sphinx3_livedecode <arguments file>
Hoe this helps.
thank you for your reply, I had tried it ,but I could not reconize the word
the screen display
"starting recoding" press enter to end recoding
when I press "enter" key ,it display ".." "..." when I press enter again ,I run over the program
can you tell your Email thank you again
Try saying smth like "one two three" after pressing the Enter key. At least it works for me with limited accuracy. If it doesn't work for you, check the with some other recording programs that your speech actually reaches the system. Also, by default it seems to listen at the first dound device (/dev/dsp). If you have more than one sound devices and the microphone is connected to the second device (/dev/dsp1), you might have to hack the source.
dear sir , thank for your help ,could I ask your another quesiton , may be is my liunux system sound device problem . but I found after I run sphinx3-livedecode It means I press second "enter" the program is over ,and It produce the sound file "Out.raw" Did you see this file in your system , I think it is a recode fuction ,and what is your screen output after you input your voice
this is my Email "jack-cui-jarod@163.com"
thank again ,you are a kind man
Hello,
This is what I get after pressing ENTER:
press ENTER to start recording
press ENTER to finish recording
ad_oss.c 255: can't set input gain/recording level for this device.
....
.... (lots of them)
Backtrace( 1061022Z192556)
FV: 1061022Z192556> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
fv: 1061022Z192556> <sil> 0 108 3276723 -74100 3202623 3511827
fv: 1061022Z192556> ONE 109 140 -939000 -148286 -1087286 -449778
fv: 1061022Z192556> <sil> 141 157 -62537 -74100 -136637 79260
fv: 1061022Z192556> Q 158 187 -1773374 -148286 -1921660 -1043005
fv: 1061022Z192556> <sil> 188 204 -710048 -74100 -784148 -378463
fv: 1061022Z192556> AREA 205 240 -1138601 -148286 -1286887 -449418
fv: 1061022Z192556> <sil> 241 330 5068937 -74100 4994837 5414515
FV: 1061022Z192556> TOTAL 3722100 -741258
FWDVIT: ONE Q AREA (* 1061022Z192556)
FWDXCT: * 1061022Z192556 S 6542531 T 3723665 A 3722100 L 1565 0 3276723 -8101 <sil> 109 -939000 -16344 ONE 141 -62537 -8101 <sil> 158 -1773374 -16344 Q 188 -710048 -8101 <sil> 205 -1138601 -16344 AREA 241 5068937 -8101 <sil> 331 0 -16344 </s> 331
INFO: stat.c(154): 331 frm; 71 cdsen/fr, 144 cisen/fr, 227 cdgau/fr, 246 cigau/fr, Sen 0.22, CPU 0.22 Clk [Ovrhd 0.17 CPU 0.17 Clk]; 206 hmm/fr, 1 wd/fr, Search: 0.01 CPU 0.01 Clk (* 1061022Z192556)
INFO: fast_algo_struct.c(398): HMMHist0..0: 331(100)
INFO: lm.c(944): 0 tg(), 0 tgcache, 0 bo; 0 fills, 0 in mem (0.0%)INFO: lm.c(948): 1182 bg(), 1173 bo; 1 fills, 1 in mem (50.0%)
Hypothesis:
ONE Q AREA
I said "one two three" but the decoder recognized "ONE Q AREA" which is OK as my English is not very good ;)
Yes, I get the out.raw file, this is the recording of what I just said. You can open this with a sound editor (import as raw, mono, 16 bit, 16kHz). If there isn't any visible or audible waveform, it means that the sound doesn't reach the decoder.