I am new to SphinxII. I am trying to get the system to decode audio files I recorded with my own voice say the same thing " go forward 10 meters".
I had
./configure
make clean all
make test
make install
Looks like ./sphinx2-test produces good results at the BESTPATH line.
INFO: search.c(2568): 783 candidate words for entering last phone (1/fr)
SFrm EFrm AScr/Frm AScr PathScr BSDiff LatDen PhPerp Word (Bestpath) (goforward)
------------------------------------------------------------------------
63 76 349174 4888443 17438237 -141750 8 1.19 GO
86 117 290375 9292029 26999968 -167232 2 0.89 FORWARD
125 143 282989 5376800 32448889 -153653 5 1.05 TEN
148 194 231355 10873712 43394722 -159068 3 1.11 METERS
INFO: searchlat.c(939): BESTPATH: GO FORWARD TEN METERS (goforward -97741494)
I had recorded with 16000 Hz, mono, 16 bit, signed.
ampliude from -0.3 to +0.8, approx 4 seconds. File name mygoforward.16k. This file plays well using both linux play and Goldwave.
/usr/bin/play ./mygoforward.16k -f s -r 16000 -w -t raw
127926 is the size which I calculated works out right for a 4 seconds clip.
I had changed the turtle.ctl for this new file mygoforward
Then I
./configure
make clean all;make test
The new file name mygoforward seems recognised but then I am not able to recognize the same words utter by my own voice?
I am new to SphinxII. I am trying to get the system to decode audio files I recorded with my own voice say the same thing " go forward 10 meters".
./configure
make clean all
make test
make install
Looks like ./sphinx2-test produces good results at the BESTPATH line.
INFO: search.c(2568): 783 candidate words for entering last phone (1/fr)
SFrm EFrm AScr/Frm AScr PathScr BSDiff LatDen PhPerp Word (Bestpath) (goforward)
------------------------------------------------------------------------
63 76 349174 4888443 17438237 -141750 8 1.19 GO
86 117 290375 9292029 26999968 -167232 2 0.89 FORWARD
125 143 282989 5376800 32448889 -153653 5 1.05 TEN
148 194 231355 10873712 43394722 -159068 3 1.11 METERS
INFO: searchlat.c(939): BESTPATH: GO FORWARD TEN METERS (goforward -97741494)
ampliude from -0.3 to +0.8, approx 4 seconds. File name mygoforward.16k. This file plays well using both linux play and Goldwave.
/usr/bin/play ./mygoforward.16k -f s -r 16000 -w -t raw
127926 is the size which I calculated works out right for a 4 seconds clip.
I had changed the turtle.ctl for this new file mygoforward
./configure
make clean all;make test
The new file name mygoforward seems recognised but then I am not able to recognize the same words utter by my own voice?
INFO: uttproc.c(1382): Samples histogram (mygoforward) (4/8/16/30/32K): 96.3%(61589) 2.9%(1858) 0.8%(503) 0.0%(13) 0.0%(0); max: 26083
5.138 = AGC MAX
SFrm Efrm AScr/Frm AScr LScr BSDiff LatDen PhPerp Word (FWDTREE) (mygoforward)
---------------------------------------------------------------------
0 48 -190493 -9334195 0 -180307 2 1.00 <s>
49 83 -210710 -7374863 -52986 -183413 2 0.89 SIL
84 117 -224194 -7622609 -52986 -150829 5 1.22 SIL
118 147 -239993 -7199818 -52986 -145032 11 1.29 SIL
148 164 -225240 -3829088 -52986 -190644 5 1.13 SIL
165 192 -218652 -6122267 -52986 -192475 3 1.27 SIL
193 238 -220551 -10145381 -52986 -161658 5 1.77 SIL
239 259 -213408 -4481585 -52986 -160426 5 1.96 SIL
260 279 -211849 -4236995 -52986 -174694 3 1.46 SIL
280 369 -178657 -16079189 -52986 -175035 1 0.47 SIL
370 397 -188428 -5276006 -212437 -172178 1 0.79 </s>
INFO: search.c(2643): FWDTREE: (mygoforward -82391307 (A=-81701996 L=-689311))
I thought Sphinx is speaker independent?
Do I need to train?
Thanks
bentckao@hotmail.com
I had overlooked the Open Source Tutorial which explained how to set up the tutorial and train.
I am able to finish off the tutorial on AN1
Now my challenge is to convert my own wav files into the various model files for recognization.