I want to use Sphinx-2.0.4 to recognize a pre-recorded voice which length is about 3 minutes. Here is the shell script to call sphinx2-continuous to perform this job.
However, I got a error message when I execute this program:
###############################################
INFO: search.c(1857): search.c(1858): startword= <s> (id= 236)
INFO: uttproc.c(897): Batchmode
ERROR: "uttproc.c", line 996: Utterance too long; truncating to about 6000 frame
s
ERROR: "uttproc.c", line 860: uttproc_end called when not in IDLE state
###############################################
Could I change some parameters in the script to solve this problem? Or ... Do I have to record my voice into several smaller segment?
Thanks in advance!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2002-01-17
Hi,
1st : in the documentation is said only utterances
of max. 30 sec are possible (http://www.speech.cs.cmu.edu/sphinx/doc/sphinx2.html)
2nd : I don't thing that this could be solved with
only changing the parameters. Maybe I'm wrong, but I guess there has to be some silence filtering. After silence detection a recognition of that part can be done. IMO, this can only be done by the C interface.
Please correct me if I'm wrong.
Ralf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear all,
I want to use Sphinx-2.0.4 to recognize a pre-recorded voice which length is about 3 minutes. Here is the shell script to call sphinx2-continuous to perform this job.
#########################################
#!/bin/sh
S2BATCH=sphinx2-continuous
HMM=/usr/local/share/sphinx2/model/hmm/6k
TASK=/[mydir]/model
CTLFILE=/[mydir]/model/control.ctl
echo " "
echo "sphinx2-icrt"
echo "Run CMU Sphinx2 in Batch mode to decode an example utterance."
echo " "
$S2BATCH -verbose 9 -adcin TRUE -adcext 16k -ctlfn ${CTLFILE} -ctloffset 0 -ctlc
ount 100000000 -datadir ${TASK} -agcmax TRUE -langwt 6.5 -fwdflatlw 8.5 -rescore
lw 9.5 -ugwt 0.5 -fillpen 1e-10 -silpen 0.005 -inspen 0.65 -top 1 -topsenfrm 3 -
topsenthresh -70000 -beam 2e-06 -npbeam 2e-06 -lpbeam 2e-05 -lponlybeam 0.0005 -
nwbeam 0.0005 -fwdflat FALSE -fwdflatbeam 1e-08 -fwdflatnwbeam 0.0003 -bestpath
TRUE -kbdumpdir ${TASK} -lmfn ${TASK}/LanguageModel -dictfn ${TASK}/DICT -noised
ict ${HMM}/noisedict -phnfn ${HMM}/phone -mapfn ${HMM}/map -hmmdir ${HMM} -hmmdi
rlist ${HMM} -8bsen TRUE -sendumpfn ${HMM}/sendump -cbdir ${HMM} -matchsegfn /[mydir]/model/matchfn
##############################################
However, I got a error message when I execute this program:
###############################################
INFO: search.c(1857): search.c(1858): startword= <s> (id= 236)
INFO: uttproc.c(897): Batchmode
ERROR: "uttproc.c", line 996: Utterance too long; truncating to about 6000 frame
s
ERROR: "uttproc.c", line 860: uttproc_end called when not in IDLE state
###############################################
Could I change some parameters in the script to solve this problem? Or ... Do I have to record my voice into several smaller segment?
Thanks in advance!
Hi,
1st : in the documentation is said only utterances
of max. 30 sec are possible (http://www.speech.cs.cmu.edu/sphinx/doc/sphinx2.html)
2nd : I don't thing that this could be solved with
only changing the parameters. Maybe I'm wrong, but I guess there has to be some silence filtering. After silence detection a recognition of that part can be done. IMO, this can only be done by the C interface.
Please correct me if I'm wrong.
Ralf