I've been trying to use Sphinx3 for the last two months. I've managed to learn
quite a bit of it, but am still very stuck in some places. I'm a novice and my
doubts are probably very trivial but I would really appreciate it if some of
the more experienced users could help me out.
1. I successfully used sphinx_livedecode and sphinx_livepretend. However I cannot figure out what the command line syntax for s3decode is. The script says usage is <part> <npart> . However i cannot figure out what exactly part, npart and exptid stand for. Control im guessing is the ctl file. </npart></part>
I built a system using the TIMIT database. However the accuracy wasn't very good. So i tried to increase the accuracy by extracting more features. I extracted 39 features and trained the system. However on running slave.pl i get the error:
FATAL_ERROR: "kbcore.c", line 633: Feature streamlen(39) != mgau
streamlen(117)
I know i'm probably missing out some field in the sphinx_decode.cfg file.
I really hope someone can help out, with this, i'm really stuck.
Thank you so much in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
what the command line syntax for s3decode is. The script says usage is
<part> <npart> . However i cannot figure out what exactly part, npart and
exptid stand for. Control im guessing is the ctl file.</npart></part>
Hello
Specifically s3decode.pl script is designed to run in a parallel environment.
For that reason it can process decoding tasks by parts. It can split the whole
list on n parts and process each specific part dumping the output for later
merge. If you want to use it you can just run it on a whole set. For that
reason use part 1 and npart 1. Other arguments are optional.
If you want to run sphinx3_decode binary, you need to run it as:
I built a system using the TIMIT database. However the accuracy wasn't
very good. So i tried to increase the accuracy by extracting more features. I
extracted 39 features and trained the system. However on running slave.pl i
get the error: FATAL_ERROR: "kbcore.c", line 633: Feature streamlen(39) !=
mgau streamlen(117)
It feels like you enabled cepwin features with lda transform but didn't pass
the transform file to the decoder. You need to pass it properly. If you don't
understand this part yet I recommend you to revert to default settings
instead.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you so much. I'll give these a shot.
You're right i do not understand the second part yet. I wanted to extract more
features and changed the CFG_VECTOR_LENGTH to 39 in sphinx_decode.cfg.
I also updated ncep to 39 in feat.params.
However these changes did not give me a 39 dim feature vector. So i checked
the make_feats.pl file and it appeared that the default parameters were being
used to create the features. So i updated -ncep to 39 in make_feats.pl, which
gave me a 39 dim feature vector.
I trained the system and got the error that i stated above. Is there some
fundamental error in what i did?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you so much. I'll give these a shot.
You're right i do not understand the second part yet. I wanted to extract more
features and changed the CFG_VECTOR_LENGTH to 39 in sphinx_decode.cfg.
I also updated ncep to 39 in feat.params.
However these changes did not give me a 39 dim feature vector. So i checked
the make_feats.pl file and it appeared that the default parameters were being
used to create the features. So i updated -ncep to 39 in make_feats.pl, which
gave me a 39 dim feature vector.
You have a confusion between feature vector length and cepstrum length.
Cepstrum is stored in MFC files and only contains mel log-scale values. It's
typical dimension is 13. Feature vector is combined of cepstrum, first
cepstrum derivatives and second cepstrum derivatives as specified by feature
vector type 1s_c_d_dd. The size for feature vector is 39 (or 13 * 3 with
derivatives)
If you change vector lenght with -ceplen, you change the lengh of final
feature vector
If you change cepstrum length with -ncep you change the lengh of cepstrum.
If you wanted cepstrum lenght 39 (pretty useless, it should be less than 20),
you also need to set vector lenght to 39 with -ceplen. You need to edit the
decoder script to do both.
-ncep configures cepstrum length
-ceplen together with -feat control feature vector length
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just tried the decoder and it works perfectly.
I have one last question. So -ceplen and -feat together control the length of
the feature vector.
So if i want a feature vector length of 39, what changes should i make to
sphinx_train.cfg.
If the -feat is "1s_c_d_dd", is that all that needs to be updated? Or should i
update CFG_VECTOR_LENGTH also.
I am using RunAll.pl to train, so i guess the arguments to the trainer have to
passed through the config file. I was wondering if CFG_VECTOR_LENGTH also
needs to be updated.
I'm a little confused,because i updated -feat to 1s_c_d_dd and
CFG_VECTOR_LENGTH to 39 and got:
FATAL_ERROR: "corpus.c", line 1754: Expected mfcc vector len of 39, got 26
(1157)
So what fields should i update to get a feature vector of length 39.
I'm really sorry if you're having to repeat yourself. But this would be a lot
of help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i updated -feat to 1s_c_d_dd and CFG_VECTOR_LENGTH to 39 and got:
FATAL_ERROR: "corpus.c", line 1754: Expected mfcc vector len of 39, got 26
(1157)
If you change vector length you (you shoud not do that) you also need to
change cepstrum length (-ncep option in make_feats.pl) and you also need to
reextract the mfc files (cepstrum)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've been trying to use Sphinx3 for the last two months. I've managed to learn
quite a bit of it, but am still very stuck in some places. I'm a novice and my
doubts are probably very trivial but I would really appreciate it if some of
the more experienced users could help me out.
1. I successfully used sphinx_livedecode and sphinx_livepretend. However I cannot figure out what the command line syntax for s3decode is. The script says usage is <part> <npart> . However i cannot figure out what exactly part, npart and exptid stand for. Control im guessing is the ctl file. </npart></part>
FATAL_ERROR: "kbcore.c", line 633: Feature streamlen(39) != mgau
streamlen(117)
I know i'm probably missing out some field in the sphinx_decode.cfg file.
I really hope someone can help out, with this, i'm really stuck.
Thank you so much in advance.
Hello
Specifically s3decode.pl script is designed to run in a parallel environment.
For that reason it can process decoding tasks by parts. It can split the whole
list on n parts and process each specific part dumping the output for later
merge. If you want to use it you can just run it on a whole set. For that
reason use part 1 and npart 1. Other arguments are optional.
If you want to run sphinx3_decode binary, you need to run it as:
sphinx3_decode -hmm <hmm> -lm <lm> -dict <dict> -ctl <control_file with="" one="" name="" per="" line=""> -cepdir <directory for="" feature="" files=""> -hyp <output file=""> </output></directory></control_file></dict></lm></hmm>
Or to process wavs
sphinx3_decode -hmm <hmm> -lm <lm> -dict <dict> -ctl <control_file with="" one="" name="" per="" line=""> -cepdir <directory for="" feature="" files=""> -hyp <output file=""> -adcin
yes -cepext .wav </output></directory></control_file></dict></lm></hmm>
It feels like you enabled cepwin features with lda transform but didn't pass
the transform file to the decoder. You need to pass it properly. If you don't
understand this part yet I recommend you to revert to default settings
instead.
Thank you so much. I'll give these a shot.
You're right i do not understand the second part yet. I wanted to extract more
features and changed the CFG_VECTOR_LENGTH to 39 in sphinx_decode.cfg.
I also updated ncep to 39 in feat.params.
However these changes did not give me a 39 dim feature vector. So i checked
the make_feats.pl file and it appeared that the default parameters were being
used to create the features. So i updated -ncep to 39 in make_feats.pl, which
gave me a 39 dim feature vector.
I trained the system and got the error that i stated above. Is there some
fundamental error in what i did?
You have a confusion between feature vector length and cepstrum length.
Cepstrum is stored in MFC files and only contains mel log-scale values. It's
typical dimension is 13. Feature vector is combined of cepstrum, first
cepstrum derivatives and second cepstrum derivatives as specified by feature
vector type 1s_c_d_dd. The size for feature vector is 39 (or 13 * 3 with
derivatives)
If you change vector lenght with -ceplen, you change the lengh of final
feature vector
If you change cepstrum length with -ncep you change the lengh of cepstrum.
If you wanted cepstrum lenght 39 (pretty useless, it should be less than 20),
you also need to set vector lenght to 39 with -ceplen. You need to edit the
decoder script to do both.
-ncep configures cepstrum length
-ceplen together with -feat control feature vector length
That makes so much more sense. Thank you so much. I'm sure this will clear
everything up. Thank you :D
Hey,
I just tried the decoder and it works perfectly.
I have one last question. So -ceplen and -feat together control the length of
the feature vector.
So if i want a feature vector length of 39, what changes should i make to
sphinx_train.cfg.
If the -feat is "1s_c_d_dd", is that all that needs to be updated? Or should i
update CFG_VECTOR_LENGTH also.
I am using RunAll.pl to train, so i guess the arguments to the trainer have to
passed through the config file. I was wondering if CFG_VECTOR_LENGTH also
needs to be updated.
I'm a little confused,because i updated -feat to 1s_c_d_dd and
CFG_VECTOR_LENGTH to 39 and got:
FATAL_ERROR: "corpus.c", line 1754: Expected mfcc vector len of 39, got 26
(1157)
So what fields should i update to get a feature vector of length 39.
I'm really sorry if you're having to repeat yourself. But this would be a lot
of help.
You shoudln't change anything. Default values are:
ncep (cepstrum length) 13
ceplen (vector length) 13
feature 1s_c_d_dd (cepstrum, delta and delta-delta)
full feature vector length 39 (3 * 13)
If you change vector length you (you shoud not do that) you also need to
change cepstrum length (-ncep option in make_feats.pl) and you also need to
reextract the mfc files (cepstrum)
Got it. Thanks :)