Hello,
I have poor performance running pocketsphinx with CD model. The RT is about
0.7. The same model gives me RT about 0.1 with sphinx4. The accuracy in booth
cases is equal and quite acceptable. Basic beam parameters in the
configuration are the same. When I try PTM model with pocketsphinx I have
perfect performance, about 0.03, but, poor accuracy. My previous experience
was that even using CD models, pocketsphinx had better performance as sphinx4
with the same accuracy. What can be a problem now?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Did you play with beams or it's a default settings?
I have tried both. With default settings I had worse performance as with
recommended from the performance tuning guide. Now I am using for pocketsphinx
the next parameters:
configs_.setFloat("-beam", 1.000000e-80);
configs_.setFloat("-wbeam", 1.000000e-40);
configs_.setInt("-ds", 2);
configs_.setInt("-topn", 2);
configs_.setInt("-maxwpf", 20);
configs_.setInt("-maxhmmpf", 3000);
For sphinx4 I am using:
<property name="absoluteBeamWidth" value="3000">
<property name="relativeBeamWidth" value="1E-80">
<property name="absoluteWordBeamWidth" value="20">
<property name="relativeWordBeamWidth" value="1E-40">
<property name="wordInsertionProbability" value="1E-1">
<property name="languageWeight" value="9.5"> </property></property></property></property></property></property>
The model dictionary has 16562 words
CD model is trained on 44 hours with LDA and has 32 densities and 4000 senons
I am making tests on 880 utterances with more than 1 hour and have next
results for
sphinx4:
Sentence accuracy: 73.43927355278093 Word accuracy: 86.75188843695526
RT: 0.42059461110005825
pocketsphinx:
Sentence accuracy: 67.4233825198638 Word accuracy: 82.78560250391236
RT: 0.851102596851953
As you can see pocketsphinx gives me worse accuracy and performance...
Maybe you can just revert time back and find out where it changed
Unfortunately, I don't remember the point where it changed. That was another
model with another phoneme set.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As you can see pocketsphinx gives me worse accuracy and performance...
Pocketsphinx is usually worse on badly trained models. For example, it doesn't
care about NaNs in variance. On good model they should be comparable with
proper settings.
Now I am using for pocketsphinx the next parameters:
It's interesting to try accuracy on various stages of processing. What will
fwdtree do? Try with "fwdflat -no bestpath -no".
"ds" is not the good option since it skips frames. For sphinx4 there is
growSkipInterval which you don't use I suppose.
There are other issues like phoneme lookahead which also should have similar
settings in both decoders.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's interesting to try accuracy on various stages of processing. What will
fwdtree do? Try with "fwdflat -no bestpath -no".
By tests I have used - fwdtree yes -fwdflat -no -bestpath -no -compallsen no
There are other issues like phoneme lookahead which also should have similar
settings in both decoders.
In pocketsphinx I haven't set -pl_window, so it was default, 0. Which config
option in sphinx4 is responcable for phoneme lookahead? I will repeat tests
with disabled -ds option.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have poor performance running pocketsphinx with CD model. The RT is about
0.7. The same model gives me RT about 0.1 with sphinx4. The accuracy in booth
cases is equal and quite acceptable. Basic beam parameters in the
configuration are the same. When I try PTM model with pocketsphinx I have
perfect performance, about 0.03, but, poor accuracy. My previous experience
was that even using CD models, pocketsphinx had better performance as sphinx4
with the same accuracy. What can be a problem now?
Did you play with beams or it's a default settings?
Maybe you can just revert time back and find out where it changed
For sphinx4 I am using:
<property name="absoluteBeamWidth" value="3000">
<property name="relativeBeamWidth" value="1E-80">
<property name="absoluteWordBeamWidth" value="20">
<property name="relativeWordBeamWidth" value="1E-40">
<property name="wordInsertionProbability" value="1E-1">
<property name="languageWeight" value="9.5"> </property></property></property></property></property></property>
The model dictionary has 16562 words
CD model is trained on 44 hours with LDA and has 32 densities and 4000 senons
I am making tests on 880 utterances with more than 1 hour and have next
results for
sphinx4:
Sentence accuracy: 73.43927355278093 Word accuracy: 86.75188843695526
RT: 0.42059461110005825
pocketsphinx:
Sentence accuracy: 67.4233825198638 Word accuracy: 82.78560250391236
RT: 0.851102596851953
As you can see pocketsphinx gives me worse accuracy and performance...
Pocketsphinx is usually worse on badly trained models. For example, it doesn't
care about NaNs in variance. On good model they should be comparable with
proper settings.
It's interesting to try accuracy on various stages of processing. What will
fwdtree do? Try with "fwdflat -no bestpath -no".
"ds" is not the good option since it skips frames. For sphinx4 there is
growSkipInterval which you don't use I suppose.
There are other issues like phoneme lookahead which also should have similar
settings in both decoders.
It's better to enable fwdflat and bestpath in the end, they increase accuracy
and speedup the decoding.