I use pocketsphinx nBest in FSG configuration. Reults seem
consistent in Pocketsphinx versions 0.6.1 and 0.7 release but starting from
some nightly build (post version 0.7), something seem to have
changed. Observations are as follows (for nightly builds):
a) All nBest hypotheses resturn the same string (In some cases the score
values look different but strings are still the same).
b) Nightly builds seem to run slower and are printing way more start_nodes
and end_nodes on the screen for utterances compared to version 0.6.1 and
0.7 (Functions find_start_node and find_end_node).
c) I did a quick check for the nBest code and that looks functionally
similar in all these versions this means the DAG itself is getting created
differently for the nightlies.
What could be the reason ?
Note: The most recent Pocketsphinx nightly buid I tried is from 10 Oct 2011.
Regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There were changes in trunk in FSG search. The purpose was to enforce FSG
structure in DAG. Previously it wasn't enforced. See the following revisions:
static ps_latnode_t *
-new_node(ps_lattice_t *dag, fsg_model_t *fsg, int sf, int ef, int32 wid, int32 ascr)
+find_node(ps_lattice_t *dag, fsg_model_t *fsg, int sf, int32 wid, int32 node_id)
{
ps_latnode_t *node;
for (node = dag->nodes; node; node = node->next)
- if (node->sf == sf && node->wid == wid)
+ if ((node->sf == sf) && (node->wid == wid) && (node->node_id == node_id))
break;
+ return node;
Basically we don't match the nodes which belong to the different FSG nodes.
Previously if words were the same we matched them. For that reason the grammar
word1 sil word2 sil word3
can be recognized as
word1 sil word3
skipping word2 because first sil and second sil are different
For the nbest issues, I really recommend to implement a function to strip
identical nbests from the list so that nbests results are different, that
problem stands for a long time and we need to solve it finally.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I use pocketsphinx nBest in FSG configuration. Reults seem
consistent in Pocketsphinx versions 0.6.1 and 0.7 release but starting from
some nightly build (post version 0.7), something seem to have
changed. Observations are as follows (for nightly builds):
a) All nBest hypotheses resturn the same string (In some cases the score
values look different but strings are still the same).
b) Nightly builds seem to run slower and are printing way more start_nodes
and end_nodes on the screen for utterances compared to version 0.6.1 and
0.7 (Functions find_start_node and find_end_node).
c) I did a quick check for the nBest code and that looks functionally
similar in all these versions this means the DAG itself is getting created
differently for the nightlies.
What could be the reason ?
Note: The most recent Pocketsphinx nightly buid I tried is from 10 Oct 2011.
Regards
Hello
There were changes in trunk in FSG search. The purpose was to enforce FSG
structure in DAG. Previously it wasn't enforced. See the following revisions:
The core change is this:
word1 sil word2 sil word3
can be recognized as
skipping word2 because first sil and second sil are different
For the nbest issues, I really recommend to implement a function to strip
identical nbests from the list so that nbests results are different, that
problem stands for a long time and we need to solve it finally.
Hi NS,
Thanks for the information. All the changes seem to be in the DAG creation
process (within ps_get_hyp).
Q1. So is it fair to assume that up to history entry creation (steps before
ps_get_hyp) results will be same for both (versions 0.7 and nightlies) ?
Q2. This change ultimately is to get better WER for FSG based decoding.
Correct ?
Regards,
Should be so
It's to make recognizer return hypothesis which match the grammar specified.
It can make WER better or worse, it depends on grammar.