Menu

nBest results differ between PS0.7 and nghtly

Help
creative64
2011-10-14
2012-09-22
  • creative64

    creative64 - 2011-10-14

    Hi,

    I use pocketsphinx nBest in FSG configuration. Reults seem
    consistent in Pocketsphinx versions 0.6.1 and 0.7 release but starting from
    some nightly build (post version 0.7), something seem to have
    changed. Observations are as follows (for nightly builds):

    a) All nBest hypotheses resturn the same string (In some cases the score
    values look different but strings are still the same).

    b) Nightly builds seem to run slower and are printing way more start_nodes
    and end_nodes on the screen for utterances compared to version 0.6.1 and
    0.7 (Functions find_start_node and find_end_node).

    c) I did a quick check for the nBest code and that looks functionally
    similar in all these versions this means the DAG itself is getting created
    differently for the nightlies.

    What could be the reason ?

    Note: The most recent Pocketsphinx nightly buid I tried is from 10 Oct 2011.

    Regards

     
  • Nickolay V. Shmyrev

    Hello

    There were changes in trunk in FSG search. The purpose was to enforce FSG
    structure in DAG. Previously it wasn't enforced. See the following revisions:

    r10982 | nshmyrev | 2011-05-26 03:29:13 -0400 (Thu, 26 May 2011) | 3 lines
    
    Prefer final hypothesis if it has the same score
    
    
    ------------------------------------------------------------------------
    r10978 | nshmyrev | 2011-05-24 03:03:37 -0400 (Tue, 24 May 2011) | 3 lines
    
    Keep fsg lattice aligned with fsg grammar structure
    
     
  • Nickolay V. Shmyrev

    The core change is this:

     static ps_latnode_t *
    -new_node(ps_lattice_t *dag, fsg_model_t *fsg, int sf, int ef, int32 wid, int32 ascr)
    +find_node(ps_lattice_t *dag, fsg_model_t *fsg, int sf, int32 wid, int32 node_id)
     {
         ps_latnode_t *node;
    
         for (node = dag->nodes; node; node = node->next)
    -        if (node->sf == sf && node->wid == wid)
    +        if ((node->sf == sf) && (node->wid == wid) && (node->node_id == node_id))
                 break;
    +    return node;
    
    Basically we don't match the nodes which belong to the different FSG nodes.
    Previously if words were the same we matched them. For that reason the grammar

    word1 sil word2 sil word3

    can be recognized as

    word1 sil word3
    

    skipping word2 because first sil and second sil are different

    For the nbest issues, I really recommend to implement a function to strip
    identical nbests from the list so that nbests results are different, that
    problem stands for a long time and we need to solve it finally.

     
  • creative64

    creative64 - 2011-10-16

    Hi NS,

    Thanks for the information. All the changes seem to be in the DAG creation
    process (within ps_get_hyp).

    Q1. So is it fair to assume that up to history entry creation (steps before
    ps_get_hyp) results will be same for both (versions 0.7 and nightlies) ?

    Q2. This change ultimately is to get better WER for FSG based decoding.
    Correct ?

    Regards,

     
  • Nickolay V. Shmyrev

    Q1. So is it fair to assume that up to history entry creation (steps before
    ps_get_hyp) results will be same for both (versions 0.7 and nightlies) ?

    Should be so

    Q2. This change ultimately is to get better WER for FSG based decoding.
    Correct ?

    It's to make recognizer return hypothesis which match the grammar specified.
    It can make WER better or worse, it depends on grammar.

     

Log in to post a comment.