Menu

Problems with ps_get_prob()

Help
IvanBembe
2015-04-13
2016-03-27
  • IvanBembe

    IvanBembe - 2015-04-13

    Hi all,

    I'm trying to obtain a confidence score that show me how many "good" is the obtained final hypothesis about the "real spoken words" (the &score that I pass to the function ps_get_hyp() partial hypothesis doesn't show me this). I want a kind of rule like if the confidence score of a final hypothesis is <0.5% or 50% I discard it. I uses a C++/CLI wrapper of pocketsphinx.

    I read the FAQ of the official wiki which explains as follows:

    Garbage Models - requires you to train special model. There is no public model with garbage phones >which can reject OOV words now. There are models with fillers, but they reject only specific sounds >(breath, laught, um). They can't reject OOV word.

    I implemented it and works remarkably well.

    Generic Word Model - same as above, requires you to train special model. There are no public models >yet.

    I'm working with grammar. I skip this step.

    Confidence Scores - confidence score (ps_get_prob) can be reliably calculated only for a large >vocabulary (> 100 words). It doens't work with small grammar. There are approaches with phone-based >confidence and one of them was implemented in sphixn2, but pocketsphinx doesn't support them. >Confidence scoring also require you to have three-pass recognition (enable both fwdflat and >bestpath).
    So for now recommendation for rejection with the small grammar is - train your own model (make it >public). For the large language model (> 100 words) use confidence score.

    My grammar contains exactly 100 rules and 1 word for rule.

    I'm trying to implement it and i always get 0 when I call ps_get_prob(ps) with right words (inside the grammar) and wrong words (outside the grammar). I read the pocketsphinx.h and this says that:

    note Unless the -bestpath option is enabled, this function will
    always return zero (corresponding to a posterior probability of 1.0).

    config = cmd_ln_init(NULL, ps_args(), TRUE,
    "-hmm", hmmPath,
    "-dict", dictPath,
    "-mmap", "no",
    "-logfn", logPath,
    "-kws_threshold", "1e-40",
    "-fwdflat", "yes",
    "-bestpath", "yes",
    NULL);

    -fwdflat and -bestpath appears enabled in the log. These are well-activated?

    Even if -bestpath is enabled, it will also return zero when called on a partial result.

    I'm calling ps_get_prob() when the speech is ended so I gather that I'm not calling in a partial hypothesis:

    if (!isCurrentInSpeech == 1 && isPreviousInSpeech == 1)
    {
    OnResultFinalizedBySilence(previousHyp);
    int32 getprob = ps_get_prob(ps);
    System::Console::WriteLine("ps_get_prob: " + getprob);
    System::String^ restart = RestartProcessing();
    System::Console::WriteLine(restart);
    }

    Where is my error? I have spent many hours trying to find it without success. :(

    P.S: In LM mode I always get 0 too (the model contains 2491 ngrams, 7448 2-grams and 10140 3-grams) .

    Thanks so much for all.

     

    Last edit: IvanBembe 2015-04-13
  • IvanBembe

    IvanBembe - 2015-04-13

    Here is the C++/CLI wrapper:

    Thanks.

     
  • IvanBembe

    IvanBembe - 2015-04-13

    And here is the log

     
  • IvanBembe

    IvanBembe - 2015-04-13

    Here is my grammar (is a Spanish grammar) which contains 100 rules with 1 word for rule. The dictionary contains more than 2000 words.

     

    Last edit: IvanBembe 2015-04-13
    • Nickolay V. Shmyrev

      Confidence estimation is not implemented in FSG mode. In language model mode it should work.

       
      • virginia

        virginia - 2016-03-26

        Is it implemented in jsgf mode? Thanks.

         
        • Nickolay V. Shmyrev

          JSGF and FSG modes are the same

           
  • IvanBembe

    IvanBembe - 2015-04-13

    And in older versions of pocketsphinx? f.e: 0.8?

    Thanks Nikolay.

     
    • Nickolay V. Shmyrev

      No, it is not implemented. It is not a trivial algorithm. If you need keyword detection, please use keyword spotting mode, it should fit your task.

       
  • IvanBembe

    IvanBembe - 2015-04-14

    In language model I always get 0 too.. where could be my error?

     
    • Nickolay V. Shmyrev

      No idea, you can try to reproduce your issue with C code and share code and data to reproduce your problem.

       

Log in to post a comment.