CMU Sphinx / Forums / Help: Word confidence

Mike Medved - 2010-01-29

Hi -

Given the latest updates to pocketsphinx, lets say that I've trained my
acoustic model with phrases like:

PAGE FORWARD
I NEED HELP
NEXT STEP

And then I have a test file that might have the following words in it:

PAGE FORWARD NEXT STEP BLAH BLAH NEXT STEP

Is there a way to get a score (or confidence) in each word as it is processed?
I'm using the batch file processor right now. I want to basically throw out
the BLAH BLAH.

Thanks!
M

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-02-01

Dear Mike

You can access confidence score through ps_get_prob API call. Please note that
confidence estimation for partial result is not implemented yet, as doxygen
documentation states.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-10

Hi,
This is John. I've half taken over the project Mike has been working on.
With updated pocketsphinx, I've been playing around with ps_get_prob, and it
didn't take long for me to realize there seems to be a lot of other setup
required.

Currently, I added
prob = ps_get_prob(ps, uttid);
to the write_hypseg() method, just to see what kind of results I might get.
This may not be the correct place to have this, but any guess is as good as
any other since ps_get_prob isn't used otherwise. Regardless, it looks to me
to be impossible for ps_get_prob to return a non-zero value; I thought there
may be some setup change I can make to fix this. Below is a description of
what I have found.

First, I noticed that ps_get_prob() seemed to always return 0, so I looked
further (stepped in). In ngram_search_prob() the only way to return a non-zero
is if ngs->bestpath and ngs->done are true, AND if some other searches aren't
NULL. If all that happens, it returns search->post, which seemed to also be 0.

Then it seemed that ngs->done was always false. I looked for where in the code
it might be changed to true, and found in ngram_search_finish() was where that
might happened. I searched the code for ps_search_finish() and found two
calls to it within ps_end_utt(). I later found out that these each end up
calling two different functions (crazy, right?), but one of them does call
ngram_search_finish(). Unfortunately, I never observed it being called while
ngs->fwdtree or ngs->fwdflat were false; in fact there isn't a line anywhere
in the solution that is "ngs->fwdtree = FALSE." Same thing for ngs->fwdflat.
So ngram_search_finish() can never reach the line "ngs->done = TRUE." (So, I
guess it never finishes?).

There may even be issues after that with regard to how search->post is
changed, but I wanted to try and get this resolved before looking into it.

If this isn't really implemented, then is there another way to obtain
confidence scores in the mean time?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Welcome, John

As for your issue, API documentation states the same pretty clear:

/** 
 * Get posterior probability. 
 * 
 * @note Unless the -bestpath option is enabled, this function will 
 * always return zero (corresponding to a posterior probability of 
 * 1.0).  Even if -bestpath is enabled, it will also return zero when 
 * called on a partial result.  Ongoing research into effective 
 * confidence annotation for partial hypotheses may result in these 
 * restrictions being lifted in future versions. 
 * 
 * @param ps Decoder. 
 * @param out_uttid Output: utterance ID for this utterance. 
 * @return Posterior probability of the best hypothesis. 
 */

Unfortunately I can't understand from your description if you are trying to
satisfy all mentioned conditions - doing tree-stages decoding with
fwdtree/fwdflat/bestpath and finishing the decoding with ps_end_utt.

John Derkacz - 2010-06-11

fwdtree, fwdflat, and bestpath are all enabled. Each segment is finished with
ps_end_utt. I understand that we can't get a hypothesis for a partial result,
but what about a hypothesis for the complete result?

My previous post was describing why it seems to me that it will never return
non-zero, at least under my current conditions.
If I was to disable fwdtree or fwdflat, would ps_end_utt make it to the line
ngs->done = TRUE? Would that still return zero?
I will try this.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-11

By disabling fwdflat, I was able to see something returned.
So now I want to know if this is the right place for me to be looking for
probability (within write_hypseg)
Should I be getting values like:
-2940
-26356
-1132
-6309
If yes, what do they mean?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-06-11

By disabling fwdflat, I was able to see something returned.

Hm, it shouldn't be so. What pocketsphinx/sphinxbase version are you talking
about exactly?

be getting values like: -2940 -26356 -1132 -6309 If yes, what do they mean?

It's probability in log scale. You convert it back with

logmath_exp(ps_get_logmath(ps), prob)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-11

The PocketSphinx and SphinxBase we have from Subversion, revision 10104 and
10103 respectively (May 16, 2010).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-06-11

Are you using fsg search or n-gram language model?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-11

ngram language model.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

You probably might be mixing with older versions or something like that. You
can check pocketsphinx/test/unit/test_ps_simple for example, even with fwdflat
it should return you proper scores:

<s> (0:45) P(w|o) = 0.999900 ascr = -4135665 lscr = 0 lback = 1
go (46:63) P(w|o) = 0.924494 ascr = -1653760 lscr = -663485 lback = 2
forward (64:124) P(w|o) = 0.999500 ascr = -5897969 lscr = -300096 lback = 2
and(2) (125:159) P(w|o) = 0.477694 ascr = -4316161 lscr = -244468 lback = 2
users (160:212) P(w|o) = 0.480280 ascr = -5405696 lscr = -623887 lback = 2
<sil> (213:273) P(w|o) = 1.000000 ascr = -5405696 lscr = -290768 lback = 1

John Derkacz - 2010-06-14

How could I be mixing with older versions? I did a subversion checkout to an
empty folder on my computer.
How do I setup this test using test_ps_simple?
We have been using pocketsphinx_batch.exe. Would I need to record "go forward
and users" and create a -ctl file?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-06-15

This test is a binary, which you can compile using source files. You don't
need to record anything since test goes with all required data.

Your issue might be related to Windows platform you are running on, though it
requires check.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-06-15

How could I be mixing with older versions?

You can forget to update sphinxbase.dll for example with newly compiled
version

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-15

I do update sphinxbase.dll if I update sphinxbase. While debugging, I run
pocketsphinx_batch with a -argfile that passes the equivalent decode
parameters and any additional parameters I am testing out. If I choose to keep
new parameters, I alter the parameters in the "runtool" section of psdecode.pl
accordingly.

I am able to get word confidence scores now with -fwdflat disabled. I haven't
run the test yet, but I'm not going to worry about this quite yet. Right now I
am more concerned with "rejecting" low probabilities. I will be trying
something like this:

/* Check Probability */ // get probability. Value is on logarithmic scale prob = ps_get_prob(ps, &uttid); // convert the probability to a usable "percentage" value prob = 100*(logmath_exp(ps_get_logmath(ps), prob)); // reject if the word is below the desired probability if (prob < 25) { /* word rejection code here */ }
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-06-18

Sorry, I didn't have a chance to setup windows development again, it's painful
to reboot

Have you any news on this issue? Did you succeed to run the test?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-24

I haven't run the test yet. It is still unclear to me what I need to do in
order to run that test.

This test is a binary, which you can compile using source files.

So, do I make a new project using all the files? which files?
or do I change something in my decode.cfg file?
or do I run decode\slave.pl with some additional -parameter?

Lately, finding a way to have -fwdflat enabled hasn't been top priority to me.
Instead I have been testing out which probability works the best for turning
away improper phrases, while not turning away correct phrases.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Derkacz - 2010-06-24

This issue is resolved. I didn't run the test, but I re-enabled -fwdflat and
continued to get probability results. The problem I was seeing had more to do
with which part of the decode process I was trying to read probability.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Amin Yazdani - 2011-05-16

Hey guys,

I was wondering if there is any way to get alternative results for each word
and their probabilities as well.

like 5-best alternatives for each word with their probabilities, acoustic
model score, language model score, etc.

Any thoughts?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

amytop - 2012-02-21

nshmyrev: i'm fresh in sphinx, can you tell me how do you get these outputs
please?

(0:45) P(w|o) = 0.999900 ascr = -4135665 lscr = 0 lback = 1 go (46:63)
P(w|o) = 0.924494 ascr = -1653760 lscr = -663485 lback = 2 forward (64:124)
P(w|o) = 0.999500 ascr = -5897969 lscr = -300096 lback = 2 and(2) (125:159)
P(w|o) = 0.477694 ascr = -4316161 lscr = -244468 lback = 2 users (160:212)
P(w|o) = 0.480280 ascr = -5405696 lscr = -623887 lback = 2 <sil> (213:273)
P(w|o) = 1.000000 ascr = -5405696 lscr = -290768 lback = 1</sil>

~~thank you so much!~~

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-22

nshmyrev: i'm fresh in sphinx, can you tell me how do you get these outputs
please?

I'm running the test named test_ps_simple

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

amytop - 2012-02-22

I got it,problems were solved,Thank you very much!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Word confidence

Speech Recognition Toolkit

Forums

Help

Word confidence document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Word confidence