Given the latest updates to pocketsphinx, lets say that I've trained my
acoustic model with phrases like:
PAGE FORWARD
I NEED HELP
NEXT STEP
And then I have a test file that might have the following words in it:
PAGE FORWARD NEXT STEP BLAH BLAH NEXT STEP
Is there a way to get a score (or confidence) in each word as it is processed?
I'm using the batch file processor right now. I want to basically throw out
the BLAH BLAH.
Thanks!
M
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You can access confidence score through ps_get_prob API call. Please note that
confidence estimation for partial result is not implemented yet, as doxygen
documentation states.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
This is John. I've half taken over the project Mike has been working on.
With updated pocketsphinx, I've been playing around with ps_get_prob, and it
didn't take long for me to realize there seems to be a lot of other setup
required.
Currently, I added
prob = ps_get_prob(ps, uttid);
to the write_hypseg() method, just to see what kind of results I might get.
This may not be the correct place to have this, but any guess is as good as
any other since ps_get_prob isn't used otherwise. Regardless, it looks to me
to be impossible for ps_get_prob to return a non-zero value; I thought there
may be some setup change I can make to fix this. Below is a description of
what I have found.
First, I noticed that ps_get_prob() seemed to always return 0, so I looked
further (stepped in). In ngram_search_prob() the only way to return a non-zero
is if ngs->bestpath and ngs->done are true, AND if some other searches aren't
NULL. If all that happens, it returns search->post, which seemed to also be 0.
Then it seemed that ngs->done was always false. I looked for where in the code
it might be changed to true, and found in ngram_search_finish() was where that might happened. I searched the code for ps_search_finish() and found two
calls to it within ps_end_utt(). I later found out that these each end up
calling two different functions (crazy, right?), but one of them does call
ngram_search_finish(). Unfortunately, I never observed it being called while
ngs->fwdtree or ngs->fwdflat were false; in fact there isn't a line anywhere
in the solution that is "ngs->fwdtree = FALSE." Same thing for ngs->fwdflat.
So ngram_search_finish() can never reach the line "ngs->done = TRUE." (So, I
guess it never finishes?).
There may even be issues after that with regard to how search->post is
changed, but I wanted to try and get this resolved before looking into it.
If this isn't really implemented, then is there another way to obtain
confidence scores in the mean time?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Unfortunately I can't understand from your description if you are trying to
satisfy all mentioned conditions - doing tree-stages decoding with
fwdtree/fwdflat/bestpath and finishing the decoding with ps_end_utt.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
fwdtree, fwdflat, and bestpath are all enabled. Each segment is finished with
ps_end_utt. I understand that we can't get a hypothesis for a partial result,
but what about a hypothesis for the complete result?
My previous post was describing why it seems to me that it will never return
non-zero, at least under my current conditions.
If I was to disable fwdtree or fwdflat, would ps_end_utt make it to the line
ngs->done = TRUE? Would that still return zero?
I will try this.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
By disabling fwdflat, I was able to see something returned.
So now I want to know if this is the right place for me to be looking for
probability (within write_hypseg)
Should I be getting values like:
-2940
-26356
-1132
-6309
If yes, what do they mean?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You probably might be mixing with older versions or something like that. You
can check pocketsphinx/test/unit/test_ps_simple for example, even with fwdflat
it should return you proper scores:
How could I be mixing with older versions? I did a subversion checkout to an
empty folder on my computer.
How do I setup this test using test_ps_simple?
We have been using pocketsphinx_batch.exe. Would I need to record "go forward
and users" and create a -ctl file?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I do update sphinxbase.dll if I update sphinxbase. While debugging, I run
pocketsphinx_batch with a -argfile that passes the equivalent decode
parameters and any additional parameters I am testing out. If I choose to keep
new parameters, I alter the parameters in the "runtool" section of psdecode.pl
accordingly.
I am able to get word confidence scores now with -fwdflat disabled. I haven't
run the test yet, but I'm not going to worry about this quite yet. Right now I
am more concerned with "rejecting" low probabilities. I will be trying
something like this:
/*CheckProbability*/// get probability. Value is on logarithmic scaleprob=ps_get_prob(ps,&uttid);// convert the probability to a usable "percentage" valueprob=100*(logmath_exp(ps_get_logmath(ps),prob));// reject if the word is below the desired probabilityif(prob<25){/*wordrejectioncodehere*/}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I haven't run the test yet. It is still unclear to me what I need to do in
order to run that test.
This test is a binary, which you can compile using source files.
So, do I make a new project using all the files? which files?
or do I change something in my decode.cfg file?
or do I run decode\slave.pl with some additional -parameter?
Lately, finding a way to have -fwdflat enabled hasn't been top priority to me.
Instead I have been testing out which probability works the best for turning
away improper phrases, while not turning away correct phrases.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This issue is resolved. I didn't run the test, but I re-enabled -fwdflat and
continued to get probability results. The problem I was seeing had more to do
with which part of the decode process I was trying to read probability.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi -
Given the latest updates to pocketsphinx, lets say that I've trained my
acoustic model with phrases like:
PAGE FORWARD
I NEED HELP
NEXT STEP
And then I have a test file that might have the following words in it:
PAGE FORWARD NEXT STEP BLAH BLAH NEXT STEP
Is there a way to get a score (or confidence) in each word as it is processed?
I'm using the batch file processor right now. I want to basically throw out
the BLAH BLAH.
Thanks!
M
Dear Mike
You can access confidence score through ps_get_prob API call. Please note that
confidence estimation for partial result is not implemented yet, as doxygen
documentation states.
Hi,
This is John. I've half taken over the project Mike has been working on.
With updated pocketsphinx, I've been playing around with ps_get_prob, and it
didn't take long for me to realize there seems to be a lot of other setup
required.
Currently, I added
prob = ps_get_prob(ps, uttid);
to the write_hypseg() method, just to see what kind of results I might get.
This may not be the correct place to have this, but any guess is as good as
any other since ps_get_prob isn't used otherwise. Regardless, it looks to me
to be impossible for ps_get_prob to return a non-zero value; I thought there
may be some setup change I can make to fix this. Below is a description of
what I have found.
First, I noticed that ps_get_prob() seemed to always return 0, so I looked
further (stepped in). In ngram_search_prob() the only way to return a non-zero
is if ngs->bestpath and ngs->done are true, AND if some other searches aren't
NULL. If all that happens, it returns search->post, which seemed to also be 0.
Then it seemed that ngs->done was always false. I looked for where in the code
it might be changed to true, and found in ngram_search_finish() was where that
might happened. I searched the code for ps_search_finish() and found two
calls to it within ps_end_utt(). I later found out that these each end up
calling two different functions (crazy, right?), but one of them does call
ngram_search_finish(). Unfortunately, I never observed it being called while
ngs->fwdtree or ngs->fwdflat were false; in fact there isn't a line anywhere
in the solution that is "ngs->fwdtree = FALSE." Same thing for ngs->fwdflat.
So ngram_search_finish() can never reach the line "ngs->done = TRUE." (So, I
guess it never finishes?).
There may even be issues after that with regard to how search->post is
changed, but I wanted to try and get this resolved before looking into it.
If this isn't really implemented, then is there another way to obtain
confidence scores in the mean time?
Welcome, John
As for your issue, API documentation states the same pretty clear:
Unfortunately I can't understand from your description if you are trying to
satisfy all mentioned conditions - doing tree-stages decoding with
fwdtree/fwdflat/bestpath and finishing the decoding with ps_end_utt.
fwdtree, fwdflat, and bestpath are all enabled. Each segment is finished with
ps_end_utt. I understand that we can't get a hypothesis for a partial result,
but what about a hypothesis for the complete result?
My previous post was describing why it seems to me that it will never return
non-zero, at least under my current conditions.
If I was to disable fwdtree or fwdflat, would ps_end_utt make it to the line
ngs->done = TRUE? Would that still return zero?
I will try this.
By disabling fwdflat, I was able to see something returned.
So now I want to know if this is the right place for me to be looking for
probability (within write_hypseg)
Should I be getting values like:
-2940
-26356
-1132
-6309
If yes, what do they mean?
Hm, it shouldn't be so. What pocketsphinx/sphinxbase version are you talking
about exactly?
It's probability in log scale. You convert it back with
The PocketSphinx and SphinxBase we have from Subversion, revision 10104 and
10103 respectively (May 16, 2010).
Are you using fsg search or n-gram language model?
ngram language model.
You probably might be mixing with older versions or something like that. You
can check pocketsphinx/test/unit/test_ps_simple for example, even with fwdflat
it should return you proper scores:
How could I be mixing with older versions? I did a subversion checkout to an
empty folder on my computer.
How do I setup this test using test_ps_simple?
We have been using pocketsphinx_batch.exe. Would I need to record "go forward
and users" and create a -ctl file?
This test is a binary, which you can compile using source files. You don't
need to record anything since test goes with all required data.
Your issue might be related to Windows platform you are running on, though it
requires check.
You can forget to update sphinxbase.dll for example with newly compiled
version
I do update sphinxbase.dll if I update sphinxbase. While debugging, I run
pocketsphinx_batch with a -argfile that passes the equivalent decode
parameters and any additional parameters I am testing out. If I choose to keep
new parameters, I alter the parameters in the "runtool" section of psdecode.pl
accordingly.
I am able to get word confidence scores now with -fwdflat disabled. I haven't
run the test yet, but I'm not going to worry about this quite yet. Right now I
am more concerned with "rejecting" low probabilities. I will be trying
something like this:
Sorry, I didn't have a chance to setup windows development again, it's painful
to reboot
Have you any news on this issue? Did you succeed to run the test?
I haven't run the test yet. It is still unclear to me what I need to do in
order to run that test.
So, do I make a new project using all the files? which files?
or do I change something in my decode.cfg file?
or do I run decode\slave.pl with some additional -parameter?
Lately, finding a way to have -fwdflat enabled hasn't been top priority to me.
Instead I have been testing out which probability works the best for turning
away improper phrases, while not turning away correct phrases.
This issue is resolved. I didn't run the test, but I re-enabled -fwdflat and
continued to get probability results. The problem I was seeing had more to do
with which part of the decode process I was trying to read probability.
Hey guys,
I was wondering if there is any way to get alternative results for each word
and their probabilities as well.
like 5-best alternatives for each word with their probabilities, acoustic
model score, language model score, etc.
Any thoughts?
nshmyrev: i'm fresh in sphinx, can you tell me how do you get these outputs
please?
thank you so much!I'm running the test named test_ps_simple
I got it,problems were solved,Thank you very much!