Menu

How can I get confidence score?

Help
CAO.Z.H
2007-10-24
2012-09-22
  • CAO.Z.H

    CAO.Z.H - 2007-10-24

    When i uses sphinx to recognize a sentence, how can i get confidence score for every word?
    Is it acoustic score ( AScr(UnNorm) ) ?
    And what is the meaning of the value of AScr(UnNorm)?

     
    • David Huggins-Daines

      However I should mention that's done at an utterance level.

      For word level confidence scoring, word posterior probabilities are the generally accepted way to do things, although they don't work particularly well. There is code for this in Sphinx3 but I am not sure that it is correct, because it doesn't actually give you probabilities, just some magic numbers, and it also contains a lot of mysterious scaling factors. The guy who wrote it was using its output as input to a neural network classifier so he didn't really care if it was correct as long it they gave good results.

      See the code in sphinx3/src/libs3decoder/libconfidence if you dare. Or read the original paper: http://citeseer.ist.psu.edu/wessel98using.html

      The most effective way to get actual word posterior probabilities is to dump out HTK format lattices (-outlatfmt htk -outlatdir .) from Sphinx3, then run SRILM's rescoring tool on them. SRILM is not free software but it's freely available for research purposes: http://www.speech.sri.com/projects/srilm/

       
    • Nickolay V. Shmyrev

      About confidence, see

      https://sourceforge.net/forum/forum.php?thread_id=1847237&forum_id=5470

      AScr is just a score from acoustic model the probability of the observation sequence with our HMM, what else can it be?

       
    • CAO.Z.H

      CAO.Z.H - 2007-10-25

      Thanks!
      Yes, AScr is the probability of the observation sequence in log domain. In my opinion, the probability must less than 1, so AScr should be a negative.
      But when i use Sphinx3, the output of AScr sometime is a positive number, i don't know why.

       
      • David Huggins-Daines

        Hi,

        This is a "feature" of Sphinx3. The actual observation probabilities that are used in search are not really probabilities, they are actually Gaussian densities. While the area under a Gaussian integrates to 1.0, the actual density value at any point can be greater than 1.0 (in fact, as the variance approaches zero, the density value for the mean approaches infinity).

        Sphinx2 always normalizes Gaussian densities so that they appear to be probabilities. Sphinx3, for some reason, does not. Therefore the acoustic score can be positive.

         
    • CAO.Z.H

      CAO.Z.H - 2007-10-25

      Thank you all.

      But where can i get some document for this? For example, some formulas.

      Thanks again.

       
      • Nickolay V. Shmyrev

        > But where can i get some document for this? For example, some formulas.

        There are lot of articles and books on HMM and sphinx in particular. For example fast GMM search is described here:

        http://www.cs.cmu.edu/~jsherwan/pubs/icslp2004.pdf

        but often it's hard to establish relationship between formulae and the code and comine it in a single document :(

         
    • Stefanie Tellex

      Stefanie Tellex - 2007-10-26

      In sphinx4, I ended up grabbing all the confidence features I could find (from the acoustic model, the language model, etc), and throwing it into a classifier that I trained on transcribed data. Basically what they describe in this paper: http://citeseer.ist.psu.edu/hazen00recognition.html

      I just played around with classifiers in weka, and we're getting about 85% accurate accept/reject decisions. The most useful feature is the span of the parse of the utterance. If a lot of it parsed, it was probably correct.

      Stefanie

       
      • David Huggins-Daines

        Yes, that is the best approach. The probabilities the recognizer gives you (even posterior probabilities) are not very reliable for confidence scoring even with smart thresholding. This is pretty much the same approach that the Ravenclaw/Olympus dialog framework here at CMU uses - there is a confidence agent called Helios which integrates dialog, parsing, and ASR information to do accept/reject decisions. See: http://reports-archive.adm.cs.cmu.edu/anon/2002/CMU-CS-02-190.ps

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.