Menu

Posterior in sphinx 3

Help
Sylvain
2008-04-03
2012-09-22
  • Sylvain

    Sylvain - 2008-04-03

    Hi,

    Sphinx 3 outputs an acoustic score and a 'language model' score. As far as I understand, the acoustic score is the log posterior, or at least, it only depends on the acoustic model and the list of recognized phonemes. However, the same recording, used with different FSG grammars, but giving the same recognized sentence, gives different acoustic scores (the segmentation is also the same). Is it a bug? How is the acoustic score computed?

    In addition, I don't understand the meaning of the scaling factor.

    Thanks a lot for any help!

    Sylvain

     
    • Nickolay V. Shmyrev

      > it only depends on the acoustic model and the list of recognized phonemes.

      No, it's not correct. The key idea here is rescoring, exactly related to scaling factor. To make search faster and more stable (to give better units more score) scaling is used, good units get higher weight thus they are more prefered during further search.

      For more precise description of the algorithm read the articles on sphinx3, probably Ravi's thesis:

      http://citeseer.ist.psu.edu/ravishankar96efficient.html

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.