Menu

HMM and AM score

Help
Vamsi
2016-11-18
2016-11-29
  • Vamsi

    Vamsi - 2016-11-18

    I am trying to understand the exact application of HMM operations used for generating the AM score. Given that three types of operations possible using HMM are evaluation( predicting the likelyhood of an observation sequence being generated by a HMM), classfication( predicting the hidden state sequence that generated a given observation) and training( generating the HMM model that matches the observation sequence), is that evaluation alone is used to generate the AM score? Since the goal is find sting of HMMs that make up an utterance.

    If yes, what is the basis for the mulitple combinations of concatenated HMMs evaluated during scoring using the Token Passing algorithm? The way I see every word in the dictionary is mapped to a finite set of pronoucations. Each pronounciation can result in one combination of triphone HMM string. So is it that each of these triphone HMM strings are evaluated to generate the AM score along a path?

    I have deliberately left out the role of LM score in this question as I was focussing only the AM scoring aspects.

     
    • Arseniy Gorin

      Arseniy Gorin - 2016-11-23

      Your question is quite fundamental and has a too long answer for this forum.

      In short, you are right in part - likelihood computation is a core part of the speech recognition decoder.

      The next part of your question is a little more complex. In practice you do not evaluate HMMs for each word or pronunciation. You build a graph either in advance or dynaically, in both cases using language model to create the word search space and lexicon to go down to the phoneme level. If you scored each possible word sequence, this would go exponential... Also in practice you usually use Viterbi algorithm to evaluate the score for each possible path and cut some paths early if the score is too low (beam search heuristic)

      I'd suggest you to check "Efficient Algorithms for Speech Recognition" by Ravishankar or a bit more recent Jurafsky and Martin "Speech and Language Processing".

       
  • Vamsi

    Vamsi - 2016-11-29

    Thank you Arseniy. That helped alot!

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.