The final mdef file constructed after tree pruning contains a single senone sequence for multiple triphone combinations. How is the correct triphone (for a sseq) selected during recognition?
1) Using LM? or
2) Any other source of information?
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
how is the correct triphone (for a sseq) selected during recognition?
You probably have misconception about speech recognition process. You can read a textbook on speech recognition to understand what is going on. You need to understand how lextree works.
In short, the recognizer doesn't care about triphones, it only cares about senones and word labels which are constructed during startup. Triphone information is completely omitted from the search. Recognizer knows what senone sequence corresponds to what word label. This mapping is stored in a special structure called lextree.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
The final mdef file constructed after tree pruning contains a single senone sequence for multiple triphone combinations. How is the correct triphone (for a sseq) selected during recognition?
1) Using LM? or
2) Any other source of information?
Thanks.
You probably have misconception about speech recognition process. You can read a textbook on speech recognition to understand what is going on. You need to understand how lextree works.
In short, the recognizer doesn't care about triphones, it only cares about senones and word labels which are constructed during startup. Triphone information is completely omitted from the search. Recognizer knows what senone sequence corresponds to what word label. This mapping is stored in a special structure called lextree.