I’d like to find resourceses and documentation about how Sphinx works. I know that it uses MFCC as features, GMM-HMM as acoustic model, n-gram for language model and there is a phonetic dictionary for mapping from words to phones, but I can’t find some documentation describing the detail implementation and whole architecture of these.
Hi,
I’d like to find resourceses and documentation about how Sphinx works. I know that it uses MFCC as features, GMM-HMM as acoustic model, n-gram for language model and there is a phonetic dictionary for mapping from words to phones, but I can’t find some documentation describing the detail implementation and whole architecture of these.
Thanks in advance.