|
From: Mailing l. u. f. U. C. a. U. <kal...@li...> - 2013-06-26 16:38:07
|
Hi all, The standard Kaldi HMM topology doesn't seem to have state skipping, eg, hmm state 0 and 1 doesn't go to state 3 directly. Would this introduce a limitation that a phone must be pronounced for at least 3 frames (30ms)? The reason for asking is that we have seen some poor decoding accuracy for very fast speeches. Our analysis shows rather high phone error. Some phones in the fast speech segments were pronounced definitely less than 30ms. gmm-align seems to point this as well. The smallest phone alignment window from gmm-align is 30ms. We probably will experiment with introducing skipping in HMM topology. Before we start, any heads-ups? Potential pointers/ideas? Or, am I missing something entirely? -- Thanks Ben Jiang |