[Kaldi-users] HMM Topology skipping question

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all,

The HMM topology in standard Kaldi recipe doesn't seem to have state
skipping, eg, hmm state 0 and 1 doesn't go to state 3 directly.  Would this
introduce a limitation that a phone must be pronounced for at least 3
frames (30ms), eg, takes 3 frames to transition out?

The reason for asking is that we have seen some poor decoding accuracy for
very fast speeches.  In the fast speech segments , phones were pronounced
definitely less than 30ms.  This results in very high phone errors.
Separate gmm-align experiments in the same segments also point to this as
well.  The smallest phone alignment window from gmm-align is 30ms.

We probably will experiment with introducing skipping in HMM topology.
Before we start, any heads-ups or particular reasons that this may not be a
good idea?  Or, am I missing something entirely?

--
Thanks

Ben Jiang