[Kaldi-users] Kaldi's recipe with "sp" and "sil" model

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all, Daniel,

I've recently worked on Kaldi toolkit, got a problem with sp model.

In a typical HTK system, 'sil' and 'sp' are conceptional different, even in
terms of number of states, 'sil' tends to have 3-5 states, whereas 'sp' has
1.

In Kaldi, if I get it right, recipes introduce a single concept called
'optional silence', which can occur at the beginning/between-words/end of
an utterance.  This "optional silence" plays the role of both 'sp' and
'sil' in HTK.

is it possible to train a Kaldi model with seperate sp and sil model? I do
this mainly because I have another system that assumes sp and sil are
different, and I want kaldi to provide acoustic model for that system.

Now one solution I can think of, is to modify the default L.fst structure
in Kaldi (make_lexicon_fst.sh script), such as substituting sil with sp on
word_end_phone branching arc(sil-state/loop-state branching), but I'm not
sure if this breaks any assumption of other parts of the recipe.

Best regards,
Jiayu