Re: [Kaldi-developers] Baum-Welch training for HMMs

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

> Personally I suspect that you would not get any improvement from doing
E-M as
> opposed to Viterbi-- the posteriors tend to be pretty peaky anyway.

I understand, but precisely my intuition is that the posteriors are not so
peaky for my particular application. Of course, I need to check this
experimentally and check whether or not EM is worth.

> If you are concerned about the randomness of initialization you could
always
> duplicate your training examples several times, so several different
random
> paths will be taken.

AFAIK the current implementation of Viterbi is deterministic, so even if a
copy the training samples, I would get the same path always. Anyway, I
think it is a good suggestion before implementing EM or for validating it:
instead of doing the Viterbi alignment, I could randomly sample a path with
probability proportional to its likelihood and do the data replication
trick. With many replicas of each training sample, this should be similar
to EM.

> Also, E-M training would be at least ten times slower- probably closer to
100
> times slower depending what tricks like pruning you know how to implement.

I am aware of this. The first thing after the vanilla-EM that I wanted to
implement is beam-search in EM, as HTK does. Anyhow, the data replication
trick would also increase the computational cost.

> If you really did want to do E-M training, the way to do this would
probably to
> implement, instead of Viterbi, some kind of forward-backward algorithm
that
> would directly output posteriors over transition-ids.

This is what I intended to do. In fact, I already sketched a Forward
algorithm that does this. It needs more work (debugging, beam search, ...)
but it seems to work with toy examples.

However, I am not so sure how to implement the Backward algorithm, since I
must traverse the edges in the FST backwards (to do the backward pass in
O(T * (V + E))), and OpenFST does not support this AFAIK. Also, I am not
sure if simply transposing the FST would work, since I would have many
initial states... Any suggestion on that?

> This would create a difficulty for converting alignments though (this
happens
> when bootstrapping later systems, e.g. starting tri2 from tri1).  You
would
> probably have to just to Viterbi for that one stage.

I am not sure what you mean. Could you extend your explanation or point to
a recipe where you had to overcome this difficulty?

Many thanks for your help and advices,

Joan Puigcerver.