From: Joan P. <joa...@gm...> - 2015-04-01 21:19:08
|
> Personally I suspect that you would not get any improvement from doing E-M as > opposed to Viterbi-- the posteriors tend to be pretty peaky anyway. I understand, but precisely my intuition is that the posteriors are not so peaky for my particular application. Of course, I need to check this experimentally and check whether or not EM is worth. > If you are concerned about the randomness of initialization you could always > duplicate your training examples several times, so several different random > paths will be taken. AFAIK the current implementation of Viterbi is deterministic, so even if a copy the training samples, I would get the same path always. Anyway, I think it is a good suggestion before implementing EM or for validating it: instead of doing the Viterbi alignment, I could randomly sample a path with probability proportional to its likelihood and do the data replication trick. With many replicas of each training sample, this should be similar to EM. > Also, E-M training would be at least ten times slower- probably closer to 100 > times slower depending what tricks like pruning you know how to implement. I am aware of this. The first thing after the vanilla-EM that I wanted to implement is beam-search in EM, as HTK does. Anyhow, the data replication trick would also increase the computational cost. > If you really did want to do E-M training, the way to do this would probably to > implement, instead of Viterbi, some kind of forward-backward algorithm that > would directly output posteriors over transition-ids. This is what I intended to do. In fact, I already sketched a Forward algorithm that does this. It needs more work (debugging, beam search, ...) but it seems to work with toy examples. However, I am not so sure how to implement the Backward algorithm, since I must traverse the edges in the FST backwards (to do the backward pass in O(T * (V + E))), and OpenFST does not support this AFAIK. Also, I am not sure if simply transposing the FST would work, since I would have many initial states... Any suggestion on that? > This would create a difficulty for converting alignments though (this happens > when bootstrapping later systems, e.g. starting tri2 from tri1). You would > probably have to just to Viterbi for that one stage. I am not sure what you mean. Could you extend your explanation or point to a recipe where you had to overcome this difficulty? Many thanks for your help and advices, Joan Puigcerver. |