Re: [Kaldi-users] issues with decoding (self loops) in tri2a for 1-minute speech segments

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Lowering the beam will make it faster and somewhat less accurate,
that's the main effect.
Dan

On Mon, Jul 1, 2013 at 1:32 AM, Nathan Dunn <nd...@ca...> wrote:
>
> This seemed to do the trick.
>
> Would you also recommend lower the beam, as well.
>
> Thanks,
>
> Nathan
>
>
> On Jun 30, 2013, at 9:44 PM, Daniel Povey wrote:
>
> You could try gmm-latgen-faster which should be faster.
> You can set the lattice beam using e.g. --lattice-beam=5.0
> That should stop this error.
> Dan
>
>
> On Mon, Jul 1, 2013 at 12:40 AM, Nathan Dunn <nd...@ca...> wrote:
>
>
> Dan,
>
>
> Thanks for quick reply.   I am using an updated resource management script.
> The beam is 20 and an acoustic scale of 0.1.   I'm not sure how I would
> specify the lattice-beam.   I am in the stable branch.
>
>
> gmm-latgen-simple --beam=20.0 --acoustic-scale=0.1
> --word-symbol-table=$lang/words.txt \
>
>  $srcdir/final.mdl $graphdir/HCLG.fst "$feats" "ark:|gzip -c > $dir/lat.gz"
> \
>
>  ark,t:$dir/test.tra ark,t:$dir/test.ali \
>
>     2> $dir/decode.log || exit 1;
>
>
> I generated the language model using the CMU toolkit (not sure if that is
> best practices).  For decoding its a little unusual.
>
>
> The language model is (currently) a set of the same 3 canonical passages.
> Each reader (100) reads the same 3 passages.   We use these to build the
> language model, which is not ideal as it would be better to use the correct
> transcript (we are currently in the process of getting this), though it
> would surprise me if it failed because of this (and it works surprisingly
> well when it doesn't fail).
>
>
> So, after reading the script and taking your advice, I'm trying going to
> reduce the beam to 13 and an acoustic value of 0.07 (though I think it tries
> to fit this anyway).   However, if gives me a similar failure.
>
>
> So, here are a few observations:
>
> 1 - it looks like I am getting this from poor quality audio (reader max's
> out input gain)
>
> 2 - should I use another method other than: gmm-latgen-simple   . . . looked
> like there were quite a few other options
>
> 3 - are there other good parameters you would recommend?
>
>
> Thanks,
>
>
> Nathan
>
>
>
> On Jun 30, 2013, at 12:37 PM, Daniel Povey wrote:
>
>
> In case you are not on the list or did not get the reply.
>
> Please cc the list if you reply.
>
>
>
> ---------- Forwarded message ----------
>
> From: Daniel Povey <dp...@gm...>
>
> Date: Sun, Jun 30, 2013 at 3:37 PM
>
> Subject: Re: [Kaldi-users] issues with decoding (self loops) in tri2a
>
> for 1-minute speech segments
>
> To: kal...@li...
>
>
>
> Can you describe the language model you use for the decoding phase?
>
> I'd like to know in order to understand what scenarios this is most
>
> likely to happen in.
>
> What values did you use for "beam" and "lattice-beam"?  Typically you
>
> should be able to solve these problems by reducing "lattice-beam".
>
>
>
> Dan
>
>
>
> On Sun, Jun 30, 2013 at 3:26 PM, Mailing list used for User
>
> Communication and Updates <kal...@li...> wrote:
>
>
> Hello,
>
>
> I'm trying to decode long (1 minute) speech segments (multiple sentences)
>
> having trained on short (<15 second) corpuses.   For some reason I get
>
> random failures of the following(even longer logs below):
>
>
> WARNING (gmm-latgen-simple:Close():kaldi-io.cc:444) Pipe compute-cmvn-stats
>
> scp:data/test_childspeech/feats.scp ark:- | apply-cmvn --norm-vars=false
>
> ark:- scp:data/test_childspeech/feats.scp ark:- | add-deltas ark:- ark:- |
>
> had nonzero return status 36096
>
> ERROR
>
> (gmm-latgen-simple:EpsilonClosure():fstext/determinize-lattice-pruned-inl.h:664)
>
> Lattice determinization aborted since looped more than 500000 times during
>
> epsilon closure.
>
>
> I've had it decode 135 of these, but sometimes it will fail after 5-10 with
>
> this error.
>
>
> My training model is the LDC9763 child's speech corpus, of which I use about
>
> 4K sentences.  If I test on a subset (100 not int he training set) I get
>
> great results (using the tri2a method).
>
>
> The language model I use for testing is the target speakers' target sentence
>
> (45 speakers speaking the same 3 sentences).
>
>
> Next, I was going to try breaking up the decoding into smaller segments
>
> (those seem to work) as well as have a more accurate language model for the
>
> decoding phase.
>
>
> Any ideas?
>
>
> Thanks,
>
>
> Nathan
>
>
>
>
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.h:285)
>
> Rebuilding repository.
>
> LOG
>
> (gmm-latgen-simple:RebuildRepository():fstext/determinize-lattice-pruned-inl.
>
>
> Nathan Dunn, Ph.D.
>
> Scientific Programer
>
> College of Arts and Science IT
>
> 541-221-2418
>
> nd...@ca...
>
>
>
>
>
> ------------------------------------------------------------------------------
>
> This SF.net email is sponsored by Windows:
>
>
> Build for Windows Store.
>
>
> http://p.sf.net/sfu/windows-dev2dev
>
> _______________________________________________
>
> Kaldi-users mailing list
>
> Kal...@li...
>
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>
>
>