From: Eamonn K. <Eam...@cs...> - 2014-03-26 15:55:07
|
On 26/03/14 15:48, Daniel Povey wrote: > > Is it possible to perform recognition confined to a small grammar > given > that you have trained on a large grammar that includes the small > grammar > as a subset? > > I ask because I attempted to follow the recipe of > http://vpanayotov.blogspot.ie/2012/06/kaldi-decoding-graph-construction.html > to do this but to no avail. > > Then I attempted to take egs/voxforge/s5/run.sh and strip out the > training section and change the corpus.txt file to obtain the small > grammar. The idea was that I would generate L and G using the existing > run script but then combine it with Ha and C to get the reduced > fst. It > all compiles and looks like it should work, but there must be a > mismatch > between the Ha of the existing large grammar model and the path > through > the model that uses the smaller G. The recogniser will respond to the > speaker but produces completely wrong results and in many cases just > produces the same word output every time. > > > This should work. I suspect you have done something like giving it > the wrong sample-rate audio. The features are not comparable between > different sample rates. Check the log-likelihoods you get on decoding > (caution: they may or may not be printed out multiplied by the > acoustic scale)- if these are very different from your "matched" > decoding, then likely the acoustics are wrong. Also see the fMLLR > objective function improvement, if you're using fMLLR- if the > acoustics are mismatched it will be very large, e.g. >5. I used the online-gmm-decode-faster which is what I usually use. Giving the large grammar the same words it recognises them some of the time. Obviously my point of then using the small grammar is to improve the accuracy. I don't tend to use online-wav-gmm-decode-faster and I've found that if I use the sample rate of 44100 it gets flagged as an error anyway. > > I've attempted also to see how I might use HCLG o G_s^- o G_l > where G_s > is the small grammar and G_l is the large grammar, but I see no > documentation on who this is actually performed using a script. > > > This is implemented, it's called "biglm" in the code and scripts, > there is an example in the WSJ scripts, egs/wsj/s5/. Thanks I'll have a look at it again. I was looking for it in the egs/wsj/s5 because I found a s3 version in bitbucket. Maybe I just missed it. > Dan > > > > > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases > and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > <mailto:Kal...@li...> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > -- Best Regards, Eamonn + + + Eamonn Kenny B.A., M.Sc., Ph.D. CNGL/Speech Communication Lab, Tel: 00+353-1-8961797 Dept. of Computer Science, Email: Eam...@sc... F.34, O'Reilly Institute, http://www.cs.tcd.ie/Eamonn.Kenny Trinity College Dublin, http://eamonnmkenny.wordpress.com Dublin 2, Ireland. + + + |