|
From: Mailing l. u. f. U. C. a. U. <kal...@li...> - 2013-07-29 22:42:20
|
There's no special thing we do for this. Just play with the #leaves and #Gaussians. Dan On Mon, Jul 29, 2013 at 6:38 PM, Nathan Dunn <nd...@ca...> wrote: > > Thanks. > > It's unusual that the later stages of training are not better. > Normally you get a substantial improvement. > > > I wonder if this is due to the very small amount of my training data. > > Is there a recommended recipe that I should follow for this type of data > (20K in training data, decoding 1 min long passages)? I tried to use swbd, > but ended up going back to using the settings that more closely matched > resource management. > > > Nathan > > On Jul 29, 2013, at 3:27 PM, Daniel Povey wrote: > > 1 - I have a training set of around 5K words, though I could bring that up > > to around 20K > > > More language model training data will definitely help. > > 2 - I am using the kaldi_lm, though I could use SRILM . . not sure if it > > would necessarily improve results > > > Probably would make no difference-- more a usability issue. > > 3 - I am decoding about 1 minute of text, though training data is in 10 > > second epochs. I can mix some of the test data in if that would help. > > > It's not considered good form to mix the test data in with training-- > this will give you unrealistically good results. > > 4 - When I am training deltas I use a very small # of leaves / gauss (100 / > > 1000) to get the best results. The best results are with tri1. Further > > training yields worse results. > > > It's unusual that the later stages of training are not better. > Normally you get a substantial improvement. > > Dan > > 5 - I use the same lexicon for the training and decoding (though a more > > restrictive language model for decoding). > > > Any help / thoughts are appreciated. > > > Thanks, > > > Nathan > > > > |