|
From: Mailing l. u. f. U. C. a. U. <kal...@li...> - 2013-07-29 22:38:11
|
Thanks. > It's unusual that the later stages of training are not better. > Normally you get a substantial improvement. I wonder if this is due to the very small amount of my training data. Is there a recommended recipe that I should follow for this type of data (20K in training data, decoding 1 min long passages)? I tried to use swbd, but ended up going back to using the settings that more closely matched resource management. Nathan On Jul 29, 2013, at 3:27 PM, Daniel Povey wrote: >> 1 - I have a training set of around 5K words, though I could bring that up >> to around 20K > > More language model training data will definitely help. > >> 2 - I am using the kaldi_lm, though I could use SRILM . . not sure if it >> would necessarily improve results > > Probably would make no difference-- more a usability issue. > >> 3 - I am decoding about 1 minute of text, though training data is in 10 >> second epochs. I can mix some of the test data in if that would help. > > It's not considered good form to mix the test data in with training-- > this will give you unrealistically good results. > >> 4 - When I am training deltas I use a very small # of leaves / gauss (100 / >> 1000) to get the best results. The best results are with tri1. Further >> training yields worse results. > > It's unusual that the later stages of training are not better. > Normally you get a substantial improvement. > > Dan > >> 5 - I use the same lexicon for the training and decoding (though a more >> restrictive language model for decoding). > >> Any help / thoughts are appreciated. >> >> Thanks, >> >> Nathan >> >> |