From: Arif K. <ife...@gm...> - 2013-10-29 16:46:29
|
Thanks a lot Daniel. :) Best regards, Arif On Tue, Oct 29, 2013 at 5:41 PM, Daniel Povey <dp...@gm...> wrote: > There are no maximums, but generally it won't pay to make it more than > 10,000 regardless how much data you have. > Dan > > > On Tue, Oct 29, 2013 at 12:35 PM, Arif Khan <ife...@gm...> wrote: > > Thank you very much Daniel for such a detail answer. > > > > One more question in this regard, > > > > Do we have a maximum limit specially for No. of leafes, if we add more > > corpus for training like WSJ than TIDIGITS or WSJ than Air traffic data. > > > > Best regards, > > Arif > > > > > > > > On Tue, Oct 29, 2013 at 5:06 PM, Daniel Povey <dp...@gm...> wrote: > >> > >> The number of leaves should decrease as you decrease the data, but > >> less than proportionally (e.g. 1/10 the data -> 1/4 the number of > >> leaves). > >> The number of Gaussians per leaf, which is the ratio of tot-num-gauss > >> to num-leaves, should decrease as you decrease the data, e.g. 1/10 the > >> data -> maybe 1/2 the average number of Gaussians per leaf. The total > >> number of Gaussians should decrease in a > >> slightly-less-than-proportional way as you decrease the amount of > >> data. > >> > >> Dan > >> > >> > >> On Tue, Oct 29, 2013 at 12:04 PM, Arif Khan <ife...@gm...> > wrote: > >> > Thanks Daniel. > >> > > >> > Do we have any fixed rule, or it is just trying different combinations > >> > and > >> > see the results. > >> > Are the No. of leaves and No. of Gaussian dependent on each other OR > >> > they > >> > are independent means one can change either of them in any > proportions. > >> > > >> > > >> > > >> > Thank you very much. > >> > > >> > Best regards, > >> > Arif > >> > > >> > > >> > On Tue, Oct 29, 2013 at 4:39 PM, Daniel Povey <dp...@gm...> > wrote: > >> >> > >> >> If you have only 1 hour of data, you will have to modify the > arguments > >> >> to the triphone-training script, i.e. the number of tree leaves and > >> >> number of Gaussians should be reduced. > >> >> Dan > >> >> > >> >> > >> >> On Tue, Oct 29, 2013 at 11:37 AM, Arif Khan <ife...@gm...> > >> >> wrote: > >> >> > Thanks Daniel for you quick answer. > >> >> > > >> >> > I have another question. I train a model using the WSJ S1 and S5 > >> >> > recipe, > >> >> > and > >> >> > got good results on monophone system than triphone. I have about > 1000 > >> >> > utterances, with length of ~1 hr, and about 300 vocabulary size. > >> >> > > >> >> > In theory, the triphone system should perform better than > monophone. > >> >> > So > >> >> > I > >> >> > don't know if something is wrong in the tree construction or any > >> >> > thing > >> >> > else > >> >> > could be fixed. > >> >> > > >> >> > Best regards, > >> >> > Arif > >> >> > > >> >> > > >> >> > On Tue, Oct 29, 2013 at 4:17 PM, Daniel Povey <dp...@gm...> > >> >> > wrote: > >> >> >> > >> >> >> Sorry, there is no code per se to get those kind of stats, but you > >> >> >> could perhaps convert the alignments into phone sequences and get > >> >> >> the > >> >> >> stats by hand (see ali-to-phones). > >> >> >> > >> >> >> Regarding the tree construction process, there is probably some > >> >> >> documentation on kaldi.sf.net; if there is a particular aspect of > >> >> >> that > >> >> >> that is unclear, please let us know; but otherwise, I doubt > anyone > >> >> >> has time to respond to your question right now. > >> >> >> > >> >> >> Dan > >> >> >> > >> >> >> > >> >> >> On Tue, Oct 29, 2013 at 10:42 AM, Arif Khan <ife...@gm... > > > >> >> >> wrote: > >> >> >> > Dear Kaldi authors, > >> >> >> > > >> >> >> > I want to do some analysis of the training data. Basically I > want > >> >> >> > to > >> >> >> > find > >> >> >> > out the number of phones (mono phone and tri phone) that > appeared > >> >> >> > in > >> >> >> > the > >> >> >> > training set. (relative frequency). Is there any module/script > >> >> >> > available > >> >> >> > for > >> >> >> > it. > >> >> >> > > >> >> >> > Also, I want to find out the tree construction process. I know > the > >> >> >> > basics > >> >> >> > from wsj/s5 recipe. But, some more details will be helpful. > >> >> >> > > >> >> >> > > >> >> >> > Best regards, > >> >> >> > Arif > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > ------------------------------------------------------------------------------ > >> >> >> > Android is increasing in popularity, but the open development > >> >> >> > platform > >> >> >> > that > >> >> >> > developers love is also attractive to malware creators. Download > >> >> >> > this > >> >> >> > white > >> >> >> > paper to learn more about secure code signing practices that can > >> >> >> > help > >> >> >> > keep > >> >> >> > Android apps secure. > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > >> >> >> > _______________________________________________ > >> >> >> > Kaldi-developers mailing list > >> >> >> > Kal...@li... > >> >> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >> >> >> > > >> >> > > >> >> > > >> > > >> > > > > > > |