You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Daniel P. <dp...@gm...> - 2011-12-07 00:05:02
|
BTW, I just dug up some comments/notes I made for some others who wanted to do this. This is a diff from egs/wsj/s1/run.sh in kaldi-v1.0, but the general ideas should carry over to whatever version and data-set you are using. Dan apps0:kaldi-v1.0: svn diff egs/wsj/s1/run.sh Index: egs/wsj/s1/run.sh =================================================================== --- egs/wsj/s1/run.sh (revision 1893) +++ egs/wsj/s1/run.sh (working copy) @@ -192,6 +192,20 @@ steps/train_tri2a.sh || exit 1; +# Command for Geoff + George: +# note: tri2a uses 3500 utterances, about half the total. +# tri3 (not run yet) uses them all. +. path.sh +ali-to-pdf exp/tri2a/final.mdl 'ark:gunzip -c exp/tri2a/cur?.ali.gz|' 'ark,t:|gzip -c > exp/tri2a/pdf_level_alignments.gz' + +# Command for Geoff + George to get high dimensional +# features on this same subset of data (example) +scripts/filter_scp.pl exp/tri2a/train.scp data/train_wav.scp | \ + ../../../src/featbin/compute-mfcc-feats --num-ceps=23 --verbose=2 \ + --config=conf/mfcc.conf scp:- ark:- | \ + ../../../src/featbin/add-deltas \ + ark:- ark,t:- | head + (scripts/mkgraph.sh data/G_tg_pruned.fst exp/tri2a/tree exp/tri2a/final.mdl exp/graph_tri2a_tg_pruned || exit 1; scripts/decode.sh exp/decode_tri2a_tgpr_eval92 exp/graph_tri2a_tg_pruned/HCLG.fst steps/decode_tri2a.sh data/eval_nov92.scp scripts/decode.sh exp/decode_tri2a_tgpr_eval93 exp/graph_tri2a_tg_pruned/HCLG.fst steps/decode_tri2a.sh data/eval_nov93.scp @@ -217,6 +231,45 @@ scripts/decode.sh exp/decode_tri3a_tgpr_uttdfmllr_eval92 exp/graph_tri3a_tg_pruned/HCLG.fst steps/decode_tri3a_diag_fmllr.sh data/eval_nov92.scp )& + + + +# command for Geoff + George: +. path.sh +ali-to-pdf exp/tri3a/final.mdl 'ark:gunzip -c exp/tri3a/cur?.ali.gz|' 'ark,t:|gzip -c > exp/tri3a/pdf_level_alignments.gz' + +# Command for Geoff + George to get high dimensional +# features on this same subset of data (example) +scripts/filter_scp.pl exp/tri3a/train.scp data/train_wav.scp | \ + ../../../src/featbin/compute-mfcc-feats --num-ceps=23 --verbose=2 \ + --config=conf/mfcc.conf scp:- ark:- | \ + ../../../src/featbin/add-deltas \ + ark:- ark,t:- | head + +# Command for George: decoding with scores obtained from a pipe. + +dir=exp/decode_tri3a_tgpr_eval92_pipe +mkdir ${dir} +scripts/split_scp.pl data/eval_nov92.scp ${dir}/{1,2,3,4,5,6,7,8}.scp +. path.sh +for n in 1 2 3 4 5 6 7 8; do + gmm-compute-likes exp/tri3a/final.mdl "ark:add-deltas --print-args=false scp:${dir}/$n.scp ark:- |" \ + ark,t:- | \ + decode-faster --beam=13.0 --max-active=7000 --acoustic-scale=0.0625 \ + --word-symbol-table=data/words.txt exp/tri3a/final.mdl exp/graph_tri3a_tg_pruned/HCLG.fst \ + ark:- ark,t:${dir}/$n.tra \ + ark,t:${dir}/$n.ali 2> ${dir}/decode$n.log & +done +wait + +cat data/eval_nov92.txt | sed 's:<NOISE>::g' | sed 's:<SPOKEN_NOISE>::g' > $dir/test_trans.filt + +cat $dir/{1,2,3,4,5,6,7,8}.tra | \ + scripts/int2sym.pl --ignore-first-field data/words.txt | \ + sed 's:<s>::' | sed 's:</s>::' | sed 's:<UNK>::g' | \ + compute-wer --text --mode=present ark:$dir/test_trans.filt ark,p:- >& $dir/wer +# End command for George. + # will delete: ## scripts/decode_queue_fmllr.sh exp/graph_tri3a_tg_pruned exp/tri3a/final.mdl exp/decode_tri3a_tg_pruned_fmllr & On Tue, Dec 6, 2011 at 12:04 AM, Troy Lee <tro...@gm...> wrote: > Hi Dan, > > Great! I have check out that version. Thanks so much! > > Regards, > Troy > > > On Tue, Dec 6, 2011 at 3:44 PM, Daniel Povey <dp...@gm...> wrote: >> >> That sounds correct. >> sandbox/karel is an alternative to trunk, that you could use when >> checking it out. >> e.g. check out a different version of kaldi, replacing "trunk" with >> "sandbox/karel" >> Dan >> >> On Mon, Dec 5, 2011 at 11:42 PM, Troy Lee <tro...@gm...> wrote: >> > Hi Karel, >> > >> > Thanks so much for the detailed explanations. I'm currently using your >> > TNet >> > package for neural network training. It's really a great tool I have >> > ever >> > used for training neural nets for speech recognition. >> > >> > As I don't have access to the directory sandbox/karel/ (which is >> > currently >> > not available in the trunk) and from what I know currently, the general >> > steps for working with TNet and Kaldi should be as follows: >> > 1) Align the training transcriptions with compile-train-graphs and >> > gmm-align-compiled >> > 2) Convert the alignment to pdf-id labels using ali-to-pdf >> > 3) Train the neural net with pdf-id labels using TNet >> > 4) Decode the neural net generated logposteriors with >> > decode-faster-mapped >> > >> > Am I correct? Thanks! >> > >> > Regards, >> > Troy >> > >> > >> > On Tue, Dec 6, 2011 at 4:39 AM, Karel Veselý <ive...@fi...> >> > wrote: >> >> >> >> Hi Troy, >> >> for example how neural network output decoding works see script: >> >> /kaldi/sandbox/karel/egs/rm/s2/steps/decode_nnet_tri2a_s3.sh >> >> >> >> At the beginning is built long pipeline of feature processing, >> >> which has at the end nnet-forward which produces logposteriors >> >> (optionally divided by priors), those are then passed to >> >> decode-faster-mapped >> >> which decodes matrix "Nframes x Npdf" >> >> (decode-faster would decode matrix "Nframes x Ntransition_id") >> >> >> >> The trunk contains only CPU implementation of neural network training, >> >> the GPU version is in sandbox/karel/ >> >> >> >> Karel >> >> >> >> >> >> >> >> Dne 3.12.2011 22:38, Daniel Povey napsal(a): >> >> >> >> BTW, the basic way I recommend to decode with neural nets is to use the >> >> neural net to produce scores for all clustered states (pdf-ids) [as a >> >> matrix >> >> for each utterance], and pipe these into "decode-faster". Probably the >> >> scripts I pointed to below use this approach. This method can be used >> >> for >> >> any type of neural net. Basically you can have a Matlab program print >> >> out, >> >> for each utterance, the utterance-id and then a matrix of scores, e.g. >> >> comparable to log-likelihoods, in Matlab format (one row per frame), >> >> and >> >> then pipe this into decode-faster. >> >> >> >> Dan >> >> >> >> On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: >> >>> >> >>> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how >> >>> to >> >>> train and decode with neural nets. >> >>> This stuff has not been merged back into the trunk yet, AFAIK. >> >>> >> >>> Dan >> >>> >> >>> >> >>> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> >> >>> wrote: >> >>>> >> >>>> Hi Troy, >> >>>> >> >>>> there is currently no support for decoding with NN, but that is >> >>>> pretty >> >>>> easy to add. The decoder works with a "decodable" interface that is >> >>>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >> >>>> provide its implementation of the DecodableInterface. You can see how >> >>>> the implementations are for few different acoustic models (regular >> >>>> diag GMMs, semi-cont models, and SGMMs) in the >> >>>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >> >>>> acoustic model is that it is able to provide a score (log likelihood) >> >>>> for a given feature vector and a state in the model. In practice, a >> >>>> decodable class in the decoder directory does not directly call the >> >>>> LogLikelihood function of the corresponding acoustic model class, but >> >>>> reimplements it to take advantage of caching. >> >>>> >> >>>> I am not sure if you can actually do acoustic modeling with the >> >>>> current neural network code in Kaldi. Karel, who wrote the the neural >> >>>> network code, can give you more details about the NN code. But if you >> >>>> have your favorite C++ implementation of, say, deep belief networks, >> >>>> that should be fairly straightforward to use with the kaldi decoder. >> >>>> >> >>>> -Arnab >> >>>> >> >>>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> >> >>>> wrote: >> >>>> > Hi, >> >>>> > >> >>>> > I'm new to the Kaldi package, and just saw there is a module in the >> >>>> > source >> >>>> > code called "nnet", which probably deals with Neural Network (NN) >> >>>> > stuff. I'm >> >>>> > thus wondering whether there is a direct support for decoding with >> >>>> > likelihoods generated by neural network acoustic models in the >> >>>> > Kaldi >> >>>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >> >>>> > >> >>>> > Regards, >> >>>> > Troy >> >>>> > >> >>>> > >> >>>> > >> >>>> > ------------------------------------------------------------------------------ >> >>>> > All the data continuously generated in your IT infrastructure >> >>>> > contains a definitive record of customers, application performance, >> >>>> > security threats, fraudulent activity, and more. Splunk takes this >> >>>> > data and makes sense of it. IT sense. And common sense. >> >>>> > http://p.sf.net/sfu/splunk-novd2d >> >>>> > _______________________________________________ >> >>>> > Kaldi-developers mailing list >> >>>> > Kal...@li... >> >>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >>>> > >> >>>> >> >>>> >> >>>> >> >>>> ------------------------------------------------------------------------------ >> >>>> All the data continuously generated in your IT infrastructure >> >>>> contains a definitive record of customers, application performance, >> >>>> security threats, fraudulent activity, and more. Splunk takes this >> >>>> data and makes sense of it. IT sense. And common sense. >> >>>> http://p.sf.net/sfu/splunk-novd2d >> >>>> _______________________________________________ >> >>>> Kaldi-developers mailing list >> >>>> Kal...@li... >> >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >>> >> >>> >> >> >> >> >> > > > |
From: Troy L. <tro...@gm...> - 2011-12-06 08:05:11
|
Hi Dan, Great! I have check out that version. Thanks so much! Regards, Troy On Tue, Dec 6, 2011 at 3:44 PM, Daniel Povey <dp...@gm...> wrote: > That sounds correct. > sandbox/karel is an alternative to trunk, that you could use when > checking it out. > e.g. check out a different version of kaldi, replacing "trunk" with > "sandbox/karel" > Dan > > On Mon, Dec 5, 2011 at 11:42 PM, Troy Lee <tro...@gm...> wrote: > > Hi Karel, > > > > Thanks so much for the detailed explanations. I'm currently using your > TNet > > package for neural network training. It's really a great tool I have ever > > used for training neural nets for speech recognition. > > > > As I don't have access to the directory sandbox/karel/ (which is > currently > > not available in the trunk) and from what I know currently, the general > > steps for working with TNet and Kaldi should be as follows: > > 1) Align the training transcriptions with compile-train-graphs and > > gmm-align-compiled > > 2) Convert the alignment to pdf-id labels using ali-to-pdf > > 3) Train the neural net with pdf-id labels using TNet > > 4) Decode the neural net generated logposteriors with > decode-faster-mapped > > > > Am I correct? Thanks! > > > > Regards, > > Troy > > > > > > On Tue, Dec 6, 2011 at 4:39 AM, Karel Veselý <ive...@fi...> > wrote: > >> > >> Hi Troy, > >> for example how neural network output decoding works see script: > >> /kaldi/sandbox/karel/egs/rm/s2/steps/decode_nnet_tri2a_s3.sh > >> > >> At the beginning is built long pipeline of feature processing, > >> which has at the end nnet-forward which produces logposteriors > >> (optionally divided by priors), those are then passed to > >> decode-faster-mapped > >> which decodes matrix "Nframes x Npdf" > >> (decode-faster would decode matrix "Nframes x Ntransition_id") > >> > >> The trunk contains only CPU implementation of neural network training, > >> the GPU version is in sandbox/karel/ > >> > >> Karel > >> > >> > >> > >> Dne 3.12.2011 22:38, Daniel Povey napsal(a): > >> > >> BTW, the basic way I recommend to decode with neural nets is to use the > >> neural net to produce scores for all clustered states (pdf-ids) [as a > matrix > >> for each utterance], and pipe these into "decode-faster". Probably the > >> scripts I pointed to below use this approach. This method can be used > for > >> any type of neural net. Basically you can have a Matlab program print > out, > >> for each utterance, the utterance-id and then a matrix of scores, e.g. > >> comparable to log-likelihoods, in Matlab format (one row per frame), and > >> then pipe this into decode-faster. > >> > >> Dan > >> > >> On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: > >>> > >>> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to > >>> train and decode with neural nets. > >>> This stuff has not been merged back into the trunk yet, AFAIK. > >>> > >>> Dan > >>> > >>> > >>> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> > wrote: > >>>> > >>>> Hi Troy, > >>>> > >>>> there is currently no support for decoding with NN, but that is pretty > >>>> easy to add. The decoder works with a "decodable" interface that is > >>>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to > >>>> provide its implementation of the DecodableInterface. You can see how > >>>> the implementations are for few different acoustic models (regular > >>>> diag GMMs, semi-cont models, and SGMMs) in the > >>>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the > >>>> acoustic model is that it is able to provide a score (log likelihood) > >>>> for a given feature vector and a state in the model. In practice, a > >>>> decodable class in the decoder directory does not directly call the > >>>> LogLikelihood function of the corresponding acoustic model class, but > >>>> reimplements it to take advantage of caching. > >>>> > >>>> I am not sure if you can actually do acoustic modeling with the > >>>> current neural network code in Kaldi. Karel, who wrote the the neural > >>>> network code, can give you more details about the NN code. But if you > >>>> have your favorite C++ implementation of, say, deep belief networks, > >>>> that should be fairly straightforward to use with the kaldi decoder. > >>>> > >>>> -Arnab > >>>> > >>>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> > wrote: > >>>> > Hi, > >>>> > > >>>> > I'm new to the Kaldi package, and just saw there is a module in the > >>>> > source > >>>> > code called "nnet", which probably deals with Neural Network (NN) > >>>> > stuff. I'm > >>>> > thus wondering whether there is a direct support for decoding with > >>>> > likelihoods generated by neural network acoustic models in the Kaldi > >>>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! > >>>> > > >>>> > Regards, > >>>> > Troy > >>>> > > >>>> > > >>>> > > ------------------------------------------------------------------------------ > >>>> > All the data continuously generated in your IT infrastructure > >>>> > contains a definitive record of customers, application performance, > >>>> > security threats, fraudulent activity, and more. Splunk takes this > >>>> > data and makes sense of it. IT sense. And common sense. > >>>> > http://p.sf.net/sfu/splunk-novd2d > >>>> > _______________________________________________ > >>>> > Kaldi-developers mailing list > >>>> > Kal...@li... > >>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >>>> > > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> All the data continuously generated in your IT infrastructure > >>>> contains a definitive record of customers, application performance, > >>>> security threats, fraudulent activity, and more. Splunk takes this > >>>> data and makes sense of it. IT sense. And common sense. > >>>> http://p.sf.net/sfu/splunk-novd2d > >>>> _______________________________________________ > >>>> Kaldi-developers mailing list > >>>> Kal...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >>> > >>> > >> > >> > > > |
From: Daniel P. <dp...@gm...> - 2011-12-06 07:45:04
|
That sounds correct. sandbox/karel is an alternative to trunk, that you could use when checking it out. e.g. check out a different version of kaldi, replacing "trunk" with "sandbox/karel" Dan On Mon, Dec 5, 2011 at 11:42 PM, Troy Lee <tro...@gm...> wrote: > Hi Karel, > > Thanks so much for the detailed explanations. I'm currently using your TNet > package for neural network training. It's really a great tool I have ever > used for training neural nets for speech recognition. > > As I don't have access to the directory sandbox/karel/ (which is currently > not available in the trunk) and from what I know currently, the general > steps for working with TNet and Kaldi should be as follows: > 1) Align the training transcriptions with compile-train-graphs and > gmm-align-compiled > 2) Convert the alignment to pdf-id labels using ali-to-pdf > 3) Train the neural net with pdf-id labels using TNet > 4) Decode the neural net generated logposteriors with decode-faster-mapped > > Am I correct? Thanks! > > Regards, > Troy > > > On Tue, Dec 6, 2011 at 4:39 AM, Karel Veselý <ive...@fi...> wrote: >> >> Hi Troy, >> for example how neural network output decoding works see script: >> /kaldi/sandbox/karel/egs/rm/s2/steps/decode_nnet_tri2a_s3.sh >> >> At the beginning is built long pipeline of feature processing, >> which has at the end nnet-forward which produces logposteriors >> (optionally divided by priors), those are then passed to >> decode-faster-mapped >> which decodes matrix "Nframes x Npdf" >> (decode-faster would decode matrix "Nframes x Ntransition_id") >> >> The trunk contains only CPU implementation of neural network training, >> the GPU version is in sandbox/karel/ >> >> Karel >> >> >> >> Dne 3.12.2011 22:38, Daniel Povey napsal(a): >> >> BTW, the basic way I recommend to decode with neural nets is to use the >> neural net to produce scores for all clustered states (pdf-ids) [as a matrix >> for each utterance], and pipe these into "decode-faster". Probably the >> scripts I pointed to below use this approach. This method can be used for >> any type of neural net. Basically you can have a Matlab program print out, >> for each utterance, the utterance-id and then a matrix of scores, e.g. >> comparable to log-likelihoods, in Matlab format (one row per frame), and >> then pipe this into decode-faster. >> >> Dan >> >> On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: >>> >>> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to >>> train and decode with neural nets. >>> This stuff has not been merged back into the trunk yet, AFAIK. >>> >>> Dan >>> >>> >>> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: >>>> >>>> Hi Troy, >>>> >>>> there is currently no support for decoding with NN, but that is pretty >>>> easy to add. The decoder works with a "decodable" interface that is >>>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >>>> provide its implementation of the DecodableInterface. You can see how >>>> the implementations are for few different acoustic models (regular >>>> diag GMMs, semi-cont models, and SGMMs) in the >>>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >>>> acoustic model is that it is able to provide a score (log likelihood) >>>> for a given feature vector and a state in the model. In practice, a >>>> decodable class in the decoder directory does not directly call the >>>> LogLikelihood function of the corresponding acoustic model class, but >>>> reimplements it to take advantage of caching. >>>> >>>> I am not sure if you can actually do acoustic modeling with the >>>> current neural network code in Kaldi. Karel, who wrote the the neural >>>> network code, can give you more details about the NN code. But if you >>>> have your favorite C++ implementation of, say, deep belief networks, >>>> that should be fairly straightforward to use with the kaldi decoder. >>>> >>>> -Arnab >>>> >>>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: >>>> > Hi, >>>> > >>>> > I'm new to the Kaldi package, and just saw there is a module in the >>>> > source >>>> > code called "nnet", which probably deals with Neural Network (NN) >>>> > stuff. I'm >>>> > thus wondering whether there is a direct support for decoding with >>>> > likelihoods generated by neural network acoustic models in the Kaldi >>>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >>>> > >>>> > Regards, >>>> > Troy >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ >>>> > All the data continuously generated in your IT infrastructure >>>> > contains a definitive record of customers, application performance, >>>> > security threats, fraudulent activity, and more. Splunk takes this >>>> > data and makes sense of it. IT sense. And common sense. >>>> > http://p.sf.net/sfu/splunk-novd2d >>>> > _______________________________________________ >>>> > Kaldi-developers mailing list >>>> > Kal...@li... >>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> > >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> All the data continuously generated in your IT infrastructure >>>> contains a definitive record of customers, application performance, >>>> security threats, fraudulent activity, and more. Splunk takes this >>>> data and makes sense of it. IT sense. And common sense. >>>> http://p.sf.net/sfu/splunk-novd2d >>>> _______________________________________________ >>>> Kaldi-developers mailing list >>>> Kal...@li... >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >>> >> >> > |
From: Troy L. <tro...@gm...> - 2011-12-06 07:42:26
|
Hi Karel, Thanks so much for the detailed explanations. I'm currently using your TNet package for neural network training. It's really a great tool I have ever used for training neural nets for speech recognition. As I don't have access to the directory sandbox/karel/ (which is currently not available in the trunk) and from what I know currently, the general steps for working with TNet and Kaldi should be as follows: 1) Align the training transcriptions with compile-train-graphs and gmm-align-compiled 2) Convert the alignment to pdf-id labels using ali-to-pdf 3) Train the neural net with pdf-id labels using TNet 4) Decode the neural net generated logposteriors with decode-faster-mapped Am I correct? Thanks! Regards, Troy On Tue, Dec 6, 2011 at 4:39 AM, Karel Veselý <ive...@fi...> wrote: > Hi Troy, > for example how neural network output decoding works see script: > /kaldi/sandbox/karel/egs/rm/s2/steps/decode_nnet_tri2a_s3.sh > > At the beginning is built long pipeline of feature processing, > which has at the end nnet-forward which produces logposteriors > (optionally divided by priors), those are then passed to > decode-faster-mapped > which decodes matrix "Nframes x Npdf" > (decode-faster would decode matrix "Nframes x Ntransition_id") > > The trunk contains only CPU implementation of neural network training, > the GPU version is in sandbox/karel/ > > Karel > > > > Dne 3.12.2011 22:38, Daniel Povey napsal(a): > > BTW, the basic way I recommend to decode with neural nets is to use the > neural net to produce scores for all clustered states (pdf-ids) [as a > matrix for each utterance], and pipe these into "decode-faster". Probably > the scripts I pointed to below use this approach. This method can be used > for any type of neural net. Basically you can have a Matlab program print > out, for each utterance, the utterance-id and then a matrix of scores, e.g. > comparable to log-likelihoods, in Matlab format (one row per frame), and > then pipe this into decode-faster. > > Dan > > On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: > >> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to >> train and decode with neural nets. >> This stuff has not been merged back into the trunk yet, AFAIK. >> >> Dan >> >> >> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: >> >>> Hi Troy, >>> >>> there is currently no support for decoding with NN, but that is pretty >>> easy to add. The decoder works with a "decodable" interface that is >>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >>> provide its implementation of the DecodableInterface. You can see how >>> the implementations are for few different acoustic models (regular >>> diag GMMs, semi-cont models, and SGMMs) in the >>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >>> acoustic model is that it is able to provide a score (log likelihood) >>> for a given feature vector and a state in the model. In practice, a >>> decodable class in the decoder directory does not directly call the >>> LogLikelihood function of the corresponding acoustic model class, but >>> reimplements it to take advantage of caching. >>> >>> I am not sure if you can actually do acoustic modeling with the >>> current neural network code in Kaldi. Karel, who wrote the the neural >>> network code, can give you more details about the NN code. But if you >>> have your favorite C++ implementation of, say, deep belief networks, >>> that should be fairly straightforward to use with the kaldi decoder. >>> >>> -Arnab >>> >>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: >>> > Hi, >>> > >>> > I'm new to the Kaldi package, and just saw there is a module in the >>> source >>> > code called "nnet", which probably deals with Neural Network (NN) >>> stuff. I'm >>> > thus wondering whether there is a direct support for decoding with >>> > likelihoods generated by neural network acoustic models in the Kaldi >>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >>> > >>> > Regards, >>> > Troy >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > All the data continuously generated in your IT infrastructure >>> > contains a definitive record of customers, application performance, >>> > security threats, fraudulent activity, and more. Splunk takes this >>> > data and makes sense of it. IT sense. And common sense. >>> > http://p.sf.net/sfu/splunk-novd2d >>> > _______________________________________________ >>> > Kaldi-developers mailing list >>> > Kal...@li... >>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> > >>> >>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense. >>> http://p.sf.net/sfu/splunk-novd2d >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >> >> > > |
From: Karel V. <ive...@fi...> - 2011-12-05 20:39:22
|
Hi Troy, for example how neural network output decoding works see script: /kaldi/sandbox/karel/egs/rm/s2/steps/decode_nnet_tri2a_s3.sh At the beginning is built long pipeline of feature processing, which has at the end nnet-forward which produces logposteriors (optionally divided by priors), those are then passed to decode-faster-mapped which decodes matrix "Nframes x Npdf" (decode-faster would decode matrix "Nframes x Ntransition_id") The trunk contains only CPU implementation of neural network training, the GPU version is in sandbox/karel/ Karel Dne 3.12.2011 22:38, Daniel Povey napsal(a): > BTW, the basic way I recommend to decode with neural nets is to use > the neural net to produce scores for all clustered states (pdf-ids) > [as a matrix for each utterance], and pipe these into "decode-faster". > Probably the scripts I pointed to below use this approach. This > method can be used for any type of neural net. Basically you can have > a Matlab program print out, for each utterance, the utterance-id and > then a matrix of scores, e.g. comparable to log-likelihoods, in Matlab > format (one row per frame), and then pipe this into decode-faster. > > Dan > > On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm... > <mailto:dp...@gm...>> wrote: > > Also-- in sandbox/karel/egs/rm/s2, I think there are examples of > how to train and decode with neural nets. > This stuff has not been merged back into the trunk yet, AFAIK. > > Dan > > > On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm... > <mailto:ar...@gm...>> wrote: > > Hi Troy, > > there is currently no support for decoding with NN, but that > is pretty > easy to add. The decoder works with a "decodable" interface > that is > defined in itf/decodable-itf.h. Any acoustic modeling class > needs to > provide its implementation of the DecodableInterface. You can > see how > the implementations are for few different acoustic models (regular > diag GMMs, semi-cont models, and SGMMs) in the > decoder/decodable-am-*.{h,cc} files. The main thing needed > from the > acoustic model is that it is able to provide a score (log > likelihood) > for a given feature vector and a state in the model. In > practice, a > decodable class in the decoder directory does not directly > call the > LogLikelihood function of the corresponding acoustic model > class, but > reimplements it to take advantage of caching. > > I am not sure if you can actually do acoustic modeling with the > current neural network code in Kaldi. Karel, who wrote the the > neural > network code, can give you more details about the NN code. But > if you > have your favorite C++ implementation of, say, deep belief > networks, > that should be fairly straightforward to use with the kaldi > decoder. > > -Arnab > > On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee > <tro...@gm... <mailto:tro...@gm...>> wrote: > > Hi, > > > > I'm new to the Kaldi package, and just saw there is a module > in the source > > code called "nnet", which probably deals with Neural Network > (NN) stuff. I'm > > thus wondering whether there is a direct support for > decoding with > > likelihoods generated by neural network acoustic models in > the Kaldi > > decoder? Otherwise, what would be the easiest way to do so? > Thanks! > > > > Regards, > > Troy > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure > > contains a definitive record of customers, application > performance, > > security threats, fraudulent activity, and more. Splunk > takes this > > data and makes sense of it. IT sense. And common sense. > > http://p.sf.net/sfu/splunk-novd2d > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > <mailto:Kal...@li...> > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application > performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > <mailto:Kal...@li...> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > |
From: Daniel P. <dp...@gm...> - 2011-12-05 06:19:27
|
OK-- BTW, it should be best to get scores that are somehow comparable to HMM likelihoods-- e.g. divide by the prior of each state first, if you're using the log-posteriors of states. Dan On Sun, Dec 4, 2011 at 7:16 PM, Troy Lee <tro...@gm...> wrote: > Hi Guys, > > Thanks so much for all your suggestions! I would more prefer to pipe those > neural net scores to existing Kaldi decoders and I'm testing it. > > Regards, > Troy > > On Sun, Dec 4, 2011 at 5:38 AM, Daniel Povey <dp...@gm...> wrote: > >> BTW, the basic way I recommend to decode with neural nets is to use the >> neural net to produce scores for all clustered states (pdf-ids) [as a >> matrix for each utterance], and pipe these into "decode-faster". Probably >> the scripts I pointed to below use this approach. This method can be used >> for any type of neural net. Basically you can have a Matlab program print >> out, for each utterance, the utterance-id and then a matrix of scores, e.g. >> comparable to log-likelihoods, in Matlab format (one row per frame), and >> then pipe this into decode-faster. >> >> Dan >> >> >> On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: >> >>> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to >>> train and decode with neural nets. >>> This stuff has not been merged back into the trunk yet, AFAIK. >>> >>> Dan >>> >>> >>> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: >>> >>>> Hi Troy, >>>> >>>> there is currently no support for decoding with NN, but that is pretty >>>> easy to add. The decoder works with a "decodable" interface that is >>>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >>>> provide its implementation of the DecodableInterface. You can see how >>>> the implementations are for few different acoustic models (regular >>>> diag GMMs, semi-cont models, and SGMMs) in the >>>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >>>> acoustic model is that it is able to provide a score (log likelihood) >>>> for a given feature vector and a state in the model. In practice, a >>>> decodable class in the decoder directory does not directly call the >>>> LogLikelihood function of the corresponding acoustic model class, but >>>> reimplements it to take advantage of caching. >>>> >>>> I am not sure if you can actually do acoustic modeling with the >>>> current neural network code in Kaldi. Karel, who wrote the the neural >>>> network code, can give you more details about the NN code. But if you >>>> have your favorite C++ implementation of, say, deep belief networks, >>>> that should be fairly straightforward to use with the kaldi decoder. >>>> >>>> -Arnab >>>> >>>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> >>>> wrote: >>>> > Hi, >>>> > >>>> > I'm new to the Kaldi package, and just saw there is a module in the >>>> source >>>> > code called "nnet", which probably deals with Neural Network (NN) >>>> stuff. I'm >>>> > thus wondering whether there is a direct support for decoding with >>>> > likelihoods generated by neural network acoustic models in the Kaldi >>>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >>>> > >>>> > Regards, >>>> > Troy >>>> > >>>> > >>>> ------------------------------------------------------------------------------ >>>> > All the data continuously generated in your IT infrastructure >>>> > contains a definitive record of customers, application performance, >>>> > security threats, fraudulent activity, and more. Splunk takes this >>>> > data and makes sense of it. IT sense. And common sense. >>>> > http://p.sf.net/sfu/splunk-novd2d >>>> > _______________________________________________ >>>> > Kaldi-developers mailing list >>>> > Kal...@li... >>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> > >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> All the data continuously generated in your IT infrastructure >>>> contains a definitive record of customers, application performance, >>>> security threats, fraudulent activity, and more. Splunk takes this >>>> data and makes sense of it. IT sense. And common sense. >>>> http://p.sf.net/sfu/splunk-novd2d >>>> _______________________________________________ >>>> Kaldi-developers mailing list >>>> Kal...@li... >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> >>> >>> >> > |
From: Troy L. <tro...@gm...> - 2011-12-05 03:16:49
|
Hi Guys, Thanks so much for all your suggestions! I would more prefer to pipe those neural net scores to existing Kaldi decoders and I'm testing it. Regards, Troy On Sun, Dec 4, 2011 at 5:38 AM, Daniel Povey <dp...@gm...> wrote: > BTW, the basic way I recommend to decode with neural nets is to use the > neural net to produce scores for all clustered states (pdf-ids) [as a > matrix for each utterance], and pipe these into "decode-faster". Probably > the scripts I pointed to below use this approach. This method can be used > for any type of neural net. Basically you can have a Matlab program print > out, for each utterance, the utterance-id and then a matrix of scores, e.g. > comparable to log-likelihoods, in Matlab format (one row per frame), and > then pipe this into decode-faster. > > Dan > > > On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: > >> Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to >> train and decode with neural nets. >> This stuff has not been merged back into the trunk yet, AFAIK. >> >> Dan >> >> >> On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: >> >>> Hi Troy, >>> >>> there is currently no support for decoding with NN, but that is pretty >>> easy to add. The decoder works with a "decodable" interface that is >>> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >>> provide its implementation of the DecodableInterface. You can see how >>> the implementations are for few different acoustic models (regular >>> diag GMMs, semi-cont models, and SGMMs) in the >>> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >>> acoustic model is that it is able to provide a score (log likelihood) >>> for a given feature vector and a state in the model. In practice, a >>> decodable class in the decoder directory does not directly call the >>> LogLikelihood function of the corresponding acoustic model class, but >>> reimplements it to take advantage of caching. >>> >>> I am not sure if you can actually do acoustic modeling with the >>> current neural network code in Kaldi. Karel, who wrote the the neural >>> network code, can give you more details about the NN code. But if you >>> have your favorite C++ implementation of, say, deep belief networks, >>> that should be fairly straightforward to use with the kaldi decoder. >>> >>> -Arnab >>> >>> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: >>> > Hi, >>> > >>> > I'm new to the Kaldi package, and just saw there is a module in the >>> source >>> > code called "nnet", which probably deals with Neural Network (NN) >>> stuff. I'm >>> > thus wondering whether there is a direct support for decoding with >>> > likelihoods generated by neural network acoustic models in the Kaldi >>> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >>> > >>> > Regards, >>> > Troy >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > All the data continuously generated in your IT infrastructure >>> > contains a definitive record of customers, application performance, >>> > security threats, fraudulent activity, and more. Splunk takes this >>> > data and makes sense of it. IT sense. And common sense. >>> > http://p.sf.net/sfu/splunk-novd2d >>> > _______________________________________________ >>> > Kaldi-developers mailing list >>> > Kal...@li... >>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> > >>> >>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense. >>> http://p.sf.net/sfu/splunk-novd2d >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >> >> > |
From: Daniel P. <dp...@gm...> - 2011-12-03 21:38:48
|
BTW, the basic way I recommend to decode with neural nets is to use the neural net to produce scores for all clustered states (pdf-ids) [as a matrix for each utterance], and pipe these into "decode-faster". Probably the scripts I pointed to below use this approach. This method can be used for any type of neural net. Basically you can have a Matlab program print out, for each utterance, the utterance-id and then a matrix of scores, e.g. comparable to log-likelihoods, in Matlab format (one row per frame), and then pipe this into decode-faster. Dan On Sat, Dec 3, 2011 at 11:51 AM, Daniel Povey <dp...@gm...> wrote: > Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to > train and decode with neural nets. > This stuff has not been merged back into the trunk yet, AFAIK. > > Dan > > > On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: > >> Hi Troy, >> >> there is currently no support for decoding with NN, but that is pretty >> easy to add. The decoder works with a "decodable" interface that is >> defined in itf/decodable-itf.h. Any acoustic modeling class needs to >> provide its implementation of the DecodableInterface. You can see how >> the implementations are for few different acoustic models (regular >> diag GMMs, semi-cont models, and SGMMs) in the >> decoder/decodable-am-*.{h,cc} files. The main thing needed from the >> acoustic model is that it is able to provide a score (log likelihood) >> for a given feature vector and a state in the model. In practice, a >> decodable class in the decoder directory does not directly call the >> LogLikelihood function of the corresponding acoustic model class, but >> reimplements it to take advantage of caching. >> >> I am not sure if you can actually do acoustic modeling with the >> current neural network code in Kaldi. Karel, who wrote the the neural >> network code, can give you more details about the NN code. But if you >> have your favorite C++ implementation of, say, deep belief networks, >> that should be fairly straightforward to use with the kaldi decoder. >> >> -Arnab >> >> On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: >> > Hi, >> > >> > I'm new to the Kaldi package, and just saw there is a module in the >> source >> > code called "nnet", which probably deals with Neural Network (NN) >> stuff. I'm >> > thus wondering whether there is a direct support for decoding with >> > likelihoods generated by neural network acoustic models in the Kaldi >> > decoder? Otherwise, what would be the easiest way to do so? Thanks! >> > >> > Regards, >> > Troy >> > >> > >> ------------------------------------------------------------------------------ >> > All the data continuously generated in your IT infrastructure >> > contains a definitive record of customers, application performance, >> > security threats, fraudulent activity, and more. Splunk takes this >> > data and makes sense of it. IT sense. And common sense. >> > http://p.sf.net/sfu/splunk-novd2d >> > _______________________________________________ >> > Kaldi-developers mailing list >> > Kal...@li... >> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure >> contains a definitive record of customers, application performance, >> security threats, fraudulent activity, and more. Splunk takes this >> data and makes sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-novd2d >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > > |
From: Daniel P. <dp...@gm...> - 2011-12-03 19:51:13
|
Also-- in sandbox/karel/egs/rm/s2, I think there are examples of how to train and decode with neural nets. This stuff has not been merged back into the trunk yet, AFAIK. Dan On Sat, Dec 3, 2011 at 1:53 AM, Arnab Ghoshal <ar...@gm...> wrote: > Hi Troy, > > there is currently no support for decoding with NN, but that is pretty > easy to add. The decoder works with a "decodable" interface that is > defined in itf/decodable-itf.h. Any acoustic modeling class needs to > provide its implementation of the DecodableInterface. You can see how > the implementations are for few different acoustic models (regular > diag GMMs, semi-cont models, and SGMMs) in the > decoder/decodable-am-*.{h,cc} files. The main thing needed from the > acoustic model is that it is able to provide a score (log likelihood) > for a given feature vector and a state in the model. In practice, a > decodable class in the decoder directory does not directly call the > LogLikelihood function of the corresponding acoustic model class, but > reimplements it to take advantage of caching. > > I am not sure if you can actually do acoustic modeling with the > current neural network code in Kaldi. Karel, who wrote the the neural > network code, can give you more details about the NN code. But if you > have your favorite C++ implementation of, say, deep belief networks, > that should be fairly straightforward to use with the kaldi decoder. > > -Arnab > > On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: > > Hi, > > > > I'm new to the Kaldi package, and just saw there is a module in the > source > > code called "nnet", which probably deals with Neural Network (NN) stuff. > I'm > > thus wondering whether there is a direct support for decoding with > > likelihoods generated by neural network acoustic models in the Kaldi > > decoder? Otherwise, what would be the easiest way to do so? Thanks! > > > > Regards, > > Troy > > > > > ------------------------------------------------------------------------------ > > All the data continuously generated in your IT infrastructure > > contains a definitive record of customers, application performance, > > security threats, fraudulent activity, and more. Splunk takes this > > data and makes sense of it. IT sense. And common sense. > > http://p.sf.net/sfu/splunk-novd2d > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Arnab G. <ar...@gm...> - 2011-12-03 09:53:51
|
Hi Troy, there is currently no support for decoding with NN, but that is pretty easy to add. The decoder works with a "decodable" interface that is defined in itf/decodable-itf.h. Any acoustic modeling class needs to provide its implementation of the DecodableInterface. You can see how the implementations are for few different acoustic models (regular diag GMMs, semi-cont models, and SGMMs) in the decoder/decodable-am-*.{h,cc} files. The main thing needed from the acoustic model is that it is able to provide a score (log likelihood) for a given feature vector and a state in the model. In practice, a decodable class in the decoder directory does not directly call the LogLikelihood function of the corresponding acoustic model class, but reimplements it to take advantage of caching. I am not sure if you can actually do acoustic modeling with the current neural network code in Kaldi. Karel, who wrote the the neural network code, can give you more details about the NN code. But if you have your favorite C++ implementation of, say, deep belief networks, that should be fairly straightforward to use with the kaldi decoder. -Arnab On Thu, Dec 1, 2011 at 6:54 AM, Troy Lee <tro...@gm...> wrote: > Hi, > > I'm new to the Kaldi package, and just saw there is a module in the source > code called "nnet", which probably deals with Neural Network (NN) stuff. I'm > thus wondering whether there is a direct support for decoding with > likelihoods generated by neural network acoustic models in the Kaldi > decoder? Otherwise, what would be the easiest way to do so? Thanks! > > Regards, > Troy > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Troy L. <tro...@gm...> - 2011-12-01 06:55:11
|
Hi, I'm new to the Kaldi package, and just saw there is a module in the source code called "nnet", which probably deals with Neural Network (NN) stuff. I'm thus wondering whether there is a direct support for decoding with likelihoods generated by neural network acoustic models in the Kaldi decoder? Otherwise, what would be the easiest way to do so? Thanks! Regards, Troy |
From: Joey W. <joe...@gm...> - 2011-11-02 06:46:08
|
Hi, there, Firstly thanks very much for the great work of Kaldi. I have gone through the WSJ recipe using the WSJCAM0 corpus which is British accent version of WSJ. Everything went well except I had a problem of the rescoring lattices generated by the bigram with trigram, *scripts/decode.sh exp/decode_tri2a_bg_${set} exp/graph_tri2a_bg/HCLG.fst steps/decode_tri2a.sh data/si_${set}.scp* *scripts/decode.sh exp/decode_tri2a_bg_latgen_${set} exp/graph_tri2a_bg/HCLG.fst steps/decode_tri2a_latgen.sh data/si_${set}.scp* The above two steps have no problem for the bigram decoding and lattice generation. However, after lattice generation, the rescoring script: *scripts/latrescore.sh exp/decode_tri2a_bg_latgen_${set} data/G_bg.fst data/G_tg.fst data/si_${set}.kaldi exp/decode_tri2a_bg_rescore_tg_${set}* has no output. Examining the log file *remove_old_lm.log*, I have *lattice-lmrescore --lm-scale=-1.0 'ark:gunzip -c exp/decode_tri2a_bg_latgen_dt5a/*.lats.gz|' 'fstproject --project_output=true data/G_bg.fst |' ark:-* *WARNING (lattice-lmrescore:CheckMemoryUsage():fstext/determinize-lattice-inl.h:449) Failure in determinize-lattice: size exceeds maximum -1 bytes; (repo,arcs,elems) = (3040,32,72), after rebuilding, repo size was 3040* *WARNING (lattice-lmrescore:main():lattice-lmrescore.cc:131) Empty lattice for key c31c0201 (incompatible LM?)* * * The log file seems to tell that the generated lattices are too large. In fact, in the original steps/decode_tri3a_latgen.sh script, --max_arcs option is used in gmm-latgen-simple which does not have this option, therefore, I removed this option and did the lattice generation. Do you think this causes the above described problem? What about the incompatible LM? Thanks in advance. -- Best regards, Wang Guangsen ******************************* Wang Guangsen, Joe School of Computing National University of Singapore email: joe...@gm... |
From: Arnab G. <ar...@gm...> - 2011-10-06 21:29:14
|
Implementing MCE sounds like a good idea. On Thu, Oct 6, 2011 at 7:53 PM, Chao Weng <cw...@gm...> wrote: > Hi Arnab, > > Thanks for your information and suggestions. I really appreciate it. > > Currently I already finished running the receipts for RM and WSJ. And > now I'm diving into the codes and trying to understanding some > essential implementation. As I mentioned earlier, if no one is working > on MCE part, I could fill this hole. Or if you have some work want me > to do, please feel free to let me know. I will try my best to help. > > Bests, > Chao > > On Thu, Oct 6, 2011 at 12:53 PM, Arnab Ghoshal <ar...@gm...> wrote: >> On Thu, Sep 29, 2011 at 8:11 PM, Chao Weng <cw...@gm...> wrote: >>> >>> Now I'm looking into the code of Kaldi, and trying to figure out the >>> discriminative training extension for Kaldi. Is there any possibility >>> I can cooperate with the team and make some contributions. >> >> Hi Chao, it will be nice to have you collaborate on the discriminative >> training code. Maybe you already mentioned this in earlier email, >> which I missed, but can you explain what particular things you plan to >> do. If you don't have very clear plans that is OK as well. Our goal is >> to implement most state of the art techniques, and so pretty much what >> you do will fit in the general plan. >> >> The discriminative training code in sandbox/discrim is right now >> fairly messy and it will be cleared up in the next few weeks. >> Currently there is code for running FB on lattices, some bits of MPE >> code, EBW estimation code, and code for fMPE/fMMI type discriminative >> features. >> >> I think the best place to start will be to familiarize yourself with >> the toolkit. You can go through the tutorial >> http://kaldi.sourceforge.net/tutorial.html and run the recipes under >> trunk/egs. >> >> Let us know if you have any questions. >> >> Best, -Arnab >> > |
From: Chao W. <cw...@gm...> - 2011-10-06 17:54:04
|
Hi Arnab, Thanks for your information and suggestions. I really appreciate it. Currently I already finished running the receipts for RM and WSJ. And now I'm diving into the codes and trying to understanding some essential implementation. As I mentioned earlier, if no one is working on MCE part, I could fill this hole. Or if you have some work want me to do, please feel free to let me know. I will try my best to help. Bests, Chao On Thu, Oct 6, 2011 at 12:53 PM, Arnab Ghoshal <ar...@gm...> wrote: > On Thu, Sep 29, 2011 at 8:11 PM, Chao Weng <cw...@gm...> wrote: >> >> Now I'm looking into the code of Kaldi, and trying to figure out the >> discriminative training extension for Kaldi. Is there any possibility >> I can cooperate with the team and make some contributions. > > Hi Chao, it will be nice to have you collaborate on the discriminative > training code. Maybe you already mentioned this in earlier email, > which I missed, but can you explain what particular things you plan to > do. If you don't have very clear plans that is OK as well. Our goal is > to implement most state of the art techniques, and so pretty much what > you do will fit in the general plan. > > The discriminative training code in sandbox/discrim is right now > fairly messy and it will be cleared up in the next few weeks. > Currently there is code for running FB on lattices, some bits of MPE > code, EBW estimation code, and code for fMPE/fMMI type discriminative > features. > > I think the best place to start will be to familiarize yourself with > the toolkit. You can go through the tutorial > http://kaldi.sourceforge.net/tutorial.html and run the recipes under > trunk/egs. > > Let us know if you have any questions. > > Best, -Arnab > |
From: Arnab G. <ar...@gm...> - 2011-10-06 16:54:07
|
On Thu, Sep 29, 2011 at 8:11 PM, Chao Weng <cw...@gm...> wrote: > > Now I'm looking into the code of Kaldi, and trying to figure out the > discriminative training extension for Kaldi. Is there any possibility > I can cooperate with the team and make some contributions. Hi Chao, it will be nice to have you collaborate on the discriminative training code. Maybe you already mentioned this in earlier email, which I missed, but can you explain what particular things you plan to do. If you don't have very clear plans that is OK as well. Our goal is to implement most state of the art techniques, and so pretty much what you do will fit in the general plan. The discriminative training code in sandbox/discrim is right now fairly messy and it will be cleared up in the next few weeks. Currently there is code for running FB on lattices, some bits of MPE code, EBW estimation code, and code for fMPE/fMMI type discriminative features. I think the best place to start will be to familiarize yourself with the toolkit. You can go through the tutorial http://kaldi.sourceforge.net/tutorial.html and run the recipes under trunk/egs. Let us know if you have any questions. Best, -Arnab |
From: Daniel P. <dp...@gm...> - 2011-10-02 18:21:07
|
Hi, I see that this mail was sent a few days ago. I also sent an invite for you to the but10 list, which gets most of the traffic and sent an email to introduce you to the list [but you may not have got it if you did not accept the link immediately]. Let me know if you have any problems with the recipe. I myself have not gone deeply into the discriminative training code (i.e. further than MMI)-- this is on my to-do list but if you can sort everything out and get it working this would be a great help-- assuming, of course, that what you do is in line with the way we generally do things [Arnab or I could check]. Dan On Thu, Sep 29, 2011 at 11:11 AM, Chao Weng <cw...@gm...> wrote: > To whom it may concern, > > Please add me to the Kaldi's mail list, thanks. > > Now I'm looking into the code of Kaldi, and trying to figure out the > discriminative training extension for Kaldi. Is there any possibility > I can cooperate with the team and make some contributions. > > Bests, > Chao > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Chao W. <cw...@gm...> - 2011-09-29 18:11:37
|
To whom it may concern, Please add me to the Kaldi's mail list, thanks. Now I'm looking into the code of Kaldi, and trying to figure out the discriminative training extension for Kaldi. Is there any possibility I can cooperate with the team and make some contributions. Bests, Chao |
From: Karel V. <ve...@gm...> - 2011-06-09 11:45:46
|
Karel Vesely On 06/09/11 12:17, Mirko Hannemann wrote: > Mirko Hannemann > > ------------------------------------------------------------------------------ > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Mirko H. <mir...@go...> - 2011-06-09 10:17:32
|
Mirko Hannemann |
From: Daniel P. <dp...@gm...> - 2011-06-08 22:48:21
|
Replying-all with another test messages. [ if anyone on this list (currently the Kaldi admins) doesn't want to be on it, ask Nagendra or work out how to remove yourself. ] Dan On Wed, Jun 8, 2011 at 3:16 PM, Nagendra Kumar Goel < nag...@go...> wrote: > > -- > Nagendra Kumar Goel > > > > > ------------------------------------------------------------------------------ > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Nagendra K. G. <nag...@go...> - 2011-06-08 22:41:55
|
-- Nagendra Kumar Goel |