You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Daniel P. <dp...@gm...> - 2012-09-02 15:46:43
|
BTW, the result, after you fix the code, will be a vector of all zeros, ones and twos-- is that what you want? You can use the program ali-to-pdfs (or something like that) to get "pdf-ids" which describe the context-dependent state. These are generally suitable labels for training neural nets. Dan On Sun, Sep 2, 2012 at 11:44 AM, Daniel Povey <dp...@gm...> wrote: > I think it's just a bug in the code. > You are doing > int32 tid = split[i][0]; > which means you get the first HMM-state in the alignment of each phone, > which will always be zero. Instead you want to iterate over split[i] and > get tid = split[i][j]. > Dan > > and I want to convert the transition ids to the hmm states to be writte to >> disk. I need the triphone states as target values for a neural network. >> >> For this purpose I have copied the file ali-to-phones.cc and adapted it >> to write out what I think should be the hmm states. >> Excerpt: >> "... >> for (; !reader.Done(); reader.Next()) { >> std::string key = reader.Key(); >> const std::vector<int32> &alignment = reader.Value(); >> >> std::vector<std::vector<int32> > split; >> SplitToPhones(trans_model, alignment, &split); >> >> if (!write_lengths) { >> std::vector<int32> states; >> for (size_t i = 0; i < split.size(); i++) { >> KALDI_ASSERT(split[i].size() > 0); >> int32 tid = split[i][0]; >> int32 hmmstate = trans_model.TransitionIdToHmmState(tid); <-- >> !! >> int32 num_repeats = split[i].size(); >> KALDI_ASSERT(num_repeats!=0); >> if (per_frame) >> for(int32 j = 0; j < num_repeats; j++) >> states.push_back(hmmstate); >> else >> states.push_back(hmmstate); >> } >> states_writer.Write(key, states); >> } else { >> ... // omitted, as not needed >> } >> ... >> " >> >> The outcome, however, is always a vector of zeros for all utterances! >> This is true also if instead I try to convert to PdfClass. >> >> What am I missing here? >> >> Thanks in advance! >> >> Ray >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > |
From: Daniel P. <dp...@gm...> - 2012-09-02 15:44:31
|
I think it's just a bug in the code. You are doing int32 tid = split[i][0]; which means you get the first HMM-state in the alignment of each phone, which will always be zero. Instead you want to iterate over split[i] and get tid = split[i][j]. Dan and I want to convert the transition ids to the hmm states to be writte to > disk. I need the triphone states as target values for a neural network. > > For this purpose I have copied the file ali-to-phones.cc and adapted it to > write out what I think should be the hmm states. > Excerpt: > "... > for (; !reader.Done(); reader.Next()) { > std::string key = reader.Key(); > const std::vector<int32> &alignment = reader.Value(); > > std::vector<std::vector<int32> > split; > SplitToPhones(trans_model, alignment, &split); > > if (!write_lengths) { > std::vector<int32> states; > for (size_t i = 0; i < split.size(); i++) { > KALDI_ASSERT(split[i].size() > 0); > int32 tid = split[i][0]; > int32 hmmstate = trans_model.TransitionIdToHmmState(tid); <-- !! > int32 num_repeats = split[i].size(); > KALDI_ASSERT(num_repeats!=0); > if (per_frame) > for(int32 j = 0; j < num_repeats; j++) > states.push_back(hmmstate); > else > states.push_back(hmmstate); > } > states_writer.Write(key, states); > } else { > ... // omitted, as not needed > } > ... > " > > The outcome, however, is always a vector of zeros for all utterances! This > is true also if instead I try to convert to PdfClass. > > What am I missing here? > > Thanks in advance! > > Ray > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: xinglong g. <gao...@gm...> - 2012-09-01 14:20:33
|
AA aa AE ae AH ah AO ao AW aw AX ax AX-H ax-h AXR axr AY ay B b BCL bcl CH ch D d DCL dcl DH dh DX dx EH eh EL el EM em EN en ENG eng EPI epi ER er EY ey F f G g GCL gcl HH hh HV hv IH ih IX ix IY iy JH jh K k KCL kcl L l M m N n NG ng NX nx OW ow OY oy P p PAU pau PCL pcl Q q R r S s SH sh T t TCL tcl TH th UH uh UW uw UX ux V v W w Y y Z z ZH zh |
From: Raymond B. <ray...@we...> - 2012-08-30 14:41:58
|
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>I'm trying to run a forced alignment of the TIMIT data (egs/timit/s3) on a triphone model I have trained on it (along the lines of egs/rm/s3)</div><div>and I want to convert the transition ids to the hmm states to be writte to disk. I need the triphone states as target values for a neural network.<br/></div><div><br/></div><div>For this purpose I have copied the file ali-to-phones.cc and adapted it to write out what I think should be the hmm states.<br/></div><div>Excerpt:<br/></div><div>"...<br/></div><div> for (; !reader.Done(); reader.Next()) {<br/> std::string key = reader.Key();<br/> const std::vector<int32> &alignment = reader.Value();<br/><br/> std::vector<std::vector<int32> > split;<br/> SplitToPhones(trans_model, alignment, &split);<br/><br/> if (!write_lengths) {<br/> std::vector<int32> states;<br/> for (size_t i = 0; i < split.size(); i++) {<br/> KALDI_ASSERT(split[i].size() > 0);<br/> int32 tid = split[i][0];<br/> int32 hmmstate = trans_model.TransitionIdToHmmState(tid); <-- !!<br/> int32 num_repeats = split[i].size();<br/> KALDI_ASSERT(num_repeats!=0);<br/> if (per_frame)<br/> for(int32 j = 0; j < num_repeats; j++)<br/> states.push_back(hmmstate);<br/> else<br/> states.push_back(hmmstate);<br/> }<br/> states_writer.Write(key, states);<br/> } else {<br/> ... // omitted, as not needed<br/> }<br/></div><div>...<br/></div><div>"<br/></div><div><br/></div><div>The outcome, however, is always a vector of zeros for all utterances! This is true also if instead I try to convert to PdfClass.<br/></div><div><br/></div><div>What am I missing here?<br/></div><div><br/></div><div>Thanks in advance!<br/></div><div><br/></div><div>Ray<br/></div></div></body></html> |
From: Daniel P. <dp...@gm...> - 2012-08-26 00:00:12
|
I think it must be something outside that code snippet that's wrong. I would recommend to either try running it in gdb or with valgrind, or paste more code. Dan On Sat, Aug 25, 2012 at 7:54 PM, Xavier Anguera <xan...@gm...> wrote: > Hi, > I have been trying to debug this issue for a few hours now, but I can > not find what I am doing wrong. > In my program I am calling the following: > DiagGmm tmpgmm; > tmpgmm.Resize(1, dim); > This is actually exactly like in gmm_init_model_flat.cc, except that I > later pretend to just initialize the model, nothing else. > I am constantly getting a segmentation fault when the weights_ vector > is resized as the program finds it not initialized to NULL and tried > to free() it before resizing it. I do not understand why weights_ > should not be initialized to NULL, as it seems done in the > constructor. > > Any help will be appreciated. > > Xavier Anguera > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Xavier A. <xan...@gm...> - 2012-08-25 23:54:22
|
Hi, I have been trying to debug this issue for a few hours now, but I can not find what I am doing wrong. In my program I am calling the following: DiagGmm tmpgmm; tmpgmm.Resize(1, dim); This is actually exactly like in gmm_init_model_flat.cc, except that I later pretend to just initialize the model, nothing else. I am constantly getting a segmentation fault when the weights_ vector is resized as the program finds it not initialized to NULL and tried to free() it before resizing it. I do not understand why weights_ should not be initialized to NULL, as it seems done in the constructor. Any help will be appreciated. Xavier Anguera |
From: Daniel P. <dp...@gm...> - 2012-08-20 09:26:09
|
I think splitting the highest-count Gaussians should be better in general, because it helps keep the counts relatively constant across the Gaussians, which should be good for speaker-ID. Dan On Mon, Aug 20, 2012 at 5:21 AM, Xavier Anguera <xan...@gm...> wrote: > Hi, > Thanks for your answers, > I need to take a closer look at sgmmbin/init-ubm-flat.cc. I did look > at the other files you propose and did not see exactly what I meant. I > see that gmmbin/gmm-init-model-flat.c is able to compute a single > Gaussian GMM given all data and that gmmbin/gmm-global-acc-stats and > gmmbin/gmm-global-est can retrain the model (i.e. EM reestimation), > but I believe I saw that those only allow me to grow the model by > splitting the N Gaussians with highest weight. This is not the same as > splitting uniformly all Gaussians (regardless of weights) or > performing Kmeans+splitting of all Gaussians. > I will try using the functions you propose and if the results are not > good enough I will try to implement the splitting myself (After > looking at it more I see that it should not be too difficult anyway). > > Thanks > > Xavier Anguera > > On Mon, Aug 20, 2012 at 10:53 AM, Arnab Ghoshal <ar...@gm...> wrote: > > Or you could create something like sgmmbin/init-ubm-flat.cc with the > > option for creating both diag and full GMMs (you can look at > > gmmbin/gmm-init-model-flat.cc for how an HMM/GMM is initialized). You > > can then train it with gmmbin/gmm-global-acc-stats and > > gmmbin/gmm-global-est in the usual fashion. > > > > On Mon, Aug 20, 2012 at 9:47 AM, Daniel Povey <dp...@gm...> wrote: > >> That's interesting. > >> > >> In the past I've done this type of thing by clustering Gaussians of a > >> speech-reco system, but if you want to start from scratch, you could > write a > >> program called gmm-global-init that would initialize a model, say with > a > >> single Gaussian, and train it, maybe on a small amount of data at > first, and > >> keep mixing up. The program gmm-global-est has an option to mix up the > >> #Gaussians. > >> > >> Dan > >> > >> On Sun, Aug 19, 2012 at 12:10 PM, Xavier Anguera <xan...@gm...> > wrote: > >>> > >>> Hi again, > >>> I am now trying to train some models given my extracted features. I > >>> want to use these models for speaker-ID experiments. For this reason I > >>> was looking for some simple method to initialize the models given some > >>> training data (something like std-perturbed Gaussian splitting, or > >>> K-means equivalents) but I do not find anything straightforward in the > >>> main code or the examples. > >>> Does anyone have an example I can work with, or any suggestion on how > >>> to implement it? > >>> > >>> Thanks > >>> > >>> Xavier Anguera > >>> > >>> > >>> > ------------------------------------------------------------------------------ > >>> Live Security Virtual Conference > >>> Exclusive live event will cover all the ways today's security and > >>> threat landscape has changed and how IT managers can respond. > Discussions > >>> will include endpoint security, mobile security and the latest in > malware > >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >>> _______________________________________________ > >>> Kaldi-developers mailing list > >>> Kal...@li... > >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Live Security Virtual Conference > >> Exclusive live event will cover all the ways today's security and > >> threat landscape has changed and how IT managers can respond. > Discussions > >> will include endpoint security, mobile security and the latest in > malware > >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >> _______________________________________________ > >> Kaldi-developers mailing list > >> Kal...@li... > >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > >> > |
From: Xavier A. <xan...@gm...> - 2012-08-20 09:21:54
|
Hi, Thanks for your answers, I need to take a closer look at sgmmbin/init-ubm-flat.cc. I did look at the other files you propose and did not see exactly what I meant. I see that gmmbin/gmm-init-model-flat.c is able to compute a single Gaussian GMM given all data and that gmmbin/gmm-global-acc-stats and gmmbin/gmm-global-est can retrain the model (i.e. EM reestimation), but I believe I saw that those only allow me to grow the model by splitting the N Gaussians with highest weight. This is not the same as splitting uniformly all Gaussians (regardless of weights) or performing Kmeans+splitting of all Gaussians. I will try using the functions you propose and if the results are not good enough I will try to implement the splitting myself (After looking at it more I see that it should not be too difficult anyway). Thanks Xavier Anguera On Mon, Aug 20, 2012 at 10:53 AM, Arnab Ghoshal <ar...@gm...> wrote: > Or you could create something like sgmmbin/init-ubm-flat.cc with the > option for creating both diag and full GMMs (you can look at > gmmbin/gmm-init-model-flat.cc for how an HMM/GMM is initialized). You > can then train it with gmmbin/gmm-global-acc-stats and > gmmbin/gmm-global-est in the usual fashion. > > On Mon, Aug 20, 2012 at 9:47 AM, Daniel Povey <dp...@gm...> wrote: >> That's interesting. >> >> In the past I've done this type of thing by clustering Gaussians of a >> speech-reco system, but if you want to start from scratch, you could write a >> program called gmm-global-init that would initialize a model, say with a >> single Gaussian, and train it, maybe on a small amount of data at first, and >> keep mixing up. The program gmm-global-est has an option to mix up the >> #Gaussians. >> >> Dan >> >> On Sun, Aug 19, 2012 at 12:10 PM, Xavier Anguera <xan...@gm...> wrote: >>> >>> Hi again, >>> I am now trying to train some models given my extracted features. I >>> want to use these models for speaker-ID experiments. For this reason I >>> was looking for some simple method to initialize the models given some >>> training data (something like std-perturbed Gaussian splitting, or >>> K-means equivalents) but I do not find anything straightforward in the >>> main code or the examples. >>> Does anyone have an example I can work with, or any suggestion on how >>> to implement it? >>> >>> Thanks >>> >>> Xavier Anguera >>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> |
From: Arnab G. <ar...@gm...> - 2012-08-20 08:53:47
|
Or you could create something like sgmmbin/init-ubm-flat.cc with the option for creating both diag and full GMMs (you can look at gmmbin/gmm-init-model-flat.cc for how an HMM/GMM is initialized). You can then train it with gmmbin/gmm-global-acc-stats and gmmbin/gmm-global-est in the usual fashion. On Mon, Aug 20, 2012 at 9:47 AM, Daniel Povey <dp...@gm...> wrote: > That's interesting. > > In the past I've done this type of thing by clustering Gaussians of a > speech-reco system, but if you want to start from scratch, you could write a > program called gmm-global-init that would initialize a model, say with a > single Gaussian, and train it, maybe on a small amount of data at first, and > keep mixing up. The program gmm-global-est has an option to mix up the > #Gaussians. > > Dan > > On Sun, Aug 19, 2012 at 12:10 PM, Xavier Anguera <xan...@gm...> wrote: >> >> Hi again, >> I am now trying to train some models given my extracted features. I >> want to use these models for speaker-ID experiments. For this reason I >> was looking for some simple method to initialize the models given some >> training data (something like std-perturbed Gaussian splitting, or >> K-means equivalents) but I do not find anything straightforward in the >> main code or the examples. >> Does anyone have an example I can work with, or any suggestion on how >> to implement it? >> >> Thanks >> >> Xavier Anguera >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Daniel P. <dp...@gm...> - 2012-08-20 08:47:28
|
That's interesting. In the past I've done this type of thing by clustering Gaussians of a speech-reco system, but if you want to start from scratch, you could write a program called gmm-global-init that would initialize a model, say with a single Gaussian, and train it, maybe on a small amount of data at first, and keep mixing up. The program gmm-global-est has an option to mix up the #Gaussians. Dan On Sun, Aug 19, 2012 at 12:10 PM, Xavier Anguera <xan...@gm...> wrote: > Hi again, > I am now trying to train some models given my extracted features. I > want to use these models for speaker-ID experiments. For this reason I > was looking for some simple method to initialize the models given some > training data (something like std-perturbed Gaussian splitting, or > K-means equivalents) but I do not find anything straightforward in the > main code or the examples. > Does anyone have an example I can work with, or any suggestion on how > to implement it? > > Thanks > > Xavier Anguera > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Xavier A. <xan...@gm...> - 2012-08-19 16:10:13
|
Hi again, I am now trying to train some models given my extracted features. I want to use these models for speaker-ID experiments. For this reason I was looking for some simple method to initialize the models given some training data (something like std-perturbed Gaussian splitting, or K-means equivalents) but I do not find anything straightforward in the main code or the examples. Does anyone have an example I can work with, or any suggestion on how to implement it? Thanks Xavier Anguera |
From: Xavier A. <xan...@gm...> - 2012-08-19 16:06:12
|
Thanks a lot for your answer. I primarily will use the features within kaldi and I see now that even though a key is appended, when I read back the features it takes care of it. X. On Sun, Aug 19, 2012 at 1:13 PM, Arnab Ghoshal <ar...@gm...> wrote: > That happens because you are using TableWriter to write the HTK > features. This is to be used for writing Kaldi-style objects that are > always stored as key-value pairs: in your case key my be the utterance > ID and value being the actual features. You can find more information > at: http://kaldi.sourceforge.net/io.html > > If you really need to write something that will be processed by a > non-Kaldi tool that expects HTK-style feature files, you should > directly open the file and write to it (you may want to use > kaldi::Output). But note that the filenames cannot have "ark:" or > "scp:" like qualifiers; the name will be treated as a literal. > > -Arnab > > On Sun, Aug 19, 2012 at 12:16 AM, Xavier Anguera <xan...@gm...> wrote: >> Hi, >> I have extracted the code below from some example file and I am trying >> to make it write an HTK-compliant features file. >> I almost make it, except that the file I get contains the HTK file >> (readable by HList, all correst) with the preceding key I pass to >> htk_writer (in my case the string "key", I tried passing it "" but it >> complains). >> Is there a way not to have the annoying key always written into the HTK file? >> >> Thanks! >> >> Xavier Anguera >> >> >> >> std::pair<Matrix<BaseFloat>, HtkHeader> p; >> p.first.Resize(m_feats.NumRows(), m_feats.NumCols()); >> p.first.CopyFromMat(m_feats); >> HtkHeader header = { >> m_feats.NumRows(), >> (int)(m_mfcc_opts.frame_opts.frame_shift_ms * 10000), //shift >> sizeof(float)*m_feats.NumCols(), >> 011 >> }; >> >> p.second = header; >> >> string outputFile = "ark:"; >> outputFile += fileName; >> TableWriter<HtkMatrixHolder> htk_writer(outputFile); >> htk_writer.Write("key", p); >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Arnab G. <ar...@gm...> - 2012-08-19 11:14:17
|
That happens because you are using TableWriter to write the HTK features. This is to be used for writing Kaldi-style objects that are always stored as key-value pairs: in your case key my be the utterance ID and value being the actual features. You can find more information at: http://kaldi.sourceforge.net/io.html If you really need to write something that will be processed by a non-Kaldi tool that expects HTK-style feature files, you should directly open the file and write to it (you may want to use kaldi::Output). But note that the filenames cannot have "ark:" or "scp:" like qualifiers; the name will be treated as a literal. -Arnab On Sun, Aug 19, 2012 at 12:16 AM, Xavier Anguera <xan...@gm...> wrote: > Hi, > I have extracted the code below from some example file and I am trying > to make it write an HTK-compliant features file. > I almost make it, except that the file I get contains the HTK file > (readable by HList, all correst) with the preceding key I pass to > htk_writer (in my case the string "key", I tried passing it "" but it > complains). > Is there a way not to have the annoying key always written into the HTK file? > > Thanks! > > Xavier Anguera > > > > std::pair<Matrix<BaseFloat>, HtkHeader> p; > p.first.Resize(m_feats.NumRows(), m_feats.NumCols()); > p.first.CopyFromMat(m_feats); > HtkHeader header = { > m_feats.NumRows(), > (int)(m_mfcc_opts.frame_opts.frame_shift_ms * 10000), //shift > sizeof(float)*m_feats.NumCols(), > 011 > }; > > p.second = header; > > string outputFile = "ark:"; > outputFile += fileName; > TableWriter<HtkMatrixHolder> htk_writer(outputFile); > htk_writer.Write("key", p); > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Xavier A. <xan...@gm...> - 2012-08-18 23:17:02
|
Hi, I have extracted the code below from some example file and I am trying to make it write an HTK-compliant features file. I almost make it, except that the file I get contains the HTK file (readable by HList, all correst) with the preceding key I pass to htk_writer (in my case the string "key", I tried passing it "" but it complains). Is there a way not to have the annoying key always written into the HTK file? Thanks! Xavier Anguera std::pair<Matrix<BaseFloat>, HtkHeader> p; p.first.Resize(m_feats.NumRows(), m_feats.NumCols()); p.first.CopyFromMat(m_feats); HtkHeader header = { m_feats.NumRows(), (int)(m_mfcc_opts.frame_opts.frame_shift_ms * 10000), //shift sizeof(float)*m_feats.NumCols(), 011 }; p.second = header; string outputFile = "ark:"; outputFile += fileName; TableWriter<HtkMatrixHolder> htk_writer(outputFile); htk_writer.Write("key", p); |
From: Daniel P. <dp...@gm...> - 2012-07-27 15:24:58
|
Thanks, Nelson-- yes it's there. Search for fmmi in example scripts such as egs/wsj/s5/run.sh. [called fMMI because using MMI objective function]. Note: it's not checked in yet, but I got better results from using the "_indirect" script [see run.sh] but double the learning rate, at 0.02. fMPE isn't giving as much gains as it used to in IBM though; I'm trying to find out why. Dan On Fri, Jul 27, 2012 at 8:53 AM, Neil Nelson <nn...@in...> wrote: > Theban, > > Google gives this page: > > http://kaldi.sourceforge.net/fmpe-est_8cc.html > > Neil > > > On 07/26/2012 01:45 PM, Theban Stanley wrote: > > Hi Kaldi team, > I was wondering if FMPE is implemented in the Kaldi > toolkit. > > thanks for your time, > > Theban > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > > _______________________________________________ > Kaldi-developers mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Neil N. <nn...@in...> - 2012-07-27 13:11:39
|
Theban, Google gives this page: http://kaldi.sourceforge.net/fmpe-est_8cc.html Neil On 07/26/2012 01:45 PM, Theban Stanley wrote: > Hi Kaldi team, > I was wondering if FMPE is implemented in the > Kaldi toolkit. > > thanks for your time, > > Theban > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Theban S. <the...@gm...> - 2012-07-26 19:45:17
|
Hi Kaldi team, I was wondering if FMPE is implemented in the Kaldi toolkit. thanks for your time, Theban |
From: Vassil P. <vas...@gm...> - 2012-07-25 11:38:30
|
Hi, Did you try to change the parameters in cmd.sh? If you are running Grid Engine the parameters of queue.pl should match your setup. If you don't have a cluster you can run the scripts locally by commenting/uncommenting the relevant lines in cmd.sh so that run.pl is used instead queue.pl. Vassil On Mon, Jul 23, 2012 at 9:28 AM, Do Quoc <doq...@is...> wrote: > Dear Kaldi team, > I am Do Quoc Truong, member of Augmented Human Communication Laboratory > Website:http://ahclab.naist.jp/index_en.html > > Now, I am trying to run example of Kaldi 'kaldi-trunk/egs/wsj/s5', but I got a problem about somethings relate with 'qsub' > This problem is rising when I try to run this command 'local/wsj_train_rnnlms.sh --cmd "$decode_cmd -l mem_free=10G" data/local/rnnlm.h30.voc10k' > in 'run.sh' file line 47. > > Below is the detail of this error: > > ahclab24 $ local/wsj_train_rnnlms.sh --cmd "$decode_cmd -l mem_free=10G" data/local/rnnlm.h30.voc10k > Not installing the rnnlm toolkit since it is already there. > Getting training data with OOV words replaced with <UNK> (train_nounk.gz) > Splitting data into train and validation sets. > Training RNNLM (note: this uses a lot of memory! Run it on a big machine.) > local/wsj_train_rnnlms.sh: line 129: -l: command not found > > I do not have any ideal about qsub and clustering and how to fix this problem. > > Hope that you can help me to figure out the problem. > > I am looking forward to hearing from you > > Yours sincerely, > Do Quoc Truong > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Do Q. <doq...@is...> - 2012-07-23 06:45:23
|
Dear Kaldi team, I am Do Quoc Truong, member of Augmented Human Communication Laboratory Website:http://ahclab.naist.jp/index_en.html Now, I am trying to run example of Kaldi 'kaldi-trunk/egs/wsj/s5', but I got a problem about somethings relate with 'qsub' This problem is rising when I try to run this command 'local/wsj_train_rnnlms.sh --cmd "$decode_cmd -l mem_free=10G" data/local/rnnlm.h30.voc10k' in 'run.sh' file line 47. Below is the detail of this error: ahclab24 $ local/wsj_train_rnnlms.sh --cmd "$decode_cmd -l mem_free=10G" data/local/rnnlm.h30.voc10k Not installing the rnnlm toolkit since it is already there. Getting training data with OOV words replaced with <UNK> (train_nounk.gz) Splitting data into train and validation sets. Training RNNLM (note: this uses a lot of memory! Run it on a big machine.) local/wsj_train_rnnlms.sh: line 129: -l: command not found I do not have any ideal about qsub and clustering and how to fix this problem. Hope that you can help me to figure out the problem. I am looking forward to hearing from you Yours sincerely, Do Quoc Truong |
From: Nagendra G. <nag...@go...> - 2012-07-01 17:42:16
|
Tony, Normally thats the only thing, but some other parameter tweaking may help sometimes. Nagendra On Jul 1, 2012 1:13 PM, "Tony Robinson" <to...@ca...> wrote: > Having got the standard example recipes to go I'm now looking at a few > tasks with my own data. > > One of them is telephony. I've set --sample-frequency=8000 in > conf/mfcc.conf, do I need to do anything else? > > > Tony > -- > Dr A J Robinson, Founder and Director of Cantab Research Limited. > St Johns Innovation Centre, Cowley Road, Cambridge, CB4 0WS, UK. > Company reg no 05697423 (England and Wales), VAT reg no 925606030. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Tony R. <to...@ca...> - 2012-07-01 17:13:26
|
Having got the standard example recipes to go I'm now looking at a few tasks with my own data. One of them is telephony. I've set --sample-frequency=8000 in conf/mfcc.conf, do I need to do anything else? Tony -- Dr A J Robinson, Founder and Director of Cantab Research Limited. St Johns Innovation Centre, Cowley Road, Cambridge, CB4 0WS, UK. Company reg no 05697423 (England and Wales), VAT reg no 925606030. |
From: Daniel P. <dp...@gm...> - 2012-06-29 13:15:31
|
I think the missing-feature approach would be quite difficult to do in the existing framework. To do it properly you'd have to redo a lot of code. There was essentially a bug in gmm-init-model, there was no variance floor (there is in the rest of the code). It might work now, I've checked in a change. People sometimes do an interpolation of the F0 in missing regions. If you do LDA (so get deltas), this would anyway cause zero variances. It should generally not crash though. Dan On Thu, Jun 28, 2012 at 6:30 PM, Mirko Hannemann < mir...@go...> wrote: > Hi! > > I was trying to use F0 (first harmony tone) features to append them to > the MFCC/PLP features, > and I realized, that when computing the F0 frequency, > the algorithm needs to output a value for "missing feature" in regions, > which do not correspond to vowels (do not have periodic excitation) - > in my case "-1" was used. > The resulting features are not very nice for the gaussian model, and > when using gmm-init-model, > it crashed the computation of Gconst, I think because of the zero > variance in regions with only "-1". > Also the CMN/CVN normalization does not make a lot of sense, when a > lot of "-1" are present in the signal. > Is there already anything in Kaldi, that can be done about it? > Should we introduce a "missing feature" value, which prevents those > frames from being used when computing means and variances? > > Greetings, > Mirko > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Mirko H. <mir...@go...> - 2012-06-28 22:30:08
|
Hi! I was trying to use F0 (first harmony tone) features to append them to the MFCC/PLP features, and I realized, that when computing the F0 frequency, the algorithm needs to output a value for "missing feature" in regions, which do not correspond to vowels (do not have periodic excitation) - in my case "-1" was used. The resulting features are not very nice for the gaussian model, and when using gmm-init-model, it crashed the computation of Gconst, I think because of the zero variance in regions with only "-1". Also the CMN/CVN normalization does not make a lot of sense, when a lot of "-1" are present in the signal. Is there already anything in Kaldi, that can be done about it? Should we introduce a "missing feature" value, which prevents those frames from being used when computing means and variances? Greetings, Mirko |
From: Andrew R. <an...@cs...> - 2012-06-24 13:29:02
|
Quite right. I was stepping through run.sh line by line, and must have missed path.sh that time. (what a time sink!) Thank you. (I'll update my version of the repo now to get s5.) -Andrew On Sun, Jun 24, 2012 at 4:41 AM, Vassil Panayotov <vas...@gm...> wrote: > Hi, > > I don't have much experience with wsj/s3 myself, but as far as I can > see the version in Kaldi's trunk has this environment variable set in > egs/wsj/s3/path.sh . I think if you "source" (. ./path.sh) this script > before running the rest of the recipe LC_ALL should be already set for > you. > By the way the currently recommended version of WSJ recipe is "s5", > which I think is stable already. > > Vassil > > On Sat, Jun 23, 2012 at 5:14 PM, Andrew Rosenberg <an...@cs...> wrote: >> Hi all, >> >> I've run into a problem with language model training in the wsj >> training recipe s3. During the LM training an error shows up that i'm >> not quite sure how to fix. >> >> Train LM 3gram min count >> Getting raw N-gram counts >> generating n grams >> discount_ngrams: for n-gram order 1, D=0.000000, tau=0.000000 phi=1.000000 >> discount_ngrams: for n-gram order 2, D=0.000000, tau=0.000000 phi=1.000000 >> discount_ngrams: for n-gram order 3, D=1.000000, tau=0.000000 phi=1.000000 >> error: histories are not in sorted order, "?? ۰" > "? ??" >> merge_ngrams: merge_ngrams.cc:141: void process_line(char*): Assertion >> `comp > 0 || (comp == 0 && >> entry.predicted.compare(stack.back().predicted) >= 0)' failed. >> >> Looking at the "error: ..." line in emacs rather than the console, to >> see what the '?' characters actually were, it was clear that the issue >> was with the sorting of the ngram tokens. >> >> The recipe runs error free until >> local/wsj_train_lms.sh >> >> within this the line that gives the problem (first) is >> >> train_lm.sh --arpa --lmtype 3gram-mincount $dir >> >> digging deeper into train_lm.sh the line that generates the error is >> >> gunzip -c $dir/train.gz | tail -n +$heldout_sents | \ >> get_raw_ngrams 3 | sort | uniq -c | uniq_to_ngrams | \ >> sort | discount_ngrams $subdir/config.get_ngrams | \ >> sort | merge_ngrams | gzip -c > $subdir/ngrams.gz >> >> Suspecting this is a problem with how sort is working, i tried sort >> -n to see if it would fix the issue to no avail. >> >> Ultimately the fix was to "export LC_ALL=C" to ensure POSIX style sorting. >> >> This is clearly an environment problem, but I figured you guys would >> want to know about it. I'm doing this in bash on CentOS. (if more >> environment information would be useful, let me know.) >> >> Thanks very much for putting this tool together. I'm really enjoying >> getting to know it. >> >> Andrew >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Vassil P. <vas...@gm...> - 2012-06-24 08:41:32
|
Hi, I don't have much experience with wsj/s3 myself, but as far as I can see the version in Kaldi's trunk has this environment variable set in egs/wsj/s3/path.sh . I think if you "source" (. ./path.sh) this script before running the rest of the recipe LC_ALL should be already set for you. By the way the currently recommended version of WSJ recipe is "s5", which I think is stable already. Vassil On Sat, Jun 23, 2012 at 5:14 PM, Andrew Rosenberg <an...@cs...> wrote: > Hi all, > > I've run into a problem with language model training in the wsj > training recipe s3. During the LM training an error shows up that i'm > not quite sure how to fix. > > Train LM 3gram min count > Getting raw N-gram counts > generating n grams > discount_ngrams: for n-gram order 1, D=0.000000, tau=0.000000 phi=1.000000 > discount_ngrams: for n-gram order 2, D=0.000000, tau=0.000000 phi=1.000000 > discount_ngrams: for n-gram order 3, D=1.000000, tau=0.000000 phi=1.000000 > error: histories are not in sorted order, "?? ۰" > "? ??" > merge_ngrams: merge_ngrams.cc:141: void process_line(char*): Assertion > `comp > 0 || (comp == 0 && > entry.predicted.compare(stack.back().predicted) >= 0)' failed. > > Looking at the "error: ..." line in emacs rather than the console, to > see what the '?' characters actually were, it was clear that the issue > was with the sorting of the ngram tokens. > > The recipe runs error free until > local/wsj_train_lms.sh > > within this the line that gives the problem (first) is > > train_lm.sh --arpa --lmtype 3gram-mincount $dir > > digging deeper into train_lm.sh the line that generates the error is > > gunzip -c $dir/train.gz | tail -n +$heldout_sents | \ > get_raw_ngrams 3 | sort | uniq -c | uniq_to_ngrams | \ > sort | discount_ngrams $subdir/config.get_ngrams | \ > sort | merge_ngrams | gzip -c > $subdir/ngrams.gz > > Suspecting this is a problem with how sort is working, i tried sort > -n to see if it would fix the issue to no avail. > > Ultimately the fix was to "export LC_ALL=C" to ensure POSIX style sorting. > > This is clearly an environment problem, but I figured you guys would > want to know about it. I'm doing this in bash on CentOS. (if more > environment information would be useful, let me know.) > > Thanks very much for putting this tool together. I'm really enjoying > getting to know it. > > Andrew > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |