You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Raymond W. M. N. <wm...@sh...> - 2015-03-08 00:12:24
|
alright. thanks! raymond On 7 March 2015 at 23:51, Daniel Povey <dp...@gm...> wrote: > You shouldn't need to subset those other archives, it should just read > from them the elements that it needs. > Dan > > > On Sat, Mar 7, 2015 at 6:47 PM, Raymond W. M. Ng <wm...@sh...> > wrote: > >> Thank you, Dan >> Is there a similar subset function for alignment and lattice archive? >> I tried >> subset-feats --include=foo/bar/random-utt-subset.iteration ark:- ark:- >> with alignment pipeline and it doesn't work (apparently the function is >> expecting the second field to start with '[' where alignment file does not) >> >> It is the RAM issue, Tony, so need to sample the files before sequence >> training. >> >> thank you >> raymond >> >> On 28 February 2015 at 19:48, Tony Robinson <to...@ca...> >> wrote: >> >>> Just to check that it's RAM you run out of not local disk. IIRC there >>> was a change made to use /tmp quite a few months ago. Our large local >>> disk isn't on /tmp and once we fixed this it all worked great again. >>> >>> It good to see the tedium recipe being at the core of delelopment. >>> There are so many bright people out there, a completely free and state of >>> the art recipe can only get more people working on ASR so we'll make after >>> progress. >>> >>> Tony >>> >>> Sent from my iPad >>> >>> > On 28 Feb 2015, at 19:31, "Raymond W. M. Ng" <wm...@sh...> >>> wrote: >>> > >>> > Hi Kaldi, >>> > >>> > I am training an DNN with Karel setupt on a 160hr data set. >>> > When I get to the sMBR sequence discriminative training >>> (steps/nnet/train_mpe.sh) The memory usage exploded. The program only >>> managed to process around 2/7 of the training files before it crashes. >>> > >>> > There's no easy accumulation function for the DNN but I assume I can >>> just put different training file splits in consecutive iterations? >>> > >>> > I'd like to know if there's resource out there already. I was >>> referring to the egs/tedlium recipe. >>> > >>> > thanks >>> > raymond >>> > >>> ------------------------------------------------------------------------------ >>> > Dive into the World of Parallel Programming The Go Parallel Website, >>> sponsored >>> > by Intel and developed in partnership with Slashdot Media, is your hub >>> for all >>> > things parallel software development, from weekly thought leadership >>> blogs to >>> > news, videos, case studies, tutorials and more. Take a look and join >>> the >>> > conversation now. http://goparallel.sourceforge.net/ >>> > _______________________________________________ >>> > Kaldi-developers mailing list >>> > Kal...@li... >>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >> >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > |
From: Daniel P. <dp...@gm...> - 2015-03-07 23:51:43
|
You shouldn't need to subset those other archives, it should just read from them the elements that it needs. Dan On Sat, Mar 7, 2015 at 6:47 PM, Raymond W. M. Ng <wm...@sh...> wrote: > Thank you, Dan > Is there a similar subset function for alignment and lattice archive? > I tried > subset-feats --include=foo/bar/random-utt-subset.iteration ark:- ark:- > with alignment pipeline and it doesn't work (apparently the function is > expecting the second field to start with '[' where alignment file does not) > > It is the RAM issue, Tony, so need to sample the files before sequence > training. > > thank you > raymond > > On 28 February 2015 at 19:48, Tony Robinson <to...@ca...> > wrote: > >> Just to check that it's RAM you run out of not local disk. IIRC there was >> a change made to use /tmp quite a few months ago. Our large local disk >> isn't on /tmp and once we fixed this it all worked great again. >> >> It good to see the tedium recipe being at the core of delelopment. >> There are so many bright people out there, a completely free and state of >> the art recipe can only get more people working on ASR so we'll make after >> progress. >> >> Tony >> >> Sent from my iPad >> >> > On 28 Feb 2015, at 19:31, "Raymond W. M. Ng" <wm...@sh...> >> wrote: >> > >> > Hi Kaldi, >> > >> > I am training an DNN with Karel setupt on a 160hr data set. >> > When I get to the sMBR sequence discriminative training >> (steps/nnet/train_mpe.sh) The memory usage exploded. The program only >> managed to process around 2/7 of the training files before it crashes. >> > >> > There's no easy accumulation function for the DNN but I assume I can >> just put different training file splits in consecutive iterations? >> > >> > I'd like to know if there's resource out there already. I was referring >> to the egs/tedlium recipe. >> > >> > thanks >> > raymond >> > >> ------------------------------------------------------------------------------ >> > Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> > by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> > things parallel software development, from weekly thought leadership >> blogs to >> > news, videos, case studies, tutorials and more. Take a look and join the >> > conversation now. http://goparallel.sourceforge.net/ >> > _______________________________________________ >> > Kaldi-developers mailing list >> > Kal...@li... >> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Raymond W. M. N. <wm...@sh...> - 2015-03-07 23:47:41
|
Thank you, Dan Is there a similar subset function for alignment and lattice archive? I tried subset-feats --include=foo/bar/random-utt-subset.iteration ark:- ark:- with alignment pipeline and it doesn't work (apparently the function is expecting the second field to start with '[' where alignment file does not) It is the RAM issue, Tony, so need to sample the files before sequence training. thank you raymond On 28 February 2015 at 19:48, Tony Robinson <to...@ca...> wrote: > Just to check that it's RAM you run out of not local disk. IIRC there was > a change made to use /tmp quite a few months ago. Our large local disk > isn't on /tmp and once we fixed this it all worked great again. > > It good to see the tedium recipe being at the core of delelopment. There > are so many bright people out there, a completely free and state of the art > recipe can only get more people working on ASR so we'll make after progress. > > Tony > > Sent from my iPad > > > On 28 Feb 2015, at 19:31, "Raymond W. M. Ng" <wm...@sh...> > wrote: > > > > Hi Kaldi, > > > > I am training an DNN with Karel setupt on a 160hr data set. > > When I get to the sMBR sequence discriminative training > (steps/nnet/train_mpe.sh) The memory usage exploded. The program only > managed to process around 2/7 of the training files before it crashes. > > > > There's no easy accumulation function for the DNN but I assume I can > just put different training file splits in consecutive iterations? > > > > I'd like to know if there's resource out there already. I was referring > to the egs/tedlium recipe. > > > > thanks > > raymond > > > ------------------------------------------------------------------------------ > > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > > by Intel and developed in partnership with Slashdot Media, is your hub > for all > > things parallel software development, from weekly thought leadership > blogs to > > news, videos, case studies, tutorials and more. Take a look and join the > > conversation now. http://goparallel.sourceforge.net/ > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Daniel P. <dp...@gm...> - 2015-03-06 08:26:05
|
I am forwarding this email thread to kaldi-developers as I think it will be of interest to people. Vimal found that we can improve sMBR trained neural nets by recomputing the priors after sMBR training-- setting them to the average posterior computed by the neural net, on randomly chosen training data. [This is a little bit like ensuring that the Gaussian mixture weights sum to one in a generative model, which is normally done in discriminative training even though in princpile the objective function would make sense without ensuring that they sum to one]. Karel has checked in the change for the nnet1 setup; and I believe he has also changed the script to make --one-silence-class true the default... "--one-silence-class true" tends to improve results by reducing the insertion rate, as well as making more sense as an objective function. Basically, the old objective function (standard MPE/SMBR/MPFE) had an asymmetry w.r.t. insertions, that insertions into silence regions were not counted as errors... this never made sense but was done because it had seemed to work (this was in other toolkits though, like HTK). Anyway, --one-silence-class true makes the objective function more symmetric, and also makes it so that all silence phones (silence, noise, etc.) or silence pdfs are treated as a single class so replacing silence with noise or vice versa is not counted as an error. This makes sense because it's similar to how we normally score the systems. I'm hoping that in a couple of weeks we can check in changes to the discriminative training setup for nnet2, to make the changes. I'd like to test it on a few setups first though. Dan ---------- Forwarded message ---------- From: Vesely Karel <ive...@fi...> Date: Thu, Mar 5, 2015 at 8:24 AM Subject: Re: Large improvements by adjusting priors To: dp...@gm..., Vimal Manohar <vim...@gm...> Okay, thanks, I just commited the updated sMBR script which estimates the priors on the training data. It has fixed the problem of too many deletions appering after sMBR training (there are errors in the tranining transcripts, so sMBR does not help much here): %WER 78.4 | 2711 24825 | 24.8 47.8 27.5 3.2 78.4 99.6 | -1.103 | exp/dnn6b_butbn1_pretrain-dbn_dnn/decode_vllp.tune.seg1/scoring_lex_10/ctm.filt.sub.sys %WER 78.4 | 2711 24825 | 24.4 46.6 29.0 2.8 78.4 99.6 | -1.208 | exp/dnn6b_butbn1_pretrain-dbn_dnn/decode_vllp.tune.seg1_PRIOR/scoring_lex_11/ctm.filt.sub.sys => no change on frame-cross entropy training %WER 80.7 | 2711 24825 | 20.7 29.8 49.4 1.4 80.7 99.8 | -1.130 | exp/dnn6c_butbn1_pretrain-dbn_dnn_smbr/decode_vllp.tune.seg1/scoring_lex_9/ctm.filt.sub.sys %WER 78.0 | 2711 24825 | 24.7 45.0 30.4 2.7 78.0 99.6 | -1.109 | exp/dnn6c_butbn1_pretrain-dbn_dnn_smbr/decode_vllp.tune.seg1_PRIOR/scoring_lex_11/ctm.filt.sub.sys => helpful with sMBR training Also changed the default of sMBR as Dan suggested: do_smbr=true exclude_silphones=true one_silence_class=true Thanks, Karel. On 03/04/2015 10:50 PM, Daniel Povey wrote: Karel, also note that the --one-silence-class thing seems to have been helpful in quite a few scenarios. We should consider making this the default. Anyway the original formulation never made sense, it was always a hack. --one-silence-class makes more sense. Dan On Wed, Mar 4, 2015 at 4:43 PM, Vimal Manohar <vim...@gm...> wrote: > Yes, that is correct. I found it helpful especially in the cases where the > epoch4 model was performing worse than epoch3, like when using a > high learning rate. But after recomputing priors (individually for both > the models) at the end of sMBR training, epoch4 model was much better > than epoch3. > > > On 03/04, Daniel Povey wrote: > >> Karel, I already do that at the end of my frame-cross-entropy training. >> It >> was never clear that it made a big difference but I felt it was the right >> way to do it. >> I think what Vimal was saying is that he did the same at the end of sMBR >> training and it did make a difference. >> Dan >> >> >> On Wed, Mar 4, 2015 at 4:28 PM, Karel Veselý <ive...@fi...> >> wrote: >> >> Wow, that sounds good, just to check that I understand : instead of >>> taking relative frequencies from the pdf-alignment >>> you compute the priors as average DNN output on a subset of data at the >>> end of frame-cross-entropy training. >>> And then the priors are fixed during the sMBR training... >>> Did I get it correctly? >>> Thanks, >>> Karel. >>> >>> Dne 4. 3. 2015 v 18:43 Daniel Povey napsal(a): >>> >>> >>> I am getting large improvements by adjusting prior on my Fisher setup >>> >>>> and also on some of the Babel systems. >>>> >>>> >>> Great news! So I guess it means you recompute the prior term based on >>> the average posteriors on a subset of data, just like at the end of the >>> cross-entropy training script. Cc'ing Karel for his info, as he might >>> want >>> to put this into his SMBR script. >>> >>> >>> >>>> On the baseline discriminative supervised system, Nnet2_SMBR the >>>> improvement is 0.4% over not adjusting priors. >>>> >>>> >>> Cool. Since this is a minor change we can check it into the SMBR >>> training scripts quite soon. >>> >>> >>> >>>> On the SMBR multilingual recipe semisupervised system Multilang2_SMBR, >>>> the improvement is around 0.2%. >>>> On the lattice entropy stuff Multinnet2_NCE+SMBR, the improvement is >>>> again 0.2%. >>>> This are just the improvements considering only the respective previous >>>> best systems. >>>> Some of the other systems that were performing worse before seems to be >>>> only because of a mismatch of priors. Some of the lattice entropy >>>> systems got around 1% improvement making it more closer to the best >>>> lattice entropy system. Also we had an issue before of unsupervised part >>>> of neural net performing better than the supervised part. This is mostly >>>> mitigated by adjusting priors. >>>> I am testing the priors adjustment out on Babel languages. >>>> Also I tried SMBR with one-silence-class in some Babel languages, it >>>> gives around 1% improvement. It looks to be mostly due to decrease in >>>> insertions and substitutions, but a slight increase in deletions. I am >>>> now trying to see its effect in the supervised part of the >>>> lattice entropy semisupervised recipe. >>>> >>>> >>> Cool! >>> >>> >>> >>>> Is there a way to extend one-silence-class for MMI or lattice >>>> entropy? Can we merge arcs at a particular time that have silence pdfs >>>> and then pass the gradients to all the silence pdfs in the DNN output >>>> layer? >>>> >>>> >>> The one-silence-class thing is specific to MPE and SMBR, it's not >>> applicable to MMI or cross-entropy. >>> >>> Dan >>> >>> >>> >>> Regards, >>>> >>>> -- >>>> Vimal Manohar >>>> Doctoral Student >>>> Electrical & Computer Engineering >>>> Johns Hopkins University >>>> Baltimore, MD >>>> >>>> >>> >>> >>> -- >>> Karel Vesely, Brno University of Tec...@fi..., >>> +420-54114-1300 >>> >>> >>> > -- > Vimal Manohar > Doctoral Student > Electrical & Computer Engineering > Johns Hopkins University > Baltimore, MD > -- Karel Vesely, Brno University of Tec...@fi..., +420-54114-1300 |
From: Kirill K. <kir...@sm...> - 2015-03-02 22:21:41
|
Yes, pipelines in PS are not simple files. It may probably be confusing how they reused the syntax for a fairly different concept. -kkm From: jt...@gm... [mailto:jt...@gm...] On Behalf Of Jan Trmal Sent: 2015-03-02 1341 To: Dan Povey Cc: Kirill Katsnelson; kal...@li... Subject: Re: [Kaldi-developers] Kaldi on Windows Not only it imposes overhead -- sometimes it somehow changes the data because it interprets the data as text or text objects. I remember using powershell once or twice and while an interesting experience, I don't want to go back. There might be some way how to switch that off, I didn;t figure it at that time (it's been a few years). I ended up with calling cmd.exe ("DOS") with /c to execute the particular pipeline. y. On Mon, Mar 2, 2015 at 4:34 PM, Daniel Povey <dp...@gm...<mailto:dp...@gm...>> wrote: I remember why it's not possible to get the scripts to work when compiled in Windows. The scripts depend on Kaldi opening commands from within C++ programs, and the commands need to be interpreted in bash. If you have compiled the binaries in windows this won't work because they don't "know about" cygwin. Now in principle you could try to convert the scripts to DOS commands to make it work. But this would be a huge amount of work; and DOS is too primitive a language. Basically I gave up on doing this when I discovered that there is no documented way to escape double-quote characters in DOS. (It seems empirically that you can sometimes use 3 quotes, and sometimes 4, but the distinction is very mysterious). And some people told me that PowerShell is the recommended way to do scripting now, but Kaldi is based on piping raw data, not objects, and PowerShell treats raw data as a string of objects which happen to be characters, which probably imposes a considerable overhead. Anyway, the standard utilities we need for speech recognition evaluations, such as sph2pipe, sclite and so on, won't work on Windows. Dan On Mon, Mar 2, 2015 at 4:16 PM, Daniel Povey <dp...@gm...<mailto:dp...@gm...>> wrote: librispeech scripts are available in librispeech/s5/. I am warning you, you will regret trying to do it on Windows. But I would still appreciate help with the build setup. Dan On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Compiling under cygwin is probably not reasonable for such a computation-intensive toolkit. No CUDA, probably no MKL, not even instruction set optimization, as far as I understand. I’ll see what I can pull out of it when it compiles and tests ok. I know of the line ending issues, and hope to be able to handle them. Pipes I am not sure, never ran into these before. That would be a more complex problem. I do not have access to the WSJ corpus. Are there any pointers to the use of LibriSpeech instead? I remember reading your paper where you compared the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but as well may ask you while we are communicating :) -- are Kaldi scripts available for it? -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1253 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows Hi, If you are going to use cygwin, then it is best to just compile it in cygwin. If you compile using Visual Studio, the binaries won't work correctly- I forget whether it relates to how newlines are handled, how pipes work, or some other reason. In any case, Visual Studio is quite buggy- some problems were reported recently on one of these lists. Dan On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Thank you Dan! I hope cygwin will take care of the script part, unless there are exotic used like unix domain sockets, procfs etc. -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1127 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li...<mailto:kal...@li...> a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Daniel P. <dp...@gm...> - 2015-03-02 21:51:29
|
Look for popen. Dan On Mon, Mar 2, 2015 at 4:48 PM, Kirill Katsnelson < kir...@sm...> wrote: > Yes, pipelines in PS are not simple files. It may probably be confusing > how they reused the syntax for a fairly different concept. > > > > -kkm > > > > *From:* jt...@gm... [mailto:jt...@gm...] *On Behalf Of *Jan > Trmal > *Sent:* 2015-03-02 1341 > *To:* Dan Povey > *Cc:* Kirill Katsnelson; kal...@li... > > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > Not only it imposes overhead -- sometimes it somehow changes the data > because it interprets the data as text or text objects. I remember using > powershell once or twice and while an interesting experience, I don't want > to go back. There might be some way how to switch that off, I didn;t figure > it at that time (it's been a few years). I ended up with calling cmd.exe > ("DOS") with /c to execute the particular pipeline. > > y. > > > > On Mon, Mar 2, 2015 at 4:34 PM, Daniel Povey <dp...@gm...> wrote: > > I remember why it's not possible to get the scripts to work when compiled > in Windows. The scripts depend on Kaldi opening commands from within C++ > programs, and the commands need to be interpreted in bash. If you have > compiled the binaries in windows this won't work because they don't "know > about" cygwin. Now in principle you could try to convert the scripts to > DOS commands to make it work. But this would be a huge amount of work; and > DOS is too primitive a language. Basically I gave up on doing this when I > discovered that there is no documented way to escape double-quote > characters in DOS. (It seems empirically that you can sometimes use 3 > quotes, and sometimes 4, but the distinction is very mysterious). And some > people told me that PowerShell is the recommended way to do scripting now, > but Kaldi is based on piping raw data, not objects, and PowerShell treats > raw data as a string of objects which happen to be characters, which > probably imposes a considerable overhead. Anyway, the standard utilities > we need for speech recognition evaluations, such as sph2pipe, sclite and so > on, won't work on Windows. > > > > Dan > > > > > > On Mon, Mar 2, 2015 at 4:16 PM, Daniel Povey <dp...@gm...> wrote: > > librispeech scripts are available in librispeech/s5/. > > I am warning you, you will regret trying to do it on Windows. But I would > still appreciate help with the build setup. > > > > Dan > > > > > > On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Compiling under cygwin is probably not reasonable for such a > computation-intensive toolkit. No CUDA, probably no MKL, not even > instruction set optimization, as far as I understand. > > > > I’ll see what I can pull out of it when it compiles and tests ok. I know > of the line ending issues, and hope to be able to handle them. Pipes I am > not sure, never ran into these before. That would be a more complex problem. > > > > I do not have access to the WSJ corpus. Are there any pointers to the use > of LibriSpeech instead? I remember reading your paper where you compared > the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but > as well may ask you while we are communicating :) -- are Kaldi scripts > available for it? > > > > -kkm > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* 2015-03-02 1253 > > > *To:* Kirill Katsnelson > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > Hi, > > If you are going to use cygwin, then it is best to just compile it in > cygwin. If you compile using Visual Studio, the binaries won't work > correctly- I forget whether it relates to how newlines are handled, how > pipes work, or some other reason. In any case, Visual Studio is quite > buggy- some problems were reported recently on one of these lists. > > Dan > > > > > > On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Thank you Dan! I hope cygwin will take care of the script part, unless > there are exotic used like unix domain sockets, procfs etc. > > > > -kkm > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* 2015-03-02 1127 > *To:* Kirill Katsnelson > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > I confirmed on both lists. Replying to kaldi-developers and just bcc'ing > kaldi-users. > > It would be great if you could improve the Windows build for us. > > However, Kaldi scripts are dependent on things like bash, and it won't > work on Windows. So ultimately it will not make sense for you to train on > Windows- the cost of your time will be more than the cost of a new > machine. That being said, I would appreciate better Windows build scripts > (main use case is deployment of recognition on Windows). > > Dan > > > > > > On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Hi, > > I am trying to compile and use Kaldi on Windows. The perl script that > comes in the distribution does not produce very useful results, since > VS2013 has a trouble opening (opening, not compiling!) the resulting > monster solution of 600+ projects. It does not support CUDA also. > > I approached the problem from the square one and am writing msbuild > scripts with the support for CUDA, MKL and Intel C++ compiler. > > Is there any interest in supporting Kaldi build on Windows in the mainline > distribution? For me, a practical consideration in this decision was the > cost of building an extra machine with CUDA hardware and Intel software > just for Kaldi, while my Windows machine already has all this. > > Is kal...@li... a closed list? I send a > subscription request to the both lists, but got a confirmation from the > -users@ list only. > > -kkm > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > > > > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > |
From: Kirill K. <kir...@sm...> - 2015-03-02 21:46:25
|
Thanks, this could really be an obstacle, I think. I could not find any instances of “fork” or “exec*” in the surces however. Do you even roughly remember what I should be looking for? -kkm From: Daniel Povey [mailto:dp...@gm...] Sent: 2015-03-02 1334 To: Kirill Katsnelson Cc: kal...@li... Subject: Re: [Kaldi-developers] Kaldi on Windows I remember why it's not possible to get the scripts to work when compiled in Windows. The scripts depend on Kaldi opening commands from within C++ programs, and the commands need to be interpreted in bash. If you have compiled the binaries in windows this won't work because they don't "know about" cygwin. Now in principle you could try to convert the scripts to DOS commands to make it work. But this would be a huge amount of work; and DOS is too primitive a language. Basically I gave up on doing this when I discovered that there is no documented way to escape double-quote characters in DOS. (It seems empirically that you can sometimes use 3 quotes, and sometimes 4, but the distinction is very mysterious). And some people told me that PowerShell is the recommended way to do scripting now, but Kaldi is based on piping raw data, not objects, and PowerShell treats raw data as a string of objects which happen to be characters, which probably imposes a considerable overhead. Anyway, the standard utilities we need for speech recognition evaluations, such as sph2pipe, sclite and so on, won't work on Windows. Dan On Mon, Mar 2, 2015 at 4:16 PM, Daniel Povey <dp...@gm...<mailto:dp...@gm...>> wrote: librispeech scripts are available in librispeech/s5/. I am warning you, you will regret trying to do it on Windows. But I would still appreciate help with the build setup. Dan On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Compiling under cygwin is probably not reasonable for such a computation-intensive toolkit. No CUDA, probably no MKL, not even instruction set optimization, as far as I understand. I’ll see what I can pull out of it when it compiles and tests ok. I know of the line ending issues, and hope to be able to handle them. Pipes I am not sure, never ran into these before. That would be a more complex problem. I do not have access to the WSJ corpus. Are there any pointers to the use of LibriSpeech instead? I remember reading your paper where you compared the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but as well may ask you while we are communicating :) -- are Kaldi scripts available for it? -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1253 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows Hi, If you are going to use cygwin, then it is best to just compile it in cygwin. If you compile using Visual Studio, the binaries won't work correctly- I forget whether it relates to how newlines are handled, how pipes work, or some other reason. In any case, Visual Studio is quite buggy- some problems were reported recently on one of these lists. Dan On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Thank you Dan! I hope cygwin will take care of the script part, unless there are exotic used like unix domain sockets, procfs etc. -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1127 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li...<mailto:kal...@li...> a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Jan T. <af...@ce...> - 2015-03-02 21:41:32
|
Not only it imposes overhead -- sometimes it somehow changes the data because it interprets the data as text or text objects. I remember using powershell once or twice and while an interesting experience, I don't want to go back. There might be some way how to switch that off, I didn;t figure it at that time (it's been a few years). I ended up with calling cmd.exe ("DOS") with /c to execute the particular pipeline. y. On Mon, Mar 2, 2015 at 4:34 PM, Daniel Povey <dp...@gm...> wrote: > I remember why it's not possible to get the scripts to work when compiled > in Windows. The scripts depend on Kaldi opening commands from within C++ > programs, and the commands need to be interpreted in bash. If you have > compiled the binaries in windows this won't work because they don't "know > about" cygwin. Now in principle you could try to convert the scripts to > DOS commands to make it work. But this would be a huge amount of work; and > DOS is too primitive a language. Basically I gave up on doing this when I > discovered that there is no documented way to escape double-quote > characters in DOS. (It seems empirically that you can sometimes use 3 > quotes, and sometimes 4, but the distinction is very mysterious). And some > people told me that PowerShell is the recommended way to do scripting now, > but Kaldi is based on piping raw data, not objects, and PowerShell treats > raw data as a string of objects which happen to be characters, which > probably imposes a considerable overhead. Anyway, the standard utilities > we need for speech recognition evaluations, such as sph2pipe, sclite and so > on, won't work on Windows. > > Dan > > > On Mon, Mar 2, 2015 at 4:16 PM, Daniel Povey <dp...@gm...> wrote: > >> librispeech scripts are available in librispeech/s5/. >> I am warning you, you will regret trying to do it on Windows. But I >> would still appreciate help with the build setup. >> >> Dan >> >> >> On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson < >> kir...@sm...> wrote: >> >>> Compiling under cygwin is probably not reasonable for such a >>> computation-intensive toolkit. No CUDA, probably no MKL, not even >>> instruction set optimization, as far as I understand. >>> >>> >>> >>> I’ll see what I can pull out of it when it compiles and tests ok. I know >>> of the line ending issues, and hope to be able to handle them. Pipes I am >>> not sure, never ran into these before. That would be a more complex problem. >>> >>> >>> >>> I do not have access to the WSJ corpus. Are there any pointers to the >>> use of LibriSpeech instead? I remember reading your paper where you >>> compared the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread >>> it, but as well may ask you while we are communicating :) -- are Kaldi >>> scripts available for it? >>> >>> >>> >>> -kkm >>> >>> >>> >>> *From:* Daniel Povey [mailto:dp...@gm...] >>> *Sent:* 2015-03-02 1253 >>> >>> *To:* Kirill Katsnelson >>> *Cc:* kal...@li... >>> *Subject:* Re: [Kaldi-developers] Kaldi on Windows >>> >>> >>> >>> Hi, >>> >>> If you are going to use cygwin, then it is best to just compile it in >>> cygwin. If you compile using Visual Studio, the binaries won't work >>> correctly- I forget whether it relates to how newlines are handled, how >>> pipes work, or some other reason. In any case, Visual Studio is quite >>> buggy- some problems were reported recently on one of these lists. >>> >>> Dan >>> >>> >>> >>> >>> >>> On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < >>> kir...@sm...> wrote: >>> >>> Thank you Dan! I hope cygwin will take care of the script part, unless >>> there are exotic used like unix domain sockets, procfs etc. >>> >>> >>> >>> -kkm >>> >>> >>> >>> *From:* Daniel Povey [mailto:dp...@gm...] >>> *Sent:* 2015-03-02 1127 >>> *To:* Kirill Katsnelson >>> *Cc:* kal...@li... >>> *Subject:* Re: [Kaldi-developers] Kaldi on Windows >>> >>> >>> >>> I confirmed on both lists. Replying to kaldi-developers and just >>> bcc'ing kaldi-users. >>> >>> It would be great if you could improve the Windows build for us. >>> >>> However, Kaldi scripts are dependent on things like bash, and it won't >>> work on Windows. So ultimately it will not make sense for you to train on >>> Windows- the cost of your time will be more than the cost of a new >>> machine. That being said, I would appreciate better Windows build scripts >>> (main use case is deployment of recognition on Windows). >>> >>> Dan >>> >>> >>> >>> >>> >>> On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < >>> kir...@sm...> wrote: >>> >>> Hi, >>> >>> I am trying to compile and use Kaldi on Windows. The perl script that >>> comes in the distribution does not produce very useful results, since >>> VS2013 has a trouble opening (opening, not compiling!) the resulting >>> monster solution of 600+ projects. It does not support CUDA also. >>> >>> I approached the problem from the square one and am writing msbuild >>> scripts with the support for CUDA, MKL and Intel C++ compiler. >>> >>> Is there any interest in supporting Kaldi build on Windows in the >>> mainline distribution? For me, a practical consideration in this decision >>> was the cost of building an extra machine with CUDA hardware and Intel >>> software just for Kaldi, while my Windows machine already has all this. >>> >>> Is kal...@li... a closed list? I send a >>> subscription request to the both lists, but got a confirmation from the >>> -users@ list only. >>> >>> -kkm >>> >>> >>> ------------------------------------------------------------------------------ >>> Dive into the World of Parallel Programming The Go Parallel Website, >>> sponsored >>> by Intel and developed in partnership with Slashdot Media, is your hub >>> for all >>> things parallel software development, from weekly thought leadership >>> blogs to >>> news, videos, case studies, tutorials and more. Take a look and join the >>> conversation now. http://goparallel.sourceforge.net/ >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >>> >>> >>> >>> >> >> > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2015-03-02 21:34:08
|
I remember why it's not possible to get the scripts to work when compiled in Windows. The scripts depend on Kaldi opening commands from within C++ programs, and the commands need to be interpreted in bash. If you have compiled the binaries in windows this won't work because they don't "know about" cygwin. Now in principle you could try to convert the scripts to DOS commands to make it work. But this would be a huge amount of work; and DOS is too primitive a language. Basically I gave up on doing this when I discovered that there is no documented way to escape double-quote characters in DOS. (It seems empirically that you can sometimes use 3 quotes, and sometimes 4, but the distinction is very mysterious). And some people told me that PowerShell is the recommended way to do scripting now, but Kaldi is based on piping raw data, not objects, and PowerShell treats raw data as a string of objects which happen to be characters, which probably imposes a considerable overhead. Anyway, the standard utilities we need for speech recognition evaluations, such as sph2pipe, sclite and so on, won't work on Windows. Dan On Mon, Mar 2, 2015 at 4:16 PM, Daniel Povey <dp...@gm...> wrote: > librispeech scripts are available in librispeech/s5/. > I am warning you, you will regret trying to do it on Windows. But I would > still appreciate help with the build setup. > > Dan > > > On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson < > kir...@sm...> wrote: > >> Compiling under cygwin is probably not reasonable for such a >> computation-intensive toolkit. No CUDA, probably no MKL, not even >> instruction set optimization, as far as I understand. >> >> >> >> I’ll see what I can pull out of it when it compiles and tests ok. I know >> of the line ending issues, and hope to be able to handle them. Pipes I am >> not sure, never ran into these before. That would be a more complex problem. >> >> >> >> I do not have access to the WSJ corpus. Are there any pointers to the use >> of LibriSpeech instead? I remember reading your paper where you compared >> the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but >> as well may ask you while we are communicating :) -- are Kaldi scripts >> available for it? >> >> >> >> -kkm >> >> >> >> *From:* Daniel Povey [mailto:dp...@gm...] >> *Sent:* 2015-03-02 1253 >> >> *To:* Kirill Katsnelson >> *Cc:* kal...@li... >> *Subject:* Re: [Kaldi-developers] Kaldi on Windows >> >> >> >> Hi, >> >> If you are going to use cygwin, then it is best to just compile it in >> cygwin. If you compile using Visual Studio, the binaries won't work >> correctly- I forget whether it relates to how newlines are handled, how >> pipes work, or some other reason. In any case, Visual Studio is quite >> buggy- some problems were reported recently on one of these lists. >> >> Dan >> >> >> >> >> >> On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < >> kir...@sm...> wrote: >> >> Thank you Dan! I hope cygwin will take care of the script part, unless >> there are exotic used like unix domain sockets, procfs etc. >> >> >> >> -kkm >> >> >> >> *From:* Daniel Povey [mailto:dp...@gm...] >> *Sent:* 2015-03-02 1127 >> *To:* Kirill Katsnelson >> *Cc:* kal...@li... >> *Subject:* Re: [Kaldi-developers] Kaldi on Windows >> >> >> >> I confirmed on both lists. Replying to kaldi-developers and just bcc'ing >> kaldi-users. >> >> It would be great if you could improve the Windows build for us. >> >> However, Kaldi scripts are dependent on things like bash, and it won't >> work on Windows. So ultimately it will not make sense for you to train on >> Windows- the cost of your time will be more than the cost of a new >> machine. That being said, I would appreciate better Windows build scripts >> (main use case is deployment of recognition on Windows). >> >> Dan >> >> >> >> >> >> On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < >> kir...@sm...> wrote: >> >> Hi, >> >> I am trying to compile and use Kaldi on Windows. The perl script that >> comes in the distribution does not produce very useful results, since >> VS2013 has a trouble opening (opening, not compiling!) the resulting >> monster solution of 600+ projects. It does not support CUDA also. >> >> I approached the problem from the square one and am writing msbuild >> scripts with the support for CUDA, MKL and Intel C++ compiler. >> >> Is there any interest in supporting Kaldi build on Windows in the >> mainline distribution? For me, a practical consideration in this decision >> was the cost of building an extra machine with CUDA hardware and Intel >> software just for Kaldi, while my Windows machine already has all this. >> >> Is kal...@li... a closed list? I send a >> subscription request to the both lists, but got a confirmation from the >> -users@ list only. >> >> -kkm >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> >> >> >> > > |
From: Kirill K. <kir...@sm...> - 2015-03-02 21:29:30
|
That's exactly what you need to have said to incite me try to make it all work. :)) Thanks for the scripts. Must be recent, did not see them just a few months ago! -kkm From: Daniel Povey [mailto:dp...@gm...] Sent: 2015-03-02 1316 To: Kirill Katsnelson Cc: kal...@li... Subject: Re: [Kaldi-developers] Kaldi on Windows librispeech scripts are available in librispeech/s5/. I am warning you, you will regret trying to do it on Windows. But I would still appreciate help with the build setup. Dan On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Compiling under cygwin is probably not reasonable for such a computation-intensive toolkit. No CUDA, probably no MKL, not even instruction set optimization, as far as I understand. I’ll see what I can pull out of it when it compiles and tests ok. I know of the line ending issues, and hope to be able to handle them. Pipes I am not sure, never ran into these before. That would be a more complex problem. I do not have access to the WSJ corpus. Are there any pointers to the use of LibriSpeech instead? I remember reading your paper where you compared the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but as well may ask you while we are communicating :) -- are Kaldi scripts available for it? -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1253 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows Hi, If you are going to use cygwin, then it is best to just compile it in cygwin. If you compile using Visual Studio, the binaries won't work correctly- I forget whether it relates to how newlines are handled, how pipes work, or some other reason. In any case, Visual Studio is quite buggy- some problems were reported recently on one of these lists. Dan On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Thank you Dan! I hope cygwin will take care of the script part, unless there are exotic used like unix domain sockets, procfs etc. -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1127 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li...<mailto:kal...@li...> a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Daniel P. <dp...@gm...> - 2015-03-02 21:16:15
|
librispeech scripts are available in librispeech/s5/. I am warning you, you will regret trying to do it on Windows. But I would still appreciate help with the build setup. Dan On Mon, Mar 2, 2015 at 4:05 PM, Kirill Katsnelson < kir...@sm...> wrote: > Compiling under cygwin is probably not reasonable for such a > computation-intensive toolkit. No CUDA, probably no MKL, not even > instruction set optimization, as far as I understand. > > > > I’ll see what I can pull out of it when it compiles and tests ok. I know > of the line ending issues, and hope to be able to handle them. Pipes I am > not sure, never ran into these before. That would be a more complex problem. > > > > I do not have access to the WSJ corpus. Are there any pointers to the use > of LibriSpeech instead? I remember reading your paper where you compared > the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but > as well may ask you while we are communicating :) -- are Kaldi scripts > available for it? > > > > -kkm > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* 2015-03-02 1253 > > *To:* Kirill Katsnelson > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > Hi, > > If you are going to use cygwin, then it is best to just compile it in > cygwin. If you compile using Visual Studio, the binaries won't work > correctly- I forget whether it relates to how newlines are handled, how > pipes work, or some other reason. In any case, Visual Studio is quite > buggy- some problems were reported recently on one of these lists. > > Dan > > > > > > On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Thank you Dan! I hope cygwin will take care of the script part, unless > there are exotic used like unix domain sockets, procfs etc. > > > > -kkm > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* 2015-03-02 1127 > *To:* Kirill Katsnelson > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > I confirmed on both lists. Replying to kaldi-developers and just bcc'ing > kaldi-users. > > It would be great if you could improve the Windows build for us. > > However, Kaldi scripts are dependent on things like bash, and it won't > work on Windows. So ultimately it will not make sense for you to train on > Windows- the cost of your time will be more than the cost of a new > machine. That being said, I would appreciate better Windows build scripts > (main use case is deployment of recognition on Windows). > > Dan > > > > > > On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Hi, > > I am trying to compile and use Kaldi on Windows. The perl script that > comes in the distribution does not produce very useful results, since > VS2013 has a trouble opening (opening, not compiling!) the resulting > monster solution of 600+ projects. It does not support CUDA also. > > I approached the problem from the square one and am writing msbuild > scripts with the support for CUDA, MKL and Intel C++ compiler. > > Is there any interest in supporting Kaldi build on Windows in the mainline > distribution? For me, a practical consideration in this decision was the > cost of building an extra machine with CUDA hardware and Intel software > just for Kaldi, while my Windows machine already has all this. > > Is kal...@li... a closed list? I send a > subscription request to the both lists, but got a confirmation from the > -users@ list only. > > -kkm > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > |
From: Kirill K. <kir...@sm...> - 2015-03-02 21:05:40
|
Thank you Dan! I hope cygwin will take care of the script part, unless there are exotic used like unix domain sockets, procfs etc. -kkm From: Daniel Povey [mailto:dp...@gm...] Sent: 2015-03-02 1127 To: Kirill Katsnelson Cc: kal...@li... Subject: Re: [Kaldi-developers] Kaldi on Windows I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li...<mailto:kal...@li...> a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Kirill K. <kir...@sm...> - 2015-03-02 21:05:35
|
Compiling under cygwin is probably not reasonable for such a computation-intensive toolkit. No CUDA, probably no MKL, not even instruction set optimization, as far as I understand. I’ll see what I can pull out of it when it compiles and tests ok. I know of the line ending issues, and hope to be able to handle them. Pipes I am not sure, never ran into these before. That would be a more complex problem. I do not have access to the WSJ corpus. Are there any pointers to the use of LibriSpeech instead? I remember reading your paper where you compared the results on DNN in wsj/s5 with LibriSpeech. I’ll have to reread it, but as well may ask you while we are communicating :) -- are Kaldi scripts available for it? -kkm From: Daniel Povey [mailto:dp...@gm...] Sent: 2015-03-02 1253 To: Kirill Katsnelson Cc: kal...@li... Subject: Re: [Kaldi-developers] Kaldi on Windows Hi, If you are going to use cygwin, then it is best to just compile it in cygwin. If you compile using Visual Studio, the binaries won't work correctly- I forget whether it relates to how newlines are handled, how pipes work, or some other reason. In any case, Visual Studio is quite buggy- some problems were reported recently on one of these lists. Dan On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Thank you Dan! I hope cygwin will take care of the script part, unless there are exotic used like unix domain sockets, procfs etc. -kkm From: Daniel Povey [mailto:dp...@gm...<mailto:dp...@gm...>] Sent: 2015-03-02 1127 To: Kirill Katsnelson Cc: kal...@li...<mailto:kal...@li...> Subject: Re: [Kaldi-developers] Kaldi on Windows I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson <kir...@sm...<mailto:kir...@sm...>> wrote: Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li...<mailto:kal...@li...> a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm ------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Kaldi-developers mailing list Kal...@li...<mailto:Kal...@li...> https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Jan T. <af...@ce...> - 2015-03-02 21:02:10
|
Hi, there was a discussion about getting Kaldi compiled under VS (various versions) two weeks ago (or so). The practical experience is that getting the VS project loaded(opened) is the least of the problems. You can find the discussion in the kaldi-{devel,users} mail archives on sf.net Y. On Mon, Mar 2, 2015 at 3:53 PM, Daniel Povey <dp...@gm...> wrote: > Hi, > If you are going to use cygwin, then it is best to just compile it in > cygwin. If you compile using Visual Studio, the binaries won't work > correctly- I forget whether it relates to how newlines are handled, how > pipes work, or some other reason. In any case, Visual Studio is quite > buggy- some problems were reported recently on one of these lists. > Dan > > > On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < > kir...@sm...> wrote: > >> Thank you Dan! I hope cygwin will take care of the script part, unless >> there are exotic used like unix domain sockets, procfs etc. >> >> >> >> -kkm >> >> >> >> *From:* Daniel Povey [mailto:dp...@gm...] >> *Sent:* 2015-03-02 1127 >> *To:* Kirill Katsnelson >> *Cc:* kal...@li... >> *Subject:* Re: [Kaldi-developers] Kaldi on Windows >> >> >> >> I confirmed on both lists. Replying to kaldi-developers and just bcc'ing >> kaldi-users. >> >> It would be great if you could improve the Windows build for us. >> >> However, Kaldi scripts are dependent on things like bash, and it won't >> work on Windows. So ultimately it will not make sense for you to train on >> Windows- the cost of your time will be more than the cost of a new >> machine. That being said, I would appreciate better Windows build scripts >> (main use case is deployment of recognition on Windows). >> >> Dan >> >> >> >> >> >> On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < >> kir...@sm...> wrote: >> >> Hi, >> >> I am trying to compile and use Kaldi on Windows. The perl script that >> comes in the distribution does not produce very useful results, since >> VS2013 has a trouble opening (opening, not compiling!) the resulting >> monster solution of 600+ projects. It does not support CUDA also. >> >> I approached the problem from the square one and am writing msbuild >> scripts with the support for CUDA, MKL and Intel C++ compiler. >> >> Is there any interest in supporting Kaldi build on Windows in the >> mainline distribution? For me, a practical consideration in this decision >> was the cost of building an extra machine with CUDA hardware and Intel >> software just for Kaldi, while my Windows machine already has all this. >> >> Is kal...@li... a closed list? I send a >> subscription request to the both lists, but got a confirmation from the >> -users@ list only. >> >> -kkm >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> >> > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2015-03-02 20:53:12
|
Hi, If you are going to use cygwin, then it is best to just compile it in cygwin. If you compile using Visual Studio, the binaries won't work correctly- I forget whether it relates to how newlines are handled, how pipes work, or some other reason. In any case, Visual Studio is quite buggy- some problems were reported recently on one of these lists. Dan On Mon, Mar 2, 2015 at 3:49 PM, Kirill Katsnelson < kir...@sm...> wrote: > Thank you Dan! I hope cygwin will take care of the script part, unless > there are exotic used like unix domain sockets, procfs etc. > > > > -kkm > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* 2015-03-02 1127 > *To:* Kirill Katsnelson > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Kaldi on Windows > > > > I confirmed on both lists. Replying to kaldi-developers and just bcc'ing > kaldi-users. > > It would be great if you could improve the Windows build for us. > > However, Kaldi scripts are dependent on things like bash, and it won't > work on Windows. So ultimately it will not make sense for you to train on > Windows- the cost of your time will be more than the cost of a new > machine. That being said, I would appreciate better Windows build scripts > (main use case is deployment of recognition on Windows). > > Dan > > > > > > On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < > kir...@sm...> wrote: > > Hi, > > I am trying to compile and use Kaldi on Windows. The perl script that > comes in the distribution does not produce very useful results, since > VS2013 has a trouble opening (opening, not compiling!) the resulting > monster solution of 600+ projects. It does not support CUDA also. > > I approached the problem from the square one and am writing msbuild > scripts with the support for CUDA, MKL and Intel C++ compiler. > > Is there any interest in supporting Kaldi build on Windows in the mainline > distribution? For me, a practical consideration in this decision was the > cost of building an extra machine with CUDA hardware and Intel software > just for Kaldi, while my Windows machine already has all this. > > Is kal...@li... a closed list? I send a > subscription request to the both lists, but got a confirmation from the > -users@ list only. > > -kkm > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > |
From: Daniel P. <dp...@gm...> - 2015-03-02 19:27:33
|
I confirmed on both lists. Replying to kaldi-developers and just bcc'ing kaldi-users. It would be great if you could improve the Windows build for us. However, Kaldi scripts are dependent on things like bash, and it won't work on Windows. So ultimately it will not make sense for you to train on Windows- the cost of your time will be more than the cost of a new machine. That being said, I would appreciate better Windows build scripts (main use case is deployment of recognition on Windows). Dan On Mon, Mar 2, 2015 at 2:05 PM, Kirill Katsnelson < kir...@sm...> wrote: > Hi, > > I am trying to compile and use Kaldi on Windows. The perl script that > comes in the distribution does not produce very useful results, since > VS2013 has a trouble opening (opening, not compiling!) the resulting > monster solution of 600+ projects. It does not support CUDA also. > > I approached the problem from the square one and am writing msbuild > scripts with the support for CUDA, MKL and Intel C++ compiler. > > Is there any interest in supporting Kaldi build on Windows in the mainline > distribution? For me, a practical consideration in this decision was the > cost of building an extra machine with CUDA hardware and Intel software > just for Kaldi, while my Windows machine already has all this. > > Is kal...@li... a closed list? I send a > subscription request to the both lists, but got a confirmation from the > -users@ list only. > > -kkm > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Kirill K. <kir...@sm...> - 2015-03-02 19:21:47
|
Hi, I am trying to compile and use Kaldi on Windows. The perl script that comes in the distribution does not produce very useful results, since VS2013 has a trouble opening (opening, not compiling!) the resulting monster solution of 600+ projects. It does not support CUDA also. I approached the problem from the square one and am writing msbuild scripts with the support for CUDA, MKL and Intel C++ compiler. Is there any interest in supporting Kaldi build on Windows in the mainline distribution? For me, a practical consideration in this decision was the cost of building an extra machine with CUDA hardware and Intel software just for Kaldi, while my Windows machine already has all this. Is kal...@li... a closed list? I send a subscription request to the both lists, but got a confirmation from the -users@ list only. -kkm |
From: Karel V. <ve...@gm...> - 2015-03-02 13:10:47
|
Hi all, sorry to hear about problems, what type of data do you suspect to cause the high RAM consumption? The Features are read by SequentialMatrixReader which does not cache data, the lattices are read from an scp file calling gzipped lattices, those are also not cached in RAM, because it is read from .scp file. The only thing which gets cached in the memory are the alignments, but those are usually not too large. Btw. I just found a bug in the steps/nnet/train_mpe.sh There the utterances get shuffled, but the bug is that the unshuffled list is used... Will fix it soon, but this is totally unrelated to the memory consumption. Best, Karel. Dne 28.2.2015 v 20:48 Tony Robinson napsal(a): > Just to check that it's RAM you run out of not local disk. IIRC there was a change made to use /tmp quite a few months ago. Our large local disk isn't on /tmp and once we fixed this it all worked great again. > > It good to see the tedium recipe being at the core of delelopment. There are so many bright people out there, a completely free and state of the art recipe can only get more people working on ASR so we'll make after progress. > > Tony > > Sent from my iPad > >> On 28 Feb 2015, at 19:31, "Raymond W. M. Ng" <wm...@sh...> wrote: >> >> Hi Kaldi, >> >> I am training an DNN with Karel setupt on a 160hr data set. >> When I get to the sMBR sequence discriminative training (steps/nnet/train_mpe.sh) The memory usage exploded. The program only managed to process around 2/7 of the training files before it crashes. >> >> There's no easy accumulation function for the DNN but I assume I can just put different training file splits in consecutive iterations? >> >> I'd like to know if there's resource out there already. I was referring to the egs/tedlium recipe. >> >> thanks >> raymond >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub for all >> things parallel software development, from weekly thought leadership blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Daniel P. <dp...@gm...> - 2015-03-01 19:44:54
|
Yes, it definitely would. There are various ways to do it. You could do ali-to-phones with --per-frame=true to get the phone labels and convert them yourself; or you could do gunzip -c ali.1.gz | ali-to-post ark:- ark:- | weight-silence-post 0.0 1:2:3:4:5 final.mdl ark:- ark:- | post-to-weights ark:- ark,t:- which will give you weights per frame that are zero for silence and 1 for speech; you can treat these as labels. The 1:2:3:4:5 should be the contents of your data/lang/phones/silphones.csl. Dan On Sun, Mar 1, 2015 at 7:20 AM, John Barnes <jcb...@gm...> wrote: > Do frame level labels from kaldi acoustic models include the silence > phones (eg SIL, SPN, ...)? > > If so, would it be possible to take aligned ASR data and extract those > frame labels, collapse all nonsilence phones to a single class, and train > the VAD DNN using an external framework like Theano? > > John > > > On Saturday, February 28, 2015, Daniel Povey <dp...@gm...> wrote: > >> >> Hi, >> I am cross-posting this to kaldi-developers as I think my reply might be >> of interest to people subscribed to that list. This is a good excuse to >> talk about the situation with Voice Activity Detection (VAD) more generally. >> >> There definitely does need to be some good voice activity detection in >> Kaldi at some point. >> Part of the reason why it doesn't exist yet is that it's never been clear >> to me that there is a "right" way to do VAD- or even a right way to >> formulate it as a problem. For example, how many classes should there be >> (music? laughter?); and what should be done about cross-talk and >> background speakers. And how does this all work in the online setting >> (e.g. is there a mechanism to reclassify previous speech as background if >> we get much louder speech?) >> >> Formulating it as a multi-class (speech/nonspeech) problem with neural >> nets does seem to be one of the most natural ways to set it up. However, I >> think it would make more sense to do this at the frame level rather than >> the segment level. Some of the issues involved in setting this up are a >> little complicated; for instance, it might be necessary to make changes to >> some of the command line tools so they don't require the transition-model >> and accept labels directly instead of alignments. >> >> Right now I'm working on extending the online-nnet2 setup to use the >> decoder backtrace to classify frames as silence or nonsilence, and use this >> to limit the iVector estimation to silence. This should at least remove >> the WER performance hit that we get from not having speech/silence >> detection in online decoding. In the past (e.g. for BABEL) we have done >> segmentation by doing a first pass of recognition using a fairly simple >> model and post-processing that output to create segments. >> >> Dan >> >> >> On Sat, Feb 28, 2015 at 8:05 AM, John Barnes <jcb...@gm...> >> wrote: >> >>> I'm interested in training a DNN voice activity detection system using >>> kaldi. I have a large corpus labeled at the segment level as speech and >>> nonspeech. Are there any existing recipes to do this or suggestions on how >>> to modify a recipe to accomplish this task? >>> >>> Thanks >>> >>> John >>> >>> >>> ------------------------------------------------------------------------------ >>> Dive into the World of Parallel Programming The Go Parallel Website, >>> sponsored >>> by Intel and developed in partnership with Slashdot Media, is your hub >>> for all >>> things parallel software development, from weekly thought leadership >>> blogs to >>> news, videos, case studies, tutorials and more. Take a look and join the >>> conversation now. http://goparallel.sourceforge.net/ >>> _______________________________________________ >>> Kaldi-users mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-users >>> >>> >> |
From: John B. <jcb...@gm...> - 2015-03-01 12:20:27
|
Do frame level labels from kaldi acoustic models include the silence phones (eg SIL, SPN, ...)? If so, would it be possible to take aligned ASR data and extract those frame labels, collapse all nonsilence phones to a single class, and train the VAD DNN using an external framework like Theano? John On Saturday, February 28, 2015, Daniel Povey <dp...@gm...> wrote: > > Hi, > I am cross-posting this to kaldi-developers as I think my reply might be > of interest to people subscribed to that list. This is a good excuse to > talk about the situation with Voice Activity Detection (VAD) more generally. > > There definitely does need to be some good voice activity detection in > Kaldi at some point. > Part of the reason why it doesn't exist yet is that it's never been clear > to me that there is a "right" way to do VAD- or even a right way to > formulate it as a problem. For example, how many classes should there be > (music? laughter?); and what should be done about cross-talk and > background speakers. And how does this all work in the online setting > (e.g. is there a mechanism to reclassify previous speech as background if > we get much louder speech?) > > Formulating it as a multi-class (speech/nonspeech) problem with neural > nets does seem to be one of the most natural ways to set it up. However, I > think it would make more sense to do this at the frame level rather than > the segment level. Some of the issues involved in setting this up are a > little complicated; for instance, it might be necessary to make changes to > some of the command line tools so they don't require the transition-model > and accept labels directly instead of alignments. > > Right now I'm working on extending the online-nnet2 setup to use the > decoder backtrace to classify frames as silence or nonsilence, and use this > to limit the iVector estimation to silence. This should at least remove > the WER performance hit that we get from not having speech/silence > detection in online decoding. In the past (e.g. for BABEL) we have done > segmentation by doing a first pass of recognition using a fairly simple > model and post-processing that output to create segments. > > Dan > > > On Sat, Feb 28, 2015 at 8:05 AM, John Barnes <jcb...@gm... > <javascript:_e(%7B%7D,'cvml','jcb...@gm...');>> wrote: > >> I'm interested in training a DNN voice activity detection system using >> kaldi. I have a large corpus labeled at the segment level as speech and >> nonspeech. Are there any existing recipes to do this or suggestions on how >> to modify a recipe to accomplish this task? >> >> Thanks >> >> John >> >> >> ------------------------------------------------------------------------------ >> Dive into the World of Parallel Programming The Go Parallel Website, >> sponsored >> by Intel and developed in partnership with Slashdot Media, is your hub >> for all >> things parallel software development, from weekly thought leadership >> blogs to >> news, videos, case studies, tutorials and more. Take a look and join the >> conversation now. http://goparallel.sourceforge.net/ >> _______________________________________________ >> Kaldi-users mailing list >> Kal...@li... >> <javascript:_e(%7B%7D,'cvml','Kal...@li...');> >> https://lists.sourceforge.net/lists/listinfo/kaldi-users >> >> > |
From: Daniel P. <dp...@gm...> - 2015-03-01 00:06:23
|
Hi, I am cross-posting this to kaldi-developers as I think my reply might be of interest to people subscribed to that list. This is a good excuse to talk about the situation with Voice Activity Detection (VAD) more generally. There definitely does need to be some good voice activity detection in Kaldi at some point. Part of the reason why it doesn't exist yet is that it's never been clear to me that there is a "right" way to do VAD- or even a right way to formulate it as a problem. For example, how many classes should there be (music? laughter?); and what should be done about cross-talk and background speakers. And how does this all work in the online setting (e.g. is there a mechanism to reclassify previous speech as background if we get much louder speech?) Formulating it as a multi-class (speech/nonspeech) problem with neural nets does seem to be one of the most natural ways to set it up. However, I think it would make more sense to do this at the frame level rather than the segment level. Some of the issues involved in setting this up are a little complicated; for instance, it might be necessary to make changes to some of the command line tools so they don't require the transition-model and accept labels directly instead of alignments. Right now I'm working on extending the online-nnet2 setup to use the decoder backtrace to classify frames as silence or nonsilence, and use this to limit the iVector estimation to silence. This should at least remove the WER performance hit that we get from not having speech/silence detection in online decoding. In the past (e.g. for BABEL) we have done segmentation by doing a first pass of recognition using a fairly simple model and post-processing that output to create segments. Dan On Sat, Feb 28, 2015 at 8:05 AM, John Barnes <jcb...@gm...> wrote: > I'm interested in training a DNN voice activity detection system using > kaldi. I have a large corpus labeled at the segment level as speech and > nonspeech. Are there any existing recipes to do this or suggestions on how > to modify a recipe to accomplish this task? > > Thanks > > John > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-users mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-users > > |
From: Tony R. <to...@ca...> - 2015-02-28 19:48:29
|
Just to check that it's RAM you run out of not local disk. IIRC there was a change made to use /tmp quite a few months ago. Our large local disk isn't on /tmp and once we fixed this it all worked great again. It good to see the tedium recipe being at the core of delelopment. There are so many bright people out there, a completely free and state of the art recipe can only get more people working on ASR so we'll make after progress. Tony Sent from my iPad > On 28 Feb 2015, at 19:31, "Raymond W. M. Ng" <wm...@sh...> wrote: > > Hi Kaldi, > > I am training an DNN with Karel setupt on a 160hr data set. > When I get to the sMBR sequence discriminative training (steps/nnet/train_mpe.sh) The memory usage exploded. The program only managed to process around 2/7 of the training files before it crashes. > > There's no easy accumulation function for the DNN but I assume I can just put different training file splits in consecutive iterations? > > I'd like to know if there's resource out there already. I was referring to the egs/tedlium recipe. > > thanks > raymond > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Daniel P. <dp...@gm...> - 2015-02-28 19:47:05
|
Karel's setup inherently uses quite a lot of memory because it stores all the data in memory. You could try splitting the data as you said, but I think it would be more ideal if Karel were to add something to the script to support this kind of thing in the "proper" way. Alternately, try running the nnet2 setup in local/online/run_nnet2_ms.sh which does not have this problem; you would have to set --num-jobs-initial 1 --num-jobs-final 1 if you only have one GPU. If you do want to modify Karel's setup to use random subsets of the features, I suggest adding something of the following form to the end of the feature pipeline: subset-feats --include=foo/bar/random-utt-subset.iteration ark:- ark:- where random-utt-subset.1 random-utt-subset.2 and so on are utterance lists computed in advance. Dan On Sat, Feb 28, 2015 at 2:31 PM, Raymond W. M. Ng <wm...@sh...> wrote: > Hi Kaldi, > > I am training an DNN with Karel setupt on a 160hr data set. > When I get to the sMBR sequence discriminative training > (steps/nnet/train_mpe.sh) The memory usage exploded. The program only > managed to process around 2/7 of the training files before it crashes. > > There's no easy accumulation function for the DNN but I assume I can just > put different training file splits in consecutive iterations? > > I'd like to know if there's resource out there already. I was referring to > the egs/tedlium recipe. > > thanks > raymond > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2015-02-28 19:42:08
|
Good news! I'm bcc'ing kaldi-developers on just this one message. My aim here is for a trickle of messages to get to kaldi-developers enough to give people on the list a sense of the kinds of things that are happening in Kaldi, without overwhelming them. Note to people on kaldi-developers: most of the list traffic right now is on the help forum, https://sourceforge.net/p/kaldi/discussion/1355348/, and you can click the envelope icon to subscribe if you are logged into sourceforge; but note, much of traffic is user questions that are varying degrees of clueless, with maybe 5-15 messages per day, so make your choice. Dan On Sat, Feb 28, 2015 at 1:43 PM, Jerry.Jiayu.DU <jer...@qq...> wrote: > Hi Karel, > > For the past two nights I have been tunning the LSTM on RM recipe, and now > I'm able to get a wer of 2.04% , with LSTM that is 4 times smaller than > baseline DNN(LSTM-1.8M param vs DNN-7.2M param). > --- > %WER 2.04 [ 256 / 12533, 18 ins, 60 del, 178 sub ] > exp/lstm4f_c512_r200_c512_r200_lr0.0001_mmt0.9_clip50/decode/wer_4_0.5 > --- > If I remember right, you mentioned the fbank DNN baseline is about 2% > something, if that's right, the LSTM result I got is now a reasonable and > competitive one. > > I will submit a diff patch to current lstm recipe( local/nnet/run_lstm.sh > ) a couple of days later, coz I currently have an urgent business at hand, > just wait for my patch > > Modifications that I made right now: > 1). momentum: 0.7 -> 0.9 > 2). use a deep nnet.proto to init the network, and 50 ClipGradient is used: > -------- > <NnetProto> > <LstmProjectedStreams> <InputDim> 43 <OutputDim> 200 <CellDim> 512 > <ParamScale> 0.010000 <ClipGradient> 50.000000 > <LstmProjectedStreams> <InputDim> 200 <OutputDim> 200 <CellDim> 512 > <ParamScale> 0.010000 <ClipGradient> 50.000000 > <AffineTransform> <InputDim> 200 <OutputDim> 1479 <BiasMean> 0.0 > <BiasRange> 0.0 <ParamStddev> 0.040000 > <Softmax> <InputDim> 1479 <OutputDim> 1479 > </NnetProto> > -------- > Although it's a 2-layer LSTM, it is still far smaller than baseline DNN. > > 3) it's better to re-shuffle the training data at the beginning of *each > epoch*: > in steps/nnet/train.sh, original training data: > feats_tr="ark:copy-feats scp:$dir/train.scp ark:- |" > ----> > feats_tr="ark:shuf $dir/train.scp | copy-feats scp:- ark:- |" > > This is due to the way multi-stream feature buffer is filled in > nnet-train-lstm.cc. without utterance re-shuffle at the beginning of each > epoch, some frames will always be filled at the beginning of batch(bptt20) > across epochs, and their error are bound to be truncated along the whole > training process. > > 4). halving factor 0.8 -> 0.5, this point is irrelevant to wer > improvement, because best wer always occur before having, but this > modification reduces the total epochs, so we don't need to wait too long) > > please wait for my patch, I might have further modifications. > > best, > Jerry > > > |
From: Raymond W. M. N. <wm...@sh...> - 2015-02-28 19:31:14
|
Hi Kaldi, I am training an DNN with Karel setupt on a 160hr data set. When I get to the sMBR sequence discriminative training (steps/nnet/train_mpe.sh) The memory usage exploded. The program only managed to process around 2/7 of the training files before it crashes. There's no easy accumulation function for the DNN but I assume I can just put different training file splits in consecutive iterations? I'd like to know if there's resource out there already. I was referring to the egs/tedlium recipe. thanks raymond |