You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Shi Hu <fin...@gm...> - 2012-11-05 10:15:44
|
Yes that was the problem. I found out about it shortly after asking the newsgroup :( & :) Thanks though! On Mon, Nov 5, 2012 at 1:55 AM, Arnab Ghoshal <ar...@gm...> wrote: > On Sat, Nov 3, 2012 at 2:39 AM, Shi Hu <fin...@gm...> wrote: > > steps/make_mfcc.sh eventually calls util/parse_options.sh. But > > parse_options.sh fails at line 81. I am not sure why... > > Did you source cmd.sh to set the job submission commands? > |
From: Arnab G. <ar...@gm...> - 2012-11-05 10:03:06
|
That's odd! The only memory intensive part of UBM training is the clustering step (init-ubm) but to run out of 35G you will need a very very big model (not possible for WSJ). Gaussian selection uses fairly low resources. Could you send the exact error messages? -Arnab On Sun, Nov 4, 2012 at 7:57 AM, Shi Hu <fin...@gm...> wrote: > Hello > > I run local/run_mmi_tri2b.sh (this is a step in run.sh for WSJ) on a single > machine at Stanford clusters which has 35GB RAM, but I still run out of RAM > and swap memory when steps/train_ubm.sh at line 99 is called (doing Gaussian > selection). > > How do I solve this problem? > > Thanks! > Shi > > ------------------------------------------------------------------------------ > LogMeIn Central: Instant, anywhere, Remote PC access and management. > Stay in control, update software, and manage PCs from one command center > Diagnose problems and improve visibility into emerging IT issues > Automate, monitor and manage. Do more in less time with Central > http://p.sf.net/sfu/logmein12331_d2d > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Arnab G. <ar...@gm...> - 2012-11-05 09:57:21
|
In principle, yes. Most of the required components are there, but you have to write your own solution. On Fri, Nov 2, 2012 at 9:33 PM, Talat Tüfekçi <tal...@gm...> wrote: > Could I use kaldi for a dictation application ? > > Thanks in advance. > > ------------------------------------------------------------------------------ > LogMeIn Central: Instant, anywhere, Remote PC access and management. > Stay in control, update software, and manage PCs from one command center > Diagnose problems and improve visibility into emerging IT issues > Automate, monitor and manage. Do more in less time with Central > http://p.sf.net/sfu/logmein12331_d2d > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Arnab G. <ar...@gm...> - 2012-11-05 09:56:29
|
On Sat, Nov 3, 2012 at 2:39 AM, Shi Hu <fin...@gm...> wrote: > steps/make_mfcc.sh eventually calls util/parse_options.sh. But > parse_options.sh fails at line 81. I am not sure why... Did you source cmd.sh to set the job submission commands? |
From: Shi Hu <fin...@gm...> - 2012-11-04 02:27:45
|
Hello I run local/run_mmi_tri2b.sh (this is a step in run.sh for WSJ) on a single machine at Stanford clusters which has 35GB RAM, but I still run out of RAM and swap memory when steps/train_ubm.sh at line 99 is called (doing Gaussian selection). How do I solve this problem? Thanks! Shi |
From: Shi Hu <fin...@gm...> - 2012-11-02 21:10:00
|
Hello I tried to run kaldi-stable/egs/wsj/s5/run.sh at line 62 (the mfcc section), I just copy the code here (FYI, I didn't run the code from line 42-56 as it is marked as not so necessary): mfccdir=mfcc for x in test_eval92 test_eval93 test_dev93 train_si284; do steps/make_mfcc.sh --cmd "$train_cmd" --nj 20 \ data/$x exp/make_mfcc/$x $mfccdir || exit 1; steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir || exit 1; done steps/make_mfcc.sh eventually calls util/parse_options.sh. But parse_options.sh fails at line 81. I am not sure why... Thanks! Shi |
From: Talat T. <tal...@gm...> - 2012-11-02 16:03:41
|
Could I use kaldi for a dictation application ? Thanks in advance. |
From: Talat T. <tal...@gm...> - 2012-11-02 15:58:34
|
From: Shi Hu <fin...@gm...> - 2012-10-30 19:43:58
|
Oops! Just missed that. Thanks! On Tue, Oct 30, 2012 at 12:42 PM, Daniel Povey <dp...@gm...> wrote: > This will be OK, as is made clear in egs/wsj/README > Dan > > > On Tue, Oct 30, 2012 at 3:39 PM, Shi Hu <fin...@gm...> wrote: > >> Ok... >> Regarding which catalog that we should use, the egs/wsj/s5/run.sh lists >> wsj0 and wsj1 as: >> wsj0=/export/corpora5/LDC/LDC93S6B >> wsj1=/export/corpora5/LDC/LDC94S13B >> >> But what we have is LDC93S6A and probably LDC94S13A (we are searching for >> the latter now. Do you think this will matter? >> >> Thanks! >> Shi >> >> >> On Tue, Oct 30, 2012 at 12:30 PM, Daniel Povey <dp...@gm...> wrote: >> >>> Things have moved on since then, the current scripts are s5, these will >>> work the best. >>> Karel (cc'd) may know whether the tandem setup in s2 will still work. >>> Dan >>> >>> >>> On Tue, Oct 30, 2012 at 3:29 PM, Shi Hu <fin...@gm...> wrote: >>> >>>> Hello >>>> >>>> Thanks for the quick reply! We're indeed missing some CD's and are now >>>> searching for it. >>>> >>>> We are trying to build a tandem neural network on top of Kaldi. I saw >>>> in kaldi-trunk/egs/wsj/, there exists a s2 folder corresponding to tandem >>>> related stuff. Should we use that? >>>> >>>> Also, under kaldi-stable/egs/wsj, the README.txt said s3 is the >>>> default, however, the paper >>>> http://publications.idiap.ch/downloads/papers/2012/Povey_ASRU2011_2011.pdf says >>>> it uses egs/wsj/s1 in section IX. Which one should we use in order to match >>>> the benchmark? >>>> >>>> Thanks! >>>> Shi >>>> >>>> On Tue, Oct 30, 2012 at 8:10 AM, Daniel Povey <dp...@gm...> wrote: >>>> >>>>> Also, if it's just an issue at the top level, you could create a >>>>> directory of your own and put soft links to those .dir directories with the >>>>> more standard names (i.e. without the .dir). [actually I'm not sure if >>>>> this will cause any commands using "find" to break due to not following >>>>> soft links.] >>>>> Dan >>>>> >>>>> >>>>> On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...>wrote: >>>>> >>>>>> Shi, this looks like a problem we have seen with WSJ where the disks >>>>>> are extracted differently at different sites. You should try to find >>>>>> the following index files in your distribution: tr_s_wv1.ndx (for >>>>>> SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 >>>>>> 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a >>>>>> different version of local/wsj_data_prep.sh (you can look at >>>>>> local/cstr_wsj_data_prep.sh which is one such specialized script for a >>>>>> different directory structure). -Arnab >>>>>> >>>>>> On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: >>>>>> > Hello Kaldi group >>>>>> > >>>>>> > I am a student from Stanford University. We are doing a project >>>>>> with Kaldi >>>>>> > and we want to run WSJ data to match the benchmark. The WSJ catalog >>>>>> version >>>>>> > we have is LDC93S6A. The files in that catalog match directory name >>>>>> pattern >>>>>> > ??_{?,??}_??.dir. However I could not run this as >>>>>> > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires >>>>>> directory >>>>>> > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We >>>>>> don't have >>>>>> > those files, but is there an alternative way to run it? >>>>>> > >>>>>> > Thanks, >>>>>> > Shi >>>>>> > >>>>>> > >>>>>> ------------------------------------------------------------------------------ >>>>>> > Everyone hates slow websites. So do we. >>>>>> > Make your web apps faster with AppDynamics >>>>>> > Download AppDynamics Lite for free today: >>>>>> > http://p.sf.net/sfu/appdyn_sfd2d_oct >>>>>> > _______________________________________________ >>>>>> > Kaldi-developers mailing list >>>>>> > Kal...@li... >>>>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>>> > >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Everyone hates slow websites. So do we. >>>>>> Make your web apps faster with AppDynamics >>>>>> Download AppDynamics Lite for free today: >>>>>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>>>>> _______________________________________________ >>>>>> Kaldi-developers mailing list >>>>>> Kal...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>>> >>>>> >>>>> >>>> >>> >> > |
From: Daniel P. <dp...@gm...> - 2012-10-30 19:42:37
|
This will be OK, as is made clear in egs/wsj/README Dan On Tue, Oct 30, 2012 at 3:39 PM, Shi Hu <fin...@gm...> wrote: > Ok... > Regarding which catalog that we should use, the egs/wsj/s5/run.sh lists > wsj0 and wsj1 as: > wsj0=/export/corpora5/LDC/LDC93S6B > wsj1=/export/corpora5/LDC/LDC94S13B > > But what we have is LDC93S6A and probably LDC94S13A (we are searching for > the latter now. Do you think this will matter? > > Thanks! > Shi > > > On Tue, Oct 30, 2012 at 12:30 PM, Daniel Povey <dp...@gm...> wrote: > >> Things have moved on since then, the current scripts are s5, these will >> work the best. >> Karel (cc'd) may know whether the tandem setup in s2 will still work. >> Dan >> >> >> On Tue, Oct 30, 2012 at 3:29 PM, Shi Hu <fin...@gm...> wrote: >> >>> Hello >>> >>> Thanks for the quick reply! We're indeed missing some CD's and are now >>> searching for it. >>> >>> We are trying to build a tandem neural network on top of Kaldi. I saw in >>> kaldi-trunk/egs/wsj/, there exists a s2 folder corresponding to tandem >>> related stuff. Should we use that? >>> >>> Also, under kaldi-stable/egs/wsj, the README.txt said s3 is the default, >>> however, the paper >>> http://publications.idiap.ch/downloads/papers/2012/Povey_ASRU2011_2011.pdf says >>> it uses egs/wsj/s1 in section IX. Which one should we use in order to match >>> the benchmark? >>> >>> Thanks! >>> Shi >>> >>> On Tue, Oct 30, 2012 at 8:10 AM, Daniel Povey <dp...@gm...> wrote: >>> >>>> Also, if it's just an issue at the top level, you could create a >>>> directory of your own and put soft links to those .dir directories with the >>>> more standard names (i.e. without the .dir). [actually I'm not sure if >>>> this will cause any commands using "find" to break due to not following >>>> soft links.] >>>> Dan >>>> >>>> >>>> On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...>wrote: >>>> >>>>> Shi, this looks like a problem we have seen with WSJ where the disks >>>>> are extracted differently at different sites. You should try to find >>>>> the following index files in your distribution: tr_s_wv1.ndx (for >>>>> SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 >>>>> 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a >>>>> different version of local/wsj_data_prep.sh (you can look at >>>>> local/cstr_wsj_data_prep.sh which is one such specialized script for a >>>>> different directory structure). -Arnab >>>>> >>>>> On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: >>>>> > Hello Kaldi group >>>>> > >>>>> > I am a student from Stanford University. We are doing a project with >>>>> Kaldi >>>>> > and we want to run WSJ data to match the benchmark. The WSJ catalog >>>>> version >>>>> > we have is LDC93S6A. The files in that catalog match directory name >>>>> pattern >>>>> > ??_{?,??}_??.dir. However I could not run this as >>>>> > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires >>>>> directory >>>>> > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We >>>>> don't have >>>>> > those files, but is there an alternative way to run it? >>>>> > >>>>> > Thanks, >>>>> > Shi >>>>> > >>>>> > >>>>> ------------------------------------------------------------------------------ >>>>> > Everyone hates slow websites. So do we. >>>>> > Make your web apps faster with AppDynamics >>>>> > Download AppDynamics Lite for free today: >>>>> > http://p.sf.net/sfu/appdyn_sfd2d_oct >>>>> > _______________________________________________ >>>>> > Kaldi-developers mailing list >>>>> > Kal...@li... >>>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>> > >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Everyone hates slow websites. So do we. >>>>> Make your web apps faster with AppDynamics >>>>> Download AppDynamics Lite for free today: >>>>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>>>> _______________________________________________ >>>>> Kaldi-developers mailing list >>>>> Kal...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>> >>>> >>>> >>> >> > |
From: Shi Hu <fin...@gm...> - 2012-10-30 19:39:42
|
Ok... Regarding which catalog that we should use, the egs/wsj/s5/run.sh lists wsj0 and wsj1 as: wsj0=/export/corpora5/LDC/LDC93S6B wsj1=/export/corpora5/LDC/LDC94S13B But what we have is LDC93S6A and probably LDC94S13A (we are searching for the latter now. Do you think this will matter? Thanks! Shi On Tue, Oct 30, 2012 at 12:30 PM, Daniel Povey <dp...@gm...> wrote: > Things have moved on since then, the current scripts are s5, these will > work the best. > Karel (cc'd) may know whether the tandem setup in s2 will still work. > Dan > > > On Tue, Oct 30, 2012 at 3:29 PM, Shi Hu <fin...@gm...> wrote: > >> Hello >> >> Thanks for the quick reply! We're indeed missing some CD's and are now >> searching for it. >> >> We are trying to build a tandem neural network on top of Kaldi. I saw in >> kaldi-trunk/egs/wsj/, there exists a s2 folder corresponding to tandem >> related stuff. Should we use that? >> >> Also, under kaldi-stable/egs/wsj, the README.txt said s3 is the default, >> however, the paper >> http://publications.idiap.ch/downloads/papers/2012/Povey_ASRU2011_2011.pdf says >> it uses egs/wsj/s1 in section IX. Which one should we use in order to match >> the benchmark? >> >> Thanks! >> Shi >> >> On Tue, Oct 30, 2012 at 8:10 AM, Daniel Povey <dp...@gm...> wrote: >> >>> Also, if it's just an issue at the top level, you could create a >>> directory of your own and put soft links to those .dir directories with the >>> more standard names (i.e. without the .dir). [actually I'm not sure if >>> this will cause any commands using "find" to break due to not following >>> soft links.] >>> Dan >>> >>> >>> On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...>wrote: >>> >>>> Shi, this looks like a problem we have seen with WSJ where the disks >>>> are extracted differently at different sites. You should try to find >>>> the following index files in your distribution: tr_s_wv1.ndx (for >>>> SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 >>>> 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a >>>> different version of local/wsj_data_prep.sh (you can look at >>>> local/cstr_wsj_data_prep.sh which is one such specialized script for a >>>> different directory structure). -Arnab >>>> >>>> On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: >>>> > Hello Kaldi group >>>> > >>>> > I am a student from Stanford University. We are doing a project with >>>> Kaldi >>>> > and we want to run WSJ data to match the benchmark. The WSJ catalog >>>> version >>>> > we have is LDC93S6A. The files in that catalog match directory name >>>> pattern >>>> > ??_{?,??}_??.dir. However I could not run this as >>>> > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires >>>> directory >>>> > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We >>>> don't have >>>> > those files, but is there an alternative way to run it? >>>> > >>>> > Thanks, >>>> > Shi >>>> > >>>> > >>>> ------------------------------------------------------------------------------ >>>> > Everyone hates slow websites. So do we. >>>> > Make your web apps faster with AppDynamics >>>> > Download AppDynamics Lite for free today: >>>> > http://p.sf.net/sfu/appdyn_sfd2d_oct >>>> > _______________________________________________ >>>> > Kaldi-developers mailing list >>>> > Kal...@li... >>>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> > >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Everyone hates slow websites. So do we. >>>> Make your web apps faster with AppDynamics >>>> Download AppDynamics Lite for free today: >>>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>>> _______________________________________________ >>>> Kaldi-developers mailing list >>>> Kal...@li... >>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>> >>> >>> >> > |
From: Daniel P. <dp...@gm...> - 2012-10-30 19:30:38
|
Things have moved on since then, the current scripts are s5, these will work the best. Karel (cc'd) may know whether the tandem setup in s2 will still work. Dan On Tue, Oct 30, 2012 at 3:29 PM, Shi Hu <fin...@gm...> wrote: > Hello > > Thanks for the quick reply! We're indeed missing some CD's and are now > searching for it. > > We are trying to build a tandem neural network on top of Kaldi. I saw in > kaldi-trunk/egs/wsj/, there exists a s2 folder corresponding to tandem > related stuff. Should we use that? > > Also, under kaldi-stable/egs/wsj, the README.txt said s3 is the default, > however, the paper > http://publications.idiap.ch/downloads/papers/2012/Povey_ASRU2011_2011.pdf says > it uses egs/wsj/s1 in section IX. Which one should we use in order to match > the benchmark? > > Thanks! > Shi > > On Tue, Oct 30, 2012 at 8:10 AM, Daniel Povey <dp...@gm...> wrote: > >> Also, if it's just an issue at the top level, you could create a >> directory of your own and put soft links to those .dir directories with the >> more standard names (i.e. without the .dir). [actually I'm not sure if >> this will cause any commands using "find" to break due to not following >> soft links.] >> Dan >> >> >> On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...> wrote: >> >>> Shi, this looks like a problem we have seen with WSJ where the disks >>> are extracted differently at different sites. You should try to find >>> the following index files in your distribution: tr_s_wv1.ndx (for >>> SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 >>> 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a >>> different version of local/wsj_data_prep.sh (you can look at >>> local/cstr_wsj_data_prep.sh which is one such specialized script for a >>> different directory structure). -Arnab >>> >>> On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: >>> > Hello Kaldi group >>> > >>> > I am a student from Stanford University. We are doing a project with >>> Kaldi >>> > and we want to run WSJ data to match the benchmark. The WSJ catalog >>> version >>> > we have is LDC93S6A. The files in that catalog match directory name >>> pattern >>> > ??_{?,??}_??.dir. However I could not run this as >>> > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires >>> directory >>> > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We don't >>> have >>> > those files, but is there an alternative way to run it? >>> > >>> > Thanks, >>> > Shi >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > Everyone hates slow websites. So do we. >>> > Make your web apps faster with AppDynamics >>> > Download AppDynamics Lite for free today: >>> > http://p.sf.net/sfu/appdyn_sfd2d_oct >>> > _______________________________________________ >>> > Kaldi-developers mailing list >>> > Kal...@li... >>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> > >>> >>> >>> ------------------------------------------------------------------------------ >>> Everyone hates slow websites. So do we. >>> Make your web apps faster with AppDynamics >>> Download AppDynamics Lite for free today: >>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >> >> > |
From: Shi Hu <fin...@gm...> - 2012-10-30 19:29:27
|
Hello Thanks for the quick reply! We're indeed missing some CD's and are now searching for it. We are trying to build a tandem neural network on top of Kaldi. I saw in kaldi-trunk/egs/wsj/, there exists a s2 folder corresponding to tandem related stuff. Should we use that? Also, under kaldi-stable/egs/wsj, the README.txt said s3 is the default, however, the paper http://publications.idiap.ch/downloads/papers/2012/Povey_ASRU2011_2011.pdf says it uses egs/wsj/s1 in section IX. Which one should we use in order to match the benchmark? Thanks! Shi On Tue, Oct 30, 2012 at 8:10 AM, Daniel Povey <dp...@gm...> wrote: > Also, if it's just an issue at the top level, you could create a directory > of your own and put soft links to those .dir directories with the more > standard names (i.e. without the .dir). [actually I'm not sure if this > will cause any commands using "find" to break due to not following soft > links.] > Dan > > > On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...> wrote: > >> Shi, this looks like a problem we have seen with WSJ where the disks >> are extracted differently at different sites. You should try to find >> the following index files in your distribution: tr_s_wv1.ndx (for >> SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 >> 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a >> different version of local/wsj_data_prep.sh (you can look at >> local/cstr_wsj_data_prep.sh which is one such specialized script for a >> different directory structure). -Arnab >> >> On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: >> > Hello Kaldi group >> > >> > I am a student from Stanford University. We are doing a project with >> Kaldi >> > and we want to run WSJ data to match the benchmark. The WSJ catalog >> version >> > we have is LDC93S6A. The files in that catalog match directory name >> pattern >> > ??_{?,??}_??.dir. However I could not run this as >> > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires >> directory >> > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We don't >> have >> > those files, but is there an alternative way to run it? >> > >> > Thanks, >> > Shi >> > >> > >> ------------------------------------------------------------------------------ >> > Everyone hates slow websites. So do we. >> > Make your web apps faster with AppDynamics >> > Download AppDynamics Lite for free today: >> > http://p.sf.net/sfu/appdyn_sfd2d_oct >> > _______________________________________________ >> > Kaldi-developers mailing list >> > Kal...@li... >> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > >> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_sfd2d_oct >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > > |
From: Daniel P. <dp...@gm...> - 2012-10-30 15:11:06
|
Also, if it's just an issue at the top level, you could create a directory of your own and put soft links to those .dir directories with the more standard names (i.e. without the .dir). [actually I'm not sure if this will cause any commands using "find" to break due to not following soft links.] Dan On Tue, Oct 30, 2012 at 8:17 AM, Arnab Ghoshal <ar...@gm...> wrote: > Shi, this looks like a problem we have seen with WSJ where the disks > are extracted differently at different sites. You should try to find > the following index files in your distribution: tr_s_wv1.ndx (for > SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 > 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a > different version of local/wsj_data_prep.sh (you can look at > local/cstr_wsj_data_prep.sh which is one such specialized script for a > different directory structure). -Arnab > > On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: > > Hello Kaldi group > > > > I am a student from Stanford University. We are doing a project with > Kaldi > > and we want to run WSJ data to match the benchmark. The WSJ catalog > version > > we have is LDC93S6A. The files in that catalog match directory name > pattern > > ??_{?,??}_??.dir. However I could not run this as > > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires directory > > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We don't > have > > those files, but is there an alternative way to run it? > > > > Thanks, > > Shi > > > > > ------------------------------------------------------------------------------ > > Everyone hates slow websites. So do we. > > Make your web apps faster with AppDynamics > > Download AppDynamics Lite for free today: > > http://p.sf.net/sfu/appdyn_sfd2d_oct > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Arnab G. <ar...@gm...> - 2012-10-30 12:17:47
|
Shi, this looks like a problem we have seen with WSJ where the disks are extracted differently at different sites. You should try to find the following index files in your distribution: tr_s_wv1.ndx (for SI-84 training set), si_tr_s.ndx (for SI-284), si_et_20.ndx (Nov'92 20K test set), si_et_05.ndx (for Nov'92 5K), etc. and write a different version of local/wsj_data_prep.sh (you can look at local/cstr_wsj_data_prep.sh which is one such specialized script for a different directory structure). -Arnab On Tue, Oct 30, 2012 at 1:41 PM, Shi Hu <fin...@gm...> wrote: > Hello Kaldi group > > I am a student from Stanford University. We are doing a project with Kaldi > and we want to run WSJ data to match the benchmark. The WSJ catalog version > we have is LDC93S6A. The files in that catalog match directory name pattern > ??_{?,??}_??.dir. However I could not run this as > kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires directory > names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We don't have > those files, but is there an alternative way to run it? > > Thanks, > Shi > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Shi Hu <fin...@gm...> - 2012-10-30 08:11:17
|
Hello Kaldi group I am a student from Stanford University. We are doing a project with Kaldi and we want to run WSJ data to match the benchmark. The WSJ catalog version we have is LDC93S6A. The files in that catalog match directory name pattern ??_{?,??}_??.dir. However I could not run this as kaldi-stable/egs/wsj/s3/local/wsj_data_prep.sh line 78 requires directory names having pattern ??-{?,??}.? such as 11-13.1 and 13-34.1. We don't have those files, but is there an alternative way to run it? Thanks, Shi |
From: Daniel P. <dp...@gm...> - 2012-10-22 14:18:38
|
The executable issue should be fixed now. Dan On Mon, Oct 22, 2012 at 9:17 AM, Daniel Povey <dp...@gm...> wrote: > I'll look into this and try to fix it. > There should be a RESULTS file which says what the expected results are > and how to see the error rates. > Dan > > > On Mon, Oct 22, 2012 at 7:11 AM, Nagendra Goel <nag...@go... > > wrote: > >> I have not run the example myself but your steps look correct. >> On Oct 22, 2012 4:31 AM, "Don McCoy" <do...@gm...> wrote: >> >>> I'm looking into Kaldi for research purposes and decided to run the >>> yesno example first. I had a couple of problems related to permissions and >>> missing links. I was able to get this to run without errors with the >>> following commands: >>> >>> cd egs/yesno/s3 >>> chmod +x local/*.pl >>> >>> ln -s ../../wsj/s3/scripts scripts >>> ln -s ../../wsj/s3/steps steps >>> >>> >>> I'm running Ubuntu 11.10 (32-bit) and I originally tried this with the >>> stable branch, but switched to trunk to see if this had been fixed but got >>> the same errors. The above worked on trunk but would probably work on >>> stable as well. Can someone confirm the above changes are correct? >>> >>> Also, how do I tell if the example ran correctly? Is there >>> documentation describing the expected output somewhere? >>> >>> Regards, >>> Don M >>> >>> >>> ------------------------------------------------------------------------------ >>> Everyone hates slow websites. So do we. >>> Make your web apps faster with AppDynamics >>> Download AppDynamics Lite for free today: >>> http://p.sf.net/sfu/appdyn_sfd2d_oct >>> _______________________________________________ >>> Kaldi-developers mailing list >>> Kal...@li... >>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>> >>> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_sfd2d_oct >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > |
From: Daniel P. <dp...@gm...> - 2012-10-22 13:17:40
|
I'll look into this and try to fix it. There should be a RESULTS file which says what the expected results are and how to see the error rates. Dan On Mon, Oct 22, 2012 at 7:11 AM, Nagendra Goel <nag...@go...>wrote: > I have not run the example myself but your steps look correct. > On Oct 22, 2012 4:31 AM, "Don McCoy" <do...@gm...> wrote: > >> I'm looking into Kaldi for research purposes and decided to run the yesno >> example first. I had a couple of problems related to permissions and >> missing links. I was able to get this to run without errors with the >> following commands: >> >> cd egs/yesno/s3 >> chmod +x local/*.pl >> >> ln -s ../../wsj/s3/scripts scripts >> ln -s ../../wsj/s3/steps steps >> >> >> I'm running Ubuntu 11.10 (32-bit) and I originally tried this with the >> stable branch, but switched to trunk to see if this had been fixed but got >> the same errors. The above worked on trunk but would probably work on >> stable as well. Can someone confirm the above changes are correct? >> >> Also, how do I tell if the example ran correctly? Is there documentation >> describing the expected output somewhere? >> >> Regards, >> Don M >> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_sfd2d_oct >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Nagendra G. <nag...@go...> - 2012-10-22 12:14:51
|
I have not run the example myself but your steps look correct. On Oct 22, 2012 4:31 AM, "Don McCoy" <do...@gm...> wrote: > I'm looking into Kaldi for research purposes and decided to run the yesno > example first. I had a couple of problems related to permissions and > missing links. I was able to get this to run without errors with the > following commands: > > cd egs/yesno/s3 > chmod +x local/*.pl > > ln -s ../../wsj/s3/scripts scripts > ln -s ../../wsj/s3/steps steps > > > I'm running Ubuntu 11.10 (32-bit) and I originally tried this with the > stable branch, but switched to trunk to see if this had been fixed but got > the same errors. The above worked on trunk but would probably work on > stable as well. Can someone confirm the above changes are correct? > > Also, how do I tell if the example ran correctly? Is there documentation > describing the expected output somewhere? > > Regards, > Don M > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Don M. <do...@gm...> - 2012-10-21 18:43:17
|
I'm looking into Kaldi for research purposes and decided to run the yesno example first. I had a couple of problems related to permissions and missing links. I was able to get this to run without errors with the following commands: cd egs/yesno/s3 chmod +x local/*.pl ln -s ../../wsj/s3/scripts scripts ln -s ../../wsj/s3/steps steps I'm running Ubuntu 11.10 (32-bit) and I originally tried this with the stable branch, but switched to trunk to see if this had been fixed but got the same errors. The above worked on trunk but would probably work on stable as well. Can someone confirm the above changes are correct? Also, how do I tell if the example ran correctly? Is there documentation describing the expected output somewhere? Regards, Don M |
From: Daniel P. <dp...@gm...> - 2012-10-14 16:22:10
|
I fixed this, but please check that it works ASAP-- I didn't test it. Dan On Fri, Oct 12, 2012 at 12:50 PM, Arnab Ghoshal <ar...@gm...> wrote: > Hi all, > > we found that the code for reading HTK features fails (silently > produces wrong output) when the features are compressed (_C in > ParmKind), or has CRC checksum (_K in ParmKind) and I guess also for > discrete observations. I don't have time to fix this right now, and so > I put a TODO comment before the relevant function in case someone gets > to this sooner. > > -Arnab > > -- > > > > |
From: Arnab G. <ar...@gm...> - 2012-10-12 16:50:31
|
Hi all, we found that the code for reading HTK features fails (silently produces wrong output) when the features are compressed (_C in ParmKind), or has CRC checksum (_K in ParmKind) and I guess also for discrete observations. I don't have time to fix this right now, and so I put a TODO comment before the relevant function in case someone gets to this sooner. -Arnab |
From: Daniel P. <dp...@gm...> - 2012-09-18 18:21:46
|
I doubt you can get cygwin to work in 64 bit. The issue here goes back to a limitation in the C++ standard library that seekg() takes an argument of size_t, and we can't work around it in Kaldi at the code level. Probably the easiest workaround is to use a larger number of jobs when doing the MFCC feature extraction, to avoid any given archive getting larger than 4GB. Dan On Sat, Sep 15, 2012 at 4:08 PM, Lee Baker <lb...@nc...> wrote: > I am a student trying to come up to speed with both speech recognition > and HMM's and the use of Kaldi. > > I was trying to run thru the kaldi tutorial using the LDC switchboard data. > > When running > steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train > /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir > the step > compute-cmvn-stats --spk2utt=ark:data/train/spk2utt > scp:data/train/feats.scp > > ark,scp:/mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/cmvn_train.ark,/mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/cmvn_train.scp > > fails with the following message > ERROR (compute-cmvn-stats:SplitFilename():kaldi-io.cc:504) Cannot get > offset from filename > /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/raw_mfcc_train.1.ark:4295001228 > (possibly you compiled in 32-bit and have a >32-bit byte offset into a > file; you'll have to compile 64-bit. > ERROR (compute-cmvn-stats:SplitFilename():kaldi-io.cc:504) Cannot get > offset from filename > /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/raw_mfcc_train.1.ark:4295001228 > (possibly you compiled in 32-bit and have a >32-bit byte offset into a > file; you'll have to compile 64-bit. > > In digging thru the install scripts, I couldnt see any references to > compiling with 64-bit. > > I am running under cygwin. > So > 1) is there an FAQ that would cover these sorts of issues > 2) could I get some guidance on how to get around this issue > > -- > Regards > Lee Baker > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Lee B. <lb...@nc...> - 2012-09-15 20:09:06
|
I am a student trying to come up to speed with both speech recognition and HMM's and the use of Kaldi. I was trying to run thru the kaldi tutorial using the LDC switchboard data. When running steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir the step compute-cmvn-stats --spk2utt=ark:data/train/spk2utt scp:data/train/feats.scp ark,scp:/mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/cmvn_train.ark,/mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/cmvn_train.scp fails with the following message ERROR (compute-cmvn-stats:SplitFilename():kaldi-io.cc:504) Cannot get offset from filename /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/raw_mfcc_train.1.ark:4295001228 (possibly you compiled in 32-bit and have a >32-bit byte offset into a file; you'll have to compile 64-bit. ERROR (compute-cmvn-stats:SplitFilename():kaldi-io.cc:504) Cannot get offset from filename /mnt/kaldi/egs/swbd_experiment/s5/mfcc_dir/raw_mfcc_train.1.ark:4295001228 (possibly you compiled in 32-bit and have a >32-bit byte offset into a file; you'll have to compile 64-bit. In digging thru the install scripts, I couldnt see any references to compiling with 64-bit. I am running under cygwin. So 1) is there an FAQ that would cover these sorts of issues 2) could I get some guidance on how to get around this issue -- Regards Lee Baker |
From: Daniel P. <dp...@gm...> - 2012-09-02 15:49:40
|
I notice it says Not creating raw N-gram counts ngrams.gz and heldout_ngrams.gz since they already exist in /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/lm/biphone I'm concerned that one of these files might be empty. [after unzipping], as one of the programs is reporting zero words. Perhaps if you rm -r /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/lm and then do it again it might work-- perhaps you created those files at a time when your training data didn't exist or something like that. Also there is nothing special about the kaldi_lm toolkit. If you use SRILM it will also produce an ARPA-format LM that Kaldi can use. Dan On Sat, Sep 1, 2012 at 10:20 AM, xinglong gao <gao...@gm...>wrote: > Hello, > Thank you very much first, and when I use timit database as train and test > database and I have gotten such basic data for lm training: > phones.txt , lexicon.txt and train_trans.txt as appendix this email > and when I use kaldi_lm to train biphone lm, and some thing wrong happened > as below: > > > this is the detailed log: > > Not installing the kaldi_lm toolkit since it is already there. > Creating phones file, and monophone lexicon (mapping phones to itself). > Creating biphone model > Training biphone language model in folder > /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/lm > Creating directory > /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/lm/biphone > Not creating raw N-gram counts ngrams.gz and heldout_ngrams.gz since they > already exist in /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/lm/biphone > (remove them if you want them regenerated) > Iteration 1/7 of optimizing discounting parameters > discount_ngrams: for n-gram order 1, D=0.400000, tau=0.675000 phi=2.000000 > discount_ngrams: for n-gram order 2, D=0.600000, tau=0.675000 phi=2.000000 > discount_ngrams: for n-gram order 3, D=0.800000, tau=0.825000 phi=2.000000 > interpolate_ngrams: 60 words in wordslist > Perplexity over 0.000000 words is nan > Perplexity over 0.000000 words (excluding 0.000000 OOVs) is nan > > real 0m0.017s > user 0m0.000s > sys 0m0.024s > discount_ngrams: for n-gram order 1, D=0.400000, tau=0.900000 phi=2.000000 > discount_ngrams: for n-gram order 2, D=0.600000, tau=0.900000 phi=2.000000 > discount_ngrams: for n-gram order 3, D=0.800000, tau=1.100000 phi=2.000000 > interpolate_ngrams: 60 words in wordslist > Perplexity over 0.000000 words is nan > Perplexity over 0.000000 words (excluding 0.000000 OOVs) is nan > discount_ngrams: for n-gram order 1, D=0.400000, tau=1.215000 phi=2.000000 > discount_ngrams: for n-gram order 2, D=0.600000, tau=1.215000 phi=2.000000 > discount_ngrams: for n-gram order 3, D=0.800000, tau=1.485000 phi=2.000000 > > real 0m0.019s > user 0m0.000s > sys 0m0.032s > interpolate_ngrams: 60 words in wordslist > Perplexity over 0.000000 words is nan > Perplexity over 0.000000 words (excluding 0.000000 OOVs) is nan > > real 0m0.016s > user 0m0.008s > sys 0m0.020s > Bad perplexities . at > /data/gaoxinglong/kaldi/trunk/egs/timit/s3/local/kaldi_lm/ > optimize_alpha.pl line 30. > > > and I have checked the value of perplexities, and its value is : "nan", I > don't know what is happened ? > > and I think the word_map may be wrong is it true? > > > > thanks > best regards! > > > > Xinglong Gao > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |