You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Daniel P. <dp...@gm...> - 2015-07-11 18:07:09
|
Yes, please normalize to UNIX. Dan On Sat, Jul 11, 2015 at 12:37 AM, Kirill Katsnelson <kir...@sm...> wrote: > These 8 files in egs/ have mixed Unix/DOS line endings: > > egs/babel/s5b/babel.html > egs/babel/s5c/babel.html > egs/gale_arabic/s5/conf/decode_dnn.config > egs/gale_arabic/s5/conf/fbank.conf > egs/gale_arabic/s5/conf/mfcc.conf > egs/gale_mandarin/s5/conf/decode_dnn.config > egs/gale_mandarin/s5/conf/fbank.conf > egs/gale_mandarin/s5/conf/mfcc.conf > > I would like to normalize line endings to Unix style in these. Any objections? Git is going crazy on me as I go across systems. > > -kkm > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Kirill K. <kir...@sm...> - 2015-07-11 07:37:41
|
These 8 files in egs/ have mixed Unix/DOS line endings: egs/babel/s5b/babel.html egs/babel/s5c/babel.html egs/gale_arabic/s5/conf/decode_dnn.config egs/gale_arabic/s5/conf/fbank.conf egs/gale_arabic/s5/conf/mfcc.conf egs/gale_mandarin/s5/conf/decode_dnn.config egs/gale_mandarin/s5/conf/fbank.conf egs/gale_mandarin/s5/conf/mfcc.conf I would like to normalize line endings to Unix style in these. Any objections? Git is going crazy on me as I go across systems. -kkm |
From: Kirill K. <kir...@sm...> - 2015-07-08 03:10:58
|
I cannot find a machine w/o .NET 3.5 where the program would fail, but I understand why this happens. Adding this under the name NewGuidCmd.exe.config will *likely* fix it meanwhile: <configuration> <startup> <supportedRuntime version="v2.0.50727"/> <supportedRuntime version="v4.0"/> </startup> </configuration> Do you want such a quick fix? Obviously, any native code uuidgen will be many times faster, as CLR startup time is generally horrendous. -kkm > -----Original Message----- > From: Jan Trmal [mailto:jt...@gm...] > Sent: 2015-07-07 1755 > Cc: kal...@li... > Subject: Re: [Kaldi-developers] Kaldi for Windows > > Just a note to the list: > we were able to resolve Mirko's troubles -- the root cause was that he > didn't have the NET 3.5 installed and for some reason the binary we use > for UUID generation (NewGuidCmd.exe) does not run without it. The > solution was to generate the GUIDs using other tools (both cygwin and > Windows SDK ---which is always installed when the Visual Studio is > installed -- provides uuidgen, which seem to be drop-in replacement) In > the near future, I will see into how to avoid relying on a binary) y. > > On Tue, Jul 7, 2015 at 3:21 PM, Kirill Katsnelson > <kir...@sm...> wrote: > > > +kaldi-developers > > > > .NET should not be a dependency. This is weird. > > > > “When making the solution file, Windows 8.1 is complaining that > the .NET3.5 feature is not installed” -- Can you explain the exact > problem you are seeing? > > > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182, > 5 > ): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and > length must refer to a location within the string. Parameter name: > length > > > > If you look at the .target file in question, the expression that > fails is obviously $(ProjectGuid.Substring(1,8)). Somehow the project > guid became just a "{}". It is expected to be is either undefined or a > real GUID string. If you look at the project file, there will be a > property like <ProjectGuid>{}</ProjectGuid>. This is perhaps a problem > in the perl script that generates the projects. This would be easy to > fix for a single project, but the 40%... The perl script is likely to > be fixed. > > > > I have a set of unweird MSBuild-based build scripts, but this is > a bit away from getting into the main repository. > > > > -kkm > > > > From: Mirko Hannemann [mailto:mir...@gm...] > Sent: 2015-07-07 0632 > To: Jan Trmal; Kirill Katsnelson > Subject: Kaldi for Windows > > > > Dear Jan and Kirill, > > > > first of all thank you for your great efforts to make Kaldi > compile with Visual Studio! > > > > I tried to replicate the steps that are described in the windows/ > directory, and was able to follow every step, including compiling > OpenFst. > > > > When making the solution file, Windows 8.1 is complaining that > the .NET3.5 feature is not installed. I made several attempts and > followed instructions from the internet, but the problem seems common > and quite tricky. > > > > So I thought, I could still generate the solution files and load > the projects in Visual Studio 2013. For about 60% of the projects, this > was successful, for 40% it failed with the following build/load errors: > (just two examples) > > > > > C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align- > compi > led-mapped\align-compiled-mapped.vcxproj : error : Unable to read the > project file "align-compiled-mapped.vcxproj". > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182, > 5 > ): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and > length must refer to a location within the string. Parameter name: > length > > > C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align- > equal > -compiled\align-equal-compiled.vcxproj : error : Unable to read the > project file "align-equal-compiled.vcxproj". > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182, > 5 > ): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and > length must refer to a location within the string. Parameter name: > length > > > > So it seems, that some of the generated solution files are > corrupt. > > > > Apart from that, is it really necessary to use .NET3.5, and no > newer version like 4.0? For what purpose is the .NET used in Kaldi > anyway? > > > > Again, thank you very much for your efforts and best regards, > > Mirko Hannemann > > > |
From: Jan T. <jt...@gm...> - 2015-07-08 00:54:38
|
Just a note to the list: we were able to resolve Mirko's troubles -- the root cause was that he didn't have the NET 3.5 installed and for some reason the binary we use for UUID generation (NewGuidCmd.exe) does not run without it. The solution was to generate the GUIDs using other tools (both cygwin and Windows SDK ---which is always installed when the Visual Studio is installed -- provides uuidgen, which seem to be drop-in replacement) In the near future, I will see into how to avoid relying on a binary) y. On Tue, Jul 7, 2015 at 3:21 PM, Kirill Katsnelson < kir...@sm...> wrote: > +kaldi-developers > > > > .NET should not be a dependency. This is weird. > > > > “When making the solution file, Windows 8.1 is complaining that the > .NET3.5 feature is not installed” -- Can you explain the exact problem you > are seeing? > > > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): > The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length > must refer to a location within the string. Parameter name: length > > > > If you look at the .target file in question, the expression that fails is > obviously $(ProjectGuid.Substring(1,8)). Somehow the project guid became > just a "{}". It is expected to be is either undefined or a real GUID > string. If you look at the project file, there will be a property like > <ProjectGuid>{}</ProjectGuid>. This is perhaps a problem in the perl > script that generates the projects. This would be easy to fix for a single > project, but the 40%... The perl script is likely to be fixed. > > > > I have a set of unweird MSBuild-based build scripts, but this is a bit > away from getting into the main repository. > > > > -kkm > > > > *From:* Mirko Hannemann [mailto:mir...@gm...] > *Sent:* 2015-07-07 0632 > *To:* Jan Trmal; Kirill Katsnelson > *Subject:* Kaldi for Windows > > > > Dear Jan and Kirill, > > > > first of all thank you for your great efforts to make Kaldi compile with > Visual Studio! > > > > I tried to replicate the steps that are described in the windows/ > directory, and was able to follow every step, including compiling OpenFst. > > > > When making the solution file, Windows 8.1 is complaining that the .NET3.5 > feature is not installed. I made several attempts and followed instructions > from the internet, but the problem seems common and quite tricky. > > > > So I thought, I could still generate the solution files and load the > projects in Visual Studio 2013. For about 60% of the projects, this was > successful, for 40% it failed with the following build/load errors: (just > two examples) > > > > C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align-compiled-mapped\align-compiled-mapped.vcxproj > : error : Unable to read the project file "align-compiled-mapped.vcxproj". > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): > The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length > must refer to a location within the string. Parameter name: length > > C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align-equal-compiled\align-equal-compiled.vcxproj > : error : Unable to read the project file "align-equal-compiled.vcxproj". > > C:\Program Files > (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): > The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length > must refer to a location within the string. Parameter name: length > > > > So it seems, that some of the generated solution files are corrupt. > > > > Apart from that, is it really necessary to use .NET3.5, and no newer > version like 4.0? For what purpose is the .NET used in Kaldi anyway? > > > > Again, thank you very much for your efforts and best regards, > > Mirko Hannemann > > > |
From: Daniel P. <dp...@gm...> - 2015-07-07 19:28:52
|
Everyone, I am adding a prototype glossary page here: http://kaldi.sourceforge.net/glossary.html Something that would be helpful is if people could help fill out the content, or at least come up with candidate terms that should be included. Send patches before committing, please. Dan ---------- Forwarded message ---------- From: Kaldi SVN repository <no...@co...> Date: Tue, Jul 7, 2015 at 12:13 PM Subject: [kaldi:code] New commit by danielpovey To: Kaldi SVN repository <no...@co...> Various minor fixes, plus adding (prototype) glossary page. By danielpovey on 07/07/2015 19:13 View Changes ________________________________ Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/code/ To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ |
From: Kirill K. <kir...@sm...> - 2015-07-07 19:22:12
|
+kaldi-developers .NET should not be a dependency. This is weird. “When making the solution file, Windows 8.1 is complaining that the .NET3.5 feature is not installed” -- Can you explain the exact problem you are seeing? C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length must refer to a location within the string. Parameter name: length If you look at the .target file in question, the expression that fails is obviously $(ProjectGuid.Substring(1,8)). Somehow the project guid became just a "{}". It is expected to be is either undefined or a real GUID string. If you look at the project file, there will be a property like <ProjectGuid>{}</ProjectGuid>. This is perhaps a problem in the perl script that generates the projects. This would be easy to fix for a single project, but the 40%... The perl script is likely to be fixed. I have a set of unweird MSBuild-based build scripts, but this is a bit away from getting into the main repository. -kkm From: Mirko Hannemann [mailto:mir...@gm...] Sent: 2015-07-07 0632 To: Jan Trmal; Kirill Katsnelson Subject: Kaldi for Windows Dear Jan and Kirill, first of all thank you for your great efforts to make Kaldi compile with Visual Studio! I tried to replicate the steps that are described in the windows/ directory, and was able to follow every step, including compiling OpenFst. When making the solution file, Windows 8.1 is complaining that the .NET3.5 feature is not installed. I made several attempts and followed instructions from the internet, but the problem seems common and quite tricky. So I thought, I could still generate the solution files and load the projects in Visual Studio 2013. For about 60% of the projects, this was successful, for 40% it failed with the following build/load errors: (just two examples) C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align-compiled-mapped\align-compiled-mapped.vcxproj : error : Unable to read the project file "align-compiled-mapped.vcxproj". C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length must refer to a location within the string. Parameter name: length C:\cygwin64\home\Mirko\kaldi-win\kaldiwin_vs12_auto\kaldiwin\align-equal-compiled\align-equal-compiled.vcxproj : error : Unable to read the project file "align-equal-compiled.vcxproj". C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Microsoft.BuildSteps.Targets(182,5): The expression ""{}".Substring(1, 8)" cannot be evaluated. Index and length must refer to a location within the string. Parameter name: length So it seems, that some of the generated solution files are corrupt. Apart from that, is it really necessary to use .NET3.5, and no newer version like 4.0? For what purpose is the .NET used in Kaldi anyway? Again, thank you very much for your efforts and best regards, Mirko Hannemann |
From: Jan T. <jt...@gm...> - 2015-07-01 21:45:51
|
All right, unless someone objects until, say, tomorrow ~12GMT, I will modify the scripts to respect this. y. On Wed, Jul 1, 2015 at 12:59 PM, Korbinian Riedhammer <kor...@gm...> wrote: > Hi all, > > I think that > - the scoring script should report an error if it fails (and thus > propagate it to the main script). After all, it implies/signals that > something went wrong along the recipe. > - the decoding scripts should consistently honor/support the > --skip-scoring option, but not do a `set -a` (which should be a deliberate > user choice, e.g. as from the main recipe script). The --skip-scoring is > used frequently in any sort of multi-stage decoding workflow. > > Thanks for soliciting feedback on this :-) > Korbinian. > > On Wed, Jul 1, 2015 at 9:31 AM Daniel Povey <dp...@gm...> wrote: > >> Hm. >> I notice that all the decoding scripts are like this, they don't check >> the return status. >> I'm not sure what to do about this- cc'ing kaldi-developers to get >> more opinions. >> There might be some setups where people want to decode but they >> haven't finished the scoring scripts, but on the other hand it would >> be more consistent to check the return status. >> >> Dan >> >> >> On Wed, Jul 1, 2015 at 8:47 AM, Jan Trmal <jt...@gm...> wrote: >> > I think it should fail, preferably using "|| exit 1" as the wsj scripts >> are >> > written in that way. If the user doesn't want the decode* scripts to >> fail >> > because of (expected) failure in scoring he/she can call the decode* >> scripts >> > with "--skip-scoring true". But I think we should wait for Dan to >> decide. >> > >> > BTW, I'm suggesting "||exit 1" also because I tried using "set -e" a >> while >> > ago and it definitely didn't work out of the box. >> > >> > y. >> > >> > On Wed, Jul 1, 2015 at 7:41 AM, Karel Veselý <ve...@gm...> wrote: >> >> >> >> Hi Dan, Yenda, >> >> as I am polishing and running the AMI recipes for the workshop, I've >> >> noticed one thing. >> >> If I decode with 'steps/decode.sh' and the scoring fails, the script >> >> 'steps/decode.sh' still >> >> finishes with error code 0. >> >> >> >> My question is, is this a bug or a feature? :) >> >> >> >> I would naturally expect that the the error gets propagated, so that it >> >> stops the master script. >> >> >> >> I.e. the change would be: >> >> => Option 1: adding '||exit 1' in steps/decode.sh: "local/score.sh >> --cmd >> >> "$cmd" $scoring_opts $data $graphdir $dir || exit 1" >> >> => Options 2: or, by adding 'set -e' somewhere into steps/decode.sh >> >> >> >> The scoring error is ignored almost all the scripts: steps/decode*.sh >> >> >> >> Can you please comment? >> >> Cheers! >> >> Karel. >> > >> > >> >> >> ------------------------------------------------------------------------------ >> Don't Limit Your Business. Reach for the Cloud. >> GigeNET's Cloud Solutions provide you with the tools and support that >> you need to offload your IT needs and focus on growing your business. >> Configured For All Businesses. Start Your Cloud Today. >> https://www.gigenetcloud.com/ >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> > |
From: Daniel P. <dp...@gm...> - 2015-07-01 18:05:19
|
Those data are copyrighted and we aren't allowed to give them to you- you need to pay the LDC. But other setups such as Librispeech and Voxforge are free. On Wed, Jul 1, 2015 at 2:37 AM, 徐俊峰 <xuj...@16...> wrote: > Dear kaldi-developers: > Hello everybody, please let me introduce myself first. I'm a postgraduate of > XIDIAN UNIVERSITY, and I am interested in ASR. I am doing some experiments > about DNN adaptation based on i-verctor using kaldi, but I am confused by > the experiment resource. I need your help extremely. > I download kaldi tools from the kaldi official website, but in dir > "../kaldi-trunk/egs/wsj/s5/“, there are just some scripts, none data and > even train set list. I have got WSJ corpora from others. It is about 76 > hours in training set. But it may be incomplete. Would you send me the WSJ > training set list? Let me have a chance to check my WSJ training set. The > WSJ set in kaldi scripts is "wsj0=/export/corpora5/LDC/LDC93S6B > wsj1=/export/corpora5/LDC/LDC94S13B" . > For consistent with kaldi experiment set, would you send me the test set and > nist_lm? "test_eval92 test_eval93 test_dev93 test_eval92_5k test_eval93_5k > test_dev93_5k dev_dt_05 dev_dt_20" are for testing, and the language model > is "nist_lm". Waiting for your reply. > Thank you very much! > Xu > > > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Korbinian R. <kor...@gm...> - 2015-07-01 16:59:49
|
Hi all, I think that - the scoring script should report an error if it fails (and thus propagate it to the main script). After all, it implies/signals that something went wrong along the recipe. - the decoding scripts should consistently honor/support the --skip-scoring option, but not do a `set -a` (which should be a deliberate user choice, e.g. as from the main recipe script). The --skip-scoring is used frequently in any sort of multi-stage decoding workflow. Thanks for soliciting feedback on this :-) Korbinian. On Wed, Jul 1, 2015 at 9:31 AM Daniel Povey <dp...@gm...> wrote: > Hm. > I notice that all the decoding scripts are like this, they don't check > the return status. > I'm not sure what to do about this- cc'ing kaldi-developers to get > more opinions. > There might be some setups where people want to decode but they > haven't finished the scoring scripts, but on the other hand it would > be more consistent to check the return status. > > Dan > > > On Wed, Jul 1, 2015 at 8:47 AM, Jan Trmal <jt...@gm...> wrote: > > I think it should fail, preferably using "|| exit 1" as the wsj scripts > are > > written in that way. If the user doesn't want the decode* scripts to fail > > because of (expected) failure in scoring he/she can call the decode* > scripts > > with "--skip-scoring true". But I think we should wait for Dan to decide. > > > > BTW, I'm suggesting "||exit 1" also because I tried using "set -e" a > while > > ago and it definitely didn't work out of the box. > > > > y. > > > > On Wed, Jul 1, 2015 at 7:41 AM, Karel Veselý <ve...@gm...> wrote: > >> > >> Hi Dan, Yenda, > >> as I am polishing and running the AMI recipes for the workshop, I've > >> noticed one thing. > >> If I decode with 'steps/decode.sh' and the scoring fails, the script > >> 'steps/decode.sh' still > >> finishes with error code 0. > >> > >> My question is, is this a bug or a feature? :) > >> > >> I would naturally expect that the the error gets propagated, so that it > >> stops the master script. > >> > >> I.e. the change would be: > >> => Option 1: adding '||exit 1' in steps/decode.sh: "local/score.sh > --cmd > >> "$cmd" $scoring_opts $data $graphdir $dir || exit 1" > >> => Options 2: or, by adding 'set -e' somewhere into steps/decode.sh > >> > >> The scoring error is ignored almost all the scripts: steps/decode*.sh > >> > >> Can you please comment? > >> Cheers! > >> Karel. > > > > > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Daniel P. <dp...@gm...> - 2015-07-01 16:30:07
|
Hm. I notice that all the decoding scripts are like this, they don't check the return status. I'm not sure what to do about this- cc'ing kaldi-developers to get more opinions. There might be some setups where people want to decode but they haven't finished the scoring scripts, but on the other hand it would be more consistent to check the return status. Dan On Wed, Jul 1, 2015 at 8:47 AM, Jan Trmal <jt...@gm...> wrote: > I think it should fail, preferably using "|| exit 1" as the wsj scripts are > written in that way. If the user doesn't want the decode* scripts to fail > because of (expected) failure in scoring he/she can call the decode* scripts > with "--skip-scoring true". But I think we should wait for Dan to decide. > > BTW, I'm suggesting "||exit 1" also because I tried using "set -e" a while > ago and it definitely didn't work out of the box. > > y. > > On Wed, Jul 1, 2015 at 7:41 AM, Karel Veselý <ve...@gm...> wrote: >> >> Hi Dan, Yenda, >> as I am polishing and running the AMI recipes for the workshop, I've >> noticed one thing. >> If I decode with 'steps/decode.sh' and the scoring fails, the script >> 'steps/decode.sh' still >> finishes with error code 0. >> >> My question is, is this a bug or a feature? :) >> >> I would naturally expect that the the error gets propagated, so that it >> stops the master script. >> >> I.e. the change would be: >> => Option 1: adding '||exit 1' in steps/decode.sh: "local/score.sh --cmd >> "$cmd" $scoring_opts $data $graphdir $dir || exit 1" >> => Options 2: or, by adding 'set -e' somewhere into steps/decode.sh >> >> The scoring error is ignored almost all the scripts: steps/decode*.sh >> >> Can you please comment? >> Cheers! >> Karel. > > |
From: 徐俊峰 <xuj...@16...> - 2015-07-01 06:37:29
|
Dear kaldi-developers: Hello everybody, please let me introduce myself first. I'm a postgraduate of XIDIAN UNIVERSITY, and I am interested in ASR. I am doing some experiments about DNN adaptation based on i-verctor using kaldi, but I am confused by the experiment resource. I need your help extremely. I download kaldi tools from the kaldi official website, but in dir "../kaldi-trunk/egs/wsj/s5/“, there are just some scripts, none data and even train set list. I have got WSJ corpora from others. It is about 76 hours in training set. But it may be incomplete. Would you send me the WSJ training set list? Let me have a chance to check my WSJ training set. The WSJ set in kaldi scripts is "wsj0=/export/corpora5/LDC/LDC93S6B wsj1=/export/corpora5/LDC/LDC94S13B" . For consistent with kaldi experiment set, would you send me the test set and nist_lm? "test_eval92 test_eval93 test_dev93 test_eval92_5k test_eval93_5k test_dev93_5k dev_dt_05 dev_dt_20" are for testing, and the language model is "nist_lm". Waiting for your reply. Thank you very much! Xu |
From: Jan T. <jt...@gm...> - 2015-06-30 12:55:53
|
Thanks for noticing this, Christine. I fixed it. y. On Tue, Jun 30, 2015 at 8:32 AM, Christine Donica <chr...@re...> wrote: > > Hi, > > Does the arpa2fst source code really belong in the .gitignore file? > > /src/bin/arpa2fst.cc > > The binary is included on a different line (makes sense.) > > Thanks, > Christine > > > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Christine D. <chr...@re...> - 2015-06-30 12:32:26
|
Hi, Does the arpa2fst source code really belong in the .gitignore file? /src/bin/arpa2fst.cc The binary is included on a different line (makes sense.) Thanks, Christine |
From: Charles C. <cha...@nv...> - 2015-06-23 14:53:51
|
Both HTK and SRILM toolkits document the ARPA file format (credit to Doug Paul of MIT) See for example the following documentation links: SRILM: http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html HTK: http://www1.icsi.berkeley.edu/Speech/docs/HTKBook3.2/node213_mn.html It probably dates back at least as far as the original WSJ Corpus (early 1990's) -- e.g., " The Design for the Wall Street Journal-based CSR Corpus" by Doug Paul and Janet Baker (of Dragon fame) published in "HLT '91 Proceedings of the workshop on Speech and Natural Language" Perhaps there is a history buff on this mailing list who can provide the definitive answer... -Charles -----Original Message----- From: Daniel Povey [mailto:dp...@gm...] Sent: Monday, June 22, 2015 1:57 PM To: Kirill Katsnelson; Guoguo Chen Cc: kal...@li... Subject: Re: [Kaldi-developers] spaces in ngram declarations in the \data\ section I don't know if there is a formal definition of the ARPA format; things like this come up occasionally. The easiest thing is to just allow the format ngram= 12344 as well as ngram = 12345, and also print a warning for any lines after the \data\ marker that are not interpretable. Guoguo, could you do this? Dan On Mon, Jun 22, 2015 at 3:45 PM, Kirill Katsnelson <kir...@sm...> wrote: > Some LM files have spaces in ngram declarations in the \data\ section: > > > \data\ > ngram 1=150000 > ngram 2= 9774628 > ngram 3= 44845299 > > > \1-grams: > -7.89095 <s> -2.06214 > -2.92635 don't -1.85988 > > arpa-to-const-arpa does not like them in a peculiar way. Namely, it bombs out on the 1st unigram with a backoff weight, because it decided 1 is the final order. Looking at the code, the library code in src/lm/ const-arpa-lm.cc does not expect any space except between "ngram" and the rest of line, silently skipping any lines that begin with "ngram" and tokenized on space into less or more than 2 tokens. See line 316 if (keyword_found && col.size() == 2 && col[0] == "ngram") { -- not even a warning if "col.size() == 2" is false. > > Are these spaces legit? Should the tool be fixed, or the grammar? I never saw a formal spec of ARPA LM in my life. > > -kkm > > ---------------------------------------------------------------------- > -------- Monitor 25 network devices or servers for free with > OpManager! > OpManager is web-based network management software that monitors > network devices and physical & virtual servers, alerts via email & sms > for fault. Monitor 25 devices for free with no restriction. Download > now http://ad.doubleclick.net/ddm/clk/292181274;119417398;o > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers ---------------------------------------------------------------------------- -- Monitor 25 network devices or servers for free with OpManager! OpManager is web-based network management software that monitors network devices and physical & virtual servers, alerts via email & sms for fault. Monitor 25 devices for free with no restriction. Download now http://ad.doubleclick.net/ddm/clk/292181274;119417398;o _______________________________________________ Kaldi-developers mailing list Kal...@li... https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Kirill K. <kir...@sm...> - 2015-06-23 03:37:14
|
Yes, it's been fixed. Thanks again Guoguo! -kkm > -----Original Message----- > From: Guoguo Chen [mailto:che...@gm...] > Sent: 2015-06-22 1920 > To: Daniel Povey > Cc: Kirill Katsnelson; kal...@li... > Subject: Re: [Kaldi-developers] spaces in ngram declarations in the > \data\ section > > I checked in a fix for this. Kirill, could you have a try and see if > this fixes your problem? > > Guoguo > -- > > On Mon, Jun 22, 2015 at 3:56 PM, Daniel Povey <dp...@gm...> wrote: > > > I don't know if there is a formal definition of the ARPA format; > things like this come up occasionally. > The easiest thing is to just allow the format > ngram= 12344 > as well as > ngram = 12345, > and also print a warning for any lines after the \data\ marker > that > are not interpretable. > Guoguo, could you do this? > > > Dan > > > > On Mon, Jun 22, 2015 at 3:45 PM, Kirill Katsnelson > <kir...@sm...> wrote: > > Some LM files have spaces in ngram declarations in the \data\ > section: > > > > > > \data\ > > ngram 1=150000 > > ngram 2= 9774628 > > ngram 3= 44845299 > > > > > > \1-grams: > > -7.89095 <s> -2.06214 > > -2.92635 don't -1.85988 > > > > arpa-to-const-arpa does not like them in a peculiar way. > Namely, it bombs out on the 1st unigram with a backoff weight, because > it decided 1 is the final order. Looking at the code, the library code > in src/lm/ const-arpa-lm.cc does not expect any space except between > "ngram" and the rest of line, silently skipping any lines that begin > with "ngram" and tokenized on space into less or more than 2 tokens. > See line 316 if (keyword_found && col.size() == 2 && col[0] == "ngram") > { -- not even a warning if "col.size() == 2" is false. > > > > Are these spaces legit? Should the tool be fixed, or the > grammar? I never saw a formal spec of ARPA LM in my life. > > > > -kkm > > > > > ----------------------------------------------------------------------- > - > ------ > > Monitor 25 network devices or servers for free with OpManager! > > OpManager is web-based network management software that > monitors > > network devices and physical & virtual servers, alerts via > email & sms > > for fault. Monitor 25 devices for free with no restriction. > Download now > > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Guoguo C. <che...@gm...> - 2015-06-23 02:20:28
|
I checked in a fix for this. Kirill, could you have a try and see if this fixes your problem? Guoguo -- On Mon, Jun 22, 2015 at 3:56 PM, Daniel Povey <dp...@gm...> wrote: > I don't know if there is a formal definition of the ARPA format; > things like this come up occasionally. > The easiest thing is to just allow the format > ngram= 12344 > as well as > ngram = 12345, > and also print a warning for any lines after the \data\ marker that > are not interpretable. > Guoguo, could you do this? > > > Dan > > > On Mon, Jun 22, 2015 at 3:45 PM, Kirill Katsnelson > <kir...@sm...> wrote: > > Some LM files have spaces in ngram declarations in the \data\ section: > > > > > > \data\ > > ngram 1=150000 > > ngram 2= 9774628 > > ngram 3= 44845299 > > > > > > \1-grams: > > -7.89095 <s> -2.06214 > > -2.92635 don't -1.85988 > > > > arpa-to-const-arpa does not like them in a peculiar way. Namely, it > bombs out on the 1st unigram with a backoff weight, because it decided 1 is > the final order. Looking at the code, the library code in src/lm/ > const-arpa-lm.cc does not expect any space except between "ngram" and the > rest of line, silently skipping any lines that begin with "ngram" and > tokenized on space into less or more than 2 tokens. See line 316 if > (keyword_found && col.size() == 2 && col[0] == "ngram") { -- not even a > warning if "col.size() == 2" is false. > > > > Are these spaces legit? Should the tool be fixed, or the grammar? I > never saw a formal spec of ARPA LM in my life. > > > > -kkm > > > > > ------------------------------------------------------------------------------ > > Monitor 25 network devices or servers for free with OpManager! > > OpManager is web-based network management software that monitors > > network devices and physical & virtual servers, alerts via email & sms > > for fault. Monitor 25 devices for free with no restriction. Download now > > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Daniel P. <dp...@gm...> - 2015-06-22 19:56:48
|
I don't know if there is a formal definition of the ARPA format; things like this come up occasionally. The easiest thing is to just allow the format ngram= 12344 as well as ngram = 12345, and also print a warning for any lines after the \data\ marker that are not interpretable. Guoguo, could you do this? Dan On Mon, Jun 22, 2015 at 3:45 PM, Kirill Katsnelson <kir...@sm...> wrote: > Some LM files have spaces in ngram declarations in the \data\ section: > > > \data\ > ngram 1=150000 > ngram 2= 9774628 > ngram 3= 44845299 > > > \1-grams: > -7.89095 <s> -2.06214 > -2.92635 don't -1.85988 > > arpa-to-const-arpa does not like them in a peculiar way. Namely, it bombs out on the 1st unigram with a backoff weight, because it decided 1 is the final order. Looking at the code, the library code in src/lm/ const-arpa-lm.cc does not expect any space except between "ngram" and the rest of line, silently skipping any lines that begin with "ngram" and tokenized on space into less or more than 2 tokens. See line 316 if (keyword_found && col.size() == 2 && col[0] == "ngram") { -- not even a warning if "col.size() == 2" is false. > > Are these spaces legit? Should the tool be fixed, or the grammar? I never saw a formal spec of ARPA LM in my life. > > -kkm > > ------------------------------------------------------------------------------ > Monitor 25 network devices or servers for free with OpManager! > OpManager is web-based network management software that monitors > network devices and physical & virtual servers, alerts via email & sms > for fault. Monitor 25 devices for free with no restriction. Download now > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Kirill K. <kir...@sm...> - 2015-06-22 19:46:01
|
Some LM files have spaces in ngram declarations in the \data\ section: \data\ ngram 1=150000 ngram 2= 9774628 ngram 3= 44845299 \1-grams: -7.89095 <s> -2.06214 -2.92635 don't -1.85988 arpa-to-const-arpa does not like them in a peculiar way. Namely, it bombs out on the 1st unigram with a backoff weight, because it decided 1 is the final order. Looking at the code, the library code in src/lm/ const-arpa-lm.cc does not expect any space except between "ngram" and the rest of line, silently skipping any lines that begin with "ngram" and tokenized on space into less or more than 2 tokens. See line 316 if (keyword_found && col.size() == 2 && col[0] == "ngram") { -- not even a warning if "col.size() == 2" is false. Are these spaces legit? Should the tool be fixed, or the grammar? I never saw a formal spec of ARPA LM in my life. -kkm |
From: Kirill K. <kir...@sm...> - 2015-06-22 18:43:48
|
Another of my test LMs got "<s> <s>" with a few spaces in between, which let it slip by the usual grep -v '<s> <s>' guard, but got caught by arpa2fst now. Probably saved me another few troubleshooting hours, thanks a lot! -kkm |
From: Daniel P. <dp...@gm...> - 2015-06-17 19:33:54
|
You need to understand the basics of speech recognition first. Read this http://mi.eng.cam.ac.uk/~mjfg/mjfg_NOW.pdf and also the data-preparation section of the Kaldi documentation. Dan On Wed, Jun 17, 2015 at 5:17 AM, Ananya Goel <ana...@gm...> wrote: > Sir, > > I have been trying to use Kaldi to develop a speech recognizer. I used the > YesNo example as a starting point. But, I am unable to extend it for > multiple words. > > Kindly inform me any alternative if it isn't possible or redirect me to a > suitable link. > > Any help will be appreciated. > > Thank You > > Ananya Goel > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Ananya G. <ana...@gm...> - 2015-06-17 09:18:08
|
Sir, I have been trying to use Kaldi to develop a speech recognizer. I used the YesNo example as a starting point. But, I am unable to extend it for multiple words. Kindly inform me any alternative if it isn't possible or redirect me to a suitable link. Any help will be appreciated. Thank You Ananya Goel |
From: Jan T. <jt...@gm...> - 2015-06-04 18:11:07
|
Paul, if I understand the question, the speech toolkits use a digital input of ADC i.e. level-sampled (PCM) audio (not a differential encoding such as DSD in SACD). For speech recognition purposes, formats with linear 16bit samples and ~16kHz are usually used. 12 bits can be fine as often A-law/mu-law audio is used as well, which after de-companding has a dynamic range of 12 bits. y. On Thu, Jun 4, 2015 at 1:26 PM, Paul Romero <pa...@rc...> wrote: > From: Paul Romero <pa...@rc...> > To: jul...@li... > Subject: Speech Recognition - FFT Recommendation > Date: Thu, 04 Jun 2015 09:37:02 -0700 > > > Dear Kaldi Staff: > > I think most current speech recognition software is > FFT based. Please correct me if I am wrong. What > kind of input does most of the software require: DB, Pascal, > Voltage, Watts, ADC readings, or other ? > > My application acquires sound samples with a microphone > and 12 bit successive ADC. > > Best Regards, > > Paul Romero > > > > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Gary M. <gar...@gm...> - 2015-06-04 18:09:12
|
You and Gannady should join Sincerely, Gary > On Jun 4, 2015, at 10:26 AM, Paul Romero <pa...@rc...> wrote: > > From: Paul Romero <pa...@rc...> > To: jul...@li... > Subject: Speech Recognition - FFT Recommendation > Date: Thu, 04 Jun 2015 09:37:02 -0700 > > > Dear Kaldi Staff: > > I think most current speech recognition software is > FFT based. Please correct me if I am wrong. What > kind of input does most of the software require: DB, Pascal, > Voltage, Watts, ADC readings, or other ? > > My application acquires sound samples with a microphone > and 12 bit successive ADC. > > Best Regards, > > Paul Romero > > > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers |
From: Paul R. <pa...@rc...> - 2015-06-04 17:32:10
|
From: Paul Romero <pa...@rc...> To: jul...@li... Subject: Speech Recognition - FFT Recommendation Date: Thu, 04 Jun 2015 09:37:02 -0700 Dear Kaldi Staff: I think most current speech recognition software is FFT based. Please correct me if I am wrong. What kind of input does most of the software require: DB, Pascal, Voltage, Watts, ADC readings, or other ? My application acquires sound samples with a microphone and 12 bit successive ADC. Best Regards, Paul Romero |
From: Niranjan V. <nvi...@gm...> - 2015-06-03 10:37:00
|
Yes, now it is working fine. Thanks a lot! On 3 June 2015 at 15:59, Tanel Alumäe <tan...@ph...> wrote: > OK, I see. > > Try replacing the line (in the original Makefile): > EXTRA_LDLIBS += -pthread -lgstbase-1.0 -lgstcontroller-1.0 -lgstreamer-1.0 > -lgobject-2.0 -lgmodule-2.0 -lgthread-2.0 -lrt -lglib-2.0 > > with: > > EXTRA_LDLIBS += -pthread -lgstbase-1.0 -lgstcontroller-1.0 -lgmodule-2.0 > -lgthread-2.0 -lrt > EXTRA_LDLIBS += $(shell pkg-config --libs gstreamer-1.0) > EXTRA_LDLIBS += $(shell pkg-config --libs glib-2.0) > > Does it help? > > Tanel > > On Wed, 2015-06-03 at 14:53 +0530, Niranjan Viladkar wrote: > > Hello, > > > > > > Thanks for the quick reply. I am using RHEL 6.2. > > My glib installation is under my user area (/home/niranjan/* ... ). > > > > > > Output of "pkg-config --cflags glib-2.0" => > > -I/home/niranjan/kaldi-svn/tools/glib/glib-2.45.2/include/glib-2.0 > > -I/home/niranjan/kaldi-svn/tools/glib/glib-2.45.2/lib/glib-2.0/include > > > > > > Output of "pkg-config --cflags --libs glib-2.0" => > > -I/home/niranjan/kaldi-svn/tools/glib/glib-2.45.2/include/glib-2.0 > > -I/home/niranjan/kaldi-svn/tools/glib/glib-2.45.2/lib/glib-2.0/include > > -L/home/niranjan/kaldi-svn/tools/glib/glib-2.45.2/lib -lglib-2.0 > > > > > > I agree that "-lglib-2.0" gets included via EXTRA_LDLIBS. > > > > But as my glib installation is local, "-L" information is additional. > > (I don't have root/sudo access) > > > > > > Secondly, is the change in rule required? (addition of EXTRA_CXXFLAGS) > > > > 41 $(LIBFILE): $(OBJFILES) > > 42 $(CXX) -shared -DPIC -o $(LIBFILE) -Wl,-soname=$(LIBFILE) > > -Wl,--no-as-needed \ > > 43 -L$(KALDILIBDIR) -Wl,-rpath=$(KALDILIBDIR) > > $(EXTRA_LDLIBS) $(LDLIBS) $(LDFLAGS) $(EXTRA_CXXFLAGS) \ > > 44 $(OBJFILES) > > > > Thanks, > > > > Niranjan. > > > > > > On 3 June 2015 at 14:18, Tanel Alumäe <tan...@ph...> > > wrote: > > Hello, > > > > On my system (Debian testing, amd64), the output of > > `pkg-config --cflags > > glib-2.0` is: > > -I/usr/include/glib-2.0 > > -I/usr/lib/x86_64-linux-gnu/glib-2.0/include > > > > The output of `pkg-config --cflags --libs glib-2.0` is: > > -I/usr/include/glib-2.0 > > -I/usr/lib/x86_64-linux-gnu/glib-2.0/include > > -lglib-2.0 > > > > So, the only difference is the "-lglib-2.0" flag, which is > > already > > included in EXTRA_LDLIBS, which is included in the linker > > flags on line > > 43. > > > > What OS are you using? Can you give your output of the > > commands > > `pkg-config --cflags glib-2.0` and `pkg-config --cflags --libs > > glib-2.0`? > > > > > > Regards, > > > > Tanel > > > > > > On Wed, 2015-06-03 at 13:17 +0530, Niranjan Viladkar wrote: > > > Hello, > > > > > > > > > I am new to kaldi. I am at revision 5112. > > > > > > > > > I had to make following changes to src/gst-plugin/Makefile - > > > > > > in line 9 and 10, add "--libs" flag. > > > > > > for eg. > > > > > > 9 EXTRA_CXXFLAGS += $(shell pkg-config --cflags --libs > > > gstreamer-1.0) > > > 10 EXTRA_CXXFLAGS += $(shell pkg-config --cflags --libs > > glib-2.0) > > > > > > > > > and I had to include $(EXTRA_CXXFLAGS) in the rule > > corresponding to > > > $(LIBFILE) at line no. 41. > > > > > > For eg. > > > > > > 41 $(LIBFILE): $(OBJFILES) > > > 42 $(CXX) -shared -DPIC -o $(LIBFILE) -Wl,-soname= > > $(LIBFILE) > > > -Wl,--no-as-needed \ > > > 43 -L$(KALDILIBDIR) -Wl,-rpath=$(KALDILIBDIR) > > > $(EXTRA_LDLIBS) $(LDLIBS) $(LDFLAGS) $(EXTRA_CXXFLAGS) \ > > > 44 $(OBJFILES) > > > > > > > > > > > > Before these changes, "make" was failing as it could not > > find gst > > > specific plugins from $(EXTRA_LDLIBS). > > > (I have gstreamer and glib installed in local area as I > > don't have a > > > root access on my system) > > > > > > > > > Are these changes correct? > > > If yes, they can be included in Makefile. > > > If not, then what is correct way of building gstreamer > > without > > > modifications to the Makefile. > > > > > > > > > Thanks in advance, > > > > > > Niranjan. > > > > > > > > > > > ------------------------------------------------------------------------------ > > > _______________________________________________ > > > Kaldi-developers mailing list > > > Kal...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > > Kaldi-developers mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > > > > -- > > Effort is important, but knowing where to make an effort in your life > > makes all the difference. > > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > -- Effort is important, but knowing where to make an effort in your life makes all the difference. |