You can subscribe to this list here.
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(4) |
Nov
(1) |
Dec
(14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2012 |
Jan
(1) |
Feb
(8) |
Mar
|
Apr
(1) |
May
(3) |
Jun
(13) |
Jul
(7) |
Aug
(11) |
Sep
(6) |
Oct
(14) |
Nov
(16) |
Dec
(1) |
2013 |
Jan
(3) |
Feb
(8) |
Mar
(17) |
Apr
(21) |
May
(27) |
Jun
(11) |
Jul
(11) |
Aug
(21) |
Sep
(39) |
Oct
(17) |
Nov
(39) |
Dec
(28) |
2014 |
Jan
(36) |
Feb
(30) |
Mar
(35) |
Apr
(17) |
May
(22) |
Jun
(28) |
Jul
(23) |
Aug
(41) |
Sep
(17) |
Oct
(10) |
Nov
(22) |
Dec
(56) |
2015 |
Jan
(30) |
Feb
(32) |
Mar
(37) |
Apr
(28) |
May
(79) |
Jun
(18) |
Jul
(35) |
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Daniel P. <dp...@gm...> - 2014-03-17 16:43:46
|
Hi, There is no explicit support for multi-stream ASR in Kaldi, you'll have to try to understand the codebase and code something yourself [although if you build separate models with the same tree, you can use the DecodableSum class to help you decode with scores summed over the models; you'll need to write code for this though.] Regarding a phone confusion matrix- if you build a system to decode phones, I think the program compute-wer has an option to output confusion data, but I doubt it is in the format you want. However, I would advise against this. Phone confusion matrices are a little old fashioned. Dan On Mon, Mar 17, 2014 at 9:20 AM, <fe...@in...> wrote: > Dear Sirs, > > I am with the Speech Processing and Transmission Lab at the University > of Chile. > We are working on multistream speech recognition in Kaldi, then we > have a couple of questions: > > - We want to create a confusion matrix by phoneme to assess the > performance of only acoustic features. How we could address this in > Kaldi? I think we have to make a phoneme recognizer (w/o word position > dependency), thus we read these posts > http://sourceforge.net/p/kaldi/discussion/1355348/thread/51258bf4/ > and http://sourceforge.net/p/kaldi/discussion/1355348/thread/2294d269/ > from 2013, but we did not find any specific solution. > > - Is there any recipe for multistream ASR in Kaldi ? Any help with this? > > > Best Regards, > > Felipe Espic > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: <fe...@in...> - 2014-03-17 13:51:12
|
Dear Sirs, I am with the Speech Processing and Transmission Lab at the University of Chile. We are working on multistream speech recognition in Kaldi, then we have a couple of questions: - We want to create a confusion matrix by phoneme to assess the performance of only acoustic features. How we could address this in Kaldi? I think we have to make a phoneme recognizer (w/o word position dependency), thus we read these posts http://sourceforge.net/p/kaldi/discussion/1355348/thread/51258bf4/ and http://sourceforge.net/p/kaldi/discussion/1355348/thread/2294d269/ from 2013, but we did not find any specific solution. - Is there any recipe for multistream ASR in Kaldi ? Any help with this? Best Regards, Felipe Espic |
From: Daniel P. <dp...@gm...> - 2014-03-12 20:19:57
|
I'm wondering whether anyone is brave enough to try to get VTLN working in the current scripts. In the early days (see ^/branches/complete, and the egs/rm/s1 scripts) I did do some work with VTLN, but never really got improvements that carried through after fMLLR. However, I have to assume that this was some deficiency in the implementation or the scripts, or something unusual about the dataset. Is anyone brave enough to revisit this? [Experienced people only!] Dan |
From: Augusto H. H. <au...@li...> - 2014-03-07 13:22:40
|
I’ll try that. Thanks very much for the quick answer! On Mar 7, 2014, at 10:10 , Paul Dixon <pau...@gm...> wrote: > You can probably just modify the token expansion code in any of the > decoders. Find the code that iterates over the outgoing arcs and > if an arc has output label add an insertion penalty. The below modified > fragment is from faster-decoder.cc. The emitting and non-emitting expansion > will both need to changed. In the fst the output labels won't be synchronised > and therefore the insertion penalty will not come at the start or end a word, but > there should be a correct number of insertions along a path. This probably doesn't matter > in practise. > > Paul > > for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state); > !aiter.Done(); aiter.Next()) { > Arc arc = aiter.Value(); > if (arc.ilabel != 0) { > Weight ac_weight(- decodable->LogLikelihood(frame, arc.ilabel)); > BaseFloat new_weight = arc.weight.Value() + tok->weight_.Value() > + ac_weight.Value(); > > if (arc.olabel) //Word label so add insertion penalty > new_weight += insertion penalty; > ... > ... > ... > ... > > > On 7 March 2014 13:45, Augusto Henrique Hentz <au...@li...> wrote: > Hello everyone, > > I would like to have a word insertion penalty option in the decoders. I am aware of the `lattice-add-inspen’ program, but I think it would be nicer if this option was inside the decoder core. > > In a recent discussion on the forum, Daniel mentioned that > > > If you really wanted to bake the insertion penalty into the decoding graph, > > you could just add something to the graph cost for any arcs that have word > > labels on them. > (https://sourceforge.net/p/kaldi/discussion/1355348/thread/b07e145d/#2d84/a3b4) > > This interests me. Would anyone care to elaborate? I’m thinking of the BiglmFasterDecoder class. Is the PropagateLm method a good place to add this graph cost? > > Thanks > > -- > Augusto Henrique Hentz > au...@li... > > > > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > -- Augusto Henrique Hentz au...@li... |
From: Paul D. <pau...@gm...> - 2014-03-07 13:11:03
|
You can probably just modify the token expansion code in any of the decoders. Find the code that iterates over the outgoing arcs and if an arc has output label add an insertion penalty. The below modified fragment is from faster-decoder.cc. The emitting and non-emitting expansion will both need to changed. In the fst the output labels won't be synchronised and therefore the insertion penalty will not come at the start or end a word, but there should be a correct number of insertions along a path. This probably doesn't matter in practise. Paul for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state); !aiter.Done(); aiter.Next()) { Arc arc = aiter.Value(); if (arc.ilabel != 0) { Weight ac_weight(- decodable->LogLikelihood(frame, arc.ilabel)); BaseFloat new_weight = arc.weight.Value() + tok->weight_.Value() + ac_weight.Value(); if (arc.olabel) //Word label so add insertion penalty new_weight += insertion penalty; ... ... ... ... On 7 March 2014 13:45, Augusto Henrique Hentz <au...@li...> wrote: > Hello everyone, > > I would like to have a word insertion penalty option in the decoders. I am > aware of the `lattice-add-inspen' program, but I think it would be nicer if > this option was inside the decoder core. > > In a recent discussion on the forum, Daniel mentioned that > > > If you really wanted to bake the insertion penalty into the decoding > graph, > > you could just add something to the graph cost for any arcs that have > word > > labels on them. > ( > https://sourceforge.net/p/kaldi/discussion/1355348/thread/b07e145d/#2d84/a3b4 > ) > > This interests me. Would anyone care to elaborate? I'm thinking of the > BiglmFasterDecoder class. Is the PropagateLm method a good place to add > this graph cost? > > Thanks > > -- > Augusto Henrique Hentz > au...@li... > > > > > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to > Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and > the > freedom to use Git, Perforce or both. Make the move to Perforce. > > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Augusto H. H. <au...@li...> - 2014-03-07 12:46:10
|
Hello everyone, I would like to have a word insertion penalty option in the decoders. I am aware of the `lattice-add-inspen’ program, but I think it would be nicer if this option was inside the decoder core. In a recent discussion on the forum, Daniel mentioned that > If you really wanted to bake the insertion penalty into the decoding graph, > you could just add something to the graph cost for any arcs that have word > labels on them. (https://sourceforge.net/p/kaldi/discussion/1355348/thread/b07e145d/#2d84/a3b4) This interests me. Would anyone care to elaborate? I’m thinking of the BiglmFasterDecoder class. Is the PropagateLm method a good place to add this graph cost? Thanks -- Augusto Henrique Hentz au...@li... |
From: Steven E. - S. <s....@sa...> - 2014-03-06 23:23:25
|
Hi all, First, impressive piece of open-source software!!! Ive been reading through various kaldi resources and started experimenting a bit as of late, I have a few basic questions that I hope my colleagues have not already questioned you on, 0) you mention Sun / Oracle Grid Engine and the ability to launch a truly data parallel run for training. Have you compiled any sort of statistics, with respect to the number of nodes and degradation in classification accuracy as the number of nodes increase? I previously expected they would not be equivalent and a post on your forums by D. Povey, confirmed this. 1) a slight followup to the previous question. Given that there is a degradation in accuracy, has one considered a more tightly coupled model where nodes did not work so independently? i.e. Possibly use mpi (assuming nodes less than 1024) and sharing data that could increase the accuracy score? 2) do you have any big O notation (space, time) for various portions of the toolkit? Likewise, is there any timing features that will profile the runtime for various components / subroutines? 3) I believe reading that you choose to go the multi-process model with lightweight `fork`. Are there any components that utilize a more shared-memory model like threads? Do you see any areas that would benefit significantly from this methodology or see significant drawbacks from this approach? 4) recent advancements in CUDA presents a more shared memory approach and also provide means to remotely DMA from different nodes through CUDA-aware MPI. Given this low-latency communication medium, would this entice you to change any aspects, or possible extend Kaldi? 5) do you document anywhere the various parameters and the effects on runtime, big O, and classification accuracy? 6) What is your process for contribution of patches and features? I have a couple build patches I hope to upstream, and once they are approved, I shall forward. Kindest Regards and looking forward to your reply, -- Steven Eliuk, Ph.D. Comp Sci, Staff Engineer, Advanced Software Platforms Lab, SRA - SV, Samsung Electronics, 1732 North First Street, San Jose, CA 95112. Work: +1 408-652-1976 Cell: +1 408-797-5771 |
From: Daniel P. <dp...@gm...> - 2014-03-05 19:04:01
|
OK, that's strange because our intention was to handle utf-8 wherever we handle words. You will notice that C++ programs in Kaldi essentially never deal with text, they only handle integer id's (except for printing out the best path in decoding, and WER scoring). The only things that deal with text utf-8 should be perl scripts, and we try to make them handle this correctly. But we could have messed up. I would appreciate it if you can find out which program is causing a problem (perhaps sym2int.pl?) and maybe either let us know how to fix it or show us how to replicate the problem. Dan On Wed, Mar 5, 2014 at 2:01 PM, Wael Abd-Almageed <wam...@is...> wrote: > > Thanks Dan. > > It appears that this problem is due to some problem generating > transcription FST from utf-8-encoded transcriptions. The transcription > files I have are in Arabic and I generated lexicon and phones and other > files properly. For some reason, "compiling training graphs" for the arabic > transcriptions along with the lexicon files generates a much larger number > of arcs. > > When I use English "transliterations" it works well. I think it might be a > problem handling utf-8-encoded files. > > Thanks > > Wael > > > On Mar 3, 2014, at 12:24 PM, Daniel Povey <dp...@gm...> wrote: > > That error is generally because you had a too-long transcription of a > too-short utterance. > Are you sure you're not just skipping some lines of the printed output > when you paste your output? > That output makes no sense to me otherwise. > If not, it could be a memory problem-- run in valgrind to test: > valgrind --args <program> <args> > I doubt this code is buggy as it's very old and always used. Possibly a > compiler or machine issue. > Dan > > > > On Fri, Feb 28, 2014 at 6:21 PM, Wael Abd-Almageed <wam...@is...>wrote: > >> Hi >> >> I am using Kaldi in an OCR project, but it is my first time using Kaldi. >> >> For some of my utterances, Kaldi fails to align as follows: >> >> EqualAlign: utterance has too to frames 222 to align. >> >> >> I tried to debug it and if fails at the following condition in >> fstext-utils-inl.h >> if (num_ilabels > length) { >> cout << "Exiting .." <<endl; >> KALDI_WARN << "EqualAlign: utterance has too to frames " << length >> << " to align." << num_ilabels <<"\n"; >> return false; // can't make it shorter by adding self-loops!. >> >> } >> >> I added some debugging lines to EqualAlign function and it appears that >> num_ilabels suddenly increases for some reason which makes EqualAlign >> fail. Here is a trace of the value of num_ilabels and another variable >> "x" that I defined in the function. >> >> Before while --------- >> in while .. x = 0 >> in if .. >> incrementing num_ilabels .. before = 0 ... after = 1 >> in while .. x = 1 >> in if .. >> in while .. x = 2 >> in if .. >> in while .. x = 3 >> in if .. >> incrementing num_ilabels .. before = 1 ... after = 2 >> in while .. x = 4 >> in if .. >> incrementing num_ilabels .. before = 2 ... after = 3 >> in while .. x = 5 >> in if .. >> in while .. x = 6 >> in if .. >> in while .. x = 7 >> in if .. >> incrementing num_ilabels .. before = 3 ... after = 4 >> in while .. x = 8 >> in if .. >> in while .. x = 9 >> in if .. >> in while .. x = 10 >> in if .. >> incrementing num_ilabels .. before = 4 ... after = 5 >> in while .. x = 622 >> in if .. >> incrementing num_ilabels .. before = 298 ... after = 299 >> in while .. x = 623 >> in if .. >> in while .. x = 624 >> in if .. >> incrementing num_ilabels .. before = 299 ... after = 300 >> in while .. x = 625 >> in if .. >> in while .. x = 626 >> Exiting .. >> >> My modifications to debug the code are as follows: >> cout << "in while .. x = "<<x++<<std::endl; >> >> >> if (arc_offset < num_arcs) >> { // an actual arc. >> >> cout << "\t in if ..\n"; >> ArcIterator<Fst<Arc> > aiter(ifst, s); >> aiter.Seek(arc_offset); >> const Arc &arc = aiter.Value(); >> if (arc.nextstate == s) >> { >> continue; // don't take this self-loop arc >> >> } >> else >> { >> arc_offsets.push_back(arc_offset); >> path.push_back(arc.nextstate); >> if (arc.ilabel != 0) >> { >> cout<<"\t\t incrementing num_ilabels .. before = "<< >> num_ilabels; >> //num_ilabels++; >> >> num_ilabels=num_ilabels+1; >> cout << " ... after = "<<num_ilabels <<"\n"; >> } >> } >> } >> else >> { >> break; // Chose final-prob. >> >> } >> } >> >> >> >> I would appreciate if you can help. >> >> Best regards, >> >> Wael >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Subversion Kills Productivity. Get off Subversion & Make the Move to >> Perforce. >> With Perforce, you get hassle-free workflows. Merge that actually works. >> Faster operations. Version large binaries. Built-in WAN optimization and >> the >> freedom to use Git, Perforce or both. Make the move to Perforce. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > > |
From: Wael Abd-A. <wam...@is...> - 2014-03-05 19:01:23
|
Thanks Dan. It appears that this problem is due to some problem generating transcription FST from utf-8-encoded transcriptions. The transcription files I have are in Arabic and I generated lexicon and phones and other files properly. For some reason, “compiling training graphs” for the arabic transcriptions along with the lexicon files generates a much larger number of arcs. When I use English “transliterations” it works well. I think it might be a problem handling utf-8-encoded files. Thanks Wael On Mar 3, 2014, at 12:24 PM, Daniel Povey <dp...@gm...> wrote: > That error is generally because you had a too-long transcription of a too-short utterance. > Are you sure you're not just skipping some lines of the printed output when you paste your output? > That output makes no sense to me otherwise. > If not, it could be a memory problem-- run in valgrind to test: > valgrind --args <program> <args> > I doubt this code is buggy as it's very old and always used. Possibly a compiler or machine issue. > Dan > > > > On Fri, Feb 28, 2014 at 6:21 PM, Wael Abd-Almageed <wam...@is...> wrote: > Hi > > I am using Kaldi in an OCR project, but it is my first time using Kaldi. > > For some of my utterances, Kaldi fails to align as follows: > > EqualAlign: utterance has too to frames 222 to align. > > > I tried to debug it and if fails at the following condition in fstext-utils-inl.h > if (num_ilabels > length) { > cout << "Exiting .." <<endl; > KALDI_WARN << "EqualAlign: utterance has too to frames " << length > << " to align." << num_ilabels <<"\n"; > return false; // can't make it shorter by adding self-loops!. > } > > I added some debugging lines to EqualAlign function and it appears that num_ilabels suddenly increases for some reason which makes EqualAlign fail. Here is a trace of the value of num_ilabels and another variable “x” that I defined in the function. > > Before while --------- > in while .. x = 0 > in if .. > incrementing num_ilabels .. before = 0 ... after = 1 > in while .. x = 1 > in if .. > in while .. x = 2 > in if .. > in while .. x = 3 > in if .. > incrementing num_ilabels .. before = 1 ... after = 2 > in while .. x = 4 > in if .. > incrementing num_ilabels .. before = 2 ... after = 3 > in while .. x = 5 > in if .. > in while .. x = 6 > in if .. > in while .. x = 7 > in if .. > incrementing num_ilabels .. before = 3 ... after = 4 > in while .. x = 8 > in if .. > in while .. x = 9 > in if .. > in while .. x = 10 > in if .. > incrementing num_ilabels .. before = 4 ... after = 5 > in while .. x = 622 > in if .. > incrementing num_ilabels .. before = 298 ... after = 299 > in while .. x = 623 > in if .. > in while .. x = 624 > in if .. > incrementing num_ilabels .. before = 299 ... after = 300 > in while .. x = 625 > in if .. > in while .. x = 626 > Exiting .. > > My modifications to debug the code are as follows: > cout << "in while .. x = "<<x++<<std::endl; > > > if (arc_offset < num_arcs) > { // an actual arc. > cout << "\t in if ..\n"; > ArcIterator<Fst<Arc> > aiter(ifst, s); > aiter.Seek(arc_offset); > const Arc &arc = aiter.Value(); > if (arc.nextstate == s) > { > continue; // don't take this self-loop arc > } > else > { > arc_offsets.push_back(arc_offset); > path.push_back(arc.nextstate); > if (arc.ilabel != 0) > { > cout<<"\t\t incrementing num_ilabels .. before = "<< num_ilabels; > //num_ilabels++; > num_ilabels=num_ilabels+1; > cout << " ... after = "<<num_ilabels <<"\n"; > } > } > } > else > { > break; // Chose final-prob. > } > } > > > > I would appreciate if you can help. > > Best regards, > > Wael > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2014-03-03 17:24:54
|
That error is generally because you had a too-long transcription of a too-short utterance. Are you sure you're not just skipping some lines of the printed output when you paste your output? That output makes no sense to me otherwise. If not, it could be a memory problem-- run in valgrind to test: valgrind --args <program> <args> I doubt this code is buggy as it's very old and always used. Possibly a compiler or machine issue. Dan On Fri, Feb 28, 2014 at 6:21 PM, Wael Abd-Almageed <wam...@is...> wrote: > Hi > > I am using Kaldi in an OCR project, but it is my first time using Kaldi. > > For some of my utterances, Kaldi fails to align as follows: > > EqualAlign: utterance has too to frames 222 to align. > > > I tried to debug it and if fails at the following condition in > fstext-utils-inl.h > if (num_ilabels > length) { > cout << "Exiting .." <<endl; > KALDI_WARN << "EqualAlign: utterance has too to frames " << length > << " to align." << num_ilabels <<"\n"; > return false; // can't make it shorter by adding self-loops!. > > } > > I added some debugging lines to EqualAlign function and it appears that > num_ilabels suddenly increases for some reason which makes EqualAlign > fail. Here is a trace of the value of num_ilabels and another variable > "x" that I defined in the function. > > Before while --------- > in while .. x = 0 > in if .. > incrementing num_ilabels .. before = 0 ... after = 1 > in while .. x = 1 > in if .. > in while .. x = 2 > in if .. > in while .. x = 3 > in if .. > incrementing num_ilabels .. before = 1 ... after = 2 > in while .. x = 4 > in if .. > incrementing num_ilabels .. before = 2 ... after = 3 > in while .. x = 5 > in if .. > in while .. x = 6 > in if .. > in while .. x = 7 > in if .. > incrementing num_ilabels .. before = 3 ... after = 4 > in while .. x = 8 > in if .. > in while .. x = 9 > in if .. > in while .. x = 10 > in if .. > incrementing num_ilabels .. before = 4 ... after = 5 > in while .. x = 622 > in if .. > incrementing num_ilabels .. before = 298 ... after = 299 > in while .. x = 623 > in if .. > in while .. x = 624 > in if .. > incrementing num_ilabels .. before = 299 ... after = 300 > in while .. x = 625 > in if .. > in while .. x = 626 > Exiting .. > > My modifications to debug the code are as follows: > cout << "in while .. x = "<<x++<<std::endl; > > > if (arc_offset < num_arcs) > { // an actual arc. > > cout << "\t in if ..\n"; > ArcIterator<Fst<Arc> > aiter(ifst, s); > aiter.Seek(arc_offset); > const Arc &arc = aiter.Value(); > if (arc.nextstate == s) > { > continue; // don't take this self-loop arc > > } > else > { > arc_offsets.push_back(arc_offset); > path.push_back(arc.nextstate); > if (arc.ilabel != 0) > { > cout<<"\t\t incrementing num_ilabels .. before = "<< > num_ilabels; > //num_ilabels++; > > num_ilabels=num_ilabels+1; > cout << " ... after = "<<num_ilabels <<"\n"; > } > } > } > else > { > break; // Chose final-prob. > > } > } > > > > I would appreciate if you can help. > > Best regards, > > Wael > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to > Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and > the > freedom to use Git, Perforce or both. Make the move to Perforce. > > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Wael Abd-A. <wam...@is...> - 2014-02-28 23:36:55
|
Hi I am using Kaldi in an OCR project, but it is my first time using Kaldi. For some of my utterances, Kaldi fails to align as follows: EqualAlign: utterance has too to frames 222 to align. I tried to debug it and if fails at the following condition in fstext-utils-inl.h if (num_ilabels > length) { cout << "Exiting .." <<endl; KALDI_WARN << "EqualAlign: utterance has too to frames " << length << " to align." << num_ilabels <<"\n"; return false; // can't make it shorter by adding self-loops!. } I added some debugging lines to EqualAlign function and it appears that num_ilabels suddenly increases for some reason which makes EqualAlign fail. Here is a trace of the value of num_ilabels and another variable “x” that I defined in the function. Before while --------- in while .. x = 0 in if .. incrementing num_ilabels .. before = 0 ... after = 1 in while .. x = 1 in if .. in while .. x = 2 in if .. in while .. x = 3 in if .. incrementing num_ilabels .. before = 1 ... after = 2 in while .. x = 4 in if .. incrementing num_ilabels .. before = 2 ... after = 3 in while .. x = 5 in if .. in while .. x = 6 in if .. in while .. x = 7 in if .. incrementing num_ilabels .. before = 3 ... after = 4 in while .. x = 8 in if .. in while .. x = 9 in if .. in while .. x = 10 in if .. incrementing num_ilabels .. before = 4 ... after = 5 in while .. x = 622 in if .. incrementing num_ilabels .. before = 298 ... after = 299 in while .. x = 623 in if .. in while .. x = 624 in if .. incrementing num_ilabels .. before = 299 ... after = 300 in while .. x = 625 in if .. in while .. x = 626 Exiting .. My modifications to debug the code are as follows: cout << "in while .. x = "<<x++<<std::endl; if (arc_offset < num_arcs) { // an actual arc. cout << "\t in if ..\n"; ArcIterator<Fst<Arc> > aiter(ifst, s); aiter.Seek(arc_offset); const Arc &arc = aiter.Value(); if (arc.nextstate == s) { continue; // don't take this self-loop arc } else { arc_offsets.push_back(arc_offset); path.push_back(arc.nextstate); if (arc.ilabel != 0) { cout<<"\t\t incrementing num_ilabels .. before = "<< num_ilabels; //num_ilabels++; num_ilabels=num_ilabels+1; cout << " ... after = "<<num_ilabels <<"\n"; } } } else { break; // Chose final-prob. } } I would appreciate if you can help. Best regards, Wael |
From: Xavier A. <xan...@gm...> - 2014-02-26 19:16:17
|
Indeed, this was my question. I will give it a try. It seems like a lot of processing for aligning a single utterance, but I guess it will be the same if I create a single lang directory for OOV words in all my utterances. Thanks a lot, Xavi Anguera On Wed, Feb 26, 2014 at 8:03 PM, Daniel Povey <dp...@gm...> wrote: > > > Hi, >> I am using the Switchboard recipe to force-align some audio files for >> which I have the transcriptions. I also have the phoneme transcriptions of >> all the words. >> I am doing it using the scripts in steps/align_si.sh >> and steps/get_train_ctm.sh >> Whenever one word is not found in the Switchboard lexicon it aligns an >> OOV word instead (i.e. <unk>). Is there a way to tell kaldi at alignment >> time the transcription of these words? >> > > I think what you mean is, "is there a way to give Kaldi at alignment time > the lexicon entry for these words?". > The easiest way to do this is to create a new "lang" directory that has a > larger lexicon, including the new words, and provide this directory to the > script that does the alignment.. You can use the prepare_lang.sh script > for this; just give it an input directory that has a larger lexicon.txt or > lexiconp.txt. Make sure after you do this that the phones.txt files are > identical in the old and new directories, except possibly for extra > disambiguation symbols (#1, #2, etc.). > > Dan > > > >> >> Thanks in advance, >> >> Xavi Anguera >> >> >> ------------------------------------------------------------------------------ >> Flow-based real-time traffic analytics software. Cisco certified tool. >> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer >> Customize your own dashboards, set traffic alerts and generate reports. >> Network behavioral analysis & security monitoring. All-in-one tool. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk >> _______________________________________________ >> Kaldi-developers mailing list >> Kal...@li... >> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >> >> > |
From: Daniel P. <dp...@gm...> - 2014-02-26 19:03:59
|
Hi, > I am using the Switchboard recipe to force-align some audio files for > which I have the transcriptions. I also have the phoneme transcriptions of > all the words. > I am doing it using the scripts in steps/align_si.sh > and steps/get_train_ctm.sh > Whenever one word is not found in the Switchboard lexicon it aligns an OOV > word instead (i.e. <unk>). Is there a way to tell kaldi at alignment time > the transcription of these words? > I think what you mean is, "is there a way to give Kaldi at alignment time the lexicon entry for these words?". The easiest way to do this is to create a new "lang" directory that has a larger lexicon, including the new words, and provide this directory to the script that does the alignment.. You can use the prepare_lang.sh script for this; just give it an input directory that has a larger lexicon.txt or lexiconp.txt. Make sure after you do this that the phones.txt files are identical in the old and new directories, except possibly for extra disambiguation symbols (#1, #2, etc.). Dan > > Thanks in advance, > > Xavi Anguera > > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2014-02-26 18:07:24
|
Send me a patch, i.e. the output of "svn diff", at dp...@gm..., and I'll have a look. Dan On Wed, Feb 26, 2014 at 12:53 PM, Augusto Henrique Hentz < au...@li...> wrote: > Hello, > > I'd like to know what is the policy for contributions to the project. I > added an input implementation that uses libcurl to read stuff from HTTP, > and I'd like to submit it for inclusion in the mainline. What should I do? > > Thanks > > -- > Augusto Henrique Hentz > au...@li... > > > > > > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Augusto H. H. <au...@li...> - 2014-02-26 17:53:19
|
Hello, I’d like to know what is the policy for contributions to the project. I added an input implementation that uses libcurl to read stuff from HTTP, and I’d like to submit it for inclusion in the mainline. What should I do? Thanks -- Augusto Henrique Hentz au...@li... |
From: Xavier A. <xan...@gm...> - 2014-02-26 14:25:24
|
Hi, I am using the Switchboard recipe to force-align some audio files for which I have the transcriptions. I also have the phoneme transcriptions of all the words. I am doing it using the scripts in steps/align_si.sh and steps/get_train_ctm.sh Whenever one word is not found in the Switchboard lexicon it aligns an OOV word instead (i.e. <unk>). Is there a way to tell kaldi at alignment time the transcription of these words? Thanks in advance, Xavi Anguera |
From: Daniel P. <dp...@gm...> - 2014-02-21 17:46:09
|
queue.pl tries to submit jobs with GridEngine (qsub) and if it is not installed, you will get an error. You could change the run.sh to set train_cmd=run.pl, decode_cmd=run.pl This should work but some later parts of the setup might give you a problem as it may try to run too many jobs on a single machine and exhaust memory. Dan On Wed, Feb 12, 2014 at 4:42 AM, 牙擦苏 <289...@qq...> wrote: > I want to run the example s5.but an error info occurred in log.such as: > queue.pl: error submitting jobs to queue(return status was 512). > How I can solve this bug? > > > ------------------------------------------------------------------------------ > Managing the Performance of Cloud-Based Applications > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. > Read the Whitepaper. > > http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Daniel P. <dp...@gm...> - 2014-02-20 18:21:03
|
Thanks for reporting it! I'll fix it next time I commit. Dan On Thu, Feb 20, 2014 at 9:01 AM, Simon Klüpfel <sim...@gm...>wrote: > Hi, > > I observed a small bug in train_deltas.sh. At the end, not only the link > to the .mdl, but also to the .occs should be deleted, before being > reassigned. > > ---- > rm $dir/final.mdl 2>/dev/null > ln -s $x.mdl $dir/final.mdl > ln -s $x.occs $dir/final.occs > ---- > ==> > ---- > rm $dir/final.mdl 2>/dev/null > rm $dir/final.occs 2>/dev/null > ln -s $x.mdl $dir/final.mdl > ln -s $x.occs $dir/final.occs > ---- > > Hope this is right, > > Simon > > > ------------------------------------------------------------------------------ > Managing the Performance of Cloud-Based Applications > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. > Read the Whitepaper. > > http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Simon K. <sim...@gm...> - 2014-02-20 17:01:44
|
Hi, I observed a small bug in train_deltas.sh. At the end, not only the link to the .mdl, but also to the .occs should be deleted, before being reassigned. ---- rm $dir/final.mdl 2>/dev/null ln -s $x.mdl $dir/final.mdl ln -s $x.occs $dir/final.occs ---- ==> ---- rm $dir/final.mdl 2>/dev/null rm $dir/final.occs 2>/dev/null ln -s $x.mdl $dir/final.mdl ln -s $x.occs $dir/final.occs ---- Hope this is right, Simon |
From: Daniel P. <dp...@gm...> - 2014-02-18 18:46:07
|
Thanks for letting us know! I'll commit the fix next time I commit. Dan On Tue, Feb 18, 2014 at 10:40 AM, Simon Klüpfel <sim...@gm...>wrote: > Hi, > > I think there is a bug in train_quick.sh. Unless the feature type is > LDA, the $feats variable is not set. > > The snippet below should probably have an 'else' part. > > ------ > if [ -f $alidir/trans.1 ]; then > echo "$0: using transforms from $alidir" > ln.pl $alidir/trans.* $dir # Link them to dest dir. > feats="$sifeats transform-feats --utt2spk=ark:$sdata/JOB/utt2spk > ark,s,cs:$dir/trans.JOB ark:- ark:- |" > fi > ------ > > something like > > ------ > else > feats="$sifeats" > ------ > > At least this worked for me to avoid the error of passing an empty > parameter on further down. > > All the best, > > Simon > > > ------------------------------------------------------------------------------ > Managing the Performance of Cloud-Based Applications > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. > Read the Whitepaper. > > http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > |
From: Simon K. <sim...@gm...> - 2014-02-18 18:40:17
|
Hi, I think there is a bug in train_quick.sh. Unless the feature type is LDA, the $feats variable is not set. The snippet below should probably have an 'else' part. ------ if [ -f $alidir/trans.1 ]; then echo "$0: using transforms from $alidir" ln.pl $alidir/trans.* $dir # Link them to dest dir. feats="$sifeats transform-feats --utt2spk=ark:$sdata/JOB/utt2spk ark,s,cs:$dir/trans.JOB ark:- ark:- |" fi ------ something like ------ else feats="$sifeats" ------ At least this worked for me to avoid the error of passing an empty parameter on further down. All the best, Simon |
From: 牙. <289...@qq...> - 2014-02-12 09:42:42
|
I want to run the example s5.but an error info occurred in log.such as: queue.pl: error submitting jobs to queue(return status was 512). How I can solve this bug? |
From: Vesely K. <ive...@fi...> - 2014-02-10 13:17:56
|
Fixed, Sending utils/convert_slf.pl Transmitting file data . Committed revision 3492. K. On 02/09/2014 01:03 AM, Daniel Povey wrote: > if the fourth field is not present, it means the "unit weight" |
From: Daniel P. <dp...@gm...> - 2014-02-09 00:04:06
|
In OpenFst the convention is that if the fourth field is not present, it means the "unit weight", i.e. equivalent to "0,0," in this case. Karel should be able to fix it. Dan Hi, > I want to convert kaldi n-best lattices to HTK SLF format. For this I use > the commands lattice-to-nbest and convert_slf.pl > > I am encountering an error in some cases when the input of convert_slf.plis something like the following: > > 0 1 <eps> > 1 2 sil 5.98436,615.382, > 2 3 hh_B 8.18498,343.171,2_8_18 > 3 4 ah_E 0,0,7762_7788_7787_7810 > 4 5 dh_B 4.07396,1041.53,17278_17346_17396_17395_17395 > 5 6 ae_I 0,0,6934_6976_6975_6975_7010_7009_7009 > 6 7 n_I 7.94708,1145.48,16638_16730_16729_16729_16790_16789 > 7 8 ch_I 3.78969,383.177,11742_11812_11811_11830_11829_11829 > 8 9 ax_I 1.75077,720.301,540_556_555_568_567_567_567 > 9 10 l_E 11.86,1544.22,1968_1967_1967_2080_2079_2079_2098 > 10 11 iy_S > 3.74688,373.832,10382_10381_10381_10381_10381_10381_10426_10425_10425_10425_10498_5694_5693_5693_5728_5727_5727_5834_5833_5833 > 11 > > Digging into convert_slf.pl I see that when the phoneme <eps> appears, it > is expected to always have a 4th column. Given that in my example it has > only 3, it crashes. > I modified the perl script and got it to work when finding only 3 columns, > but I wonder whether the lack of a score might be indicating some deeper > and more obscure error I should be aware of. > > Thanks for your help, > > Xavier Anguera > > > ------------------------------------------------------------------------------ > Managing the Performance of Cloud-Based Applications > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. > Read the Whitepaper. > > http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |
From: Xavier A. <xan...@gm...> - 2014-02-08 22:47:34
|
Hi, I want to convert kaldi n-best lattices to HTK SLF format. For this I use the commands lattice-to-nbest and convert_slf.pl I am encountering an error in some cases when the input of convert_slf.plis something like the following: 0 1 <eps> 1 2 sil 5.98436,615.382, 2 3 hh_B 8.18498,343.171,2_8_18 3 4 ah_E 0,0,7762_7788_7787_7810 4 5 dh_B 4.07396,1041.53,17278_17346_17396_17395_17395 5 6 ae_I 0,0,6934_6976_6975_6975_7010_7009_7009 6 7 n_I 7.94708,1145.48,16638_16730_16729_16729_16790_16789 7 8 ch_I 3.78969,383.177,11742_11812_11811_11830_11829_11829 8 9 ax_I 1.75077,720.301,540_556_555_568_567_567_567 9 10 l_E 11.86,1544.22,1968_1967_1967_2080_2079_2079_2098 10 11 iy_S 3.74688,373.832,10382_10381_10381_10381_10381_10381_10426_10425_10425_10425_10498_5694_5693_5693_5728_5727_5727_5834_5833_5833 11 Digging into convert_slf.pl I see that when the phoneme <eps> appears, it is expected to always have a 4th column. Given that in my example it has only 3, it crashes. I modified the perl script and got it to work when finding only 3 columns, but I wonder whether the lack of a score might be indicating some deeper and more obscure error I should be aware of. Thanks for your help, Xavier Anguera |