From: Xavier A. <xan...@gm...> - 2013-12-28 20:35:37
|
Sure, here it is: The error: # utils/mkgraph.sh --mono data/lang_test_phn-mono exp/mono exp/mono/graph_phn # Started at Sat Dec 28 20:47:57 CET 2013 # fstminimizeencoded fsttablecompose data/lang_test_phn-mono/L_disambig.fst data/lang_test_phn-mono/G.fst fstdeterminizestar --use-log=true fstisstochastic data/lang_test_phn-mono/tmp/LG.fst 0.000358155 -0.000356635 fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test_phn-mono/phones/disambig.int--write-disambig-syms=data/lang_test_ phn-mono/tmp/disambig_ilabels_1_0.intdata/lang_test_phn-mono/tmp/ilabels_1_0 WARNING (fstcomposecontext:main():fstcomposecontext.cc:130) Disambiguation symbols list is empty; this likely indicates an error in data preparation. fstcomposecontext: ../fstext/context-fst-inl.h:105: fst::ContextFstImpl<Arc, LabelT>::ContextFstImpl(typename Arc::Label, const std::vector<B, std::allocator< _T2> >&, const std::vector<B, std::allocator<_T2> >&, int, int) [with Arc = fst::ArcTpl<fst::TropicalWeightTpl<float> >, LabelT = int]: Assertion `subsequenti al_symbol != 0 && disambig_syms_.count(subsequential_symbol) == 0 && phone_syms_.count(subsequential_symbol) == 0' failed. utils/mkgraph.sh: line 76: 7661 Aborted fstcomposecontext --context-size=$N --central-position=$P --read-disambig-syms=$lang/phones/disambig. int --write-disambig-syms=$lang/tmp/disambig_ilabels_${N}_${P}.int $lang/tmp/ilabels_${N}_${P} < $lang/tmp/LG.fst > $clg fstisstochastic data/lang_test_phn-mono/tmp/CLG_1_0.fst ERROR: FstHeader::Read: Bad FST header: data/lang_test_phn-mono/tmp/CLG_1_0.fst ERROR (fstisstochastic:ReadFstKaldi():fstext/fstext-utils-inl.h:1183) Reading FST: error reading FST header from data/lang_test_phn-mono/tmp/CLG_1_0.fst ERROR (fstisstochastic:ReadFstKaldi():fstext/fstext-utils-inl.h:1183) Reading FST: error reading FST header from data/lang_test_phn-mono/tmp/CLG_1_0.fst The execution of gdb: (gdb) where #0 0x00007ffff6be9475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007ffff6bec6f0 in *__GI_abort () at abort.c:92 #2 0x00007ffff6be2621 in *__GI___assert_fail ( assertion=0x498448 "subsequential_symbol != 0 && disambig_syms_.count(subsequential_symbol) == 0 && phone_syms_.count(subsequential_symbol) == 0", file=<optimized out>, line=105, function=0x499700 "fst::ContextFstImpl<Arc, LabelT>::ContextFstImpl(typename Arc::Label, const std::vector<B, std::allocator<_T2> >&, const std::vector<B, std::allocator<_T2> >&, int, int) [with Arc = fst::ArcTpl<fst::T"...) at assert.c:81 #3 0x000000000045b419 in fst::ContextFstImpl<fst::ArcTpl<fst::TropicalWeightTpl<float> >, int>::ContextFstImpl (this=0x6bd520, subsequential_symbol=97, phone_syms=..., disambig_syms=..., N=1, P=0) at ../fstext/context-fst-inl.h:103 #4 0x0000000000457610 in fst::ContextFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, int>::ContextFst (this=0x7fffffffd100, subsequential_symbol=97, phones=..., disambig_syms=..., N=1, P=0) at ../fstext/context-fst.h:223 #5 0x0000000000455b95 in fst::ComposeContext (disambig_syms_in=..., N=1, P=0, ifst=0x6c5be0, ofst=0x7fffffffd390, ilabels_out=0x7fffffffd3a0) at ../fstext/context-fst-inl.h:522 #6 0x00000000004522a3 in main (argc=7, argv=0x7fffffffdaa8) at fstcomposecontext.cc:138 (gdb) up #1 0x00007ffff6bec6f0 in *__GI_abort () at abort.c:92 92 abort.c: No such file or directory. (gdb) up #2 0x00007ffff6be2621 in *__GI___assert_fail ( assertion=0x498448 "subsequential_symbol != 0 && disambig_syms_.count(subsequential_symbol) == 0 && phone_syms_.count(subsequential_symbol) == 0", file=<optimized out>, line=105, function=0x499700 "fst::ContextFstImpl<Arc, LabelT>::ContextFstImpl(typename Arc::Label, const std::vector<B, std::allocator<_T2> >&, const std::vector<B, std::allocator<_T2> >&, int, int) [with Arc = fst::ArcTpl<fst::T"...) at assert.c:81 81 assert.c: No such file or directory. (gdb) p subsequential_symbol No symbol "subsequential_symbol" in current context. (gdb) up #3 0x000000000045b419 in fst::ContextFstImpl<fst::ArcTpl<fst::TropicalWeightTpl<float> >, int>::ContextFstImpl (this=0x6bd520, subsequential_symbol=97, phone_syms=..., disambig_syms=..., N=1, P=0) at ../fstext/context-fst-inl.h:103 103 assert(subsequential_symbol != 0 (gdb) p subsequential_symbol $1 = 97 (gdb) p disambig_syms_.count(subsequential_symbol) $2 = 0 (gdb) p phone_syms_.count(subsequential_symbol) $3 = 1 (gdb) p phone_syms_.size() $4 = 78 (gdb) p disambig_syms_.size() $5 = 0 Thanks X. On Sat, Dec 28, 2013 at 9:01 PM, Daniel Povey <dp...@gm...> wrote: > The same error should not have happened. Can you please do the same steps > in gdb as last time, and paste the screen from gdb? > Dan > > > > On Sat, Dec 28, 2013 at 11:49 AM, Xavier Anguera <xan...@gm...>wrote: > >> Dan, >> the same error occurred, just that now I got the extra Warning you >> inserted. >> Should I maybe modify the make_phone_bigram_lang.sh script to copy the >> current disambig.* files into the new lang directory? >> >> Thanks, >> >> X. >> >> >> >> On Sat, Dec 28, 2013 at 8:03 PM, Daniel Povey <dp...@gm...> wrote: >> >>> OK, then try running the script with the code fix I checked in. I >>> forgot about the existence of that script. Possibly it will work. I'll >>> have to modify validate_lang.pl in that case. >>> Dan >>> >>> >>> >>> On Sat, Dec 28, 2013 at 7:02 AM, Xavier Anguera <xan...@gm...>wrote: >>> >>>> Dan, >>>> there must be something I do not do correctly in my current setup, or >>>> you did not understand where my problem is. >>>> I am currently calling the script mkgraph.sh (that is crashing) in the >>>> following context: >>>> >>>> # Create phone-bigram grammar (unsmoothed) estimated from >>>> alignments >>>> utils/make_phone_bigram_lang.sh data/lang exp/mono_ali_all >>>> data/lang_test_phn-mono || exit 1; >>>> # Create phone recognition graph >>>> $train_cmd exp/mono/graph/mkgraph_phn.log \ >>>> utils/mkgraph.sh --mono data/lang_test_phn-mono exp/mono >>>> exp/mono/graph_phn || exit 1 >>>> >>>> As you can see, first the script make_phone_bigram_lang.sh is called, >>>> which takes as an input a lang directory and creates a "test" lang >>>> directory. Looking into this script I see that the disambig.* files are >>>> left empty in purpose in the new directory (they are not empty in the >>>> original lang directory, in fact, they have the #0 #1 values you proposed >>>> in the previous email). >>>> Then, when calling the mkgraph.sh script with this test_lang directory >>>> it complaints as stated in my previous emails. >>>> The question is then whether I should modify make_phone_bigram_lang.sh >>>> to copy the original disambig.* files or should I pass the original lang >>>> directory to the mkgraph.sh script, or am I doing something else very >>>> wrong? >>>> >>>> Thanks for your help. >>>> >>>> Xavier Anguera >>>> >>>> >>>> On Sat, Dec 28, 2013 at 1:43 AM, Daniel Povey <dp...@gm...> wrote: >>>> >>>>> OK, I just committed a fix because it should not have crashed at that >>>>> particular point in the code, but the underlying error is with your lang >>>>> directory. You do need to have the disambiguation symbols "disambig.txt", >>>>> with at least #0 and #1. You should probably be creating the lang >>>>> directory with the prepare_lang.sh script, and if not, at least you should >>>>> validate it with the validate_lang.pl script. Also, there is no >>>>> reason to have a separate "lang" directory for the monophone setup, the >>>>> same directory is valid for monophone or triphone setups. >>>>> >>>>> Dan >>>>> >>>>> >>>>> >>>>> On Fri, Dec 27, 2013 at 4:18 PM, Xavier Anguera <xan...@gm...>wrote: >>>>> >>>>>> Dear Dan, >>>>>> thank you for your help. >>>>>> Next are the tests you asked me to perform: >>>>>> >>>>>> Running utils/validate_lang.pl data/lang_test_phn-mono/ gives: >>>>>> >>>>>> Checking data/lang_test_phn-mono//phones/roots.{txt, int} ... >>>>>> --> 30 entry/entries in data/lang_test_phn-mono//phones/roots.txt >>>>>> --> data/lang_test_phn-mono//phones/roots.int corresponds to >>>>>> data/lang_test_phn-mono//phones/roots.txt >>>>>> --> data/lang_test_phn-mono//phones/roots.{txt, int} are OK >>>>>> >>>>>> Checking data/lang_test_phn-mono//phones/sets.{txt, int} ... >>>>>> --> 30 entry/entries in data/lang_test_phn-mono//phones/sets.txt >>>>>> --> data/lang_test_phn-mono//phones/sets.int corresponds to >>>>>> data/lang_test_phn-mono//phones/sets.txt >>>>>> --> data/lang_test_phn-mono//phones/sets.{txt, int} are OK >>>>>> >>>>>> Checking data/lang_test_phn-mono//phones/extra_questions.{txt, int} >>>>>> ... >>>>>> --> 9 entry/entries in >>>>>> data/lang_test_phn-mono//phones/extra_questions.txt >>>>>> --> data/lang_test_phn-mono//phones/extra_questions.int corresponds >>>>>> to data/lang_test_phn-mono//phones/extra_questions.txt >>>>>> --> data/lang_test_phn-mono//phones/extra_questions.{txt, int} are OK >>>>>> >>>>>> Checking disjoint: silence.txt, nosilenct.txt, disambig.txt ... >>>>>> --> silence.txt and nonsilence.txt are disjoint >>>>>> --> silence.txt and disambig.txt are disjoint >>>>>> --> disambig.txt and nonsilence.txt are disjoint >>>>>> --> disjoint property is OK >>>>>> >>>>>> Checking sumation: silence.txt, nonsilence.txt, disambig.txt ... >>>>>> --> ERROR: data/lang_test_phn-mono//phones/disambig.txt is empty or >>>>>> not exists >>>>>> >>>>>> Checking optional_silence.txt ... >>>>>> --> reading data/lang_test_phn-mono//phones/optional_silence.txt >>>>>> --> data/lang_test_phn-mono//phones/optional_silence.txt is OK >>>>>> >>>>>> Checking disambiguation symbols: #0 and #1 >>>>>> --> ERROR: data/lang_test_phn-mono//phones/disambig.txt is empty or >>>>>> not exists >>>>>> --> ERROR: data/lang_test_phn-mono//phones/disambig.txt doesn't have >>>>>> "#0" or "#1" >>>>>> Checking topo ... >>>>>> --> data/lang_test_phn-mono//topo's nonsilence section is OK >>>>>> --> data/lang_test_phn-mono//topo's silence section is OK >>>>>> --> data/lang_test_phn-mono//topo is OK >>>>>> >>>>>> Checking data/lang_test_phn-mono//oov.{txt, int} ... >>>>>> --> ERROR: fail to open data/lang_test_phn-mono//oov.txt >>>>>> >>>>>> --> ERROR >>>>>> >>>>>> Apparently I do not have either oov.txt nore disambig.txt >>>>>> Probably the test data I am using does not have any OOV in it. I can >>>>>> add it artificially, but I guess this is not the main problem here... >>>>>> regarding the disambig.txt file, what should it contain? >>>>>> >>>>>> I did run gdb as you indicated (thank you for such detailed info) and >>>>>> gives me: >>>>>> (gdb) p subsequential_symbol >>>>>> $1 = 97 >>>>>> (gdb) p disambig_syms_.count(subsequential_symbol) >>>>>> $2 = 0 >>>>>> (gdb) p phone_syms_.count(subsequential_symbol) >>>>>> $3 = 1 >>>>>> (gdb) p phone_syms_.size() >>>>>> $4 = 78 >>>>>> (gdb) p disambig_syms_.size() >>>>>> $5 = 0 >>>>>> >>>>>> Finally, the contents of cat data/lang_test_phn-mono/phones/ >>>>>> disambig.int is also empty. >>>>>> >>>>>> Thanks again for your help! >>>>>> >>>>>> yours, >>>>>> >>>>>> Xavier Anguera >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Dec 27, 2013 at 10:26 PM, Daniel Povey <dp...@gm...>wrote: >>>>>> >>>>>>> Could you please do the following. [apologies if you already know >>>>>>> gdb] >>>>>>> >>>>>>> First do utils/validate_lang.pl data/lang_test_phn-mono/ >>>>>>> and let me know if it fails. >>>>>>> If it doesn't fail, do: >>>>>>> >>>>>>> gdb --args fstcomposecontext --context-size=1 --central-position=0 >>>>>>> --read-disambig-syms=data/lang_test_phn-mono/phones/disambig.int >>>>>>> --write-disambig-syms=data/lang_test_ >>>>>>> phn-mono/tmp/disambig_ilabels_1_0.int data/lang_test_phn-mono/tmp/ilabels_1_0 >>>>>>> data/lang_test_phn-mono/tmp/LG.fst >>>>>>> >>>>>>> (gdb) r >>>>>>> # wait till it crashes >>>>>>> # go up the stack by typing "up" until you get to the right frame; >>>>>>> type "down" if you go too far >>>>>>> >>>>>>> (gdb) p subsequential_symbol >>>>>>> (gdb) p disambig_syms_.count(subsequential_symbol) >>>>>>> (gdb) p phone_syms_.count(subsequential_symbol) >>>>>>> (gdb) p phone_syms_.size() >>>>>>> (gdb) p disambig_syms_.size() >>>>>>> (gdb) quit >>>>>>> >>>>>>> [I hope this works; sometimes it will fail because functions are >>>>>>> inlined]. >>>>>>> Anyway, send the output, and also >>>>>>> cat data/lang_test_phn-mono/phones/disambig.int >>>>>>> and show me that output too. >>>>>>> >>>>>>> Dan >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Dec 27, 2013 at 10:23 AM, Xavier Anguera <xan...@gm... >>>>>>> > wrote: >>>>>>> >>>>>>>> Dear all, >>>>>>>> I am encounering a problem when training mono-state NN using a >>>>>>>> recipe adapted from the SWBD S5 recipe. I am able to train, decode and >>>>>>>> phone-align a GMM system, but when I use these results to train the NN I >>>>>>>> get the following error (see below). I have used this recipe in the past to >>>>>>>> successfully train one ASR system and now the only difference is that I am >>>>>>>> trying to train a similar system using graphemes are phonemes (for which I >>>>>>>> have assigned the graphemes of the words as transcriptions to each word). >>>>>>>> Any help is appreciated. >>>>>>>> >>>>>>>> This is the beginning of the file exp/mono/graph/mkgraph_phn.log: >>>>>>>> >>>>>>>> # utils/mkgraph.sh --mono data/lang_test_phn-mono exp/mono >>>>>>>> exp/mono/graph_phn >>>>>>>> # Started at Fri Dec 27 18:57:19 CET 2013 >>>>>>>> # >>>>>>>> fsttablecompose data/lang_test_phn-mono/L_disambig.fst >>>>>>>> data/lang_test_phn-mono/G.fst >>>>>>>> fstdeterminizestar --use-log=true >>>>>>>> fstminimizeencoded >>>>>>>> fstisstochastic data/lang_test_phn-mono/tmp/LG.fst >>>>>>>> 0.000358155 -0.000356635 >>>>>>>> fstcomposecontext --context-size=1 --central-position=0 >>>>>>>> --read-disambig-syms=data/lang_test_phn-mono/phones/disambig.int--write-disambig-syms=data/lang_test_ >>>>>>>> phn-mono/tmp/disambig_ilabels_1_0.intdata/lang_test_phn-mono/tmp/ilabels_1_0 >>>>>>>> fstcomposecontext: ../fstext/context-fst-inl.h:105: >>>>>>>> fst::ContextFstImpl<Arc, LabelT>::ContextFstImpl(typename Arc::Label, const >>>>>>>> std::vector<B, std::allocator< >>>>>>>> _T2> >&, const std::vector<B, std::allocator<_T2> >&, int, int) >>>>>>>> [with Arc = fst::ArcTpl<fst::TropicalWeightTpl<float> >, LabelT = int]: >>>>>>>> Assertion `subsequenti >>>>>>>> al_symbol != 0 && disambig_syms_.count(subsequential_symbol) == 0 >>>>>>>> && phone_syms_.count(subsequential_symbol) == 0' failed. >>>>>>>> utils/mkgraph.sh: line 76: 6263 Aborted >>>>>>>> fstcomposecontext --context-size=$N --central-position=$P >>>>>>>> --read-disambig-syms=$lang/phones/disambig. >>>>>>>> int --write-disambig-syms=$lang/tmp/disambig_ilabels_${N}_${P}.int >>>>>>>> $lang/tmp/ilabels_${N}_${P} < $lang/tmp/LG.fst > $clg >>>>>>>> fstisstochastic data/lang_test_phn-mono/tmp/CLG_1_0.fst >>>>>>>> ERROR: FstHeader::Read: Bad FST header: >>>>>>>> data/lang_test_phn-mono/tmp/CLG_1_0.fst >>>>>>>> ERROR >>>>>>>> (fstisstochastic:ReadFstKaldi():fstext/fstext-utils-inl.h:1183) Reading >>>>>>>> FST: error reading FST header from data/lang_test_phn-mono/tmp/CLG_1_0.fst >>>>>>>> ERROR >>>>>>>> (fstisstochastic:ReadFstKaldi():fstext/fstext-utils-inl.h:1183) Reading >>>>>>>> FST: error reading FST header from data/lang_test_phn-mono/tmp/CLG_1_0.fst >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Rapidly troubleshoot problems before they affect your business. >>>>>>>> Most IT >>>>>>>> organizations don't have a clear picture of how application >>>>>>>> performance >>>>>>>> affects their revenue. With AppDynamics, you get 100% visibility >>>>>>>> into your >>>>>>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>>>>>>> AppDynamics Pro! >>>>>>>> >>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>>>>>> _______________________________________________ >>>>>>>> Kaldi-developers mailing list >>>>>>>> Kal...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > |