Kaldi / Discussion / Help: decode_biglm.sh No arc available in LM

L-A - 2015-07-11

Hi,
I have some troubles while using the script "decdeo_biglm.sh".

time ./decode_biglm.sh.org tri2b_mmi_1200/graph_80k_173 lang_80k_tg173/G.fst lang_80k_tg172/G.fst test_utt tri2b_mmi_1200/test
WARNING (gmm-latgen-biglm-faster:PropagateLm():decoder/lattice-biglm-faster-decoder.h:673) No arc available in LM (unlikely to be correct if a statistical language model); will not warn again
KALDI_ASSERT: at gmm-latgen-biglm-faster:PruneForwardLinks:decoder/lattice-biglm-faster-decoder.h:394, failed: link_extra_cost == link_extra_cost

The difference between newlm and oldlm is the cutoff size and the vocabulary size is the same.
Should I remove the disambiguous symbol #0 after making G.fst ?
Please give me some suggestions, thanks!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Gilles Boulianne - 2015-07-11
  
  Does your new lm contain all unigrams that were present in the old lm?
  Or maybe your words.txt file was changed inadvertently between compiling the old and the new lm?
  The decoder is looking for a word probability (following an history) and cannot find it in the new lm.
  A correct LM is supposed to provide such a word probability for any word that appears in the oldlm.
  
  Le 2015-07-11 à 10:54, L-A laou@users.sf.net a écrit :
  
  Hi,
  I have some troubles while using the script "decdeo_biglm.sh".
  
  time ./decode_biglm.sh.org tri2b_mmi_1200/graph_80k_173 lang_80k_tg173/G.fst lang_80k_tg172/G.fst test_utt tri2b_mmi_1200/test
  WARNING (gmm-latgen-biglm-faster:PropagateLm():decoder/lattice-biglm-faster-decoder.h:673) No arc available in LM (unlikely to be correct if a statistical language model); will not warn again
  KALDI_ASSERT: at gmm-latgen-biglm-faster:PruneForwardLinks:decoder/lattice-biglm-faster-decoder.h:394, failed: link_extra_cost == link_extra_cost
  
  The difference between newlm and oldlm is the cutoff size and the vocabulary size is the same.
  Should I remove the disambiguous symbol #0 after making G.fst ?
  Please give me some suggestions, thanks!
  
  decode_biglm.sh No arc available in LM
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

L-A - 2015-07-12

hi,

Thanks for your replies.
I'm sure that the new lm contains all unigrams which are presented in the old lm.
The words.txt is the same.
I have successfully decoded new lm and old lm with same words.txt while using "gmm-latgen-faster" separately.
But there still are some troubles with "gmm-latgen-biglm-faster".

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-07-12
  
  How did you create the LM?
  Also, you might want to get it in a debugger and figure out which word
  it was complaining about. (It will be in integer but you can check in
  words.txt)
  Dan
  
  On Sat, Jul 11, 2015 at 10:28 PM, L-A laou@users.sf.net wrote:
  
  hi,
  
  Thanks for your replies.
  I'm sure that the new lm contains all unigrams which are presented in the
  old lm.
  The words.txt is the same.
  I have successfully decoded new lm and old lm with same words.txt while
  using "gmm-latgen-faster" separately.
  But there still are some troubles with "gmm-latgen-biglm-faster".
  
  decode_biglm.sh No arc available in LM
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - L-A - 2015-07-13
    
    Hi,
    
    LM is created in ARPA format and changed into fst int the following steps.
    
    cat lm.arpa | \ grep -v ' ~~' | \ grep -v '~~ ~~' | \ grep -v '~~ ' | \ arpa2fst - | \
    fstprint | \ eps2disambig.pl |\
    s2eps.pl | \
    fstcompile --isymbols=words.txt \ --osymbols=words.txt \ --keep_isymbols=false --keep_osymbols=false | \ fstrmepsilon > G.fst
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Daniel Povey - 2015-07-13
      
      Yes, what I meant is, how did you create the ARPA-format LM? Also,
      make sure your Kaldi code is fully up to date; we do more checking
      recently in arpa2fst which might catch certain problems.
      You should get in a debugger and figure out which word is involved. Do
      gdb --args (program) (args)
      commands you will need include
      run (r)
      catch throw
      continue (c)
      up
      list
      print (e.g. print some-c++-expression)
      
      Dan
      
      On Mon, Jul 13, 2015 at 5:21 AM, L-A laou@users.sf.net wrote:
      
      Hi,
      
      LM is created in ARPA format and changed into fst int the following steps.
      
      cat lm.arpa | \ grep -v ' ' | \ grep -v ' ' | \ grep -v ' ' | \ arpa2fst - |
      \ fstprint | \ eps2disambig.pl |\ s2eps.pl | \ fstcompile --isymbols=words.txt \ --osymbols=words.txt \ --keep_isymbols=false --keep_osymbols=false | \ fstrmepsilon > G.fst
      
      decode_biglm.sh No arc available in LM
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

decode_biglm.sh No arc available in LM

Forums

Help

decode_biglm.sh No arc available in LM document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

decode_biglm.sh No arc available in LM