Menu

grammar-based language model

Help
gary
2015-07-21
2015-07-22
  • gary

    gary - 2015-07-21

    Dear all

    I wrote the below grammar :
    <s> = <hi> <names>;
    <hi> = hi | hello;
    <names> = gary | mary | sophie | tony | scott;

    and converted it to the fst
    0 1 hi hi
    0 1 hello hello
    1 2 gary gary
    1 2 mary mary
    1 2 sophie sophie
    1 2 tony tony
    1 2 scott scott
    2 0

    I used the below command to convert text to binary format.
    cat text.fst | epsdisambig.pl | s2eps.pl | fstcompile --isymbols=isyms.txt --osymbols=osyms.txt --keep_isymbols=false --keep_osymbols=false | fstrmepsilon > G.fst

    I drawed the G.fst picture.
    http://i.imgur.com/s1F5CCL.png

    Then I reference kaldi/egs/yesno to prepare input file : lexicon.txt , lexicon_nosil.txt.
    Before calling run.sh to generate HCLG.fst, I use my own build G.fst wihtout using arpa

    My lexicon.txt
    <SIL> SIL
    hi HH AY1
    hello HH AH0 L OW1
    gary G EH1 R IY0
    mary M EH1 R IY0
    sophie S OW1 F IY0
    tony T OW1 N IY0
    scott S K AA1 T

    When decoding , if I say "hi sohphie", I get the answer "gary sophie".
    But my grammar has no this rule. What's wrong with my G.fst or something else wrong?
    Thanks.

     

    Last edit: gary 2015-07-21
    • Daniel Povey

      Daniel Povey - 2015-07-21

      Everything looks right in what you described. Possibly there was a
      mismatch in a words.txt somewhere, or maybe you somehow ended up decoding
      using a different G.fst than what you described in your post.
      Dan

      On Mon, Jul 20, 2015 at 6:58 PM, gary gary2015@users.sf.net wrote:

      Dear all

      I wrote the below grammar :
      = <hi> <names>;
      <hi> = hi | hello;
      <names> = gary | mary | sophie | tony | scott;

      and converted it to the fst
      0 1 hi hi
      0 1 hello hello
      1 2 gary gary
      1 2 mary mary
      1 2 sophie sophie
      1 2 tony tony
      1 2 scott scott
      2 0

      I used the below command to convert text to binary format.
      cat text.fst | epsdisambig.pl | s2eps.pl | fstcompile
      --isymbols=isyms.txt --osymbols=osyms.txt --keep_isymbols=false
      --keep_osymbols=false | fstrmepsilon > G.fst

      I drawed the G.fst picture.
      http://i.imgur.com/s1F5CCL.png

      Then I reference kaldi/egs/yesno to prepare input file : lexicon.txt ,
      lexicon_nosil.txt.
      Before calling run.sh to generate HCLG.fst, I use my own build G.fst
      wihtout using arpa

      My lexicon.txt
      <SIL> SIL
      hi HH AY1
      hello HH AH0 L OW1
      gary G EH1 R IY0
      mary M EH1 R IY0
      sophie S OW1 F IY0
      tony T OW1 N IY0
      scott S K AA1 T

      When decoding , if I say "hi sohphie", I get the answer "gary sophie".
      But my grammar has no this rule. What's wrong with my G.fst or something
      else wrong?
      Thanks.


      grammar-based language model
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/1e6bbe9e/?limit=25#9701


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
  • gary

    gary - 2015-07-22

    Dear Dan

    Thank you very much.
    I solved this problem.
    The reason is as you said : mismatch in a words.txt (L.fst and G.fst)