Menu

Grammar with one-word sentence

Help
2015-07-01
2015-07-01
  • Konstantinos Themelis

    Hello everyone,

    I use kaldi tool to generate an HMM to classify some acoustic events. I have only isolated acoustic events. So I model each event as one word with only one monophone. For example, I have the word "knock| and the monophone is KN. But when i train my hmm model and then move to decoding i get a WER above 40%. In the exp/mono0a/decode/scoring directory i see that that some acoustic files have been scored with more than one event.
    For example: "file1 knockdoor phoreringing", but I would like it to be scored with one event or none.
    As a result a get insertion and deletion penalties which lead to have bad WER.
    I define my grammar at task.arpabo file like this:


    \data\ ngram l=8
    -1 knock
    -1 doorslam
    -1 steps
    -1 chairmoving
    -1 spoon # these are the words #
    -1 paperwork
    -1 keyjingle
    -1 speech
    -99 <S>
    -1 </S>

    \end\

    I also tried to add word insertion penalty at the decode stage with the option "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems to eliminate insertion and deletion error.
    My question: How can I force my grammar to produce sentences of only one word?

    Your help is of great importance as I am stuck on this for days!
    Thanks in advance!

    (P.S i followed the yesno example)

     

    Last edit: Konstantinos Themelis 2015-07-01
    • Daniel Povey

      Daniel Povey - 2015-07-01

      When you say "scoring", I think what you are really talking about is
      decoding. Scoring is the process of computing WERs from the outputs.
      If you want a grammar that outputs exactly one word per sentence, you
      should construct it as an FST (it will actually be a finite state
      acceptor as the input and output symbols will be the same). Someting
      like:

      0 1 knock knock 0.0
      0 1 ring ring 0.0
      [etc.]
      1 0.0

      the 0.0 means zero cost; these values are interpreted as negative
      log-probs and would normally be positive.

      Fan

      On Wed, Jul 1, 2015 at 2:48 PM, Konstantinos Themelis
      kothemel@users.sf.net wrote:

      Hello everyone,

      I use kaldi tool to generate an HMM to classify some acoustic events. I have
      only isolated acoustic events. So I model each event as one word with only
      one monophone. For example, I have the word "knock| and the monophone is KN.
      But when i train my hmm model and then move to decoding i get a WER above
      40%. In the exp/mono0a/decode/scoring directory i see that that some
      acoustic files have been scored with more than one event.
      For example: "file1 knockdoor phoreringing", but I would like it to be
      scored with one event or none.
      As a result a get insertion and deletion penalties which lead to have bad
      WER.
      I define my grammar at task.arpabo file like this:


      \data\ ngram l=8
      -1 kn
      -1 ds
      -1 st
      -1 cm
      -1 cl # these are the words #
      -1 pw
      -1 kj
      -1 kt
      -99
      -1

      \end\

      I also tried to add word insertion penalty at the decode stage with the
      oprion "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems
      to eliminate insertion and deletion error.
      My question: How can I force my grammar to produce sentences of only one
      word?

      Your help is of great importance as I am stuck on this for days!
      Thanks in advance!

      (P.S i followed the yesno example)


      Grammar with one-word sentence


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Daniel Povey

        Daniel Povey - 2015-07-01

        BTW, I am showing the text-form FST with words instead of integer
        symbols. The RM setup might be a good example to show you how to turn
        this into a binary FST using fstcompile. Also see the tutorial at
        openfst.org.
        Dan

        On Wed, Jul 1, 2015 at 6:31 PM, Daniel Povey danielpovey@users.sf.net wrote:

        When you say "scoring", I think what you are really talking about is
        decoding. Scoring is the process of computing WERs from the outputs.
        If you want a grammar that outputs exactly one word per sentence, you
        should construct it as an FST (it will actually be a finite state
        acceptor as the input and output symbols will be the same). Someting
        like:

        0 1 knock knock 0.0
        0 1 ring ring 0.0
        [etc.]
        1 0.0

        the 0.0 means zero cost; these values are interpreted as negative
        log-probs and would normally be positive.

        Fan

        On Wed, Jul 1, 2015 at 2:48 PM, Konstantinos Themelis
        kothemel@users.sf.net wrote:

        Hello everyone,

        I use kaldi tool to generate an HMM to classify some acoustic events. I have
        only isolated acoustic events. So I model each event as one word with only
        one monophone. For example, I have the word "knock| and the monophone is KN.
        But when i train my hmm model and then move to decoding i get a WER above
        40%. In the exp/mono0a/decode/scoring directory i see that that some
        acoustic files have been scored with more than one event.
        For example: "file1 knockdoor phoreringing", but I would like it to be
        scored with one event or none.
        As a result a get insertion and deletion penalties which lead to have bad
        WER.
        I define my grammar at task.arpabo file like this:


        \data\ ngram l=8
        -1 kn
        -1 ds
        -1 st
        -1 cm
        -1 cl # these are the words #
        -1 pw
        -1 kj
        -1 kt
        -99
        -1

        \end\

        I also tried to add word insertion penalty at the decode stage with the
        oprion "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems
        to eliminate insertion and deletion error.
        My question: How can I force my grammar to produce sentences of only one
        word?

        Your help is of great importance as I am stuck on this for days!
        Thanks in advance!

        (P.S i followed the yesno example)


        Grammar with one-word sentence


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/


        Grammar with one-word sentence


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/