Kaldi / Discussion / Help: Grammar with one-word sentence

Konstantinos Themelis - 2015-07-01

Hello everyone,

I use kaldi tool to generate an HMM to classify some acoustic events. I have only isolated acoustic events. So I model each event as one word with only one monophone. For example, I have the word "knock| and the monophone is KN. But when i train my hmm model and then move to decoding i get a WER above 40%. In the exp/mono0a/decode/scoring directory i see that that some acoustic files have been scored with more than one event.
For example: "file1 knockdoor phoreringing", but I would like it to be scored with one event or none.
As a result a get insertion and deletion penalties which lead to have bad WER.
I define my grammar at task.arpabo file like this:

\data\ ngram l=8
-1 knock
-1 doorslam
-1 steps
-1 chairmoving
-1 spoon # these are the words #
-1 paperwork
-1 keyjingle
-1 speech
-99 <S>
-1 </S>

\end\

I also tried to add word insertion penalty at the decode stage with the option "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems to eliminate insertion and deletion error.
My question: How can I force my grammar to produce sentences of only one word?

Your help is of great importance as I am stuck on this for days!
Thanks in advance!

(P.S i followed the yesno example)

Last edit: Konstantinos Themelis 2015-07-01

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-07-01
  
  When you say "scoring", I think what you are really talking about is
  decoding. Scoring is the process of computing WERs from the outputs.
  If you want a grammar that outputs exactly one word per sentence, you
  should construct it as an FST (it will actually be a finite state
  acceptor as the input and output symbols will be the same). Someting
  like:
  
  0 1 knock knock 0.0
  0 1 ring ring 0.0
  [etc.]
  1 0.0
  
  the 0.0 means zero cost; these values are interpreted as negative
  log-probs and would normally be positive.
  
  Fan
  
  On Wed, Jul 1, 2015 at 2:48 PM, Konstantinos Themelis
  kothemel@users.sf.net wrote:
  
  Hello everyone,
  
  I use kaldi tool to generate an HMM to classify some acoustic events. I have
  only isolated acoustic events. So I model each event as one word with only
  one monophone. For example, I have the word "knock| and the monophone is KN.
  But when i train my hmm model and then move to decoding i get a WER above
  40%. In the exp/mono0a/decode/scoring directory i see that that some
  acoustic files have been scored with more than one event.
  For example: "file1 knockdoor phoreringing", but I would like it to be
  scored with one event or none.
  As a result a get insertion and deletion penalties which lead to have bad
  WER.
  I define my grammar at task.arpabo file like this:
  
  \data\ ngram l=8
  -1 kn
  -1 ds
  -1 st
  -1 cm
  -1 cl # these are the words #
  -1 pw
  -1 kj
  -1 kt
  -99
  -1
  
  \end\
  
  I also tried to add word insertion penalty at the decode stage with the
  oprion "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems
  to eliminate insertion and deletion error.
  My question: How can I force my grammar to produce sentences of only one
  word?
  
  Your help is of great importance as I am stuck on this for days!
  Thanks in advance!
  
  (P.S i followed the yesno example)
  
  Grammar with one-word sentence
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Daniel Povey - 2015-07-01
    
    BTW, I am showing the text-form FST with words instead of integer
    symbols. The RM setup might be a good example to show you how to turn
    this into a binary FST using fstcompile. Also see the tutorial at
    openfst.org.
    Dan
    
    On Wed, Jul 1, 2015 at 6:31 PM, Daniel Povey danielpovey@users.sf.net wrote:
    
    When you say "scoring", I think what you are really talking about is
    decoding. Scoring is the process of computing WERs from the outputs.
    If you want a grammar that outputs exactly one word per sentence, you
    should construct it as an FST (it will actually be a finite state
    acceptor as the input and output symbols will be the same). Someting
    like:
    
    0 1 knock knock 0.0
    0 1 ring ring 0.0
    [etc.]
    1 0.0
    
    the 0.0 means zero cost; these values are interpreted as negative
    log-probs and would normally be positive.
    
    Fan
    
    On Wed, Jul 1, 2015 at 2:48 PM, Konstantinos Themelis
    kothemel@users.sf.net wrote:
    
    Hello everyone,
    
    I use kaldi tool to generate an HMM to classify some acoustic events. I have
    only isolated acoustic events. So I model each event as one word with only
    one monophone. For example, I have the word "knock| and the monophone is KN.
    But when i train my hmm model and then move to decoding i get a WER above
    40%. In the exp/mono0a/decode/scoring directory i see that that some
    acoustic files have been scored with more than one event.
    For example: "file1 knockdoor phoreringing", but I would like it to be
    scored with one event or none.
    As a result a get insertion and deletion penalties which lead to have bad
    WER.
    I define my grammar at task.arpabo file like this:
    
    \data\ ngram l=8
    -1 kn
    -1 ds
    -1 st
    -1 cm
    -1 cl # these are the words #
    -1 pw
    -1 kj
    -1 kt
    -99
    -1
    
    \end\
    
    I also tried to add word insertion penalty at the decode stage with the
    oprion "--word_ins_penalty". I tried values 5,10,100,200 but nothing seems
    to eliminate insertion and deletion error.
    My question: How can I force my grammar to produce sentences of only one
    word?
    
    Your help is of great importance as I am stuck on this for days!
    Thanks in advance!
    
    (P.S i followed the yesno example)
    
    Grammar with one-word sentence
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    Grammar with one-word sentence
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Grammar with one-word sentence

Forums

Help

Grammar with one-word sentence document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Grammar with one-word sentence