Kaldi / Discussion / Help: Getting phone posteriors with Kaldi

Horia Cucu - 2014-08-11

Hi all,

First of all I must mention that this is my first contact with Kaldi. I have some experience with other speech recognition toolkits (HTK, Sphinx) and used them for small and large vocabulary ASR tasks.

I didn't install anything and I'm not quite sure where to begin, but my goal for now is to create posterior features for a speech database using Kaldi.

Can you give me some guidelines on how to begin?

Thanks,
Horia

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2014-08-11
  
  Can you say what you intend to use these posterior features for?
  Dan
  
  On Mon, Aug 11, 2014 at 10:26 AM, Horia Cucu horiacucu@users.sf.net wrote:
  
  Hi all,
  
  First of all I must mention that this is my first contact with Kaldi. I
  have some experience with other speech recognition toolkits (HTK, Sphinx)
  and used them for small and large vocabulary ASR tasks.
  
  I didn't install anything and I'm not quite sure where to begin, but my
  goal for now is to create posterior features for a speech database using
  Kaldi.
  
  Can you give me some guidelines on how to begin?
  
  Thanks,
  Horia
  
  Getting phone posteriors with Kaldi
  https://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Horia Cucu - 2014-08-12
    
    I want to use them for spoken term detection. My experiments trigger two
    scenarios:
    a) Spokent term detection based on phone posterior features
    b) Spokent term detection based on the actual phones (the 1-best hypothesis
    string of phones)
    
    Horia
    
    On 11 August 2014 21:38, Daniel Povey danielpovey@users.sf.net wrote:
    
    Can you say what you intend to use these posterior features for?
    Dan
    
    On Mon, Aug 11, 2014 at 10:26 AM, Horia Cucu horiacucu@users.sf.net wrote:
    
    Hi all,
    
    First of all I must mention that this is my first contact with Kaldi. I
    have some experience with other speech recognition toolkits (HTK, Sphinx)
    and used them for small and large vocabulary ASR tasks.
    
    I didn't install anything and I'm not quite sure where to begin, but my
    goal for now is to create posterior features for a speech database using
    Kaldi.
    
    Can you give me some guidelines on how to begin?
    
    Thanks,
    Horia
    
    Getting phone posteriors with Kaldi
    
    https://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    Getting phone posteriors with Kaldi
    http://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc/962d
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    alternate
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Daniel Povey - 2014-08-12
      
      It would probably be better to generate a lattice, possibly a phone-level
      lattice, and do keyword search on the lattice. We already have
      keyword-search stuff in Kaldi, that was used for the BABEL project (see
      egs/babel/s5b), but the setup is kind of complicated. There is also an
      example script for keyword search in the WSJ example, but I don't know how
      recently it has been tested. I don't know if that WSJ example script
      handles words not in the vocabulary (probably not).
      To generate a phone-level lattice you could either convert a word lattice
      to a phone lattice using lattice-align-phones with
      --replace-output-symbols=true (but this will only contain phone sequences
      that correspond to actual word sequences), or generate a language model at
      the phone level and create a decoding graph from it... the latter approach
      is probably only practical if you have a system without
      word-position-dependent phones (--position-dependent-phones false to
      prepare_lang.sh), and I'm afraid a script doesn't currently exist for it at
      least in the checked-in code, although it should be doable.
      If you really want phone-posterior features, not from a lattice, one way to
      do it is to train a neural net to get the posteriors of context-dependent
      states, evaluate the neural net using nnet-forward or nnet-compute (nnet1
      vs nnet2 setup), convert to pdf-level posteriors using logprob-to-post or
      prob-to-post, then convert to phone-level posteriors using
      post-to-phone-post.
      Guoguo may want to add more regarding the keyword search.
      Dan
      
      On Tue, Aug 12, 2014 at 7:40 AM, Horia Cucu horiacucu@users.sf.net wrote:
      
      I want to use them for spoken term detection. My experiments trigger two
      scenarios:
      a) Spokent term detection based on phone posterior features
      b) Spokent term detection based on the actual phones (the 1-best hypothesis
      string of phones)
      
      Horia
      
      On 11 August 2014 21:38, Daniel Povey danielpovey@users.sf.net wrote:
      
      Can you say what you intend to use these posterior features for?
      Dan
      
      On Mon, Aug 11, 2014 at 10:26 AM, Horia Cucu horiacucu@users.sf.net wrote:
      
      Hi all,
      
      First of all I must mention that this is my first contact with Kaldi. I
      have some experience with other speech recognition toolkits (HTK, Sphinx)
      and used them for small and large vocabulary ASR tasks.
      
      I didn't install anything and I'm not quite sure where to begin, but my
      goal for now is to create posterior features for a speech database using
      Kaldi.
      
      Can you give me some guidelines on how to begin?
      
      Thanks,
      Horia
      
      Getting phone posteriors with Kaldi
      
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      Getting phone posteriors with Kaldi
      
      http://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc/962d
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      Getting phone posteriors with Kaldi
      http://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc/962d/b939
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Getting phone posteriors with Kaldi

Forums

Help

Getting phone posteriors with Kaldi document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

https://sourceforge.net/p/kaldi/discussion/1355348/thread/df992e5a/?limit=25#57fc

Getting phone posteriors with Kaldi