Kaldi / Discussion / Developers: lattice-mbr-decode sausage stats

Patrick L. Lange - 2015-07-05

Hi everyone,

we are currently using lattice-mbr-decode to create a confusion network and are wondering whether lattice-mbr-decode itself considers the full lattice file or itself prunes out certain alternatives which are considered when using lattice-oracle.

Furthermore, is there a way to easier way to get the word-level confidence scores for the oracle hypothesis than to somehow parse the sausage stats? Something like lattice-to-ctm-conf which only considers the best path?

Lastly, I noticed a difference between the times produced by lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer number of frames. Is this due to averaging the frame count over the considered alternatives?

Any input would be greatly appreciated.

Patrick

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Guoguo Chen - 2015-07-07
  
  On Sun, Jul 5, 2015 at 12:47 PM, Patrick Lange langep@users.sf.net wrote:
  
  Hi everyone,
  
  we are currently using lattice-mbr-decode to create a confusion network
  and are wondering whether lattice-mbr-decode itself considers the full
  lattice file or itself prunes out certain alternatives which are considered
  when using lattice-oracle.
  
  Furthermore, is there a way to easier way to get the word-level confidence
  scores for the oracle hypothesis than to somehow parse the sausage stats?
  Something like lattice-to-ctm-conf which only considers the best path?
  
  I just committed a change for this. Please update your repository and try
  the new lattice-to-ctm-conf program. Let me know if it works or not.
  
  Guoguo
  
  Lastly, I noticed a difference between the times produced by
  lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non
  integer number of frames. Is this due to averaging the frame count over the
  considered alternatives?
  
  Any input would be greatly appreciated.
  
  Patrick
  
  lattice-mbr-decode sausage stats
  https://sourceforge.net/p/kaldi/discussion/1355349/thread/904ab11c/?limit=25#ec77
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Patrick L. Lange - 2015-07-08
    
    When I run the following commands:
    
    /try to save the oracle path in 0000000006590022-VC787776.oraclelat
    cat development_set.reference_transcription_kaldi_format | \ sed 's:<NOISE>::g' | \ scripts/sym2int.pl --ignore-first-field ./graph/words.txt | \ ~/halef-cassandra/kaldi-070615/src/latbin/lattice-oracle --write-lattices=ark,t:0000000006590022-VC787776.oraclelat --word-symbol-table=./graph/words.txt ark:./lats/0000000006590022-VC787776.lat ark:- ark,t:tmp.txt
    
    ~/halef-cassandra/kaldi-070615/src/latbin/lattice-to-ctm-conf --acoustic-scale=0.1 --decode-mbr=false ark:./lats/0000000006590022-VC787776.lat ark:0000000006590022-VC787776.oraclelat tmp.ctm
    
    I get an empty tmp.ctm file
    
    When I try to store the oracle path as binary ark:0000000006590022-VC787776.oraclelat
    I get:
    
    WARNING (lattice-to-ctm-conf:Read():util/kaldi-holder-inl.h:255) BasicVectorHolder::Read, could not interpret line: ▒▒~
    
    Furthermore, I have not been able to extract the oraclepath this way out of lattice aligned with lattice-align-word-lexicon
    
    Am I doing something wrong?
    
    Patrick
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Patrick L. Lange - 2015-07-08
    
    I think that I am using a wrong format for <1best-rspecifier> in lattice-to-ctm-conf.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Guoguo Chen - 2015-07-08
      
      What you need to provide to lattice-to-ctm-conf is your tmp.txt, i.e.,
      the <transcriptions-wspecifier> from lattice-oracle.
      
      Guoguo
      
      On Wed, Jul 8, 2015 at 11:27 AM, Patrick Lange langep@users.sf.net wrote:
      
      I think that I am using a wrong format for <1best-rspecifier> in
      lattice-to-ctm-conf.
      
      lattice-mbr-decode sausage stats
      https://sourceforge.net/p/kaldi/discussion/1355349/thread/904ab11c/?limit=25#ec77/4387/d086
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355349/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Patrick L. Lange - 2015-07-08
        
        This is a sample of my output. It looks like it is working as expected. Every word with a confidence of 1 in the best path is also showing up in the oracle with a confidence of 1. This is perfect. Thank you.
        
        /1best.ctm
        0000000006590022-VC787776 1 0.48 1.00 9758 0.48
        0000000006590022-VC787776 1 1.49 0.39 22646 0.93
        0000000006590022-VC787776 1 2.05 0.10 20757 0.56
        0000000006590022-VC787776 1 2.15 0.30 9145 0.49
        0000000006590022-VC787776 1 2.44 0.34 20886 0.93
        0000000006590022-VC787776 1 2.78 0.41 20743 1.00
        0000000006590022-VC787776 1 3.26 0.44 21627 1.00
        0000000006590022-VC787776 1 4.22 0.48 21574 0.85
        0000000006590022-VC787776 1 4.80 0.32 21574 0.88
        
        /oracle.ctm
        0000000006590022-VC787776 1 1.06 0.67 22714 0.01
        0000000006590022-VC787776 1 1.82 0.21 20757 0.56
        0000000006590022-VC787776 1 2.12 0.25 9145 0.49
        0000000006590022-VC787776 1 2.44 0.34 20886 0.93
        0000000006590022-VC787776 1 2.78 0.41 20743 1.00
        0000000006590022-VC787776 1 3.26 0.44 21627 1.00
        0000000006590022-VC787776 1 4.21 0.48 21574 0.84
        0000000006590022-VC787776 1 4.79 0.32 21574 0.89
        
        Last edit: Patrick L. Lange 2015-07-08
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Patrick L. Lange - 2015-07-05

Sorry I wanted to create this in the help forum. Pleae move it because I think it does belong there.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-07-05
  
  It's OK, there is no real difference between help and discuss.
  
  we are currently using lattice-mbr-decode to create a confusion network and are wondering whether lattice-mbr-decode itself considers the full lattice file or itself prunes out certain alternatives which are considered when using lattice-oracle.
  
  lattice-mbr-decode does consider the full file, but very
  low-probability paths will make very little difference to the result.
  Also, some of the scripts that use lattice-mbr-decode may precede it
  with lattice-prune, in which case not all of the paths will be
  considered (but this will hardly affect the results).
  
  Furthermore, is there a way to easier way to get the word-level confidence scores for the oracle hypothesis than to somehow parse the sausage stats? Something like lattice-to-ctm-conf which only considers the best path?
  
  Vassil, do you have time to help with this? What he is asking for is
  possible but would require a little coding.
  In sausages.cc, there is a block that starts with this:
  // Now set R_ to one best in the FST.
  It would be possible to add a new constructor to the MinimumBayesRisk
  class with an FST argument representing the best path you want to
  compute confidences against. (note: you would usually want to set
  do_mbr_ to false to prevent updating R_, but this could be made
  optional for greater flexibility), and it would compute the
  integer-vector R_ as the word sequence in the provided FST argument
  using the function GetLinearSymbolSequence, instead of as the best
  path of the provided lattice. The program lattice-to-ctm-conf could
  be modified so it takes an optional second argument (i.e. a 3-argument
  usage) where it would take a vector of linear lattices
  (<1best-lattice-rspecifier>), and in that case it would use this other
  constructor. We should warn users that in the 3-argument form it will
  usually make sense to set --decode-mbr to false, otherwise it will
  simply use the provided 1-best as a starting point for optimization.
  
  Lastly, I noticed a difference between the times produced by lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer number of frames. Is this due to averaging the frame count over the considered alternatives?
  
  yes, it is. It's explained in the paper,
  "Minimum Bayes Risk decoding and system combination based on a
  recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and
  Jie Zhu, Computer Speech and Language, 2011.
  
  Dan
  
  On Sun, Jul 5, 2015 at 9:49 AM, Patrick Lange langep@users.sf.net wrote:
  
  Sorry I wanted to create this in the help forum. Pleae move it because I
  think it does belong there.
  
  lattice-mbr-decode sausage stats
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355349/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Daniel Povey - 2015-07-06
    
    Just a follow-up on this:
    I realized it doesn't make sense to have the input be a linear FST,
    more convenient to have it be a std::vector<int32>.
    Dan
    
    On Sun, Jul 5, 2015 at 11:28 AM, Daniel Povey danielpovey@users.sf.net wrote:
    
    It's OK, there is no real difference between help and discuss.
    
    we are currently using lattice-mbr-decode to create a confusion network and
    are wondering whether lattice-mbr-decode itself considers the full lattice
    file or itself prunes out certain alternatives which are considered when
    using lattice-oracle.
    
    lattice-mbr-decode does consider the full file, but very
    low-probability paths will make very little difference to the result.
    Also, some of the scripts that use lattice-mbr-decode may precede it
    with lattice-prune, in which case not all of the paths will be
    considered (but this will hardly affect the results).
    
    Furthermore, is there a way to easier way to get the word-level confidence
    scores for the oracle hypothesis than to somehow parse the sausage stats?
    Something like lattice-to-ctm-conf which only considers the best path?
    
    Vassil, do you have time to help with this? What he is asking for is
    possible but would require a little coding.
    In sausages.cc, there is a block that starts with this:
    // Now set R_ to one best in the FST.
    It would be possible to add a new constructor to the MinimumBayesRisk
    class with an FST argument representing the best path you want to
    compute confidences against. (note: you would usually want to set
    do_mbr_ to false to prevent updating R_, but this could be made
    optional for greater flexibility), and it would compute the
    integer-vector R_ as the word sequence in the provided FST argument
    using the function GetLinearSymbolSequence, instead of as the best
    path of the provided lattice. The program lattice-to-ctm-conf could
    be modified so it takes an optional second argument (i.e. a 3-argument
    usage) where it would take a vector of linear lattices
    (<1best-lattice-rspecifier>), and in that case it would use this other
    constructor. We should warn users that in the 3-argument form it will
    usually make sense to set --decode-mbr to false, otherwise it will
    simply use the provided 1-best as a starting point for optimization.
    
    Lastly, I noticed a difference between the times produced by
    lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer
    number of frames. Is this due to averaging the frame count over the
    considered alternatives?
    
    yes, it is. It's explained in the paper,
    "Minimum Bayes Risk decoding and system combination based on a
    recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and
    Jie Zhu, Computer Speech and Language, 2011.
    
    Dan
    
    On Sun, Jul 5, 2015 at 9:49 AM, Patrick Lange langep@users.sf.net wrote:
    
    Sorry I wanted to create this in the help forum. Pleae move it because I
    think it does belong there.
    
    lattice-mbr-decode sausage stats
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355349/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    lattice-mbr-decode sausage stats
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355349/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Patrick L. Lange - 2015-07-06

Thank you very much for the information.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

lattice-mbr-decode sausage stats

Forums

Help

lattice-mbr-decode sausage stats

Guoguo

Patrick

Guoguo

lattice-mbr-decode sausage stats

Forums

Help

lattice-mbr-decode sausage stats document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Guoguo

Patrick

Guoguo

lattice-mbr-decode sausage stats