we are currently using lattice-mbr-decode to create a confusion network and are wondering whether lattice-mbr-decode itself considers the full lattice file or itself prunes out certain alternatives which are considered when using lattice-oracle.
Furthermore, is there a way to easier way to get the word-level confidence scores for the oracle hypothesis than to somehow parse the sausage stats? Something like lattice-to-ctm-conf which only considers the best path?
Lastly, I noticed a difference between the times produced by lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer number of frames. Is this due to averaging the frame count over the considered alternatives?
Any input would be greatly appreciated.
Patrick
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
we are currently using lattice-mbr-decode to create a confusion network
and are wondering whether lattice-mbr-decode itself considers the full
lattice file or itself prunes out certain alternatives which are considered
when using lattice-oracle.
Furthermore, is there a way to easier way to get the word-level confidence
scores for the oracle hypothesis than to somehow parse the sausage stats?
Something like lattice-to-ctm-conf which only considers the best path?
I just committed a change for this. Please update your repository and try
the new lattice-to-ctm-conf program. Let me know if it works or not.
Guoguo
Lastly, I noticed a difference between the times produced by
lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non
integer number of frames. Is this due to averaging the frame count over the
considered alternatives?
This is a sample of my output. It looks like it is working as expected. Every word with a confidence of 1 in the best path is also showing up in the oracle with a confidence of 1. This is perfect. Thank you.
It's OK, there is no real difference between help and discuss.
we are currently using lattice-mbr-decode to create a confusion network and are wondering whether lattice-mbr-decode itself considers the full lattice file or itself prunes out certain alternatives which are considered when using lattice-oracle.
lattice-mbr-decode does consider the full file, but very
low-probability paths will make very little difference to the result.
Also, some of the scripts that use lattice-mbr-decode may precede it
with lattice-prune, in which case not all of the paths will be
considered (but this will hardly affect the results).
Furthermore, is there a way to easier way to get the word-level confidence scores for the oracle hypothesis than to somehow parse the sausage stats? Something like lattice-to-ctm-conf which only considers the best path?
Vassil, do you have time to help with this? What he is asking for is
possible but would require a little coding.
In sausages.cc, there is a block that starts with this:
// Now set R_ to one best in the FST.
It would be possible to add a new constructor to the MinimumBayesRisk
class with an FST argument representing the best path you want to
compute confidences against. (note: you would usually want to set
do_mbr_ to false to prevent updating R_, but this could be made
optional for greater flexibility), and it would compute the
integer-vector R_ as the word sequence in the provided FST argument
using the function GetLinearSymbolSequence, instead of as the best
path of the provided lattice. The program lattice-to-ctm-conf could
be modified so it takes an optional second argument (i.e. a 3-argument
usage) where it would take a vector of linear lattices
(<1best-lattice-rspecifier>), and in that case it would use this other
constructor. We should warn users that in the 3-argument form it will
usually make sense to set --decode-mbr to false, otherwise it will
simply use the provided 1-best as a starting point for optimization.
Lastly, I noticed a difference between the times produced by lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer number of frames. Is this due to averaging the frame count over the considered alternatives?
yes, it is. It's explained in the paper,
"Minimum Bayes Risk decoding and system combination based on a
recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and
Jie Zhu, Computer Speech and Language, 2011.
It's OK, there is no real difference between help and discuss.
we are currently using lattice-mbr-decode to create a confusion network and
are wondering whether lattice-mbr-decode itself considers the full lattice
file or itself prunes out certain alternatives which are considered when
using lattice-oracle.
lattice-mbr-decode does consider the full file, but very
low-probability paths will make very little difference to the result.
Also, some of the scripts that use lattice-mbr-decode may precede it
with lattice-prune, in which case not all of the paths will be
considered (but this will hardly affect the results).
Furthermore, is there a way to easier way to get the word-level confidence
scores for the oracle hypothesis than to somehow parse the sausage stats?
Something like lattice-to-ctm-conf which only considers the best path?
Vassil, do you have time to help with this? What he is asking for is
possible but would require a little coding.
In sausages.cc, there is a block that starts with this:
// Now set R_ to one best in the FST.
It would be possible to add a new constructor to the MinimumBayesRisk
class with an FST argument representing the best path you want to
compute confidences against. (note: you would usually want to set
do_mbr_ to false to prevent updating R_, but this could be made
optional for greater flexibility), and it would compute the
integer-vector R_ as the word sequence in the provided FST argument
using the function GetLinearSymbolSequence, instead of as the best
path of the provided lattice. The program lattice-to-ctm-conf could
be modified so it takes an optional second argument (i.e. a 3-argument
usage) where it would take a vector of linear lattices
(<1best-lattice-rspecifier>), and in that case it would use this other
constructor. We should warn users that in the 3-argument form it will
usually make sense to set --decode-mbr to false, otherwise it will
simply use the provided 1-best as a starting point for optimization.
Lastly, I noticed a difference between the times produced by
lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer
number of frames. Is this due to averaging the frame count over the
considered alternatives?
yes, it is. It's explained in the paper,
"Minimum Bayes Risk decoding and system combination based on a
recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and
Jie Zhu, Computer Speech and Language, 2011.
Dan
On Sun, Jul 5, 2015 at 9:49 AM, Patrick Lange langep@users.sf.net wrote:
Sorry I wanted to create this in the help forum. Pleae move it because I
think it does belong there.
Hi everyone,
we are currently using lattice-mbr-decode to create a confusion network and are wondering whether lattice-mbr-decode itself considers the full lattice file or itself prunes out certain alternatives which are considered when using lattice-oracle.
Furthermore, is there a way to easier way to get the word-level confidence scores for the oracle hypothesis than to somehow parse the sausage stats? Something like lattice-to-ctm-conf which only considers the best path?
Lastly, I noticed a difference between the times produced by lattice-to-ctm-conf and lattice-mbr-decode. The times produces a non integer number of frames. Is this due to averaging the frame count over the considered alternatives?
Any input would be greatly appreciated.
Patrick
On Sun, Jul 5, 2015 at 12:47 PM, Patrick Lange langep@users.sf.net wrote:
Guoguo
When I run the following commands:
/try to save the oracle path in 0000000006590022-VC787776.oraclelat
cat development_set.reference_transcription_kaldi_format | \ sed 's:<NOISE>::g' | \ scripts/sym2int.pl --ignore-first-field ./graph/words.txt | \ ~/halef-cassandra/kaldi-070615/src/latbin/lattice-oracle --write-lattices=ark,t:0000000006590022-VC787776.oraclelat --word-symbol-table=./graph/words.txt ark:./lats/0000000006590022-VC787776.lat ark:- ark,t:tmp.txt
~/halef-cassandra/kaldi-070615/src/latbin/lattice-to-ctm-conf --acoustic-scale=0.1 --decode-mbr=false ark:./lats/0000000006590022-VC787776.lat ark:0000000006590022-VC787776.oraclelat tmp.ctm
I get an empty tmp.ctm file
When I try to store the oracle path as binary ark:0000000006590022-VC787776.oraclelat
I get:
WARNING (lattice-to-ctm-conf:Read():util/kaldi-holder-inl.h:255) BasicVectorHolder::Read, could not interpret line: ▒▒~
Furthermore, I have not been able to extract the oraclepath this way out of lattice aligned with lattice-align-word-lexicon
Am I doing something wrong?
Patrick
I think that I am using a wrong format for <1best-rspecifier> in lattice-to-ctm-conf.
What you need to provide to lattice-to-ctm-conf is your tmp.txt, i.e.,
the <transcriptions-wspecifier> from lattice-oracle.
Guoguo
On Wed, Jul 8, 2015 at 11:27 AM, Patrick Lange langep@users.sf.net wrote:
This is a sample of my output. It looks like it is working as expected. Every word with a confidence of 1 in the best path is also showing up in the oracle with a confidence of 1. This is perfect. Thank you.
/1best.ctm
0000000006590022-VC787776 1 0.48 1.00 9758 0.48
0000000006590022-VC787776 1 1.49 0.39 22646 0.93
0000000006590022-VC787776 1 2.05 0.10 20757 0.56
0000000006590022-VC787776 1 2.15 0.30 9145 0.49
0000000006590022-VC787776 1 2.44 0.34 20886 0.93
0000000006590022-VC787776 1 2.78 0.41 20743 1.00
0000000006590022-VC787776 1 3.26 0.44 21627 1.00
0000000006590022-VC787776 1 4.22 0.48 21574 0.85
0000000006590022-VC787776 1 4.80 0.32 21574 0.88
/oracle.ctm
0000000006590022-VC787776 1 1.06 0.67 22714 0.01
0000000006590022-VC787776 1 1.82 0.21 20757 0.56
0000000006590022-VC787776 1 2.12 0.25 9145 0.49
0000000006590022-VC787776 1 2.44 0.34 20886 0.93
0000000006590022-VC787776 1 2.78 0.41 20743 1.00
0000000006590022-VC787776 1 3.26 0.44 21627 1.00
0000000006590022-VC787776 1 4.21 0.48 21574 0.84
0000000006590022-VC787776 1 4.79 0.32 21574 0.89
Last edit: Patrick L. Lange 2015-07-08
Sorry I wanted to create this in the help forum. Pleae move it because I think it does belong there.
It's OK, there is no real difference between help and discuss.
lattice-mbr-decode does consider the full file, but very
low-probability paths will make very little difference to the result.
Also, some of the scripts that use lattice-mbr-decode may precede it
with lattice-prune, in which case not all of the paths will be
considered (but this will hardly affect the results).
Vassil, do you have time to help with this? What he is asking for is
possible but would require a little coding.
In sausages.cc, there is a block that starts with this:
// Now set R_ to one best in the FST.
It would be possible to add a new constructor to the MinimumBayesRisk
class with an FST argument representing the best path you want to
compute confidences against. (note: you would usually want to set
do_mbr_ to false to prevent updating R_, but this could be made
optional for greater flexibility), and it would compute the
integer-vector R_ as the word sequence in the provided FST argument
using the function GetLinearSymbolSequence, instead of as the best
path of the provided lattice. The program lattice-to-ctm-conf could
be modified so it takes an optional second argument (i.e. a 3-argument
usage) where it would take a vector of linear lattices
(<1best-lattice-rspecifier>), and in that case it would use this other
constructor. We should warn users that in the 3-argument form it will
usually make sense to set --decode-mbr to false, otherwise it will
simply use the provided 1-best as a starting point for optimization.
yes, it is. It's explained in the paper,
"Minimum Bayes Risk decoding and system combination based on a
recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and
Jie Zhu, Computer Speech and Language, 2011.
Dan
On Sun, Jul 5, 2015 at 9:49 AM, Patrick Lange langep@users.sf.net wrote:
Just a follow-up on this:
I realized it doesn't make sense to have the input be a linear FST,
more convenient to have it be a std::vector<int32>.
Dan
On Sun, Jul 5, 2015 at 11:28 AM, Daniel Povey danielpovey@users.sf.net wrote:
Thank you very much for the information.