The patch has various stylistic problems, i.e. departures from the Google
style guide (e.g. non-const reference as function argument; output before
inputs). But lattices should not contain arcs with infinite costs at all.
I think the better solution is to work backward and figure out how your
lattices got broken. Can you say what sequence of steps you used to
generate the lattices? If you used the standard scripts, what top-level
script did you use?
Below is the patch I have been using to avoid DNN smbr training failure on
occasions. Please check if it can be commited in trunk. --Ricky
Index: nnetbin/nnet-train-mpe-sequential.cc
Also, Ricky- let us know what the issue was with the bad weights that you
found on some arcs, i.e. what the weights were- were they
LatticeWeight::Zero(), i.e. inf,inf,empty-string, or were they something
else?
Dan
On Wed, Jan 28, 2015 at 4:25 PM, Daniel Povey dpovey@gmail.com wrote:
The patch has various stylistic problems, i.e. departures from the Google
style guide (e.g. non-const reference as function argument; output before
inputs). But lattices should not contain arcs with infinite costs at all.
I think the better solution is to work backward and figure out how your
lattices got broken. Can you say what sequence of steps you used to
generate the lattices? If you used the standard scripts, what top-level
script did you use?
Below is the patch I have been using to avoid DNN smbr training failure
on occasions. Please check if it can be commited in trunk. --Ricky
Index: nnetbin/nnet-train-mpe-sequential.cc
Those scripts no longer exist. I think you must be using a very
out-of-date version of Kaldi; and this issue has probably been resolved
long ago. Make sure you're not pointing to the old repo.
Dan
Below is the patch I have been using to avoid DNN smbr training failure on occasions. Please check if it can be commited in trunk. --Ricky
Index: nnetbin/nnet-train-mpe-sequential.cc
--- nnetbin/nnet-train-mpe-sequential.cc (revision 4840)
+++ nnetbin/nnet-train-mpe-sequential.cc (working copy)
@@ -292,15 +292,23 @@
fst::ScaleLattice(fst::LatticeScale(lm_scale, acoustic_scale), &den_lat);
bool success;
if (do_smbr) { // use state-level accuracies, i.e. sMBR estimation
utt_frame_acc = LatticeForwardBackwardMpeVariants(
- trans_model, silence_phones, den_lat, ref_ali, "smbr", &post);
+ success, trans_model, silence_phones, den_lat, ref_ali, "smbr", &post);
} else { // use phone-level accuracies, i.e. MPFE (minimum phone frame error)
utt_frame_acc = LatticeForwardBackwardMpeVariants(
- trans_model, silence_phones, den_lat, ref_ali, "mpfe", &post);
+ success, trans_model, silence_phones, den_lat, ref_ali, "mpfe", &post);
}
if(success==false) {
+
// 6) convert the Posterior to a matrix
nnet_diff_h.Resize(num_frames, num_pdfs, kSetZero);
for (int32 t = 0; t < post.size(); t++) {
Index: latbin/lattice-to-mpe-post.cc
--- latbin/lattice-to-mpe-post.cc (revision 4840)
+++ latbin/lattice-to-mpe-post.cc (working copy)
@@ -114,8 +114,9 @@
} else {
const std::vector<int32> &alignment = alignments_reader.Value(key);
Posterior post;
+ bool success = true;
lat_frame_acc = LatticeForwardBackwardMpeVariants(
- trans_model, silence_phones, lat, alignment,
+ success, trans_model, silence_phones, lat, alignment,
"mpfe", &post);
total_lat_frame_acc += lat_frame_acc;
lat_time = post.size();
Index: latbin/lattice-to-smbr-post.cc
--- latbin/lattice-to-smbr-post.cc (revision 4840)
+++ latbin/lattice-to-smbr-post.cc (working copy)
@@ -115,8 +115,9 @@
} else {
const std::vector<int32> &alignment = alignments_reader.Value(key);
Posterior post;
+ bool success = true;
lat_frame_acc = LatticeForwardBackwardMpeVariants(
- trans_model, silence_phones, lat, alignment,
+ success, trans_model, silence_phones, lat, alignment,
"smbr", &post);
total_lat_frame_acc += lat_frame_acc;
lat_time = post.size();
Index: lat/lattice-functions.cc
--- lat/lattice-functions.cc (revision 4840)
+++ lat/lattice-functions.cc (working copy)
@@ -660,6 +660,7 @@
BaseFloat LatticeForwardBackwardMpeVariants(
+ bool& success,
const TransitionModel &trans,
const std::vector<int32> &silence_phones,
const Lattice &lat,
@@ -699,6 +700,10 @@
double this_alpha = alpha[s];
for (ArcIterator<Lattice> aiter(lat, s); !aiter.Done(); aiter.Next()) {
const Arc &arc = aiter.Value();
+ if(KALDI_ISINF(arc.weight.Value1()) || KALDI_ISINF(arc.weight.Value2())) {
+ success = false;
+ return 0.0;
+ }
double arc_like = -ConvertToCost(arc.weight);
alpha[arc.nextstate] = LogAdd(alpha[arc.nextstate], this_alpha + arc_like);
}
@@ -716,6 +721,10 @@
double this_beta = -(f.Value1() + f.Value2());
for (ArcIterator<Lattice> aiter(lat, s); !aiter.Done(); aiter.Next()) {
const Arc &arc = aiter.Value();
+ if(KALDI_ISINF(arc.weight.Value1()) || KALDI_ISINF(arc.weight.Value2())) {
+ success = false;
+ return 0.0;
+ }
double arc_like = -ConvertToCost(arc.weight),
arc_beta = beta[arc.nextstate] + arc_like;
this_beta = LogAdd(this_beta, arc_beta);
@@ -736,6 +745,10 @@
double this_alpha = alpha[s];
for (ArcIterator<Lattice> aiter(lat, s); !aiter.Done(); aiter.Next()) {
const Arc &arc = aiter.Value();
+ if(KALDI_ISINF(arc.weight.Value1()) || KALDI_ISINF(arc.weight.Value2())) {
+ success = false;
+ return 0.0;
+ }
double arc_like = -ConvertToCost(arc.weight);
double frame_acc = 0.0;
if (arc.ilabel != 0) {
@@ -760,14 +773,21 @@
double final_like = this_alpha - (f.Value1() + f.Value2());
double arc_scale = Exp(final_like - tot_forward_prob);
tot_forward_score += arc_scale * alpha_smbr[s];
- KALDI_ASSERT(state_times[s] == max_time &&
- "Lattice is inconsistent (final-prob not at max_time)");
+ if(state_times[s] != max_time) {
+ KALDI_WARN << "Lattice is inconsistent (final-prob not at max_time)";
+ success = false;
+ return 0.0;
+ }
}
}
// Second Pass Backward, collect Mpe style posteriors
for (StateId s = num_states-1; s >= 0; s--) {
for (ArcIterator<Lattice> aiter(lat, s); !aiter.Done(); aiter.Next()) {
const Arc &arc = aiter.Value();
+ if(KALDI_ISINF(arc.weight.Value1()) || KALDI_ISINF(arc.weight.Value2())) {
+ success = false;
+ return 0.0;
+ }
double arc_like = -ConvertToCost(arc.weight),
arc_beta = beta[arc.nextstate] + arc_like;
double frame_acc = 0.0;
@@ -815,6 +835,7 @@
// Output the computed posteriors
for (int32 t = 0; t < max_time; t++)
MergePairVectorSumming(&((*post)[t]));
+ success = true;
return tot_forward_score;
}
Index: lat/lattice-functions.h
--- lat/lattice-functions.h (revision 4840)
+++ lat/lattice-functions.h (working copy)
@@ -154,6 +154,7 @@
positive or negative) into "arc_post".
*/
BaseFloat LatticeForwardBackwardMpeVariants(
+ bool &success,
const TransitionModel &trans,
const std::vector<int32> &silence_phones,
const Lattice &lat,
Index: nnet2/nnet-example-functions.cc
--- nnet2/nnet-example-functions.cc (revision 4840)
+++ nnet2/nnet-example-functions.cc (working copy)
@@ -843,12 +843,13 @@
Posterior *post) {
KALDI_ASSERT(criterion == "mpfe" || criterion == "smbr" || criterion == "mmi");
Lattice lat;
ConvertLattice(eg.den_lat, &lat);
TopSort(&lat);
if (criterion == "mpfe" || criterion == "smbr") {
Posterior tid_post;
LatticeForwardBackwardMpeVariants(success, tmodel, silence_phones, lat, eg.num_ali,
criterion, &tid_post);
ConvertPosteriorToPdfs(tmodel, tid_post, post);
Index: nnet2/nnet-compute-discriminative.cc
--- nnet2/nnet-compute-discriminative.cc (revision 4840)
+++ nnet2/nnet-compute-discriminative.cc (working copy)
@@ -325,7 +325,8 @@
if (opts_.criterion == "mpfe" || opts_.criterion == "smbr") {
Posterior tid_post;
double ans;
- ans = LatticeForwardBackwardMpeVariants(tmodel_, silence_phones_, lat_,
+ bool success = true;
+ ans = LatticeForwardBackwardMpeVariants(success, tmodel_, silence_phones_, lat_,
eg_.num_ali, opts_.criterion,
&tid_post) * eg_.weight;
ConvertPosteriorToPdfs(tmodel_, tid_post, post);
The patch has various stylistic problems, i.e. departures from the Google
style guide (e.g. non-const reference as function argument; output before
inputs). But lattices should not contain arcs with infinite costs at all.
I think the better solution is to work backward and figure out how your
lattices got broken. Can you say what sequence of steps you used to
generate the lattices? If you used the standard scripts, what top-level
script did you use?
Dan
On Wed, Jan 28, 2015 at 11:47 AM, RickyChan rickychanhoyin@users.sf.net
wrote:
Also, Ricky- let us know what the issue was with the bad weights that you
found on some arcs, i.e. what the weights were- were they
LatticeWeight::Zero(), i.e. inf,inf,empty-string, or were they something
else?
Dan
On Wed, Jan 28, 2015 at 4:25 PM, Daniel Povey dpovey@gmail.com wrote:
standard scripts: steps/align_nnet.sh and steps/make_denlats_nnet.sh for lattices generations
tracing of nnet-train-mpe-sequential which crashed on occasions showed "inf" in "arc.weight.Value "
Those scripts no longer exist. I think you must be using a very
out-of-date version of Kaldi; and this issue has probably been resolved
long ago. Make sure you're not pointing to the old repo.
Dan
On Thu, Jan 29, 2015 at 9:55 AM, RickyChan rickychanhoyin@users.sf.net
wrote:
Yes.
I just wonder anyone may get crash on the program with similar issues using current version.
I think this issue was fixed a long time ago. If we get reports of
problems I'll look into it.
Dan
On Thu, Jan 29, 2015 at 9:52 PM, RickyChan rickychanhoyin@users.sf.net
wrote: