In sphinxtrain-1.0.8/src/programs/bw , when performing MMIE training, the program "bw" would crash in case where the numerator lattices has arc with start frame from zero. For examples (in the first arc of the below lattice),
Total arcs: 20 True arcs: 20 arc_id, arc_name, start frame, end frame, lmscore, number of preceding acrs, number of succeeding arcs, preceding arc_ids, succeeding arc_ids 1 <s> 0 1 -230.235388422 1 1 < 3 > 4 2 <sil> 1 15 -2.30258509299 1 1 < 0 > 1 3 <sil> 1 25 -2.30258509299 1 1 < 0 > 1 4 <sil> 1 29 -2.30258509299 1 1 < 0 > 5 5 an 29 38 -53684.4070244 1 1 < 4 > 6 6 ge 38 48 -53684.4070244 1 1 < 5 > 7 7 you 48 56 -53684.4070244 1 1 < 6 > 8 8 tai 56 67 -53684.4070244 1 1 < 7 > 9 9 wan 67 77 -53684.4070244 1 1 < 8 > 10 10 tian 77 86 -53684.4070244 1 1 < 9 > 11 11 hou 86 95 -53684.4070244 1 1 < 10 > 12 12 gong 95 103 -53684.4070244 1 1 < 11 > 13 13 de 103 112 -53684.4070244 1 1 < 12 > 14 14 zi 112 121 -53684.4070244 1 1 < 13 > 15 15 liao 121 143 -53684.4070244 1 1 < 14 > 16 16 <sil> 143 167 -2.30258509299 1 1 < 15 > 0 17 </s> 167 167 -1.81960902107 1 1 < 16 > 18 18 <sil> 168 180 -2.30258509299 1 1 < 17 > 0 19 </s> 180 180 -1.81960902107 1 1 < 18 > 20 20 <sil> 181 184 -2.30258509299 1 1 < 19 > 0
This is caused by the code in main.c (in sphinxtrain-1.0.8/src/programs/bw) in the functions "mmi_rand_train", "mmi_best_train", "mmi_ci_train", where the line
arc_f[k] = f[k+lat->arc[n].sf-1];
caused out of range feature assignment .A fix of it like this (and I have tried running with MMIE on this as well which gave around 8% absolute error reduction, though the MMIE converge in a single iteration)
[ricky@ bw]$ diff main.c main.c.orig 1155c1155 < --- > 1158,1160c1158,1160 < for (k=0; k<n_word_obs; k++) < arc_f[k] = f[k+lat->arc[n].sf]; < --- > for (k=0; k<n_word_obs; k++) > arc_f[k] = f[k+lat->arc[n].sf-1]; > 1241,1242c1241,1242 < arc_f[k] = f[k+lat->arc[n].sf]; < --- > arc_f[k] = f[k+lat->arc[n].sf-1]; > 1318c1318 < arc_f[k] = f[k+lat->arc[n].sf]; --- > arc_f[k] = f[k+lat->arc[n].sf-1]; 1414c1414 < arc_f[k] = f[k+lat->arc[n].sf]; --- > arc_f[k] = f[k+lat->arc[n].sf-1]; 1487c1487 < arc_f[k] = f[k+lat->arc[n].sf]; --- > arc_f[k] = f[k+lat->arc[n].sf-1]; 1522c1522 < arc_f[k] = f[k+lat->arc[n].sf]; --- > arc_f[k] = f[k+lat->arc[n].sf-1]; 1724a1725 >
A bit typo in the former post, the MMIE on this was given around 8% relative word error rate reduction (not absolute error reduction)
Diff:
Hi Ricky
Can you please create unified diff instead of plain diff. Unified diffs are created with -u option and they are more standard practice:
please note the order in the command line, you first put orig file, then new file. In your case the diff is reverted.
Please do not post diff file, it's better to attach it. There is no need to attach C file, just attach the diff.
Also, I don't think your patch is 100% correct. Look, we take 1 more frame into obs and that creates a problem. If you remove sf - 1 you will not have invalid access with sf = 0, but instead you will have array overflow when ef = last frame since we take + 1 in n_word_obs. Please consider this situation too.
I understand the code is pretty dirty, I totally agree it has to be rewritten from scratch probably.
Ignore the last frame for the last arc in the lattice, this should fix the above issue.
Sorry, what if not only the last arc ends in the last frame?
I think the proper patch should use less n_word_obs frames consistently for all the arcs.
Thanks for reminding on that.
I think checking if last frame occur in all arcs in the lattice are more appropriate (and ignore for the last frame) since mmi_viterbi_run will assert n_word_obs.