CMU Sphinx / Forums / Help: SphinxTrain: Assertion j==n_total

marekl - 2010-08-25

Hello All,
During tests with new version of SphinxTrain (latest trunk) I've encounter the
following error in MODULE: 30 Training Context Dependent models.

INFO: main.c(261): Reading
/home/marekl/workspace/training/model_architecture/boolware.untied.mdef
bw: model_def_io.c:585: model_def_read: Assertion `j == n_total_map' failed.

The model was successfully created many times on different ST versions -
including the version just before MMIE training - so I have no idea what is
going on .

Used configuration can be seen at https://sourceforge.net/projects/cmusphinx/
forums/forum/382337/topic/3744788?message=8519421 with
exception that in this case 1s_c_d_dd feature was used as well as POSIX queue
on linux

I would be grateful for any suggestion
Marek

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-08-25

Hello Marek

It might be a bug caused by updated newline handling. Something like empty
line or windows-style \r could cause it. Can you please just add printf to
that file model_def_io and find out what is the value of j and what is
n_total_map and which value is correct and if there are empty lines in mdef
file.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Nicolay,

Thanx for a hint, I'll do that and report result to you. In the meantime I
have changed the configuration to use 1s_c feature. This time train failed on
the MODULE: 06 Train MLLT transformation getting the following error in the
norm during flat initialize (strange as with 1s_c_d_dd MLLT was completed
without any problems):

/home/monika/workspace/training/bin/norm \
 -accumdir /home/monika/workspace/training/bwaccumdir/boolware_buff_1 \
 -meanfn /home/monika/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean

[Switch]   [Default] [Value]
-help      no        no     
-example   no        no     
-accumdir            /home/marekl/workspace/training/bwaccumdir/boolware_buff_1
-oaccumdir                  
-tmatfn                     
-mixwfn                     
-meanfn              /home/marekl/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean
-varfn                      
-regmatfn                   
-dcountfn                   
-inmixwfn                   
-inmeanfn                   
-invarfn                    
-fullvar   no        no     
-tiedvar   no        no     
-mmie      no        no     
-constE    3.0       3.000000e+00
INFO: main.c(165): No -mixwfn specified, will skip if any
INFO: main.c(168): No -tmatfn specified, will skip if any
INFO: main.c(174): No -varfn specified, will skip if any
INFO: main.c(227): Reading and accumulating counts from /home/marekl/workspace/training/bwaccumdir/boolware_buff_1
ERROR: "s3acc_io.c", line 339: Unable to access /home/marekl/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts
INFO: main.c(476): No means or variances to normalize
WARNING: "main.c", line 542: NO reestimated means seen, but -meanfn specified
Wed Aug 25 14:53:07 2010

do you have any ideas about this?

marekl - 2010-08-25

ups,

log header was taken from different run the correct is

home/monika/workspace/training/bin/norm \ -accumdir /home/marekl/workspace/training/bwaccumdir/boolware_buff_1 \ -meanfn /home/marekl/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean

I double checked - it's for sure not a problem with paths.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-08-25

That's magic ;)

06 MLLT looks like a really old script. It must be stage 02. Are you sure you
are running last version?

As for no access, it tries to open the file and fails. It might be permission
issue. I have no other hypothesis.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

06 MLLT is a stage 2 of LDA-MLLT training. I have downloaded snapshot version
2 days ago, but right now I have reinstalled ST from trunk getting the same
error. After some investigations it seams that gauden_counts is not accessible
because it is not written.

log file from init_gau of this stage looks like

INFO: corpus.c(1313): Will process all remaining utts starting at 0
INFO: init_gau.c(146): Computing 1x1x1 mean estimates
Wed Aug 25 18:41:24 2010

while the same log from previous version of ST is

INFO: corpus.c(1343): Will process all remaining utts starting at 0
INFO: init_gau.c(146): Computing 1x1x1 mean estimates
[100] [200] [300] [400] [500] [600] [700] [800] [900] [1000] [1100] [1200] [1300] [1400] [1500] [1600] [1700] [1800] [1900] [2000] [2100] [2200] [2300] [2400] [2500] [2600] [2700] [2800] [2900] [3000] [3100] [3200] [3300] [3400] [3500] [3600] [3700] [3800] [3900] [4000] [4100] [4200] [4300] [4400] [4500] [4600] [4700] [4800] [4900] [5000] [5100] [5200] [5300] [5400] [5500] [5600] [5700] [5800] [5900] [6000] [6100] [6200] [6300] [6400] [6500] [6600] [6700] [6800] [6900] [7000] [7100] [7200] [7300] [7400] [7500] [7600] [7700] [7800] [7900] [8000] [8100] [8200] [8300] [8400] [8500] [8600] [8700] [8800] [8900] [9000] [9100] [9200] [9300] [9400] [9500] [9600] [9700] [9800] [9900] [10000] [10100] [10200] [10300] [10400] [10500] [10600] [10700] [10800] [10900] [11000] [11100] [11200] [11300] [11400] [11500] [11600] [11700] [11800] [11900] [12000] [12100] [12200] [12300] [12400] [12500] [12600] [12700] [12800] [12900] [13000] [13100] [13200] [13300] [13400] [13500] [13600] [13700] [13800] [13900] [14000] [14100] [14200] [14300] [14400] [14500] [14600] [14700] [14800] [14900] [15000] [15100] [15200] [15300] [15400] [15500] INFO: s3gau_io.c(478): Wrote /home/marekl/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts with means [1x1x1 vector arrays]
Sat Aug  7 13:48:36 2010

init_gau log from LDA training step of new ST is pretty similar

INFO: corpus.c(1313): Will process all remaining utts starting at 0
INFO: init_gau.c(146): Computing 1x1x1 mean estimates
[100] [200] [300] [400] [500] [600] [700] [800] [900] [1000] [1100] [1200] [1300] [1400] [1500] [1600] [1700] [1800] [1900] [2000] [2100] [2200] [2300] [2400] [2500] [2600] [2700] [2800] [2900] [3000] [3100] [3200] [3300] [3400] [3500] [3600] [3700] [3800] [3900] [4000] [4100] [4200] [4300] [4400] [4500] [4600] [4700] [4800] [4900] [5000] [5100] [5200] [5300] [5400] [5500] [5600] [5700] [5800] [5900] [6000] [6100] [6200] [6300] [6400] [6500] [6600] [6700] [6800] [6900] [7000] [7100] [7200] [7300] [7400] [7500] [7600] [7700] [7800] [7900] [8000] [8100] [8200] [8300] [8400] [8500] [8600] [8700] [8800] [8900] [9000] [9100] [9200] [9300] [9400] [9500] [9600] [9700] [9800] [9900] [10000] [10100] [10200] [10300] [10400] [10500] [10600] [10700] [10800] [10900] [11000] [11100] [11200] [11300] [11400] [11500] [11600] [11700] [11800] [11900] [12000] [12100] [12200] [12300] [12400] [12500] [12600] [12700] [12800] [12900] [13000] [13100] [13200] [13300] [13400] [13500] [13600] [13700] [13800] [13900] [14000] [14100] [14200] [14300] [14400] [14500] [14600] [14700] [14800] [14900] [15000] [15100] [15200] [15300] [15400] [15500] INFO: s3gau_io.c(478): Wrote /home/monika/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts with means [1x1x1 vector arrays]
Wed Aug 25 21:12:55 2010

For me it looks like something crashes init_gau for MLLT without causing any
info on log file

permissions are for sure not a problem as 1) these are temporary files created
by ST in stage 1 (lda training) with no problem 2) files are created with no
problem for MLLT but in case of 1s_c_d_dd feature set

Nickolay V. Shmyrev - 2010-08-25

It writes to /home/monika and not to /home/marekl. Is it ok?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

marekl - 2010-08-25

Yes thats OK. I have 2 environments and I'm running them both. the last run
was on /home/monika so it's ok. It might be confusing for you - sorry about
that - but as these are separate computers there is no chance for me to make
error

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

marekl - 2010-08-28

INFO: main.c(261): Reading
/home/monika/workspace/training/model_architecture/boolware.untied.mdef

INFO: model_def_io.c(585): j = 154209, n_total_map = 154044 bw: model_def_io.c:586: model_def_read: Assertion `j == n_total_map' failed. Sat Aug 28 22:29:36 2010

boolware.untied.mdef:

# Generated by /home/monika/workspace/training/bin/mk_mdef_gen on Sat Aug 28 22:29:35 2010 0.3 38 n_base 25636 n_tri 154044 n_state_map 128370 n_tied_state 190 n_tied_ci_state 38 n_tied_tmat

boolware.untied.mdef does not contain any \r characters, however there is
empty line at the very end of file (last line is finished with \n)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-08-28

Hm, can you mail me that mdef. My mail is nshmyrev@nexiwave.com. Also the log
where it was created make_alltriphonelist.log on stage 30.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-08-31

The fix for this issue is committed in SphinxTrain trunk.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

marekl - 2010-09-01

It seems that this change solved assertion problem, thank you very much
Nicolay

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SphinxTrain: Assertion j==n_total_map failed

Speech Recognition Toolkit

Forums

Help

SphinxTrain: Assertion j==n_total_map failed

SphinxTrain: Assertion j==n_total_map failed

Speech Recognition Toolkit

Forums

Help

SphinxTrain: Assertion j==n_total_map failed document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

SphinxTrain: Assertion j==n_total_map failed