Menu

SphinxTrain: Assertion j==n_total_map failed

Help
marekl
2010-08-25
2012-09-22
  • marekl

    marekl - 2010-08-25

    Hello All,
    During tests with new version of SphinxTrain (latest trunk) I've encounter the
    following error in MODULE: 30 Training Context Dependent models.

    INFO: main.c(261): Reading
    /home/marekl/workspace/training/model_architecture/boolware.untied.mdef
    bw: model_def_io.c:585: model_def_read: Assertion `j == n_total_map' failed.

    The model was successfully created many times on different ST versions -
    including the version just before MMIE training - so I have no idea what is
    going on .

    Used configuration can be seen at https://sourceforge.net/projects/cmusphinx/
    forums/forum/382337/topic/3744788?message=8519421
    with
    exception that in this case 1s_c_d_dd feature was used as well as POSIX queue
    on linux

    I would be grateful for any suggestion
    Marek

     
  • Nickolay V. Shmyrev

    Hello Marek

    It might be a bug caused by updated newline handling. Something like empty
    line or windows-style \r could cause it. Can you please just add printf to
    that file model_def_io and find out what is the value of j and what is
    n_total_map and which value is correct and if there are empty lines in mdef
    file.

     
  • marekl

    marekl - 2010-08-25

    Hi Nicolay,

    Thanx for a hint, I'll do that and report result to you. In the meantime I
    have changed the configuration to use 1s_c feature. This time train failed on
    the MODULE: 06 Train MLLT transformation getting the following error in the
    norm during flat initialize (strange as with 1s_c_d_dd MLLT was completed
    without any problems):

    /home/monika/workspace/training/bin/norm \
     -accumdir /home/monika/workspace/training/bwaccumdir/boolware_buff_1 \
     -meanfn /home/monika/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean
    
    [Switch]   [Default] [Value]
    -help      no        no     
    -example   no        no     
    -accumdir            /home/marekl/workspace/training/bwaccumdir/boolware_buff_1
    -oaccumdir                  
    -tmatfn                     
    -mixwfn                     
    -meanfn              /home/marekl/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean
    -varfn                      
    -regmatfn                   
    -dcountfn                   
    -inmixwfn                   
    -inmeanfn                   
    -invarfn                    
    -fullvar   no        no     
    -tiedvar   no        no     
    -mmie      no        no     
    -constE    3.0       3.000000e+00
    INFO: main.c(165): No -mixwfn specified, will skip if any
    INFO: main.c(168): No -tmatfn specified, will skip if any
    INFO: main.c(174): No -varfn specified, will skip if any
    INFO: main.c(227): Reading and accumulating counts from /home/marekl/workspace/training/bwaccumdir/boolware_buff_1
    ERROR: "s3acc_io.c", line 339: Unable to access /home/marekl/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts
    INFO: main.c(476): No means or variances to normalize
    WARNING: "main.c", line 542: NO reestimated means seen, but -meanfn specified
    Wed Aug 25 14:53:07 2010
    

    do you have any ideas about this?

     
  • marekl

    marekl - 2010-08-25

    ups,

    log header was taken from different run the correct is

    home/monika/workspace/training/bin/norm \
     -accumdir /home/marekl/workspace/training/bwaccumdir/boolware_buff_1 \
     -meanfn /home/marekl/workspace/training/model_parameters/boolware.ci_mllt_flatinitial/globalmean
    

    I double checked - it's for sure not a problem with paths.

     
  • Nickolay V. Shmyrev

    That's magic ;)

    06 MLLT looks like a really old script. It must be stage 02. Are you sure you
    are running last version?

    As for no access, it tries to open the file and fails. It might be permission
    issue. I have no other hypothesis.

     
  • marekl

    marekl - 2010-08-25

    06 MLLT is a stage 2 of LDA-MLLT training. I have downloaded snapshot version
    2 days ago, but right now I have reinstalled ST from trunk getting the same
    error. After some investigations it seams that gauden_counts is not accessible
    because it is not written.

    log file from init_gau of this stage looks like

    INFO: corpus.c(1313): Will process all remaining utts starting at 0
    INFO: init_gau.c(146): Computing 1x1x1 mean estimates
    Wed Aug 25 18:41:24 2010
    

    while the same log from previous version of ST is

    INFO: corpus.c(1343): Will process all remaining utts starting at 0
    INFO: init_gau.c(146): Computing 1x1x1 mean estimates
    [100] [200] [300] [400] [500] [600] [700] [800] [900] [1000] [1100] [1200] [1300] [1400] [1500] [1600] [1700] [1800] [1900] [2000] [2100] [2200] [2300] [2400] [2500] [2600] [2700] [2800] [2900] [3000] [3100] [3200] [3300] [3400] [3500] [3600] [3700] [3800] [3900] [4000] [4100] [4200] [4300] [4400] [4500] [4600] [4700] [4800] [4900] [5000] [5100] [5200] [5300] [5400] [5500] [5600] [5700] [5800] [5900] [6000] [6100] [6200] [6300] [6400] [6500] [6600] [6700] [6800] [6900] [7000] [7100] [7200] [7300] [7400] [7500] [7600] [7700] [7800] [7900] [8000] [8100] [8200] [8300] [8400] [8500] [8600] [8700] [8800] [8900] [9000] [9100] [9200] [9300] [9400] [9500] [9600] [9700] [9800] [9900] [10000] [10100] [10200] [10300] [10400] [10500] [10600] [10700] [10800] [10900] [11000] [11100] [11200] [11300] [11400] [11500] [11600] [11700] [11800] [11900] [12000] [12100] [12200] [12300] [12400] [12500] [12600] [12700] [12800] [12900] [13000] [13100] [13200] [13300] [13400] [13500] [13600] [13700] [13800] [13900] [14000] [14100] [14200] [14300] [14400] [14500] [14600] [14700] [14800] [14900] [15000] [15100] [15200] [15300] [15400] [15500] INFO: s3gau_io.c(478): Wrote /home/marekl/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts with means [1x1x1 vector arrays]
    Sat Aug  7 13:48:36 2010
    

    init_gau log from LDA training step of new ST is pretty similar

    INFO: corpus.c(1313): Will process all remaining utts starting at 0
    INFO: init_gau.c(146): Computing 1x1x1 mean estimates
    [100] [200] [300] [400] [500] [600] [700] [800] [900] [1000] [1100] [1200] [1300] [1400] [1500] [1600] [1700] [1800] [1900] [2000] [2100] [2200] [2300] [2400] [2500] [2600] [2700] [2800] [2900] [3000] [3100] [3200] [3300] [3400] [3500] [3600] [3700] [3800] [3900] [4000] [4100] [4200] [4300] [4400] [4500] [4600] [4700] [4800] [4900] [5000] [5100] [5200] [5300] [5400] [5500] [5600] [5700] [5800] [5900] [6000] [6100] [6200] [6300] [6400] [6500] [6600] [6700] [6800] [6900] [7000] [7100] [7200] [7300] [7400] [7500] [7600] [7700] [7800] [7900] [8000] [8100] [8200] [8300] [8400] [8500] [8600] [8700] [8800] [8900] [9000] [9100] [9200] [9300] [9400] [9500] [9600] [9700] [9800] [9900] [10000] [10100] [10200] [10300] [10400] [10500] [10600] [10700] [10800] [10900] [11000] [11100] [11200] [11300] [11400] [11500] [11600] [11700] [11800] [11900] [12000] [12100] [12200] [12300] [12400] [12500] [12600] [12700] [12800] [12900] [13000] [13100] [13200] [13300] [13400] [13500] [13600] [13700] [13800] [13900] [14000] [14100] [14200] [14300] [14400] [14500] [14600] [14700] [14800] [14900] [15000] [15100] [15200] [15300] [15400] [15500] INFO: s3gau_io.c(478): Wrote /home/monika/workspace/training/bwaccumdir/boolware_buff_1/gauden_counts with means [1x1x1 vector arrays]
    Wed Aug 25 21:12:55 2010
    

    For me it looks like something crashes init_gau for MLLT without causing any
    info on log file

    permissions are for sure not a problem as 1) these are temporary files created
    by ST in stage 1 (lda training) with no problem 2) files are created with no
    problem for MLLT but in case of 1s_c_d_dd feature set

     
  • Nickolay V. Shmyrev

    It writes to /home/monika and not to /home/marekl. Is it ok?

     
  • marekl

    marekl - 2010-08-25

    Yes thats OK. I have 2 environments and I'm running them both. the last run
    was on /home/monika so it's ok. It might be confusing for you - sorry about
    that - but as these are separate computers there is no chance for me to make
    error

     
  • marekl

    marekl - 2010-08-28

    INFO: main.c(261): Reading
    /home/monika/workspace/training/model_architecture/boolware.untied.mdef

    INFO: model_def_io.c(585): j = 154209,  n_total_map = 154044
    bw: model_def_io.c:586: model_def_read: Assertion `j == n_total_map' failed.
    Sat Aug 28 22:29:36 2010
    

    boolware.untied.mdef:

    # Generated by /home/monika/workspace/training/bin/mk_mdef_gen on Sat Aug 28 22:29:35 2010
    0.3
    38 n_base
    25636 n_tri
    154044 n_state_map
    128370 n_tied_state
    190 n_tied_ci_state
    38 n_tied_tmat
    

    boolware.untied.mdef does not contain any \r characters, however there is
    empty line at the very end of file (last line is finished with \n)

     
  • Nickolay V. Shmyrev

    Hm, can you mail me that mdef. My mail is nshmyrev@nexiwave.com. Also the log
    where it was created make_alltriphonelist.log on stage 30.

     
  • Nickolay V. Shmyrev

    The fix for this issue is committed in SphinxTrain trunk.

     
  • marekl

    marekl - 2010-09-01

    It seems that this change solved assertion problem, thank you very much
    Nicolay

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.