Menu

[sphinx3] : LDA/MLLT training questions

Help
svanni
2010-07-28
2012-09-22
  • svanni

    svanni - 2010-07-28

    Hello !

    First of all, thanks to the team to provide a such great toolkit !

    I have try LDA/MLLT step in my training (only one iteration, lda_dim = 29 ),
    it seems to work but :

    1/ i "only" have a absolute gain of 1.5 % (WER = 51.5 % with lda/mllt , WER =
    53 % without)

    2/ i've got two similar errors in my log files (for lda & mllt steps) :

    % grep "ERROR" ../../sphinxtrain_cont8000_numgau32_ldamllt/logdir/02.mllt_train/*.log | grep -v "backward.c" | grep -v "baum_welch.c"
    
    ../../sphinxtrain_cont8000_numgau32_ldamllt/logdir/02.mllt_train/callsurf.N-1.bw.log:ERROR: "s3gau_full_io.c", line 129: Failed to read full covariance file ../sphinxtrain_cont8000_numgau32_ladmmlt+mmie/model_parameters/training_xxx.ci_mllt/variances (expected 98397 values, got 3393)
    
    % grep "ERROR" ../../sphinxtrain_cont8000_numgau32_ldamllt/logdir/01.lda_train/training_xxx.* | grep -v "backward.c" | grep -v "baum_welch.c"
    'training_xxx.mllt'
    ../../sphinxtrain_cont8000_numgau32_ldamllt/logdir/01.lda_train/callsurf.N-1.bw.log:ERROR: "s3gau_full_io.c", line 129: Failed to read full covariance file ../../sphinxtrain_cont8000_numgau32_ladmmlt+mmie/model_parameters/training_xxx.ci_lda/variances (expected 177957 values, got 4563)
    

    However, lda & mllt steps are complete and i get as excepted the
    'training_xxx.lda' file (39x39 matrix) and the 'training_xxx.mllt' file (29x39
    matrix).

    But i don't understand the meaning of those errors and i don't know if those
    errors can be resolved ?

    3/ In the wiki page, the author says : "The reason is that it's necessary to
    do some parts of training several times over. ... This has to be done for each
    feature transformation (currently there are two of them as they have been
    found to have additive effects)."

    I don't really understand this chapter.

    Did i've got to re-run the LDA step with a new bootstrap acoustic model ( from
    the preceding training ) ?

    If true, for the LDA & MLLT step , the options of 'bw' to modify are :

    -hmmdir                    Default directory for acoustic model files (mdef, means, variances, transition_matrices, noisedict) 
    -moddeffn                  The model definition file for the model inventory to train                                         
    -tmatfn                    The transition matrix parameter file name                                                          
    -mixwfn                    The mixture weight parameter file name                                                             
    -meanfn                    The mean parameter file name                                                                       
    -varfn                     The var parameter file name
    -ldafn                     File containing an LDA transformation matrix ( with 'training_xxx.mllt' from the preceding training ?)
    

    Thanks by advance for your response !
    Stephan

     
  • Nickolay V. Shmyrev

    Why sphinx3?

    1/ i "only" have a absolute gain of 1.5 % (WER = 51.5 % with lda/mllt , WER
    = 53 % without)

    It means that the quality of your acoustic model doen't really contribute to
    WER. Most likely your model is overtrained or language model/language weight
    aren't properly tuned

    ./../sphinxtrain_cont8000_numgau32_ldamllt/logdir/01.lda_train/callsurf.N-1.
    bw.log:ERROR: "s3gau_full_io.c", line 129: Failed to read full covariance file
    But i don't understand the meaning of those errors and i don't know if those
    errors can be resolved ?

    This error is expected and you can ignore it. Even more, in recent trunk it's
    not shown

    Did i've got to re-run the LDA step with a new bootstrap acoustic model (
    from the preceding training ) ?

    No, scripts already care about that. Training on stage 01 is repeated on stage
    02 and repeated on stage 20.

     
  • svanni

    svanni - 2010-07-29

    It means that the quality of your acoustic model doen't really contribute to
    WER. Most likely your model is overtrained or language model/language weight
    aren't properly tuned

    Ok for language model / language weight tuning but you mean that an
    overtrained acoustic model could explain that ?

    Why sphinx3

    and not Sphinx4 ?
    Mostly because i'm not familiar with java code ... but i'll try of course

    This error is expected and you can ignore it. Even more, in recent trunk
    it's not shown

    Ok

    No, scripts already care about that. Training on stage 01 is repeated on
    stage 02 and repeated on stage 20.

    Ok

    Thanks again for your help !
    Stephan

     
  • eliasmajic

    eliasmajic - 2010-07-29

    I suggest you try pocketsphinx instead of sphinx3 for reasons outlined on the
    website.

    By over trained he means in regards to the data its already seen. So one
    speaker could be a disproportionate percentage of the audio. So having even
    more audio from that person wont benefit from help adaptation.

     
  • svanni

    svanni - 2010-07-30

    I suggest you try pocketsphinx instead of sphinx3 for reasons outlined on
    the website.

    Ok but i was understanding that pocketsphinx is dedicated for mobile devices
    and light models.
    My goal here is to make an acoustic model for conversational speech, large
    vocabulary, noisy environment, 2 speakers (overlap speech). I know it's hard
    but it's just a try.
    So i think i'll switch to Sphinx4 in order to evaluate things not available
    with Sphinx3 :
    Lattice rescoring, PLP extraction, unsupervised and online acoustic adaptation
    ( as soon as it's available ) ...

    Thanks again,
    Stephan

    By over trained he means in regards to the data its already seen. So one
    speaker could be a disproportionate percentage of the audio. So having even
    more audio from that person wont benefit from help adaptation.

    Ok.

     

Log in to post a comment.