Menu

Problem in adapting acoustic model

Help
2016-06-29
2016-06-29
  • Vishnu Vardhan Dhanabalan

    Hi,

    I have created a C++ application to train/adapt the default en-us acoustic model using 20 Arctic speech files that are in wav format. The sample rate is 16000 Hz.

    I successfully created the .fileids and .transcription files and followed the steps as mentioned in this site: http://cmusphinx.sourceforge.net/wiki/tutorialadapt

    The adaptation process seems to be quite smooth but I could not find any improvement in the decoding accuracy. I tried using both mllr as well as map adaptation for ultimate accuracy but of no use. Then I noticed something in mllr execution and it says "Estimation of 0th regression in MLLR failed.. Estimation of 1st regression in MLLR failed.. " and so on. I have copied my MLLR command line output and pasted it below for reference:

    Current configuration:
    [NAME]      [DEFLT] [VALUE]
    -accumdir       .,
    -cb2mllrfn  .1cls.  .1cls.
    -cdonly     no  no
    -example    no  no
    -fullvar    no  no
    -help       no  no
    -meanfn         en-us/means
    -mllradd    yes yes
    -mllrmult   yes yes
    -moddeffn       
    -outmllrfn      mllr_matrix
    -varfloor   1e-3    1.000000e-03
    -varfn          en-us/variances
    
    
    INFO: main.c(382): -- 1. Read input mean, (var) and accumulation.
    INFO: s3gau_io.c(169): Read en-us/means [42x3x128 array]
    INFO: main.c(397): Reading and accumulating counts from .
    INFO: s3gau_io.c(386): Read ./gauden_counts with means with vars [42x3x128 vector arrays]
    
    INFO: main.c(436): -- 2. Read cb2mllrfn
    INFO: main.c(455): n_mllr_class = 1
    
    INFO: main.c(475): -- 3. Calculate mllr matrices
    INFO: main.c(127): 
    INFO: main.c(128):  ---- mllr_solve(): Conventional MLLR method
    INFO: s3gau_io.c(169): Read en-us/variances [42x3x128 array]
    
    INFO: main.c(208):  ---- A. Accum regl, regr
    INFO: main.c(209):  No classes 1, no. stream 3
    INFO: main.c(281):  ---- B. Compute MLLR matrices (A,B)
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
    INFO: mllr.c(186): Estimation of 0 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 1 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 2 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 3 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 4 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 5 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 6 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 7 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 8 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 9 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 10 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 11 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 12 th regression in MLLR failed
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
    INFO: mllr.c(186): Estimation of 0 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 1 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 2 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 3 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 4 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 5 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 6 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 7 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 8 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 9 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 10 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 11 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 12 th regression in MLLR failed
    INFO: mllr.c(182): Computing both multiplicative and additive part of MLLR
    INFO: mllr.c(186): Estimation of 0 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 1 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 2 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 3 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 4 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 5 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 6 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 7 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 8 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 9 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 10 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 11 th regression in MLLR failed
    INFO: mllr.c(186): Estimation of 12 th regression in MLLR failed
    
    INFO: main.c(497): -- 4. Store mllr matrices (A,B) to mllr_matrix
    

    And also I am not quite sure about the model type that I am using. I checked feat.params file and I saw

    -feat 1s_c_d_dd
    

    and

    -model ptm
    

    in there. So am I using continous or phonetically tied model?

    I included all the adaptation commands in a shell script and I have pasted in here for reference. Please note that the below shown script saves the newly adapted model in a folder of user's choice as exported into environment as PSt_ADAPTED_MODEL_PATH variable. All the wav files, copied acoustic, language models and dictionary will be saved and processed in the path as mentioned by the user (PSt_TRAINING_WORKSPACE)

    #! bin/bash
    #<==============================================================
    # This shell script runs configuration commands for training
    # the acoustic model of pocketsphinx. Training before decoding 
    # the speech samples is not mandatory.
    #<==============================================================
    # Compiling ps trainer package. 
    
    cd $PSt_TRAINING_WORKSPACE
    
    echo $"Copying model from pocketsphinx source to the working directory"
    cp -a /usr/local/share/pocketsphinx/model/en-us/en-us .
    echo $"Copying model: SUCCESSFUL"
    echo $"Copying dictionary from pocketsphinx source to the working directory"
    cp -a /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict .
    echo $"Copying dictionary: SUCCESSFUL"
    echo $"Copying lm from pocketsphinx source to the working directory"
    cp -a /usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin .
    echo $"Copying lm: SUCCESSFUL"
    
    # Generating acoustic feature files 
    sphinx_fe -argfile $PSt_FEAT_PARAMS \
            -samprate 16000 -c $PSt_FILEID \
           -di . -do . -ei wav -eo mfc -mswav yes
    
    # Convert the compressed mdef into mdef.txt
    pocketsphinx_mdef_convert -text en-us/mdef en-us/mdef.txt
    
    # Copy bw, map-adapt and mk-s2sendump
    cp -a /usr/local/libexec/sphinxtrain/bw .
    cp -a /usr/local/libexec/sphinxtrain/map_adapt .
    cp -a /usr/local/libexec/sphinxtrain/mk_s2sendump .
    
    # Accumulating observation counts
    ./bw \
     -hmmdir en-us \
     -moddeffn en-us/mdef.txt \
     -ts2cbfn .ptm. \
     -feat 1s_c_d_dd \
     -svspec 0-12/13-25/26-38 \
     -cmn current \
     -agc none \
     -dictfn cmudict-en-us.dict \
     -ctlfn $PSt_FILEID \
     -lsnfn $PSt_TRANSCRIPTION \
     -accumdir .
    
    # Copy the model once again so that we over write them
    cp -a en-us en-us-adapt
    
    # Copy mllr_solve executable first
    cp -a /usr/local/libexec/sphinxtrain/mllr_solve .
    
    # Create MLLR matrix for improved accuracy
    ./mllr_solve \
        -meanfn en-us/means \
        -varfn en-us/variances \
        -outmllrfn mllr_matrix -accumdir .
    
    # Use MAP adaptation technique
    ./map_adapt \
        -moddeffn en-us/mdef.txt \
        -ts2cbfn .ptm. \
        -meanfn en-us/means \
        -varfn en-us/variances \
        -mixwfn en-us/mixture_weights \
        -tmatfn en-us/transition_matrices \
        -accumdir . \
        -mapmeanfn en-us-adapt/means \
        -mapvarfn en-us-adapt/variances \
        -mapmixwfn en-us-adapt/mixture_weights \
        -maptmatfn en-us-adapt/transition_matrices
    
    # Moving the adapted model, dictionary and language model to the desired location
    mkdir $PSt_ADAPTED_MODEL_PATH
    cd $PSt_ADAPTED_MODEL_PATH
    mkdir en-us
    cd en-us
    cp -a /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict .
    cp -a /usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin .
    mkdir en-us
    cd en-us
    cp -a $PSt_TRAINING_WORKSPACE"en-us-adapt/" .
    
    # Come back to Training workspace and delete unnecessary things
    cd $PSt_TRAINING_WORKSPACE
    
    # Delete the copied model
    rm cmudict-en-us.dict
    rm en-us.lm.bin
    rm -r en-us
    rm -r en-us-adapt
    rm bw 
    rm map_adapt
    rm mllr_solve
    rm mk_s2sendump
    rm gauden_counts
    rm mixw_counts
    # rm mllr_matrix
    rm tmat_counts
    find . -type f -name '*.mfc' -delete
    

    I also read that for continous model, we need to include MLLR matrix in the command line like -mllr mllr_matrix but we also need to change the model directory to the MAP adapted model path right?

    Any help would be highly appreciated !!!

     
  • Nickolay V. Shmyrev

    in there. So am I using continous or phonetically tied model?

    Default en-us model is phonetically tied. You need to use map adaptation for it, not mllr.

    I also read that for continous model, we need to include MLLR matrix in the command line like -mllr mllr_matrix but we also need to change the model directory to the MAP adapted model path right?

    You either apply map or mllr, joint map+mllr is not covered by tutorial.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.