Menu

Help to improve accuracy

Help
Shammur
2012-04-24
2012-09-22
  • Shammur

    Shammur - 2012-04-24

    Hello,

    I am training a continuous model with around 20K sentences, which has around
    98785 n_tri using cmu dic

    my LM has perplexity on test text around 14.3

    I have trained my last model with 3 states/hmm , 32 GMixture, and 3000 tied
    states
    I used forced Alignment .
    my wav files are in sphere nist format

    please do let me know what can I do to improve it, at the present my wer =
    91.7% which is really not expected.
    I have tried with diff tied states but dont know which one I used use or is
    there any other calculation by which I can compute it.

    my model , etc , logdir a sample wav and align file is in the link below
    http://www.mediafire.com/file/5qy1x01186tqzdr/help.tar.gz

    Please any suggestion or any sort of comment will be really helpful.

    At this moment I am re running the training with 3s/hmm 16 gm and 8000 tied
    states. which takes around 15hrs to train

    Thanks in advance

     
  • Shammur

    Shammur - 2012-04-24

    And here is other details:

    Mon Apr 23 09:10:55 CEST 2012
    MODULE: 00 verify training files
    O.S. is case sensitive ("A" != "a").
    Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the
    phonelist file.
    Found 133120 words using 40 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the
    dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive);
    files exist
    Phase 4: CTL - Checking number of lines in the transcript should match lines
    in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems
    reasonable.
    Estimated Total Hours Training: 15.6430972222222
    Rule of thumb suggests 3000, however there is no correct answer
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the
    dictionary
    Words in dictionary: 133117
    Words in filler dictionary: 3
    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in
    the phonelist, and all phones in the phonelist appear at least once
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 30 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = 26.1319655403433
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 12 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 26.2070756188023
    Training completed after 2 iterations
    MODULE: 11 Force-aligning transcripts
    Phase 1: Cleaning up directories:
    logs...output...qmanager...
    Phase 3: Creating dictionary for alignment...
    Phase 4: Creating transcript for alignment...
    Phase 5: Running force alignment in 1 parts
    Force alignment starting: (1 of 1)
    0%
    MODULE: 12 Force-aligning data for VTLN
    Skipped: $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Copy initialize from falign model
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 1
    Current Overall Likelihood Per Frame = 26.2272230474393
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 26.3358971786455
    Convergence Ratio = 0.108674131206236
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 2 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 26.5023794074265
    Convergence Ratio = 0.166482228781007
    Baum welch starting for 1 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 18 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = 26.7898282866141
    Convergence Ratio = 0.2874488791876
    Baum welch starting for 1 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 96 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = 26.9731678517901
    Convergence Ratio = 0.183339565176016
    Baum welch starting for 1 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 168 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = 27.0386115539069
    Training completed after 6 iterations
    MODULE: 30 Training Context Dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Initialization
    Phase 3: Forward-Backward
    Baum welch starting for iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 204 ERROR messages and 1 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    WARNING: This step had 0 ERROR messages and 96 WARNING messages. Please check
    the log file for details.
    Current Overall Likelihood Per Frame = 27.0620323767413
    Baum welch starting for iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 200 ERROR messages and 1 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    WARNING: This step had 0 ERROR messages and 96 WARNING messages. Please check
    the log file for details.
    Current Overall Likelihood Per Frame = 27.3165607504742
    Training completed after 2 iterations
    MODULE: 40 Build Trees
    Phase 1: Cleaning up old log files...
    Phase 2: Make Questions
    Phase 3: Tree building
    Processing each phone with each state
    AA 0
    AA 1
    AA 2
    AE 0
    AE 1
    AE 2
    AH 0
    AH 1
    AH 2
    AO 0
    AO 1
    AO 2
    AW 0
    AW 1
    AW 2
    AY 0
    AY 1
    AY 2
    B 0
    B 1
    B 2
    CH 0
    CH 1
    CH 2
    D 0
    D 1
    D 2
    DH 0
    DH 1
    DH 2
    EH 0
    EH 1
    EH 2
    ER 0
    ER 1
    ER 2
    EY 0
    EY 1
    EY 2
    F 0
    F 1
    F 2
    G 0
    G 1
    G 2
    HH 0
    HH 1
    HH 2
    IH 0
    IH 1
    IH 2
    IY 0
    IY 1
    IY 2
    JH 0
    JH 1
    JH 2
    K 0
    K 1
    K 2
    L 0
    L 1
    L 2
    M 0
    M 1
    M 2
    N 0
    N 1
    N 2
    NG 0
    NG 1
    NG 2
    OW 0
    OW 1
    OW 2
    OY 0
    OY 1
    OY 2
    P 0
    P 1
    P 2
    R 0
    R 1
    R 2
    S 0
    S 1
    S 2
    SH 0
    SH 1
    SH 2
    T 0
    T 1
    T 2
    TH 0
    TH 1
    TH 2
    UH 0
    UH 1
    UH 2
    UW 0
    UW 1
    UW 2
    V 0
    V 1
    V 2
    W 0
    W 1
    W 2
    Y 0
    Y 1
    Y 2
    Z 0
    Z 1
    Z 2
    ZH 0
    ZH 1
    ZH 2
    Skipping SIL
    MODULE: 45 Prune Trees
    Phase 1: Tree Pruning
    Phase 2: State Tying
    MODULE: 50 Training Context dependent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...
    Phase 2: Copy CI to CD initialize
    Phase 3: Forward-Backward
    Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 204 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 3 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 27.0620323767413
    Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 200 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 27.1756667388801
    Convergence Ratio = 0.113634362138786
    Baum welch starting for 1 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 210 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 27.2068794897292
    Split Gaussians, increase by 1
    Current Overall Likelihood Per Frame = 27.2068794897292
    Convergence Ratio = 0.0312127508491464
    Baum welch starting for 2 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 208 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 147 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 27.0087931553322
    Baum welch starting for 2 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 210 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 27.2304503029369
    Convergence Ratio = 0.221657147604699
    Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 208 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 27.2678195942578
    Split Gaussians, increase by 2
    Current Overall Likelihood Per Frame = 27.2678195942578
    Convergence Ratio = 0.0373692913208608
    Baum welch starting for 4 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 206 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 316 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 27.0696735549779
    Baum welch starting for 4 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 208 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 27.3343836749616
    Convergence Ratio = 0.264710119983711
    Baum welch starting for 4 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 206 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 27.4163755213446
    Split Gaussians, increase by 4
    Current Overall Likelihood Per Frame = 27.4163755213446
    Convergence Ratio = 0.0819918463830192
    Baum welch starting for 8 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 210 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 516 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 27.2537423845428
    Baum welch starting for 8 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 208 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 27.532755561439
    Convergence Ratio = 0.279013176896225
    Baum welch starting for 8 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 208 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 27.6295347126804
    Split Gaussians, increase by 8
    Current Overall Likelihood Per Frame = 27.6295347126804
    Convergence Ratio = 0.0967791512413747
    Baum welch starting for 16 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 206 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 1152 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 27.4939123333241
    Baum welch starting for 16 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 204 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 27.8240073740902
    Convergence Ratio = 0.330095040766142
    Baum welch starting for 16 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 200 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 27.9680046230904
    Convergence Ratio = 0.143997249000204
    Baum welch starting for 16 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 200 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = 28.0956212601184
    Convergence Ratio = 0.127616637028019
    Baum welch starting for 16 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 200 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = 28.2003838146574
    Convergence Ratio = 0.10476255453899
    Baum welch starting for 16 Gaussian(s), iteration: 6 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 196 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 6
    Current Overall Likelihood Per Frame = 28.2845113158031
    Split Gaussians, increase by 16
    Current Overall Likelihood Per Frame = 28.2845113158031
    Convergence Ratio = 0.0841275011457157
    Baum welch starting for 32 Gaussian(s), iteration: 1 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 196 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 1
    This step had 1838 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Current Overall Likelihood Per Frame = 28.1444890706236
    Baum welch starting for 32 Gaussian(s), iteration: 2 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 194 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 2
    Current Overall Likelihood Per Frame = 28.4805715265889
    Convergence Ratio = 0.336082455965343
    Baum welch starting for 32 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 196 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 28.623325488486
    Convergence Ratio = 0.142753961897132
    Baum welch starting for 32 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 196 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 4
    Current Overall Likelihood Per Frame = 28.7337483050926
    Convergence Ratio = 0.110422816606597
    Baum welch starting for 32 Gaussian(s), iteration: 5 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    This step had 196 ERROR messages and 0 WARNING messages. Please check the log
    file for details.
    Normalization for iteration: 5
    Current Overall Likelihood Per Frame = 28.8125174122122
    Split Gaussians, increase by 0
    Training for 32 Gaussian(s) completed after 5 iterations
    MODULE: 60 Lattice Generation
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 61 Lattice Pruning
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 62 Lattice Format Conversion
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 65 MMIE Training
    Skipped: $ST::CFG_MMIE set to 'no' in sphinx_train.cfg
    MODULE: 90 deleted interpolation
    Skipped for continuous models
    Tue Apr 24 03:55:36 CEST 2012

     
  • Nickolay V. Shmyrev

    Your feature extraction is not correct

    You wav files are encoded with shorten.

    You need to decode them with sph2pipe before training or you can use
    -sph2pipe yes option of sphinx_fe. You need to have sph2pipe anyway. To
    modify the option you need to edit the feature extraction script, there is no
    way to do it from a configuration file.

     
  • Shammur

    Shammur - 2012-04-24

    Thank You
    I am trying now and let you know the result

     
  • Shammur

    Shammur - 2012-04-26

    Hello,

    I have tried sph2pipe option.
    my new model: http://www.mediafire.com/file/lrdepc4aqk2ndey/help_2.tar.gz
    Still I am getting a wer of 92.5%

    Is my feature extraction ok ?
    or this error is due to choice of wrong tied state

    Any way to decide what tied state should I use?

    His training has parameters: 3 s/hmm, 16 gmix, tied state = 8000

    Please do suggest me

    Thanks in advance

     
  • Nickolay V. Shmyrev

    It doesn't seem you used sph2pipe right. You need to provide a feature file
    example in order to let me say more. You can control CMN value in the logs, it
    must be around 10-12.

     
  • Nickolay V. Shmyrev

    mfc file is not correct. It doesn't sound like you installed sph2pipe
    properly. You can dump it with

     sphinx_cepview -f file.mfc
    

    The proper values should look like:

      6.951  -0.736  -0.112  -0.182  -0.146  -0.017  -0.261  -0.116  -0.202  -0.078 
      6.483  -0.736  -0.040  -0.243  -0.100  -0.072  -0.281  -0.076  -0.079  -0.008 
      6.676  -0.795  -0.136  -0.213  -0.112  -0.061  -0.325   0.039   0.006  -0.065 
      6.898  -0.614  -0.293  -0.249  -0.145  -0.193  -0.139  -0.042  -0.116  -0.078 
      7.093  -0.675  -0.152  -0.284  -0.194  -0.129  -0.317  -0.076  -0.081  -0.071 
      6.938  -0.567  -0.056  -0.235  -0.159  -0.054  -0.358  -0.211  -0.063  -0.048
    
     
  • Shammur

    Shammur - 2012-05-02

    Thank You,

    It works now, WER =8.3%

    Now will be checking how the results differ with different parameters

    Thanks for all you help

     
  • Shammur

    Shammur - 2012-05-03

    Can you please give me any information about how can I control CMN value .

    Thanks

     
  • Nickolay V. Shmyrev

    Can you please give me any information about how can I control CMN value .

    When I wrote you this:

    You can control CMN value in the logs, it must be around 10-12.

    I meant that you can open the log and read it and see the first value in the
    CMN string.
    It looks like this

    INFO: cmn.c(175): CMN:  7.30 -0.08 -0.06  0.19 -0.33 -0.00 -0.12 -0.01  0.07  0.01 -0.21 -0.18 -0.
    

    You need to make sure that the first value is reasonable.

     

Log in to post a comment.