Menu

Elapsed time for different Guassian

Help
2016-05-04
2016-06-02
  • Senjam Shantirani

    I have done training for by changing the number of Guassian Mixture.
    I find as the number of guassian mixture increases the time taken also increases:

    Pocket Sphinx:
    4 GMM took 73 secs giving accuracy 49.20
    8 GMM took 90 secs giving accuracy 51.20
    16 GMM took 117 giving accuracy 54.117

    HDecode:
    4 GMM took 12 secs giving accuracy 52.86
    8 GMM took 23 secs giving accuracy 56.79
    16 GMM took 44 secs giving accuracy 59.48

    Can you please suggest, why the time taken increases as we increase the GMM and why it is too much in case of PocketSphinx as compare to HDecode.

    Also I find the trend reverse in Kaldi, where as we increase the GMM the time taken reduces while takes around 200 secs in all studies. How and why PocketSphinx differs from Kaldi in this showing of trend?

    Am I doing something wrong or is this the trend I am not sure.

    Please advice.

    Senjam

     
    • Nickolay V. Shmyrev

      There are many parameters involved here - topn scoring, language weight, beams for decoding, multiple scoring passes. You can't simply change number of gaussians and keep everything else the same. It is also not clear how do you measure the time, it seems you are doing something strange. You need to measure the decoding time only, not application startup time.

      You need to provide more data - command lines, models, data to get help on this issue. Without clear understanding on what is going on the numbers are sort of meaningless.

      Usually more gaussians mean more computation. So it is natural the decoding will take longer time. As for Kaldi, you probably measure something different, not the actual computation.

       
  • Senjam Shantirani

    For PocketSphinx: I simply ran the command "sphinxtrain -s decode run" and a shell script measures the execution time of this command from start till the end, and finally throws the total time taken in Minutes.

    If this is not the write method, kindly suggest me how should I do this, which particular script execution time I should measure.

    In Kaldi I did this in the run.sh:

    start=$(date +'%s')

    steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/mono/graph data/test exp/mono/decode_test || exit 1;

    echo "It took $(($(date +'%s') - $start)) seconds"

    Please suggest if this is NOT proper.

     
    • Nickolay V. Shmyrev

      You need to take the time taken and realtime ratio from the decoder logs, both cmusphinx and kaldi report that.

      In cmusphinx:

      INFO: batch.c(778): TOTAL 474.73 seconds speech, 1263.35 seconds CPU, 1264.48 seconds wall
      INFO: batch.c(780): AVERAGE 2.66 xRT (CPU), 2.66 xRT (elapsed)
      

      In Kaldi:

      LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:176) Time taken 191.82s: real-time factor assuming 100 frames/sec is 0.204942
      

      Measure of the time of the script is sort of senseless since it includes the initialization time and other utility time.

       
  • Senjam Shantirani

    Thank you Nickolay.

    I have changed the language weight to 13 and the accuracy improves in an4 with 23.8% WER.

    Can you please let me know about the following:

    topn scoring: Is this related with accuracy or only for saving time taken for training?
    I put $CFG_CI_TOPN = 8; $CFG_CD_TOPN = 8;
    But all I get is models for CD only till 8 Guassian. Cannot see CI in model parameter.

    Beams for decoding: what are the ranges we can do for word and global beam(sentence),

    Multiple scoring passes: what are the important scoring passes.

    Also I came to know from a search about maxhmmpf, maxwpf. Where can I set them?
    Also are the settings of these parameter mentioned above available for HTK and KALDI?

    Regards,
    Shanti

     
    • Nickolay V. Shmyrev

      topn scoring: Is this related with accuracy or only for saving time taken for training?

      Topn scoring improves training speed. It slightly reduces the model accuracy because it doesn't score all gaussians in every training step.

      But all I get is models for CD only till 8 Guassian. Cannot see CI in model parameter.

      To enable ci mgau training you can set $CFG_CD_TRAIN in config file

      Beams for decoding: what are the ranges we can do for word and global beam(sentence),

      1e-10 to 1e-200

      Multiple scoring passes: what are the important scoring passes.

      You can read http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.72.3560 about that

      Also I came to know from a search about maxhmmpf, maxwpf. Where can I set them?

      In decoder script, there is no configuration option in sphinxtrain

      Also are the settings of these parameter mentioned above available for HTK and KALDI?

      Yes, they are just called differently.

       
  • Senjam Shantirani

    Thank you so much Nickolay, I don't know how to thank you.
    The open source is making the world change, I am new to this, but will work harder to contribute back.

     

Log in to post a comment.