CMU Sphinx / Forums / Help: Elapsed time for different Guassian

Senjam Shantirani - 2016-05-04

I have done training for by changing the number of Guassian Mixture.
I find as the number of guassian mixture increases the time taken also increases:

Pocket Sphinx:
4 GMM took 73 secs giving accuracy 49.20
8 GMM took 90 secs giving accuracy 51.20
16 GMM took 117 giving accuracy 54.117

HDecode:
4 GMM took 12 secs giving accuracy 52.86
8 GMM took 23 secs giving accuracy 56.79
16 GMM took 44 secs giving accuracy 59.48

Can you please suggest, why the time taken increases as we increase the GMM and why it is too much in case of PocketSphinx as compare to HDecode.

Also I find the trend reverse in Kaldi, where as we increase the GMM the time taken reduces while takes around 200 secs in all studies. How and why PocketSphinx differs from Kaldi in this showing of trend?

Am I doing something wrong or is this the trend I am not sure.

Please advice.

Senjam

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-05-06
  
  There are many parameters involved here - topn scoring, language weight, beams for decoding, multiple scoring passes. You can't simply change number of gaussians and keep everything else the same. It is also not clear how do you measure the time, it seems you are doing something strange. You need to measure the decoding time only, not application startup time.
  
  You need to provide more data - command lines, models, data to get help on this issue. Without clear understanding on what is going on the numbers are sort of meaningless.
  
  Usually more gaussians mean more computation. So it is natural the decoding will take longer time. As for Kaldi, you probably measure something different, not the actual computation.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Senjam Shantirani - 2016-05-13

For PocketSphinx: I simply ran the command "sphinxtrain -s decode run" and a shell script measures the execution time of this command from start till the end, and finally throws the total time taken in Minutes.

If this is not the write method, kindly suggest me how should I do this, which particular script execution time I should measure.

In Kaldi I did this in the run.sh:

start=$(date +'%s')

steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \
exp/mono/graph data/test exp/mono/decode_test || exit 1;

echo "It took $(($(date +'%s') - $start)) seconds"

Please suggest if this is NOT proper.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-05-13
  
  You need to take the time taken and realtime ratio from the decoder logs, both cmusphinx and kaldi report that.
  
  In cmusphinx:
  
  INFO: batch.c(778): TOTAL 474.73 seconds speech, 1263.35 seconds CPU, 1264.48 seconds wall INFO: batch.c(780): AVERAGE 2.66 xRT (CPU), 2.66 xRT (elapsed)
  
  In Kaldi:
  
  LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:176) Time taken 191.82s: real-time factor assuming 100 frames/sec is 0.204942
  
  Measure of the time of the script is sort of senseless since it includes the initialization time and other utility time.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Senjam Shantirani - 2016-06-01

Thank you Nickolay.

I have changed the language weight to 13 and the accuracy improves in an4 with 23.8% WER.

Can you please let me know about the following:

topn scoring: Is this related with accuracy or only for saving time taken for training?
I put $CFG_CI_TOPN = 8; $CFG_CD_TOPN = 8;
But all I get is models for CD only till 8 Guassian. Cannot see CI in model parameter.

Beams for decoding: what are the ranges we can do for word and global beam(sentence),

Multiple scoring passes: what are the important scoring passes.

Also I came to know from a search about maxhmmpf, maxwpf. Where can I set them?
Also are the settings of these parameter mentioned above available for HTK and KALDI?

Regards,
Shanti

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-06-02
  
  topn scoring: Is this related with accuracy or only for saving time taken for training?
  
  Topn scoring improves training speed. It slightly reduces the model accuracy because it doesn't score all gaussians in every training step.
  
  But all I get is models for CD only till 8 Guassian. Cannot see CI in model parameter.
  
  To enable ci mgau training you can set $CFG_CD_TRAIN in config file
  
  Beams for decoding: what are the ranges we can do for word and global beam(sentence),
  
  1e-10 to 1e-200
  
  Multiple scoring passes: what are the important scoring passes.
  
  You can read http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.72.3560 about that
  
  Also I came to know from a search about maxhmmpf, maxwpf. Where can I set them?
  
  In decoder script, there is no configuration option in sphinxtrain
  
  Also are the settings of these parameter mentioned above available for HTK and KALDI?
  
  Yes, they are just called differently.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Senjam Shantirani - 2016-06-02

Thank you so much Nickolay, I don't know how to thank you.
The open source is making the world change, I am new to this, but will work harder to contribute back.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Elapsed time for different Guassian

Speech Recognition Toolkit

Forums

Help

Elapsed time for different Guassian document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Elapsed time for different Guassian