CMU Sphinx / Forums / Help: Comparison between Sphinx and Kaldi

Speech Recognition Toolkit

Comparison between Sphinx and Kaldi

Forum: Help

Created: 2016-02-29

Updated: 2016-02-29

bhargav - 2016-02-29

I have collected 30 hrs of Indian English speech data. I have performed experiments on Sphinx and Kaldi keeping all the experimental conditions same.
Like Feature extraction, numer of Gaussians, tied states, basic EM training, no other techniques like SAT,fmllr,mmi etc.

The test set is of 8000 utterances.
I am getting wer of 6.5% on Sphinx, 4.3% on Kaldi.

I cant figure out the reason behind difference in the accuracy.
Am I missing something important in the experiments?

Thanks in advance
Bhargav

Last edit: bhargav 2016-02-29

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-02-29
  
  To compare decoders accurately you need to tune decoding and training parameters - number of gaussians, beams, language weights. Those must be different for kaldi and cmusphinx, not the same. For cmusphinx you generally need more gaussians than for Kaldi since cmusphinx assigns them uniformly. Small difference in accuracy is ok. Also for CMUSphinx you need a different langauge weights (in fwdflat, fwdtree and bestpath) due to normalizations of scores inside decoder.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi everyone.
I am testing the Lium/CMU french Language Model on Kaldi. But the format doesn't seem the same.
Here is what I get when trying to evaluate its ppl on my corpus:

$ ngram -lm ModelsRef/3gLIUM/trigram_LM.DMP.gz -ppl data/ACSYNT/Mix1/ACSYNTMix1_${p}_ts/text
ModelsRef/3gLIUM/trigram_LM.DMP.gz: line 182630: reached EOF before \end\
format error in lm file ModelsRef/3gLIUM/trigram_LM.DMP.gz

Is there a way to convert the model in a format that Kaldi can evaluate?
When trying with SphinxBase here is the error:

$ sphinx_lm_convert -i ModelsRef/3gLIUM/trigram_LM.DMP -o model.lm -ofmt bin
Current configuration:
[NAME]      [DEFLT] [VALUE]
-case           
-debug          0
-help       no  no
-i          ModelsRef/3gLIUM/trigram_LM.DMP
-ifmt           
-logbase    1.0001  1.000100e+00
-mmap       no  no
-o          model.lm
-ofmt           bin

INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(70): No \data\ mark in LM file
INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
INFO: ngram_model_trie.c(527): ngrams 1=65533, 2=18408667, 3=22235344
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
ERROR: "ngram_model_trie.c", line 323: Error reading word strings (904402888 doesn't match n_unigrams 65533)

$ sphinx_lm_convert -i ModelsRef/3gLIUM/trigram_LM.DMP -o model.lm -ofmt arpa
Current configuration:
[NAME]      [DEFLT] [VALUE]
-case           
-debug          0
-help       no  no
-i          ModelsRef/3gLIUM/trigram_LM.DMP
-ifmt           
-logbase    1.0001  1.000100e+00
-mmap       no  no
-o          model.lm
-ofmt           arpa

INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(70): No \data\ mark in LM file
INFO: ngram_model_trie.c(445): Trying to read LM in dmp format
INFO: ngram_model_trie.c(527): ngrams 1=65533, 2=18408667, 3=22235344
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
ERROR: "ngram_model_trie.c", line 323: Error reading word strings (904402888 doesn't match n_unigrams 65533)

Comparison between Sphinx and Kaldi

Speech Recognition Toolkit

Forums

Help

Comparison between Sphinx and Kaldi document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Comparison between Sphinx and Kaldi