after upgrading from sphinx* 1.0.8 to the 5prealpha I noticed that the recognition (via sphinxtrain decode and also pocketsphinx_batch itself) is much slower. The config/ parameter stayed the same. Same goes for the latest subversion version. The log file shows nothing suspect, it just seems to be very slow, like 10 sentences in several minutes...
Did anyone experience this as well? Are weights and beam values interpreted in a different way? How to change the config and parameter when upgrading to the latest version?
AM build seems not the be affected by this.
Thanks for your help,
Benjamin
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thanks for your help. The recognition of the test set (45 sentences) took about 2,5h (5prealpha). Decoding the same files with the same AM, config and the older sphinx version takes like 2-3min.
Sharing the files for a build will take a while. Would the acoustic model be of any help? Here is the config file.
Also, I'm using a lmctl. But same goes for a setup with a plain language model
thanks for your reply. Running the command you provided I only get er as result, for all files. Same results for pocketsphinx_continuous with the -infile option.
When extracting the audio files' features with make_feats.pl and running pocketsphinx_batch with -cepext with mfc instead of wav, the results are fine but it just takes to long to recognize. Like several minutes.
So is there something wrong with the feature extraction?
Last edit: bekoe 2015-06-26
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi guys,
after upgrading from sphinx* 1.0.8 to the 5prealpha I noticed that the recognition (via sphinxtrain decode and also pocketsphinx_batch itself) is much slower. The config/ parameter stayed the same. Same goes for the latest subversion version. The log file shows nothing suspect, it just seems to be very slow, like 10 sentences in several minutes...
Did anyone experience this as well? Are weights and beam values interpreted in a different way? How to change the config and parameter when upgrading to the latest version?
AM build seems not the be affected by this.
Thanks for your help,
Benjamin
5prealpha is expected to be faster and significantly more accurate.
You are welcome to provide data to reproduce your problem, decoder configuration, files and exact times you see.
Hi Nickolay,
thanks for your help. The recognition of the test set (45 sentences) took about 2,5h (5prealpha). Decoding the same files with the same AM, config and the older sphinx version takes like 2-3min.
Sharing the files for a build will take a while. Would the acoustic model be of any help? Here is the config file.
Also, I'm using a lmctl. But same goes for a setup with a plain language model
Thanks!
Last edit: bekoe 2015-06-24
I'm sorry, without test sentences I can't help you.
Do you mean the test sentences or the audio files for training?
I meant "test sentences", the ones you are running on. I also need your acoustic model. I need to reproduce your problems.
Alright, there you go:
https://www.dropbox.com/sh/b7rpjryneydb632/AAACMBRzMCzlL6Hy66YlQ6jKa?dl=0
Do you need anything else?
Sorry, there is no phonetic dictionary in the archive. I can't run the sample without the dictionary.
Oh I forgot that. I've uploaded it just now.
I run
My results for 5prealpha
~~~~~~~~~~~~~~
INFO: batch.c(777): TOTAL 92.32 seconds speech, 11.45 seconds CPU, 11.46 seconds wall
INFO: batch.c(779): AVERAGE 0.12 xRT (CPU), 0.12 xRT (elapsed)
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 7.98 CPU 0.086 xRT
INFO: ngram_search_fwdtree.c(435): TOTAL fwdtree 7.99 wall 0.087 xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 2.96 CPU 0.032 xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 2.97 wall 0.032 xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.51 CPU 0.005 xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.51 wall 0.005 xRT
As expected, 5prealpha is faster
You probably want to provide your decoding log if you was able to reproduce the original problem.
Hi Nickolay,
thanks for your reply. Running the command you provided I only get er as result, for all files. Same results for pocketsphinx_continuous with the
-infileoption.When extracting the audio files' features with
make_feats.pland runningpocketsphinx_batchwith-cepextwith mfc instead of wav, the results are fine but it just takes to long to recognize. Like several minutes.So is there something wrong with the feature extraction?
Last edit: bekoe 2015-06-26
Yes, you need to add lines in feat.params:
-transform dct
-lifter 22
Worked like a charm! Thank you so much Nickolay