CMU Sphinx / Forums / Help: Self-trained model runs good in sphinxtrain-test but bad with Pockstsphinx and Sphinx4

Speech Recognition Toolkit

Self-trained model runs good in sphinxtrain-test but bad with Pockstsphinx and Sphinx4

Forum: Help

Creator: kyon

Created: 2017-11-03

Updated: 2017-11-08

kyon - 2017-11-03

Hi, I am newbee to cmu-sphinx.

I created the language model using: http://www.speech.cs.cmu.edu/tools/lmtool-new.html
And trained the acoustic mode followed: https://cmusphinx.github.io/wiki/tutorialam/
Using latest release from https://cmusphinx.github.io/wiki/download/

The training data is very simple. They are all in the format as "Please take the characters <digits> once again <digits>".
There are 2,000 wav files for training(5 hours in total) and 500 files for test, which cost me 2days to prepare.</digits></digits>

The training result is very good according to align file:

TOTAL Words: 8256 Correct: 7278 Errors: 1105 TOTAL Percent correct = 88.15% Error = 13.38% Accuracy = 86.62% TOTAL Insertions: 127 Deletions: 281 Substitutions: 697

However, I got a very bad accuracy when running Pocketsphinx/Sphinx4 with my model.
For example, file_2001.wav and file_2002.wav can be recognized in db.align file as:

1 please take the characters one three eight seven two once again one three eight seven two (apple-FILE_2001) 2 please take the characters one three eight seven two once again one three eight seven two (apple-FILE_2001) 3 Words: 16 Correct: 16 Errors: 0 Percent correct = 100.00% Error = 0.00% Accuracy = 100.00% 4 Insertions: 0 Deletions: 0 Substitutions: 0 5 please take the characters one seven THREE nine one ONCE AGAIN one seven three nine ONE (apple-FILE_2002) 6 please take the characters one seven NINE nine one *** *** one seven three nine *** (apple-FILE_2002) 7 Words: 16 Correct: 12 Errors: 4 Percent correct = 75.00% Error = 25.00% Accuracy = 75.00%

Rather than pocketsphinx:

file_2001.wav : NINE ZERO EIGHT NINE TWO NINE ONE file_2002.wav : NINE

pocketsphinx pocommand is

pocketsphinx_continuous -infile wav/file_2002.wav -hmm model_parameters/kyon2_db.cd_cont_32 -dict etc/kyon2_db.dic -lm etc/kyon2_db.lm.DMP

Training db, including etc/model_parameters/result can be found : https://drive.google.com/open?id=0ByvxjxiH6xq3SEotczdNNVhacUU
And I've uploaded 10 wav file for test.

Do I miss something in training or using the model?

Last edit: kyon 2017-11-03
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

kyon - 2017-11-03

Running with pocketsphinx_batch can get a very good result

Command is :

pocketsphinx_batch -adcin yes -cepdir wav -cepext .wav -ctl test.fileids -lm etc/kyon2_db.lm.DMP -dict etc/kyon2_db.dic -hmm model_parameters/kyon2_db.cd_cont_32 -hyp test

Result:

INFO: batch.c(761): apple/file_2001: 8.23 seconds speech, 0.11 seconds CPU, 0.11 seconds wall INFO: batch.c(763): apple/file_2001: 0.01 xRT (CPU), 0.01 xRT (elapsed) PLEASE TAKE THE CHARACTERS ONE THREE EIGHT SEVEN TWO ONCE AGAIN ONE THREE EIGHT SEVEN TWO (apple/file_2001 -11322) apple/file_2001 done --------------------------------------

Any magic here?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-11-08
  
  Our software has several hidden tweaks to reduce captcha cracking capabilities.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Self-trained model runs good in sphinxtrain-test but bad with Pockstsphinx...

Speech Recognition Toolkit

Forums

Help

Self-trained model runs good in sphinxtrain-test but bad with Pockstsphinx and Sphinx4 document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Self-trained model runs good in sphinxtrain-test but bad with Pockstsphinx and Sphinx4