Unable to test WER

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Unable to test WER

Forum: Help

Creator: Simen H

Created: 2019-03-05

Updated: 2019-03-18

Simen H - 2019-03-05

After running the entire sphinxtrain training, we are unable to test the WER with the decoder. This is not related to the training process, since according to our logs all previous steps are completed until this point.

What happens is that when running sphinxtrain -s decode run, we only get the output "MODULE: DECODE Decoding using models previously trained". It never continues to "Decoding 130 segments starting at 0 (part 1 of 1)". The decoder creates 2 files in the result folder, but not the align file. We are using the sphinxtrain-master version off of github for training.

I know for sure that the word-align.pl script exists on the machine, but it seems that the decoder is unable to access this. The path to the script is "/usr/local/lib/sphinxtrain/scripts/decode/word_align.pl".

Is there any way to give the decoder a pointer to where the script is? I've tried in vain to export to PATH.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-03-05
  
  The decoder creates 2 files in the result folder, but not the align file.
  
  What is inside those files? What is inside decoding log in logdir/decode?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Simen H - 2019-03-05

Attached are the logdir/decode and result files.

logdir_result.zip

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-03-05
  
  Ok, share testing/04/r5310980/u0980055.wav
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Simen H - 2019-03-05

I sent you a DM.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-03-05
  
  You didn't convert the training data into proper format. It should be 16khz 16bit mono pcm. You can convert the files with sph2pipe.
  
  Because data is garbage, decoder goes out of memory and crashes.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Simen H - 2019-03-05

We are training a model for 8kHz (specified in sphinx_train.cfg). Do we have to specify this when running the decoder aswell?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-03-05
  
  No, it picks the values from cfg file.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Simen H - 2019-03-05

Upon further investigation I have found that some of the wav files are PCM MU-LAW, while others are PCM ALAW. Do you think this might be the problem?

EDIT:
I would like to add to this that the model that we've trained has very bad recognition, which is why testing the WER is so important.

Last edit: Simen H 2019-03-05

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.