So I'm trying to adapt the en-us model with ~400 recordings. About ~150 of the recordings are of a single word and these appear to work fine, but the rest are sentences 10-30 words long. When running bw all the longer recordings complain:
ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
ERROR: "baum_welch.c", line 324: cleaned/263_0 ignored
After searching this forum I found someone who had a similar problem that was fixed by making sure that the feat.params used to created the mfc files were identical to the arguments given to bw. I tried running through the process again paying strict attention to these params and the problem remains.
This is the command I used to create the mfc files:
sphinx_fe -argfile cmusphinx-en-us-5.2/feat.params -samprate 16000 -c ../etc/voiceVim_train.fileids -di ../wav -do . -ei wav -eo mfc -mswav yes
This is how I run bw:
./bw -hmmdir cmusphinx-en-us-5.2 -moddeffn cmusphinx-en-us-5.2/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -varnorm no -cmninit 40,3,-1 -dictfn mydict.dict -ctlfn ../etc/voiceVim_train.fileids -lsnfn ../etc/voiceVim_train.transcription -accumdir .
Any ideas what I am doing wrong?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Forgot to subscribe to thread. My mistake.
You mean you want examples of the wav files I'm trying to adapt with?
Here are one wav each of the kind that work (short) and the kind that don't (long). Also included the transcription file in case that's helpful. Let me know if you need more than this.
possibly (but maybe other problems, too) you have out-of-vocabulary words. In your 19_0 example "xray" does not exist in CMU dict (only "x-ray"). I did not check for other words
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So I'm trying to adapt the en-us model with ~400 recordings. About ~150 of the recordings are of a single word and these appear to work fine, but the rest are sentences 10-30 words long. When running bw all the longer recordings complain:
ERROR: "backward.c", line 421: Failed to align audio to trancript: final state of the search is not reached
ERROR: "baum_welch.c", line 324: cleaned/263_0 ignored
After searching this forum I found someone who had a similar problem that was fixed by making sure that the feat.params used to created the mfc files were identical to the arguments given to bw. I tried running through the process again paying strict attention to these params and the problem remains.
This is the command I used to create the mfc files:
sphinx_fe -argfile cmusphinx-en-us-5.2/feat.params -samprate 16000 -c ../etc/voiceVim_train.fileids -di ../wav -do . -ei wav -eo mfc -mswav yes
This is how I run bw:
./bw -hmmdir cmusphinx-en-us-5.2 -moddeffn cmusphinx-en-us-5.2/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -varnorm no -cmninit 40,3,-1 -dictfn mydict.dict -ctlfn ../etc/voiceVim_train.fileids -lsnfn ../etc/voiceVim_train.transcription -accumdir .
Any ideas what I am doing wrong?
You are asking the question about alignment without data example.
Forgot to subscribe to thread. My mistake.
You mean you want examples of the wav files I'm trying to adapt with?
Here are one wav each of the kind that work (short) and the kind that don't (long). Also included the transcription file in case that's helpful. Let me know if you need more than this.
Edit: first attachment was a bad tarball
Last edit: Benjamin Roye 2017-02-01
possibly (but maybe other problems, too) you have out-of-vocabulary words. In your 19_0 example "xray" does not exist in CMU dict (only "x-ray"). I did not check for other words
This was an issue at first, but I added all of the missing words to the CMU dict. This expanded dict is what I've used in the call to bw.
Last edit: Benjamin Roye 2017-02-01
I have done so, but it was in an edit so that might not be caught by the
email.
Last edit: Nickolay V. Shmyrev 2017-02-01
You need to add
-lda cmusphinx-en-us-5.2/feature_transform
as described in tutorial.That was the problem. Thanks for your help Nickolay and Arseniy!