I was using wave2feat executable from SphinxTrain to get my feature files. In fact, I did it, but when I trained my context independent models using 3 states/phone, I got a lot of my utterances ignored at some iterations of the baum welch. There I got the error "final state not reached".
Then, I'm not sure if my feature files were generated correctly because I was using a set of 1000 audio files in the .wav format and sampling rate of 48KHz... I understand the default values of wave2feat parameters refer to 16KHz but I didn't know exactly what to change...
Then, I used the default values except for the fft number, that was set to 2048... Is that enough or is there anything I could do to improve this?
And .. About this problem of some utterances being ignored at some iterations of the baum welch, is that common or should I do something to improve this?
Could someone help me?
Thanks in advance...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
ok.. And how should I do this? Is there anyway sox would convert a list of files instead of just one?
And how should I do if I want to convert my files to raw format?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I hope you didn't forget to rebuild features from new audio files. It might be everythign actually. If you like, pack a few files and etc folder into archive and upload it somewhere.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi.
I was using wave2feat executable from SphinxTrain to get my feature files. In fact, I did it, but when I trained my context independent models using 3 states/phone, I got a lot of my utterances ignored at some iterations of the baum welch. There I got the error "final state not reached".
Then, I'm not sure if my feature files were generated correctly because I was using a set of 1000 audio files in the .wav format and sampling rate of 48KHz... I understand the default values of wave2feat parameters refer to 16KHz but I didn't know exactly what to change...
Then, I used the default values except for the fft number, that was set to 2048... Is that enough or is there anything I could do to improve this?
And .. About this problem of some utterances being ignored at some iterations of the baum welch, is that common or should I do something to improve this?
Could someone help me?
Thanks in advance...
Resample files to 16000 with sox.
ok.. And how should I do this? Is there anyway sox would convert a list of files instead of just one?
And how should I do if I want to convert my files to raw format?
There is no need to use raw, use wav, just set variables in config file properly. About sox:
mkdir wav_new
for f in *.wav; do sox $f -r 16000 wav_new/$f; done
will do the trick.
thanks..
I did this but I'm still getting the errors in baum welch..
Everything else seems to be right..
What could that be?
I hope you didn't forget to rebuild features from new audio files. It might be everythign actually. If you like, pack a few files and etc folder into archive and upload it somewhere.