I generated a new dictionary using the Language Modelling Tool and a text file of the eleven digits used in TIDIGITS speech data base. I have successfully run sphinx2_test.bat on TIDIGITS files that were resampled from 20 KSPS to 16 KSPS (using Matlab), and get only a few percent errors (two or three words wrong out of a few dozen sentences of multiple words). I resample the original speech from 20 KSPS to 8 KSPS, and listen to it at 8 KSPS to verify it is sampled correctly. I then change the sphinx2_test.bat file to run at 8 KSPS (set samp to 8000) and test the new versions of the files. I get no correct sentences, and very few correct words. My question is, what else needs changed to get the 8 KSPS files to work?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2003-01-23
If you're still using the model/hmm/6k/ acoustic model, that's a 16 KSPS model, and simply setting -samp 8000 won't suffice. You need to use an 8 KSPS acoustic model as well.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I generated a new dictionary using the Language Modelling Tool and a text file of the eleven digits used in TIDIGITS speech data base. I have successfully run sphinx2_test.bat on TIDIGITS files that were resampled from 20 KSPS to 16 KSPS (using Matlab), and get only a few percent errors (two or three words wrong out of a few dozen sentences of multiple words). I resample the original speech from 20 KSPS to 8 KSPS, and listen to it at 8 KSPS to verify it is sampled correctly. I then change the sphinx2_test.bat file to run at 8 KSPS (set samp to 8000) and test the new versions of the files. I get no correct sentences, and very few correct words. My question is, what else needs changed to get the 8 KSPS files to work?
If you're still using the model/hmm/6k/ acoustic model, that's a 16 KSPS model, and simply setting -samp 8000 won't suffice. You need to use an 8 KSPS acoustic model as well.