Hello, I've been doing my initial expiriments with sphinx3 using the hub4 model that comes with the distribution. My input files are 8khz telephone captures, and I had been up-sampling to 16khz using sox.
I expected that when I downloaded and started using Communicator, the accuracy would go up since it should better approximate my acoustic conditions. I had been getting about 77% accuracy with hub4, but I've been getting 66% or so using Communicator.
I'm just testing with a single speaker right now, with about a 10k word vocabulary. I've been using the same language model and dictionary for all my tests.
First I have been using sphinx_fe like this:
sphinx_fe -samprate 8000 -c masterCtl -di . -do . -ei raw -eo mfc -raw yes -argfile /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000/feat.params
Then running sphinx3_decode with these arguments:
-featparams /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000/feat.params
-samprate 8000
-hmm /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000
-dict dict/getz-6-25.dict
-fdict dict/filler.dict
-lm LM/getz-6-26.corpus.arpa.DMP
-hypseg 8kbase.scorefile
-hyp 8kbase.txt
-beam 1.0e-100
-pbeam 1.0e-80
-wbeam 1.0e-70
-subvqbeam 1.0e-100
(I know the beams are pretty wide and this takes a while, but I wanted to make sure that wasn't the problem)
Are there any errors with my process, or have I made an incorrect assumption somewhere?
Thanks in advance!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, I've been doing my initial expiriments with sphinx3 using the hub4 model that comes with the distribution. My input files are 8khz telephone captures, and I had been up-sampling to 16khz using sox.
I expected that when I downloaded and started using Communicator, the accuracy would go up since it should better approximate my acoustic conditions. I had been getting about 77% accuracy with hub4, but I've been getting 66% or so using Communicator.
I'm just testing with a single speaker right now, with about a 10k word vocabulary. I've been using the same language model and dictionary for all my tests.
First I have been using sphinx_fe like this:
sphinx_fe -samprate 8000 -c masterCtl -di . -do . -ei raw -eo mfc -raw yes -argfile /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000/feat.params
Then running sphinx3_decode with these arguments:
-featparams /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000/feat.params
-samprate 8000
-hmm /usr/src/sphinx/sphinx3/model/hmm/Communicator_40.cd_cont_4000
-dict dict/getz-6-25.dict
-fdict dict/filler.dict
-lm LM/getz-6-26.corpus.arpa.DMP
-hypseg 8kbase.scorefile
-hyp 8kbase.txt
-beam 1.0e-100
-pbeam 1.0e-80
-wbeam 1.0e-70
-subvqbeam 1.0e-100
(I know the beams are pretty wide and this takes a while, but I wanted to make sure that wasn't the problem)
Are there any errors with my process, or have I made an incorrect assumption somewhere?
Thanks in advance!