I am just starting to investigate Sphinx3.5 and am trying to run some sample data through the decoder.exe to make sure my setup is correct before trying my own audio input.
I have downloaded the open source sample acoustic and language models. I am inputting the file an391-mjwl-6.mfc from the AN4 database. According to the transcript it just contains
the words ENTER, TWO, NINE, EIGHT and ONE. When I run the decoder on it, it gets 0% recognition. If I use the an4.ug.lm.DMP language model which I got from another link on the Sphinx web site, it only gets 2 out of 5 words, or 40% recognition. Below is the contents of the default.arg file I am feeding to the decoder. InputControl.txt just contains the single .MFC file. Am I specifying something incorrect? Am I missing a parameter? I was expecting a much higher recognition rate using the provided sample data.
you can find some instructions and example files in sphinx3 sourcecodedir->model->lm->an4. The right parameters are very important for a good recognition so i would suggest to look into the args.an4* files for some good parameters.
I am just starting to investigate Sphinx3.5 and am trying to run some sample data through the decoder.exe to make sure my setup is correct before trying my own audio input.
I have downloaded the open source sample acoustic and language models. I am inputting the file an391-mjwl-6.mfc from the AN4 database. According to the transcript it just contains
the words ENTER, TWO, NINE, EIGHT and ONE. When I run the decoder on it, it gets 0% recognition. If I use the an4.ug.lm.DMP language model which I got from another link on the Sphinx web site, it only gets 2 out of 5 words, or 40% recognition. Below is the contents of the default.arg file I am feeding to the decoder. InputControl.txt just contains the single .MFC file. Am I specifying something incorrect? Am I missing a parameter? I was expecting a much higher recognition rate using the provided sample data.
-mdef hub4opensrc.6000.mdef
-mean means
-var variances
-mixw mixture_weights
-tmat transition_matrices
-subvq 8gau.6000sen.quant
-dict cmudict.06d
-fdict fillerdict.txt
-lm language_model.arpaformat.DMP
-hypseg hypseg.txt
-ctl inputcontrol.txt
-logfn log.txt
Thanks
Hello,
you can find some instructions and example files in sphinx3 sourcecodedir->model->lm->an4. The right parameters are very important for a good recognition so i would suggest to look into the args.an4* files for some good parameters.
I hope this helps
shio
PS:
you can look for the filesin the cvs, too:
http://cvs.sourceforge.net/viewcvs.py/cmusphinx/sphinx3/model/lm/an4/