Thanks for early helps. I am edging forward with using an FSG for my word recogniser, using the python test script. I have a new error: it may be that I am missing something in the config/comand line arguments, or it may be that my acoustic model is no good.
Command Line:
Instead of giving an -lm argument, I am setting -mode to "fsg" and setting -fsg to the path to the fsg file. Apart from -hmm, -dict and -fdict, I am giving no other arguments.
Error Message:
Comparing the output with output from earlier attempts with an LM, Sphinx seems ot read in ther AM, and the dictionaries OK. I notice I get the same error output if I give a bogus/non-existent filename for the fsg file, so the error can't be to do with the fsg itself. Perhaps I am missing some command line arguments, although this looks like something more to do with the acoustic model.
Here it is reading in the dictionaries OK, then going pear-shaped:
;; This buffer is for notes you don't want to save, and for Lisp evaluation.
;; If you want to create a file, visit that file with C-x C-f,
;; then enter the text in that file's own buffer.
However, the AM I've built is a word model, and the SphinxTrain docs [1] recommend using more hmm states for a word model (I'm using seven). Is there a command-line argument I can set for Sphinx to use more states more hmm, or must I edit and recompile the source?
Q: How many states-per-hmm should I specify for my training?
A: If you have "difficult" speech (noisy/spontaneous/damaged), use 3-state hmms with a noskip topology. For clean speech you may choose to use any odd number of states, depending on the amount of data you have and the type of acoustic units you are training. If you are training word models, for example, you might be better off using 5 states or higher. 3-5 states are good for shorter acoustic units like phones. You cannot currently train 1 state hmms with the Sphinx.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've recompiled sphinx3 with a higher MAX_HMM_NSTATE in hmm.h and everything "just works"! Would still like to know if there's a command-line argument I can use instead.
Sorry for thinking aloud in public!
Best
Ivan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear All
Thanks for early helps. I am edging forward with using an FSG for my word recogniser, using the python test script. I have a new error: it may be that I am missing something in the config/comand line arguments, or it may be that my acoustic model is no good.
Command Line:
Instead of giving an -lm argument, I am setting -mode to "fsg" and setting -fsg to the path to the fsg file. Apart from -hmm, -dict and -fdict, I am giving no other arguments.
Error Message:
Comparing the output with output from earlier attempts with an LM, Sphinx seems ot read in ther AM, and the dictionaries OK. I notice I get the same error output if I give a bogus/non-existent filename for the fsg file, so the error can't be to do with the fsg itself. Perhaps I am missing some command line arguments, although this looks like something more to do with the acoustic model.
Here it is reading in the dictionaries OK, then going pear-shaped:
;; This buffer is for notes you don't want to save, and for Lisp evaluation.
;; If you want to create a file, visit that file with C-x C-f,
;; then enter the text in that file's own buffer.
Any help much appreciated. Once I get this working, I'll be sure to write it up and put a howto or similar on the web.
Thanks and best wishes
Ivan
Dear All
Further on! It's not the AM (yet).
The last line in the error message is:
MAX_HMM_NSTATE is hardcoded in hmm.h to be 5:
However, the AM I've built is a word model, and the SphinxTrain docs [1] recommend using more hmm states for a word model (I'm using seven). Is there a command-line argument I can set for Sphinx to use more states more hmm, or must I edit and recompile the source?
Thanks
Ivan
[1] from http://www.speech.cs.cmu.edu/sphinxman/FAQ.html
Q: How many states-per-hmm should I specify for my training?
A: If you have "difficult" speech (noisy/spontaneous/damaged), use 3-state hmms with a noskip topology. For clean speech you may choose to use any odd number of states, depending on the amount of data you have and the type of acoustic units you are training. If you are training word models, for example, you might be better off using 5 states or higher. 3-5 states are good for shorter acoustic units like phones. You cannot currently train 1 state hmms with the Sphinx.
Dear All
I've recompiled sphinx3 with a higher MAX_HMM_NSTATE in hmm.h and everything "just works"! Would still like to know if there's a command-line argument I can use instead.
Sorry for thinking aloud in public!
Best
Ivan