I am new to the Sphinx4 project and am looking for a quick and easy way to add new vocabulary to the HelloDigits demo.
I am prototyping a new project and would just like to get a feel for the functionality.
I have been reading about the Sphinx4 configuration file, which is where I think I have to modify the corresponding acoustic models to add different vocabulary.
However, the whole process seems somewhat convoluted and I am not sure if it should be this difficult.
Can anyone provide me with some insight into the methodology?
Also, I found the following thread from a while back, however it contained no solutions to the actual question:
<--------------------------------------->
hi,
i have just downloaded and installed the src version. i have built the demos and all is working fine.
however, i was just wondering if someone could tell me how i can recognize certain words, my own words for my application.
like for example for HelloDigits it recognizes numbers 0-9.
my Q is where can i change this dictionary of 0-9 to recognize my own words, i.e if it's even possible.
is there a demo that would recognize any words really, like my words are 'close', 'save', 'submit' etc...
it won't look very well if i have to say a digit to execute my task, like "one" closes application :)
thanks in advance
<--------------------------------------->
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You should use more generic acoustic model instead of tidigits. Also you should modify dictionary and JSFG. I suggest you to start with helloworld example instead since it's already has more generic dictionary and acoustic model. Change hello.gram to describe your patterns. Grammar description:
Hello,
I am new to the Sphinx4 project and am looking for a quick and easy way to add new vocabulary to the HelloDigits demo.
I am prototyping a new project and would just like to get a feel for the functionality.
I have been reading about the Sphinx4 configuration file, which is where I think I have to modify the corresponding acoustic models to add different vocabulary.
However, the whole process seems somewhat convoluted and I am not sure if it should be this difficult.
Can anyone provide me with some insight into the methodology?
Also, I found the following thread from a while back, however it contained no solutions to the actual question:
<--------------------------------------->
hi,
i have just downloaded and installed the src version. i have built the demos and all is working fine.
however, i was just wondering if someone could tell me how i can recognize certain words, my own words for my application.
like for example for HelloDigits it recognizes numbers 0-9.
my Q is where can i change this dictionary of 0-9 to recognize my own words, i.e if it's even possible.
is there a demo that would recognize any words really, like my words are 'close', 'save', 'submit' etc...
it won't look very well if i have to say a digit to execute my task, like "one" closes application :)
thanks in advance
<--------------------------------------->
You should use more generic acoustic model instead of tidigits. Also you should modify dictionary and JSFG. I suggest you to start with helloworld example instead since it's already has more generic dictionary and acoustic model. Change hello.gram to describe your patterns. Grammar description:
99 <component name="jsgfGrammar" type="edu.cmu.sphinx.jsapi.JSGFGrammar">
100 <property name="dictionary" value="dictionary"/>
101 <property name="grammarLocation"
102 value="resource:/demo.sphinx.helloworld.HelloWorld!/demo/sphinx/helloworld/"/>
103 <property name="grammarName" value="hello"/>
104 <property name="logMath" value="logMath"/>
105 </component>
Also you can try to adjust dictionary to make it smaller. Dictionary is described in a following part of config:
112 <component name="dictionary"
113 type="edu.cmu.sphinx.linguist.dictionary.FastDictionary">
114 <property name="dictionaryPath"
115 value="resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz/dict/cmudict.0.6d"/>
116 <property name="fillerPath"
117 value="resource:/edu.cmu.sphinx.model.acoustic.WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model!/edu/cmu/sphinx/model/acoustic/WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz/dict/fillerdict"/>
118 <property name="addSilEndingPronunciation" value="false"/>
119 <property name="allowMissingWords" value="false"/>
120 <property name="unitManager" value="unitManager"/>
121 </component>
any ideas?