I am trying to understand the implementation of Sphinx. I am having hard time figuring out how grammars are implemented.
I understand that a grammar graph is created from the JSGF(or grXML) grammars. I was hoping to find a function which given a string, provide its probability (not bool, since we can also provides weights in grammar, correct if I am wrong) of matching the grammar. I am expecting this because similar thing is done "in general" for language models, and according to my best understanding grammars can be interpreted as "LM which is based on a grammar" (more context driven LM). Can you please point me to such implementation (if there is) in code base?.
In File JSGFGrammar.java, the implementation details (in initial comments) says:
All internal probabilities are maintained in LogMath log base.
I am not able to understand what internal probabilities and log base are referred here?
Please let me know the correct implementation details, if my above assumptions are wrong.
Much Thanks,
Ram
(Newbie in Domain)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was hoping to find a function which given a string, provide its probability (not bool, since we can also provides weights in grammar, correct if I am wrong) of matching the grammar.
There is no such things in sphinx4
I am not able to understand what internal probabilities and log base are referred here?
The RuleParse object will have the all the details about matching the string with grammar in form:
RuleParse(<command> = // Match <command>
RuleSequence( // by a sequence of 3 entities
RuleParse(<action> = // First match <action>
RuleAlternatives( // One of a set of alternatives
RuleTag( // matching the tagged
RuleToken("close"), "CL"))) // token "close"
RuleParse(<object> = // Now match <object>
RuleSequence( // by a sequence of 2 entities
RuleSequence( // RuleCount becomes RuleSequence
RuleParse(<this_that_etc> = // Match <this_that_etc>
RuleAlternatives( // One of a set of alternatives
RuleToken("that")))) // is the token "that"
RuleAlternatives( // Match "window | door"
RuleToken("door")))) // as the token "door"
RuleSequence( // RuleCount becomes RuleSequence
RuleParse(<polite> = // Now match <polite>
RuleAlternatives( // by 1 of 2 alternatives
RuleToken("please")))) // The token "please"
)
).
So my question is, what will be the behaviour when weights are used with JSGF. This result feels like a bool (whether string matched the grammar or now). So how the weights provided in JSGF are used with final docoder? Please let me know, it will be huge help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Result of the parsing is a search graph (finite state transducer). The edges could have weights, for example RuleAlternative has weight properly.
Speech recognizer does not match text strings, it matches audio signal. The result is a float score and graph weights are accounted in that score. To understand how matching is performed you can read Rabiner's HMM Tutorial.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hey,
I am trying to understand the implementation of Sphinx. I am having hard time figuring out how grammars are implemented.
I understand that a grammar graph is created from the JSGF(or grXML) grammars. I was hoping to find a function which given a string, provide its probability (not bool, since we can also provides weights in grammar, correct if I am wrong) of matching the grammar. I am expecting this because similar thing is done "in general" for language models, and according to my best understanding grammars can be interpreted as "LM which is based on a grammar" (more context driven LM). Can you please point me to such implementation (if there is) in code base?.
In File JSGFGrammar.java, the implementation details (in initial comments) says:
I am not able to understand what internal probabilities and log base are referred here?
Please let me know the correct implementation details, if my above assumptions are wrong.
Much Thanks,
Ram
(Newbie in Domain)
There is no such things in sphinx4
https://en.wikipedia.org/wiki/Log_probability
Ok,
So this is my current understanding.
After getting the javax.speech.recognition.RuleGrammar instance using edu.cmu.sphinx.jsapi.JSGFGrammar class, we can run:
public RuleParse parse(String text, String ruleName)
link, which will give us javax.speech.recognition.RuleParse object.
The RuleParse object will have the all the details about matching the string with grammar in form:
So my question is, what will be the behaviour when weights are used with JSGF. This result feels like a bool (whether string matched the grammar or now). So how the weights provided in JSGF are used with final docoder? Please let me know, it will be huge help.
Result of the parsing is a search graph (finite state transducer). The edges could have weights, for example RuleAlternative has weight properly.
Speech recognizer does not match text strings, it matches audio signal. The result is a float score and graph weights are accounted in that score. To understand how matching is performed you can read Rabiner's HMM Tutorial.
Hey Nickolay,
Apologies for my above very uninformed questions.
Can you please let me know how wieghts in JSGF are used in speech reocgnition pipeline of sphinx?
Thanks
Last edit: Kaustav Datta 2018-04-03
Thanks for the reply.
Result of the parsing is a search graph (finite state transducer). The edges could have weights, for example RuleAlternative has weight properly.
I think this will solve my issue. Much Thanks
Last edit: Kaustav Datta 2018-04-03