I am new to Sphinx4 only having used it the past 6 days, I built my project with gradle and was able to run through quite a few tutorials that were posted online. I created my own Grammar file, It is very simple and appears to work correctly whenever I talk through my microphone it will always pick up something that is in my .gram file. But I am wondering if Sphinx is fast enough for real-time applications our needs are to have the user say something like "Hey (Machine Name) Turn Laser On Please" and the machine will respond within the second. From my experience with playing with the tutorial there seems to be a lot of lag time between getting commands and interpreting them. The recognizer.getResult() then speechResult.getHypothesis() seem to take seconds for just 1 or 2 word phrases. I was wondering if it is possible to tell Sphinx to use my Grammar file as the Dictionary, because I am not planning on tracking a lot of the users phrases other than the important ones (50 some odd words). My Desktop application to control a robot like device would be similar to windows Cortana where you say "Hey Cortana" then ... and that action occurs with the servo motors. My current code:
But after each message is parsed it appears to pause for 3-4 seconds until it can start acquiring new words that I am saying from my microphone. Is there a way to provide like a "microphone timeout" which would stop listening and attempt to parse the words after some amount of milliseconds has passed without any utterance of words. I basically would just like some insights as to whether Sphinx4 is capable of handling real-time audio, parsing and responding to short phrases and then immediately go back to parsing audio?
I am new to Sphinx4 only having used it the past 6 days, I built my project with gradle and was able to run through quite a few tutorials that were posted online. I created my own Grammar file, It is very simple and appears to work correctly whenever I talk through my microphone it will always pick up something that is in my .gram file. But I am wondering if Sphinx is fast enough for real-time applications our needs are to have the user say something like "Hey (Machine Name) Turn Laser On Please" and the machine will respond within the second. From my experience with playing with the tutorial there seems to be a lot of lag time between getting commands and interpreting them. The recognizer.getResult() then speechResult.getHypothesis() seem to take seconds for just 1 or 2 word phrases. I was wondering if it is possible to tell Sphinx to use my Grammar file as the Dictionary, because I am not planning on tracking a lot of the users phrases other than the important ones (50 some odd words). My Desktop application to control a robot like device would be similar to windows Cortana where you say "Hey Cortana" then ... and that action occurs with the servo motors. My current code:
But after each message is parsed it appears to pause for 3-4 seconds until it can start acquiring new words that I am saying from my microphone. Is there a way to provide like a "microphone timeout" which would stop listening and attempt to parse the words after some amount of milliseconds has passed without any utterance of words. I basically would just like some insights as to whether Sphinx4 is capable of handling real-time audio, parsing and responding to short phrases and then immediately go back to parsing audio?
Last edit: Sean Connor Phillips 2017-12-03