I am new to Sphinx4 only having used it the past 6 days, I built my project with gradle and was able to run through quite a few tutorials that were posted online. I created my own Grammar file, It is very simple and appears to work correctly whenever I talk through my microphone it will always pick up something that is in my .gram file. But I am wondering if Sphinx is fast enough for real-time applications our needs are to have the user say something like "Hey (Machine Name) Turn Laser On Please" and the machine will respond within the second. From my experience with playing with the tutorial there seems to be a lot of lag time between getting commands and interpreting them. The recognizer.getResult() then speechResult.getHypothesis() seem to take seconds for just 1 or 2 word phrases. I was wondering if it is possible to tell Sphinx to use my Grammar file as the Dictionary, because I am not planning on tracking a lot of the users phrases other than the important ones (50 some odd words). My Desktop application to control a robot like device would be similar to windows Cortana where you say "Hey Cortana" then ... and that action occurs with the servo motors. My current code:

try {
                    while (speechRecognizerThreadRunning) {
                        SpeechResult speechResult = recognizer.getResult();

                        //Check if we ignore the speech recognition results
                        if (!ignoreSpeechRecognitionResults) {

                            //Check the result
                            if (speechResult == null)
                                logger.log(Level.INFO, "I can't understand what you said.\n");
                            else {

                                //Get the hypothesis
                                speechRecognitionResult = speechResult.getHypothesis();

                                //You said?
                                System.out.println("You said: [" + speechRecognitionResult"]\n");

                                //Call the appropriate method 
                                makeDecision(speechRecognitionResult, speechResult.getWords());

                            }
                        } else
                            logger.log(Level.INFO, "Ingoring Speech Recognition Results...");

                    }
                } catch (Exception ex) {
                    logger.log(Level.WARNING, null, ex);
                    speechRecognizerThreadRunning = false;
                }

But after each message is parsed it appears to pause for 3-4 seconds until it can start acquiring new words that I am saying from my microphone. Is there a way to provide like a "microphone timeout" which would stop listening and attempt to parse the words after some amount of milliseconds has passed without any utterance of words. I basically would just like some insights as to whether Sphinx4 is capable of handling real-time audio, parsing and responding to short phrases and then immediately go back to parsing audio?

 

Last edit: Sean Connor Phillips 2017-12-03