I have been mucking around with my Raspberry Pi for a while now and have setup Pocketsphinx on it. While i have been working with Open source and am well versed with Linux/Unix with a basic scripting capability. I am able to work through Python code (if it's not too complex) but am not much of a Python programmer (though I would love to pick up some python skills over the coming years).
My raspberry pi is part of a robot I've built (GoPiGo) and I am trying to work out how to control it using some simple commands. This is where I am really stuck and I would really appreciate some direction. I've looked at the forums, various examples but I can't really seem to put together a simple program in python that picks up what I am saying and translates it to text. Eventually I want to be able to run a command using the python program once a string is identified.
I am able to run Pocketsphinx by itself at the command line and it's able to pick up the words, so no issues with that. The challenge is getting it to work through Python where it's a streaming application, listening to my (mike), picking up the commands and passing it along to the robot (GoPiGo) to do stuff.
The code below is quite amateurish...feel free to bag me for any inefficient coding practices you find in there. I would really appreciate some direction on sorting this out.
fromosimportenviron,pathfrompocketsphinx.pocketsphinximport*fromsphinxbase.sphinxbaseimport*importpyaudioimportwaveimportsocketMODELDIR="/usr/local/share/pocketsphinx/model"DATADIR="/home/perf/Downloads/Python/Dev/PocketSpinx_TTS/data"config=Decoder.default_config()config.set_string('-adcdev','sysdefault')config.set_string('-hmm','/usr/local/share/pocketsphinx/model/en-us/en-us')config.set_string('-lm','/home/perf/Downloads/Python/Dev/PocketSpinx_TTS/data/9735.lm')config.set_string('-dict','/home/perf/Downloads/Python/Dev/PocketSpinx_TTS/data/9735.dic')config.set_string('-samprate','8000')config.set_string('-inmic','yes')decoder=Decoder(config)p=pyaudio.PyAudio()stream=p.open(format=pyaudio.paInt16,channels=1,rate=8000,input=True,frames_per_buffer=1024,input_device_index=0)stream.start_stream()in_speech_bf=Truedecoder.start_utt()print"Starting to listen"whileTrue:buf=stream.read(1024)decoder.process_raw(buf,False,False)ifdecoder.hyp()!=Noneanddecoder.hyp().hypstr=='FORWARD':decoder.end_utt()print"Detected Move Forward, restarting search"decoder.start_utt()print"Am not listening any more"
Last edit: VisualizeIT 2016-05-08
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you so much for responding. Appreciate the assistance.
I've also tried using the "keyword" spotting mode examples and the program seems to fail on the Raspberry Pi with an input overflow error. I've done a bit of digging around the forums, folks have suggested changing the buffer size which I've done (256, 512, etc) but I still get the same error.
#!/usr/bin/pythonimportsys,osimportpyaudiofrompocketsphinx.pocketsphinximport*fromsphinxbase.sphinxbaseimport*modeldir="/usr/local/share/pocketsphinx/model"datadir="../../../test/data"# Create a decoder with certain modelconfig=Decoder.default_config()config.set_string('-adcdev','sysdefault')config.set_string('-samprate','8000')config.set_string('-inmic','yes')config.set_string('-hmm',os.path.join(modeldir,'en-us/en-us'))config.set_string('-dict',os.path.join(modeldir,'en-us/cmudict-en-us.dict'))config.set_string('-keyphrase','forward')config.set_float('-kws_threshold',1e+20)# Open file to read the data#stream = open(os.path.join(datadir, "goforward.raw"), "rb")# Alternatively you can read from microphonep=pyaudio.PyAudio()stream=p.open(format=pyaudio.paInt16,channels=1,rate=8000,input=True,output=True,frames_per_buffer=1024)stream.start_stream()# Process audio chunk by chunk. On keyword detected perform action and restart searchdecoder=Decoder(config)decoder.start_utt()whileTrue:buf=stream.read(1024)ifbuf:decoder.process_raw(buf,False,False)else:breakifdecoder.hyp()!=None:print([(seg.word,seg.prob,seg.start_frame,seg.end_frame)forsegindecoder.seg()])print("Detected keyword, restarting search")decoder.end_utt()decoder.start_utt()
I now can get the program to run but even for the keyword searches, nothing ever shows up. The python program doesn't ever manage to find any keywords!!!!
Last edit: VisualizeIT 2016-05-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Folks,
I have been mucking around with my Raspberry Pi for a while now and have setup Pocketsphinx on it. While i have been working with Open source and am well versed with Linux/Unix with a basic scripting capability. I am able to work through Python code (if it's not too complex) but am not much of a Python programmer (though I would love to pick up some python skills over the coming years).
My raspberry pi is part of a robot I've built (GoPiGo) and I am trying to work out how to control it using some simple commands. This is where I am really stuck and I would really appreciate some direction. I've looked at the forums, various examples but I can't really seem to put together a simple program in python that picks up what I am saying and translates it to text. Eventually I want to be able to run a command using the python program once a string is identified.
I am able to run Pocketsphinx by itself at the command line and it's able to pick up the words, so no issues with that. The challenge is getting it to work through Python where it's a streaming application, listening to my (mike), picking up the commands and passing it along to the robot (GoPiGo) to do stuff.
The code below is quite amateurish...feel free to bag me for any inefficient coding practices you find in there. I would really appreciate some direction on sorting this out.
Last edit: VisualizeIT 2016-05-08
You forgot to tell what is the problem with your code. You also forgot to format it properly.
You need to use keyword spotting mode to continuously look for commands, the sample code is here:
https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py
Keyword spotting is explained in tutorial, I recommend you to read it too
http://cmusphinx.sourceforge.net/wiki/tutoriallm
Nickolay. Thank you so much for responding. Appreciate the assistance.
Ok, I figured out what was the issue with the code I submitted. I've re-submitted as code.
I've also attached the code as a text file.
The issue is that, nothing is ever recognized. The program directly jumps to the last block i.e. nothing detected, restarting search.
Any pointers would be appreciated.
Last edit: VisualizeIT 2016-05-08
Nickolay,
Thank you so much for responding. Appreciate the assistance.
I've also tried using the "keyword" spotting mode examples and the program seems to fail on the Raspberry Pi with an input overflow error. I've done a bit of digging around the forums, folks have suggested changing the buffer size which I've done (256, 512, etc) but I still get the same error.
The example I am using is -
The error it fails with is as follows -
Last edit: VisualizeIT 2016-05-08
Its a PyAudio audio recording issue. See link below for pyaudio Record example .
https://people.csail.mit.edu/hubert/pyaudio/
Error indicate, your Mic is not supporting Sample Rate 8000. You need to set a default plug plugin in ~/.asoundrc (Rasbian)
something like this:
pcm.!default {
type plug
slave {
pcm "hw:0,0"
}
}
ctl.!default {
type hw
card 0
}
Thanks for that G10DRAS. I've tried the config you've mentioned above but the errors prevails.
Here's the output from /proc/asound/card0/stream0
Here's the error from the python program (the keyword detection example I've modified. Code provided above).
Thanks for that G10DRAS. You were spot on. I spent a lot of time digging around and realized that I needed to play around with the buffer size.
stream = p.open(format=pyaudio.paInt16, channels=1, rate=8000, input=True, output=True, frames_per_buffer=8192)
I now can get the program to run but even for the keyword searches, nothing ever shows up. The python program doesn't ever manage to find any keywords!!!!
Last edit: VisualizeIT 2016-05-09
You need to configure threshold as described in tutorial
Thanks Nickolay. I'll take a look and come back with additional questions if required.
I appreciate the support.