I could use a little help, though I suspect I'll be told to go to python forums, who are telling me to go to Sphinx forums :-(.
I've read everything I can find but am still at a bit of a loss as to how to set up to have pocketsphinx do realtime asr within a python program, analogous to using pocetshpinx_continuous in the command line.
Everything I've found that tutors using an import pocketshinx statement and then some example code is for decoding a pre-existing wav file. I want to recognize real time speech input from a microphone.
I've spent many hours pouring over the gst-pocketsphinx-gtk page, and I moderately understand how everything works, although vader still eludes me a bit. I get what it does, but I get a bit confused when I try to work out the button method and vader seems to be active when paused? Anyway, the python code included in that tutorial is perfect for my application except I want the text output to be returned to the python program from which this code is called so it can be further processed as opposed to just being sent to a text-box gui.
I realize this is more likely a python question, but they keep sending me back here. Any help at all would be greatly appreciated.
regards, Richard
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Everything I've found that tutors using an import pocketshinx statement and then some example code is for decoding a pre-existing wav file. I want to recognize real time speech input from a microphone.
Yes, Python bindings doesn't support continuous processing right now, you need to use external tool to split long audio on utterance or you can use gstreamer bindings which are continuous
Anyway, the python code included in that tutorial is perfect for my application except I want the text output to be returned to the python program from which this code is called so it can be further processed as opposed to just being sent to a text-box gui.
Gstreamer works in message-passing callback mode, you need to understand "callback" pattern to effectively work with it. Basically gstreamer sends you messages with decoded text and you need to process them. This is not a function call pattern.
You can add your own handler here if you need that.
def final_result(self, hyp, uttid):
"""Insert the final result."""
# Add your code here, do whatever you want with a hyp
print hyp
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I could use a little help, though I suspect I'll be told to go to python forums, who are telling me to go to Sphinx forums :-(.
I've read everything I can find but am still at a bit of a loss as to how to set up to have pocketsphinx do realtime asr within a python program, analogous to using pocetshpinx_continuous in the command line.
Everything I've found that tutors using an import pocketshinx statement and then some example code is for decoding a pre-existing wav file. I want to recognize real time speech input from a microphone.
I've spent many hours pouring over the gst-pocketsphinx-gtk page, and I moderately understand how everything works, although vader still eludes me a bit. I get what it does, but I get a bit confused when I try to work out the button method and vader seems to be active when paused? Anyway, the python code included in that tutorial is perfect for my application except I want the text output to be returned to the python program from which this code is called so it can be further processed as opposed to just being sent to a text-box gui.
I realize this is more likely a python question, but they keep sending me back here. Any help at all would be greatly appreciated.
regards, Richard
Yes, Python bindings doesn't support continuous processing right now, you need to use external tool to split long audio on utterance or you can use gstreamer bindings which are continuous
Gstreamer works in message-passing callback mode, you need to understand "callback" pattern to effectively work with it. Basically gstreamer sends you messages with decoded text and you need to process them. This is not a function call pattern.
You can add your own handler here if you need that.