I want to improve my voice recognition system on a Raspberry Pi 3 running Pocket Spinx. It's pretty accurate (limited recognition vocabulary) but is terrible when other noise is around.
Is there any way to pre-process sounds before feeding them into pocket sphinx to filter only the human voice ? I was thinking there might be some way to reduce the data being passed in, eg human voice being between certain frequency etc, white noise removal etc. I haven't found anything simple so far. I'm thinking of filtering and processing the voice recordings on an STM32 that has an Arm Cortex M4 and DSP etc.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I want to improve my voice recognition system on a Raspberry Pi 3 running Pocket Spinx. It's pretty accurate (limited recognition vocabulary) but is terrible when other noise is around.
Is there any way to pre-process sounds before feeding them into pocket sphinx to filter only the human voice ? I was thinking there might be some way to reduce the data being passed in, eg human voice being between certain frequency etc, white noise removal etc. I haven't found anything simple so far. I'm thinking of filtering and processing the voice recordings on an STM32 that has an Arm Cortex M4 and DSP etc.
You already asked the same at https://dsp.stackexchange.com/questions/60334/preprocess-sound-for-voice-recognition-in-pocket-sphinx-on-raspberry-pi-3