I'm not sure where to put this file or where to set a relevant parameter to ensure that the correct model is used. This is the code that I tried using:
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source)
r.recognize_sphinx(audio, language="en-us-8khz"))
Which produced the error:
Sphinx error; missing PocketSphinx language data directory: "/usr/local/lib/python3.6/dist-packages/speech_recognition/pocketsphinx-data/en-us-8khz"
There is a folder "en-us" in that path location already. I assume that is the default 16khz model being used. If I put the cmusphinx-en-us-8khz-5.2.tar.gz file in that path and unzip it, it does not create a similar "en-us-8khz" folder with similar files to the "en-us" folder. Is this not the correct tar file to use? Or am I wrong about how to use the file?
Thanks in advance for any help on this.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The files in cmusphinx-en-us-8khz-5.2.tar.gz are the equivalent of those in en-us/acoustic-model. I tried making a folder with that name and moving them in there. The next error was that I'm missing language-model.lm.bin. Where do I find that?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have set up speechRecognizer with pocketSphinx. The default model isn't working for my audio. I need the 8khz model. I believe I need to download cmusphinx-en-us-8khz-5.2.tar.gz from https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English/.
I'm not sure where to put this file or where to set a relevant parameter to ensure that the correct model is used. This is the code that I tried using:
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source)
r.recognize_sphinx(audio, language="en-us-8khz"))
Which produced the error:
Sphinx error; missing PocketSphinx language data directory: "/usr/local/lib/python3.6/dist-packages/speech_recognition/pocketsphinx-data/en-us-8khz"
There is a folder "en-us" in that path location already. I assume that is the default 16khz model being used. If I put the cmusphinx-en-us-8khz-5.2.tar.gz file in that path and unzip it, it does not create a similar "en-us-8khz" folder with similar files to the "en-us" folder. Is this not the correct tar file to use? Or am I wrong about how to use the file?
Thanks in advance for any help on this.
The files in cmusphinx-en-us-8khz-5.2.tar.gz are the equivalent of those in en-us/acoustic-model. I tried making a folder with that name and moving them in there. The next error was that I'm missing language-model.lm.bin. Where do I find that?
The language model can be the same en-us.lm.bin from pocketsphinx distribution. Or your can build the language model yourself.
Once we build a model, do we simply point the path to "new model" vs en-us.lm.bin?