CMU Sphinx / Forums / Help: Help with finding documentation on PockietSphinx-Python Methods please?

James B Ross - 2020-04-20

I have PocketSphinx installed on a small SBC running Ubuntu 18.04. I have also installed PocketSphinx-Python and I have been able to run some example python code I found including one that allows me to use microphone input.

However, I would like to write my own Python programs to access pocketsphinx, but I don't know where to find the methods used by pocketsphinx-python or how to use them.

For example there is a method called "decode"

In the example program they are using decode.end_uut()

I have no idea what this does other than it seems to print out a lot of information to the terminal and decode anything I say into the microphone. I might add that pocketsphinx has been decoding my speech at a very good accuracy so far.

However, I would like to gain more control over how my Python program interacts with pocketsphinx.

My Python IDE has intellisene and when I type in decoder. it offers up a very long list of options that can be chosen.

Where can I find information on what all these options do?

Is there any documentaion for the methods I can use via PocketSphinx-Python?

I'm building a Linguistic AI system and I 'm using pocketsphinx as the speech recognizer for the input. So I'd like to get the results of the decoding process into variable as eifficiently as possible. All I need to know is what was said into the microphone.

Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-20
  
  Try https://github.com/alphacep/vosk-api, it is much more accurate.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-20

Other people have pointed me to vosk as well. The problem is that I haven't been able to find much information about vosk. I don't even know what it is, or how to use it?

Does vosk use pocketsphinx?

If not, what exactly is vosk? And where do I find detailed information on it beyond that github page, especially in terms of tutorials.?

I'm already having close to 100% accuracy with Pocket Sphinx. It's been decoding everything I throw at it with near perfection. Perhaps it likes the way I speek?

Also, if I move over to vosk It also appears to be dependent on Kaldi?

What's Kaldi?

I need total indepence from Internet. Is Vosk and Kaldi Internet independent?

Do they have dictionaries that I can modify like PocketSphinx has?

And will my original question even be answered? Will I be able to find detailed information on the methods I can use in Python when using vosk-Kaldi?

I don't want to have to install a whole new system only to discover that I'll have the same questions about how to access it efficiently from Python.

Is CMUSphinx dead?

If so why isn't this CMUSphinx group proclaimed to be obsolete or dead, and we aren't all just being pointed over to vosk-Kaldi?

This is supposed to be a CMUSphinx forum. Is CMUSphinx gargage now?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-20
  
  I don't even know what it is, or how to use it?
  
  You are welcome to ask,
  
  Does vosk use pocketsphinx?
  
  No
  
  If not, what exactly is vosk? And where do I find detailed information on it beyond that github page, especially in terms of tutorials.?
  
  It is a software library to recognize speech just like pocketsphinx.
  
  I'm already having close to 100% accuracy with Pocket Sphinx. It's been decoding everything I throw at it with near perfection. Perhaps it likes the way I speek?
  
  If it is perfect already, what are you asking about then?
  
  Also, if I move over to vosk It also appears to be dependent on Kaldi? What's Kaldi?
  
  Kaldi is speech recognition toolkit.
  
  I need total indepence from Internet. Is Vosk and Kaldi Internet independent?
  
  Yes, you do not need internet.
  
  Do they have dictionaries that I can modify like PocketSphinx has?
  
  Yes
  
  And will my original question even be answered? Will I be able to find detailed information on the methods I can use in Python when using vosk-Kaldi?
  
  Sure, it is all in the sources and demos.
  
  I don't want to have to install a whole new system only to discover that I'll have the same questions about how to access it efficiently from Python.
  
  Ok, it is up to you.
  
  Is CMUSphinx dead?
  
  Yes.
  
  If so why isn't this CMUSphinx group proclaimed to be obsolete or dead, and we aren't all just being pointed over to vosk-Kaldi?
  
  You are pointed over to vosk-kaldi.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-20

"If it is perfect already, what are you asking about then?"

I was asking for information on how to access and control pocketsphinx from Python. I wasn't complaining that it doesn't decode speech well.

And I have a feeling that if I move over to vosk-kaldi I'll end up having the same questions. And I will have basically gained nothing.

But hey, I'll give it a shot.

The thing that is so disusting is that I have already spend about 2 weeks learning all about pocketsphinx. Now I'll need to start all over from scratch learning about vosk-kaldi.

And I've already done searchers for tutorials on vosk and kaldi and I haven't found much. Do they have a tutorial page like CMUSphinx has? At leat CMUSphinx provided quite a bit of documentation. https://cmusphinx.github.io/wiki/tutorial/

This feels like going back to square one moving over to vosk-kaldi. I hope it's worth it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-20
  
  This feels like going back to square one moving over to vosk-kaldi. I hope it's worth it.
  
  Absolutely! Let me know if you have further questions.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-21

Ok, I have a problem right off the bat:

I'm running Ubuntu 18.04 on a Jetson Nano arm64.

From the GitHub page I tried the foillowing:

james@james-desktop:~/vosk-kaldi$ pip3 --version
pip 20.0.2 from /home/james/.local/lib/python3.6/site-packages/pip (python 3.6)
(I have the correct verion of pip)

james@james-desktop:~/vosk-kaldi$ pip3 install vosk
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement vosk (from versions: none)
ERROR: No matching distribution found for vosk

(so then I tried the following:

james@james-desktop:~/vosk-kaldi$ python3 -m pip install vosk
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement vosk (from versions: none)
ERROR: No matching distribution found for vosk

(Same error)

Just for the record I was able to install pocketbase, pocketsphinx , and pocketsphinx-python on this machine with no problems. And it's all working.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-21
  
  Try
  
  pip3 install https://github.com/alphacep/vosk-api/releases/download/0.3.3/vosk-0.3.3-cp36-cp36m-linux_aarch64.whl
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-21

Ok, that seemed to work.

james@james-desktop:~/vosk-kaldi$ pip3 install https://github.com/alphacep/vosk-api/releases/download/0.3.3/vosk-0.3.3-cp36-cp36m-linux_aarch64.whl
Defaulting to user installation because normal site-packages is not writeable
Collecting vosk==0.3.3
Downloading https://github.com/alphacep/vosk-api/releases/download/0.3.3/vosk-0.3.3-cp36-cp36m-linux_aarch64.whl (2.5 MB)
|████████████████████████████████| 2.5 MB 4.9 kB/s
Installing collected packages: vosk
Successfully installed vosk-0.3.3
james@james-desktop:~/vosk-kaldi$

I have another question. I found this page: http://kaldi-asr.org/doc/index.html

That appears to have a lot of information on it about Kaldi. But there's no mention of vosk anywhere.

What exactly is vosk, and why is it needed? Isn't Kaldi suppoosed to be the SRE?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-21
  
  What exactly is vosk, and why is it needed? Isn't Kaldi suppoosed to be the SRE?
  
  If you want to simply use speech recognizer from python, you can use vosk prepackaged wheels and models. Kaldi is more a system for speech researchers with complex install, api and usage.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

I'm not getting anywhere with vosk.
I've installed vosk and Kaldi.

I'm trying to run the following python text code for vosk

#!/usr/bin/python3

from vosk import Model, KaldiRecognizer
import sys
import os
import wave

if not os.path.exists("model-en"):
    print ("Please download the model from https://github.com/alphacep/kaldi-android-demo/releases and unpack as 'model-en' in the current folder.")
    exit (1)

wf = wave.open(sys.argv[1], "rb")
if wf.getnchannels() != 1 or wf.getsampwidth() != 2 or wf.getcomptype() != "NONE":
    print ("Audio file must be WAV format mono PCM.")
    exit (1)

model = Model("model-en")
rec = KaldiRecognizer(model, wf.getframerate())

while True:
    data = wf.readframes(1000)
    if len(data) == 0:
        break
    if rec.AcceptWaveform(data):
        print(rec.Result())
    else:
        print(rec.PartialResult())

print(rec.FinalResult())

And I get the following erros:

kaldi/test_simple.py~/vosk-kaldi/kaldi$ /usr/bin/python3 /home/james/vosk-kaldi/k
Traceback (most recent call last):
  File "/home/james/vosk-kaldi/kaldi/test_simple.py", line 3, in <module>
    from vosk import Model, KaldiRecognizer
  File "/home/james/.local/lib/python3.6/site-packages/vosk/__init__.py", line 1, in <module>
    from .vosk import KaldiRecognizer, Model, SpkModel
  File "/home/james/.local/lib/python3.6/site-packages/vosk/vosk.py", line 13, in <module>
    from . import _vosk
**ImportError: libgfortran.so.3: cannot open shared object file: No such file or directory**

Is there anyone still alive who can answer questions about Pocket Sphinx.? I'd rather be using pocketsphinx to be honest. It was looking really promising for my specific project.

In terms of not being able to find any help, I'm finding that vosk isn't any better. I can't find any information on vosk at all beyond the github page which has very limited information.

Nickolay V. Shmyrev - 2020-04-23

ImportError: libgfortran.so.3: cannot open shared object file: No such file or directory

You need to install libfortran.so.3 with sudo apt-get install libgfortran3

Is there anyone still alive who can answer questions about Pocket Sphinx.? I'd rather be using pocketsphinx to be honest. It was looking really promising for my specific project.

You are welcome to ask.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-23

I did install libgfortran as a dependency Kaldi.

I'll install libgfortran3 and see if that helps.

Although I'm probably going to have as many questions about vosk as I have for pocketsphinx.

The first thing I'll want to do is empty its dictionary and start my own dictionary from scratch. I already know how to do that for pocketsphinx.

Currently I have pocketsphinx running. And it's decoding my speech close to 100% accuracy. In fact, most of the time it is 100% accurate. As I say, it must like my voice.

The things I need to know are little things, like how to stop it from printing out all the INFO: and debug statements on the command line. It basically prints out several pages of what it's doing with the decided speech somewhere in the middle. I'd like to be able to turn off those printouts. It's nice to have them when needed, but I don't need them when everything is working properly.

Here's my Python Code for Pocket Sphinx:

# This program is working! # Needed to set the microphone input correctly on the desktop mic app. # Saturday April 18th. import subprocess as cmdLine from os import environ, path import pyaudio from pocketsphinx.pocketsphinx import * from sphinxbase.sphinxbase import * cmdLine.call('clear') MODELDIR = "pocketsphinx/model" config = Decoder.default_config() config.set_string('-hmm', path.join(MODELDIR, 'en-us/en-us')) config.set_string('-lm', path.join(MODELDIR, 'en-us/en-us.lm.bin')) config.set_string('-dict', path.join(MODELDIR, 'en-us/cmudict-en-us.dict')) decoder = Decoder(config) p = pyaudio.PyAudio() stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024) stream.start_stream() in_speech_bf = False decoder.start_utt() Result = "" while True: buf = stream.read(1024) if buf: decoder.process_raw(buf, False, False) if decoder.get_in_speech() != in_speech_bf: in_speech_bf = decoder.get_in_speech() if not in_speech_bf: decoder.end_utt() Result = decoder.hyp().hypstr print (Result) break else: break print ("Decoded Speech: {0}".format(Result))

And all I want it to do is print out the final decoded speech.

But it prints out all of the following:

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from pocketsphinx/model/en-us/en-us/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn live batch
-cmninit 40,3,-1 41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,-1.78,-5.08,-2.05,-6.45,-1.42,1.17
-compallsen no no
-debug 0
-dict pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm pocketsphinx/model/en-us/en-us
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm pocketsphinx/model/en-us/en-us.lm.bin
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(166): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: pocketsphinx/model/en-us/en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: pocketsphinx/model/en-us/en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(149): Reading HMM transition probability matrices: pocketsphinx/model/en-us/en-us/transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: pocketsphinx/model/en-us/en-us/means
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: pocketsphinx/model/en-us/en-us/variances
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file pocketsphinx/model/en-us/en-us/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138824 * 32 bytes (4338 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Dictionary size 134723, allocated 1016 KiB for strings, 1679 KiB for phones
INFO: dict.c(336): 134723 words read
INFO: dict.c(358): Reading filler dictionary: pocketsphinx/model/en-us/en-us/noisedict
INFO: dict.c(213): Dictionary size 134728, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 791 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 152609
INFO: ngram_search_fwdtree.c(333): Created 723 root, 152481 non-root channels, 53 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.front.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM front
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround21
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround21
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround40.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround40
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround41
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround50
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround51
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround71.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround71
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM iec958
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM spdif
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM spdif
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_dmix.c:990:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
INFO: ngram_search.c(459): Resized backpointer table to 10000 entries
INFO: ngram_search.c(467): Resized score stack to 200000 entries
INFO: cmn_live.c(120): Update from < 41.00 -5.29 -0.12 5.09 2.48 -4.07 -1.37 -1.78 -5.08 -2.05 -6.45 -1.42 1.17 >
INFO: cmn_live.c(138): Update to < 51.82 -0.02 -12.76 7.46 -4.59 -1.31 1.28 -0.43 1.27 -0.31 1.46 6.54 -1.84 >
INFO: ngram_search_fwdtree.c(1550): 6515 words recognized (17/fr)
INFO: ngram_search_fwdtree.c(1552): 1118489 senones evaluated (2967/fr)
INFO: ngram_search_fwdtree.c(1556): 5693477 channels searched (15102/fr), 200745 1st, 257404 last
INFO: ngram_search_fwdtree.c(1559): 13944 words for which last channels evaluated (36/fr)
INFO: ngram_search_fwdtree.c(1561): 414142 candidate words for entering last phone (1098/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 7.28 CPU 1.932 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 7.76 wall 2.058 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 202 words
INFO: ngram_search_fwdflat.c(948): 2710 words recognized (7/fr)
INFO: ngram_search_fwdflat.c(950): 279330 senones evaluated (741/fr)
INFO: ngram_search_fwdflat.c(952): 348757 channels searched (925/fr)
INFO: ngram_search_fwdflat.c(954): 19506 words searched (51/fr)
INFO: ngram_search_fwdflat.c(957): 12693 word transitions (33/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.50 CPU 0.133 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.52 wall 0.138 xRT
INFO: ngram_search.c(1250): lattice start node ~~.0 end node~~ .330
INFO: ngram_search.c(1276): Eliminated 2 nodes before end node
INFO: ngram_search.c(1381): Lattice has 645 nodes, 1119 links
INFO: ps_lattice.c(1380): Bestpath score: -10877
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(:330:375) = -768052
INFO: ps_lattice.c(1441): Joint P(O,S) = -794637 P(S|O) = -26585
INFO: ngram_search.c(872): bestpath 0.01 CPU 0.002 xRT
INFO: ngram_search.c(875): bestpath 0.01 wall 0.002 xRT
hello my name is james
Decoded Speech: hello my name is james
INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 7.28 CPU 1.937 xRT
INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 7.76 wall 2.064 xRT
INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.50 CPU 0.133 xRT
INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.52 wall 0.138 xRT
INFO: ngram_search.c(303): TOTAL bestpath 0.01 CPU 0.002 xRT
INFO: ngram_search.c(306): TOTAL bestpath 0.01 wall 0.002 xRT
james@james-desktop:~/speech_recognition/pocketsphinx-python$

Notice near the bottom it prints:

hello my name is james
Decoded Speech: hello my name is james

That's 100% perfectly decoded speech.

And it printed both those lines because my program printed those lines.

How do I shut off all the other messages?

I'd really like to find documentation on the Pocket Sphinx source code so I can see what the methods I call actualy do, and what parameters I can send them to do things like telling them not to print messages.

I have pocketsphinx running. I like it. I just want to have more control over it from Python, that's all. Simple little things like telling it to quit printing debug message.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-23
  
  How do I shut off all the other messages?
  
  add config.set_string('-logfn', '/dev/null'))
  
  see also
  
  https://stackoverflow.com/questions/17825820/how-do-i-turn-off-e-info-in-pocketsphinx
  
  I'd really like to find documentation on the Pocket Sphinx source code so I can see what the methods I call actualy do, and what parameters I can send them to do things like telling them not to print messages.
  
  There is C documentation here. https://cmusphinx.github.io/doc/pocketsphinx/files.html. There is no Python documentation, just the source code.
  
  I have pocketsphinx running. I like it. I just want to have more control over it from Python, that's all. Simple little things like telling it to quit printing debug message.
  
  Ok!
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Ok I finally got VOSK running, but as I suspected I have simlar questions with vosk:

Here's my Python code:

#!/usr/bin/python3

from vosk import Model, KaldiRecognizer
import time
import os
os.system('clear')
time.sleep(1)

if not os.path.exists("model-en"):
    print ("Please download the model from https://github.com/alphacep/kaldi-android-demo/releases and unpack as 'model-en' in the current folder.")
    exit (1)

import pyaudio

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=8000)
stream.start_stream()

model = Model("model-en")
rec = KaldiRecognizer(model, 16000)

Decoded_speech = ""
print("STARTING HERE:")
while True:
    data = stream.read(2000)
    if len(data) == 0:
        break
    if rec.AcceptWaveform(data):
        Decoded_speech = rec.Result()
        # print(rec.Result())
        break
    else:
        pass
        # Decoded_speech =rec.PartialResult()

print("My Print Statement: = {0}".format(Decoded_speech))
print("Done")

And here's the output:

ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.front.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM front
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround21
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround21
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround40.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround40
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround41
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround50
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround51.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround51
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.surround71.0:CARD=0'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM surround71
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM iec958
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM spdif
ALSA lib confmisc.c:1281:(snd_func_refer) Unable to find definition 'cards.tegra-hda.pcm.iec958.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4528:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:5007:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM spdif
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.modem
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm.c:2495:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline
ALSA lib pcm_dmix.c:990:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_params.c:2162:(snd1_pcm_hw_refine_slave) Slave PCM not usable
ALSA lib pcm_dmix.c:1052:(snd_pcm_dmix_open) unable to open slave
Cannot connect to server socket err = No such file or directory
Cannot connect to server request channel
jack server is not running or cannot be started
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
JackShmReadWritePtr::~JackShmReadWritePtr - Init not done for -1, skipping unlock
vosk --min-active=200 --max-active=3000 --beam=10.0 --lattice-beam=2.0 --acoustic-scale=1.0 --frame-subsampling-factor=3 --endpoint.silence-phones=1:2:3:4:5:6:7:8:9:10 --endpoint.rule2.min-trailing-silence=0.5 --endpoint.rule3.min-trailing-silence=1.0 --endpoint.rule4.min-trailing-silence=2.0
LOG (vosk[5.5.643~3-7e185]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (vosk[5.5.643~3-7e185]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (vosk[5.5.643~3-7e185]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (vosk[5.5.643~3-7e185]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (vosk[5.5.643~3-7e185]:Collapse():nnet-utils.cc:1472) Added 1 components, removed 2
LOG (vosk[5.5.643~3-7e185]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.163441 seconds in looped compilation.
STARTING HERE:
My Print Statement: = {
"result" : [{
"conf" : 1.000000,
"end" : 3.330000,
"start" : 3.090000,
"word" : "i'm"
}, {
"conf" : 1.000000,
"end" : 3.660000,
"start" : 3.330000,
"word" : "having"
}, {
"conf" : 1.000000,
"end" : 3.750000,
"start" : 3.660000,
"word" : "the"
}, {
"conf" : 1.000000,
"end" : 4.200000,
"start" : 3.750000,
"word" : "same"
}, {
"conf" : 1.000000,
"end" : 4.860000,
"start" : 4.260000,
"word" : "problems"
}, {
"conf" : 1.000000,
"end" : 5.070000,
"start" : 4.860000,
"word" : "with"
}, {
"conf" : 1.000000,
"end" : 5.790000,
"start" : 5.070000,
"word" : "control"
}, {
"conf" : 1.000000,
"end" : 6.270000,
"start" : 5.820000,
"word" : "here"
}],
"text" : "i'm having the same problems with control here"
}
Done
james@james-desktop:~/vosk-kaldi/kaldi$

All the ALSA lib comments are coming from Pyaudio. So I'll, need to look into how I can turn those off.

But then vosk prints a bunch of comments before starting to capture speech. It doesn't actually start capturing speech until I've printed out "STARTING HERE"

Then I'm getting information from every word it detects along with the "text": where it finally gives me what was actually said in quotes.

Do I then need to dig that result out of there to put it into a string I can actualy use?

Isn't there a way to just get the final result and move on to working with that?

I'm, basically right back to where I was with Pocket Spinkx. I mean, I could make this work, but it's kind of ugly. Surely there are cleaner ways to work with this? I just want an SRE that's goign to report what was said without printing out a bunch of unnecessary comments to the terminal.

Is there any documented source code for Vosk?

Nickolay V. Shmyrev - 2020-04-23

All the ALSA lib comments are coming from Pyaudio. So I'll, need to look into how I can turn those off.

You need to cleanup the alsa config, see https://stackoverflow.com/questions/7088672/pyaudio-working-but-spits-out-error-messages-each-time

Do I then need to dig that result out of there to put it into a string I can actualy use?

It is json, you can parse it with json.loads:

import json
result = json.loads(rec.Result())
text = result['text']

Is there any documented source code for Vosk?

No

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-23

"There is C documentation here. https://cmusphinx.github.io/doc/pocketsphinx/files.html. There is no Python documentation, just the source code."

This will be very helpful. Thank you!

Thanks for the information on alsa config, and json too.

Sorry to hear that there's no documentation for vosk.

How's a person supposed to learn it? Just keep coming here and asking questions?

I just want to iron out a few things so I can use an SRE for my Linquistic AI project.

I like Pocket Sphinx because I understand how to modify its dictionary.

I have no clue how to modify the dictionary vosk uses.

My Linguistic AI projects is all about building a dictionary from the ground up. So I really need to start with an empty dictionary. Something I can then have my AI software build from the ground up as it learns new words.

vosk does look like a better SRE, but I need it to be as flexible as Pocket Sphinx was in terms of building the dictionary.

So little information, and no tutorials. That's a bummer! I'd think there should be some good tutorials around on how to use and modify vosk.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-23
  
  How's a person supposed to learn it? Just keep coming here and asking questions?
  
  Yes. Many people can read code too, it is the best documentation.
  
  So little information, and no tutorials. That's a bummer! I'd think there should be some good tutorials around on how to use and modify vosk.
  
  Eventually there will be some tutorials but for now the speed of the development of the technology makes it very hard to create extensive documentation. You can check also
  
  https://groups.google.com/d/topic/kaldi-help/3LBSzmploC0/discussion
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-24

It can be difficult to get a simple answer to a question.

I've got vosk installed and running on a computer. Vosk only. Not Kaldi. Although Vosk apparently comes with some form of a Kaldirecognitzer.

My question is where do I find the vocabulary dictionary, and how do I modfied it?
And is it easy to do?

With pocketsphinx it is extremely easy to modify the dictionary or even created a new dictionary from scratch. And that was the main attraction to pocketsphinx. I need to be able to create a dictionary from scratch and be able to modify it and add to it on the fly easily and programmatically. This is extremely easy to do with pocketsphinx.

Currently with vosk I have no idea where to even find the vocabulary dictionary much less how to modify it or swap it out. In my Linguistic AI project I need the ability to create several different dictionaries for special cases and be able to swap between them programmatically on the fly.

As I say, this is very easy to do with pocketsphinx. But I have no clue how to do this with vosk.

Where can I find information on how to create specialized dictionaries for vosk, and how difficult is it to do?

I'm thinking this may be the deciding factor of whether I use vosk or pocketsphinx for my project. If it's not easy to work with the vosk vocabulary dictionary, then it's not going to be of much use in my project. So the answer to this question is paramount for my application.

And a simple answer like "Yes it can be easily done" is not a useful answer. A useful answer is to point to information on exactly how to do it.

Without knowing how to do it knowing that it's supposedly easy to do isn't very helpful.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-04-24
  
  It is not very easy but somewhat doable, you can check for details http://vpanayotov.blogspot.com/2012/06/kaldi-decoding-graph-construction.html
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

James B Ross - 2020-04-25

Thanks for all the information Nickolay. You've been very helpful. Vosk does seem to be a very good SRE, but the dictionary modifications appears to be too complex for my purpose. So I'm going to need to go back to using pocketsphinx just for the ease-of-use of its dictionary.

In any case, thanks for your patience and time. You've been very helpful.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Help with finding documentation on PockietSphinx-Python Methods please?

Speech Recognition Toolkit

Forums

Help

Help with finding documentation on PockietSphinx-Python Methods please? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Help with finding documentation on PockietSphinx-Python Methods please?