CMU Sphinx / Forums / Help: Stop detection with intermediary result when using grammar

christian schuch - 2020-02-04

I've built a recognotion system for german digits (short words) using pocketsphinx (because I need to run it on an embedded system). So far it recognizes the digits and rejects false positives with a certainty of about 80%. The setup is an asterisk 13 and a plugin to connect the speech recognition core with a client which in turn connects to a small server communicating with pocketsphinx and communicating with the plugin over an internal port. I can see the intermediary results in standard out as well as the final verdict with the recognition result. The model is setup using a jsfg grammar to ensure false positive recjection. When using a grammar instead of a .lm ** the probabillites are not readable, although there obviously is happening some sort of evaluation which finally reaches a certain threshold which in turn makes PS hand over a result. My problem is, that sometimes the recognition core needs more than one try of the utterance to deliver a result, although monitoring the internal results clearly shows, that the correct interpretation is already found. Is it possible to force** PS to hand over results below that (maybe imaginary) threshold? How is the internal recognition set up when it is not using p(x)?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-02-04
  
  I'm sorry, it is hard to understand the purpose of the system and give you the advise. You simply experience accuracy issues.
  
  Asterisk is bad choice here since it only limits to 8khz which is less accurate than 16khz.
  
  Another thing would be to have more accurate system based on neural networks. https://github.com/alphacep/vosk-api should work on RPi if that is your embedded system. German model is here: https://github.com/alphacep/kaldi-android-demo/releases/download/2020-01/alphacep-model-android-de-zamia-0.3.tar.gz
  
  You can explain your system in more details to get better advise.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - christian schuch - 2020-02-05
    
    The purpose of the system is a IVR-menu.
    Unfortunately only pocketsphinx is possible because of system / contract reasons, and so far it works very good (so far I've got an accuracy of about 80% with only about one hour of recorded material. Let me phrase the setup a little bit more concise: On the asterisk side I use a plugin, witch works with the resspeech.... - core. This one takes the audio stream and hands it over via an internal port to a server, that maintains a connection to an open thread of pocketsphinx, and hands over the audio data to the PS-core for interpretation (both the plugin and the server are based on astsphinx). The PS-core than prints out several lines of intermediary results (see attached jpg). I just want to add a timeout, after that the core returns the intermediary result, or if that is not possible a fault (empty string). Is that possible?
    
    Last edit: christian schuch 2020-02-05
    
    std_out_ps.jpg
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2020-02-06
      
      Ok then. Timeout is perfectly possible, you just have to implement it yourself in your code. I do not see any problem here, just count the bytes processed and return if you got enough bytes.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

christian schuch - 2020-02-06

Or what else do you want to know about my setup?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

christian schuch - 2020-02-06

Ok, with what API- command or data-structure can I read out the intermidiary results?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2020-02-06
  
  ps_get_hyp function returns current results, final or intermediate if you call it.
  
  https://cmusphinx.github.io/doc/pocketsphinx/pocketsphinx_8h.html#ada74b12d71e9d4db5d959b94004ff812
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

christian schuch - 2020-02-07

thx very much

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stop detection with intermediary result when using grammar

Speech Recognition Toolkit

Forums

Help

Stop detection with intermediary result when using grammar document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Stop detection with intermediary result when using grammar