CMU Sphinx / Forums / Help: input audio duration limit in sphinx 3.5?

MarkCatchpole - 2005-07-20

Hi,

I am running sphinx 3.5 on windows 2k using the WSJ acoustic and Language models. My problem is that livedecode is currently only recognising a short fragment of my input audio - about 1 sec of speech.

This is perhaps partly because I can't configure my PC soundcard to anything other than 22050 khz sampling rate and so a buffer somewhere is filling up with the extra audio samples. The recogniser does appear to be working correctly because when I look at the raw audio file dumped by the recogniser and play this back, this corresponds to the words generated by the recogniser.

I seem to remember seeing something about a hard limit for the amount of audio data somewhere - any suggestions about code I could modify or other suggestions would be appreciated.

Mark

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- The Grand Janitor - 2005-07-20
  
  Hi Mark,
  Allow me to explain, Sphinx(en) including 2,3.x,4 are all basically speech recognition engines. Despite of popular misunderstanding, Sphinx doesn't do provide any dictation interface.
  What are you seeing, livedecode is just a demo and was just a way to allow developer to try out engine.
  
  For your problem. Hmm, it is very strange that using 22.05 k audio capture will give you a right answer, because usually you need to match the model training situation (in WSJ case, it is 16k). I will therefore recommend you to try to configure your PC sound card to do it . Or try to down-sample the waveform livedecode does has a limit in duration. Before I asked the original author to confirm, try to play with something like BUFSIZE. I will follow this up asap.
  
  Arthur
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- shiosai - 2005-07-23
  
  I had some similar problem with livedecode under linux. It seemed the decoder didnt catch the speech correct, while dumped rawfile was perfect.
  Perhaps it can help you:
  http://sourceforge.net/forum/forum.php?thread_id=1282514&forum_id=5470
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

input audio duration limit in sphinx 3.5?

Speech Recognition Toolkit

Forums

Help

input audio duration limit in sphinx 3.5? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

input audio duration limit in sphinx 3.5?