Long silence increases misrecognition

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Long silence increases misrecognition

Forum: Help

Creator: Krisztian Loki

Created: 2010-07-14

Updated: 2012-09-22

Krisztian Loki - 2010-07-14

Hi All,

recently I tried to implement some basic barge-in capability but quickly
discovered that both PocketSphinx and Sphinx3 have the same bug: if the
utterance begins with a long silence (a couple of seconds), then the
recognition is totally bogus. An all-silence utterance is correctly recognised
as noise/silence. Have any of you experienced something similar?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-14

This is not a bug, you are not using endpointer, that's it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Krisztian Loki - 2010-07-14

Indeed I'm not, I don't even know what it is. So a simple pocketsphinx_batch
on a raw file starting with silence won't give the expected result? What do I
have to do to make it work?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-07-14

pocketsphinx_batch is supposed to decode short files without silences.

If you need to decode long files with silences you need to use pocketsphinx
API differently. You need to filter silences using cont_ad functions from
sphinxbase. You can find example of API usage looking on
pocketsphinx_continuous. There were also many threads about that in this
forum.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.