CMU Sphinx / Forums / Help: Pocketsphinx continuous problem

Hi, I have been using pocketsphinx_continuous with custom language and
acoustic model and it has been working very well but after while it just
stops. Below is the usual output of two last utterances before halt. I would
be very grateful if someone can give some advice where I should be looking for
solution before I lost myself in code completely :)

READY....
Listening...
Stopped listening, please wait...
INFO: cmn_prior.c(121): cmn_prior_update: from < 11.63  0.18  0.07 -0.16 -0.30 -0.30 -0.11 -0.33 -0.05 -0.20 -0.05 -0.13 -0.09 >
INFO: cmn_prior.c(139): cmn_prior_update: to   < 11.61  0.18  0.06 -0.17 -0.31 -0.30 -0.12 -0.34 -0.05 -0.20 -0.05 -0.13 -0.09 >
INFO: ngram_search_fwdtree.c(1502):      428 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1504):    40277 senones evaluated (491/fr)
INFO: ngram_search_fwdtree.c(1506):    13346 channels searched (162/fr), 2574 1st, 3921 last
INFO: ngram_search_fwdtree.c(1510):      659 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1513):      427 candidate words for entering last phone (5/fr)
INFO: ngram_search_fwdflat.c(295): Utterance vocabulary contains 11 words
INFO: ngram_search_fwdflat.c(912):      270 words recognized (3/fr)
INFO: ngram_search_fwdflat.c(914):    22248 senones evaluated (271/fr)
INFO: ngram_search_fwdflat.c(916):     7823 channels searched (95/fr)
INFO: ngram_search_fwdflat.c(918):      912 words searched (11/fr)
INFO: ngram_search_fwdflat.c(920):      371 word transitions (4/fr)
WARNING: "ngram_search.c", line 1082: </s> not found in last frame, using vrata instead
INFO: ngram_search.c(1132): lattice start node <s>.0 end node vrata.44
INFO: ps_lattice.c(1228): Normalizer P(O) = alpha(vrata:44:80) = -831767
INFO: ps_lattice.c(1266): Joint P(O,S) = -831943 P(S|O) = -176
000000113: otkljuczaj vrata (-16040810)
READY....
Listening...
Stopped listening, please wait...
INFO: cmn_prior.c(121): cmn_prior_update: from < 11.61  0.18  0.06 -0.17 -0.31 -0.30 -0.12 -0.34 -0.05 -0.20 -0.05 -0.13 -0.09 >
INFO: cmn_prior.c(139): cmn_prior_update: to   < 11.65  0.14  0.04 -0.15 -0.29 -0.27 -0.12 -0.35 -0.05 -0.21 -0.05 -0.13 -0.10 >
INFO: ngram_search_fwdtree.c(1502):      372 words recognized (3/fr)
INFO: ngram_search_fwdtree.c(1504):    41649 senones evaluated (389/fr)
INFO: ngram_search_fwdtree.c(1506):    12908 channels searched (120/fr), 2978 1st, 3233 last
INFO: ngram_search_fwdtree.c(1510):      580 words for which last channels evaluated (5/fr)
INFO: ngram_search_fwdtree.c(1513):      373 candidate words for entering last phone (3/fr)
INFO: ngram_search_fwdflat.c(295): Utterance vocabulary contains 10 words
INFO: ngram_search_fwdflat.c(912):      190 words recognized (2/fr)
INFO: ngram_search_fwdflat.c(914):    13517 senones evaluated (126/fr)
INFO: ngram_search_fwdflat.c(916):     4863 channels searched (45/fr)
INFO: ngram_search_fwdflat.c(918):      689 words searched (6/fr)
INFO: ngram_search_fwdflat.c(920):      460 word transitions (4/fr)
WARNING: "ngram_search.c", line 1082: </s> not found in last frame, using <sil> instead
INFO: ngram_search.c(1132): lattice start node <s>.0 end node <sil>.98
INFO: ps_lattice.c(1228): Normalizer P(O) = alpha(<sil>:98:105) = -952607
INFO: ps_lattice.c(1266): Joint P(O,S) = -952607 P(S|O) = 0
000000114: otkljuczaj vrata (-18248974)
READY....
Listening...
INFO: ngram_search.c(407): Resized backpointer table to 10000 entries
INFO: ngram_search.c(407): Resized backpointer table to 20000 entries
INFO: ngram_search_fwdtree.c(1433): Renormalizing Scores at frame 4136, best score -534675725
INFO: ngram_search.c(407): Resized backpointer table to 40000 entries
INFO: ngram_search.c(407): Resized backpointer table to 80000 entries
INFO: ngram_search.c(415): Resized score stack to 200000 entries
INFO: ngram_search.c(407): Resized backpointer table to 160000 entries
pocketsphinx_continuous: ngram_search_fwdtree.c:803: prune_nonroot_chan: Assertion `(&hmm->hmm)->frame >= frame_idx' failed.

Nickolay V. Shmyrev - 2010-09-11

Hello

That sounds like a bug. What version are you using? Can you reproduce it with
recent trunk? Can you provides models to let me reproduce it locally? Can you
enable audio dump with -rawlogdir . and share last audio chunk that auses
issue as well.

Are you running on the system with restricted memory? It looks you need some
tigher beams then.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bula - 2010-09-11

Hi,

I'm using sphinxbase and pocketsphinx both version 0.6 on embedded system with
256MB, I can try to reproduce it with recent trunk and with -rawlogdir later
today or tomorrow and send you results.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bula - 2010-09-13

Today I finally managed to make some tests and here is what I found out.
Pocketsphinx_continuous is stuck in continuous listening mode and these
messages (memory related?) are consequence of wanna-be-infinite utterance.

When I introduce break (if the utterance is more than few seconds long) in the
listening loop then it continuously breaks out of it with (null) or some
other, mainly wrong, hypothesis.

I tried to find out what triggers that behavior but I cant put my finger on
it. It seems like sensitivity to noise suddenly gets really high. When the raw
files are played I can hear a little elevated noise (interference) in
background but nothing too loud or out of the ordinary. If you would like to
help me I can send you the raw files before and after the constant listening
mode kicks in.

Also, there is one more thing that probably is not related - at the beginning
of each raw file there a little stuttering can be heard. When I make normal
rec (parallel with pocketsphinx) in another terminal it sounds ok so I wonder
is that normal behavior and what is the cause?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-09-13

If I would be able to reproduce it I could solve this easily. That parallel
recording file will also help

As for embedded device there are number of tricks to make memory usage low.
For example -maxhmmpf 3000 should be better.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bula - 2010-09-13

Hey, thank you for the tip, but it seems that memory is not the problem,
problem is infinite utterance loop. I looked into it further and real problem
seems to be silence/speech filtering in sphinxbase cont_ad module. Source
might be the fact that my program often mutes capture device so by
recalibration in muted periods this module becomes too sensitive to
noise/interference and thinks it's speech. If that is the case solution is
avoidance of muting the device and keeping interference as low as possible by
solving some hardware issues.

There is one other thing that bothers me and I cant figure it out and that is
stuttering in raw files. I'm sending you the recordings via mail.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-09-14

Hi

I've got your file, it seems it has DC offset. Please try to add "-remove_dc
yes" option to remove it. Also please send me long continuous recording
bypassing pocketsphinx from your device. I'll try to test cont_ad locally.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-09-19

Hi

Sorry, it took me a while. I checked your file and it seems ok. It might be
indeed like you said when you mute energy endpointer breaks. Can you try to
insert into your code

cont_ad_set_logfp(cont, stdout);

and share the log of the endpointer?

Also, I wonder if you mute the device, why don't you stop recording as well.
You can just pause cont_ad_read for a while,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bula - 2010-09-20

Hi thank you for looking into it. Conditions that had to be satisfied for
problem to occur are very subtle and difficult to grasp but anyway the
solution was not so hard to make - removing the mute part. Indeed, it should
have been avoided in the first place as it was only a bad temporary-shortcut
code to real solution (which is pausing cont_ad_read like you said).

Second problem in my last post is a little bit harder to trace: as you can
hear from files in my last mail, usual rec (parallel to pocektsphinx) records
a perfect sound file but raw audio files that are logged from pocketsphinx at
the same time have stutterings all over. I wonder what is the cause and if
solving this problem can enhance recognition accuracy.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-09-24

Hello

In order to investigate this can you dump the raw input to the file please?
Not a parallel recording but actual data that
appears in the device. You need to insert this chunk of code into the
application.

if ((rawfp = fopen(copyfile, "wb")) == NULL) E_ERROR("Failed to open raw output file '%s' for writing: %s\n", copyfile, strerror(errno)); else cont_ad_set_rawfp(cont, rawfp);
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bula - 2010-09-25

Hi, I have inserted the code and files are sent by mail.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Andresen - 2012-03-05

Hi,

Yes, I know this bug is a bit older, but I have the same problem running
pocketsphinx over gstreamer:

ngram_search_fwdtree.c:825: prune_nonroot_chan: Assertion `(&hmm->hmm)->frame

= frame_idx' failed.

Bug is in both version 0.7 and current svn trunk.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-06

Hello nxdefiant

If you have the same problem maybe the same solution can help you. You just
need to configure the frontend to split the audio on utterances properly. Also
you can increase frame_idx_t type to int32 to make utterance size bigger. See
hmm.h header for details.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Andresen - 2012-03-06

Just to be on the save side: This means configuring vader in gstreamer, right?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-03-06

And that too

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Erik Andresen - 2012-03-24

ok with frame_idx_t as int32 it is way more stable (no single crash yet)
Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pocketsphinx continuous problem

Speech Recognition Toolkit

Forums

Help

Pocketsphinx continuous problem document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Pocketsphinx continuous problem