i see on your pocketsphinx_continuous source code, you dropped all silence
segments in normal mode. i don't know why dropped it, because i think the SIL
model was already train in training phase and may be the dropping make
missmatch between training and testing phase.
and a small qst: if i switch to rawmode, can i detect the start and the end
utterance to stop the recognizer. i have tried to replace cont_ad_init() by
cont_ad_init_rawmode() but the recognizer still recognize forever ...
Thanks so much
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i see on your pocketsphinx_continuous source code, you dropped all silence
segments in normal mode. i don't know why dropped it, because i think the SIL
model was already train in training phase and may be the dropping make
missmatch between training and testing phase.
Thats one of the approaches which is faster because decoding of silence is
still computationally expensive. Other approaches like you described also make
sense. If model is trained properly, a mismatch caused by dropped silence
shouldn't matter.
if i switch to rawmode, can i detect the start and the end utterance to stop
the recognizer. i have tried to replace cont_ad_init() by
cont_ad_init_rawmode() but the recognizer still recognize forever ...
Sorry, I'm not sure what do you need. If you don't need a silence detector you
can just use ad_read instead of cont_ad_read.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i see on your pocketsphinx_continuous source code, you dropped all silence
segments in normal mode. i don't know why dropped it, because i think the SIL
model was already train in training phase and may be the dropping make
missmatch between training and testing phase.
and a small qst: if i switch to rawmode, can i detect the start and the end
utterance to stop the recognizer. i have tried to replace cont_ad_init() by
cont_ad_init_rawmode() but the recognizer still recognize forever ...
Thanks so much
please help me, i'm waiting for an answer ...
Thats one of the approaches which is faster because decoding of silence is
still computationally expensive. Other approaches like you described also make
sense. If model is trained properly, a mismatch caused by dropped silence
shouldn't matter.
Sorry, I'm not sure what do you need. If you don't need a silence detector you
can just use ad_read instead of cont_ad_read.