I trained a acoustic model and decode the testing data with pocketsphinx_batch, the WER is about 2.5%. I want to get the timing boundary of the recognized word, however, it seems there is no such function in pocketsphinx_batch. So I turned to pocketsphinx_continous. But the recognition seems different from the one given by pocketsphinx_batch (even with the same parameters). I then changed a little bit. I use the recognized text from pocketsphinx_batch as the grammar and do the decoding with pocketsphinx_continuous, some of the files say there are errors during decoding and cannot give out the result.
I want ask:
1. what's the difference between these two decoder?
2. how can I get timing boundary with pockesphinx_batch? (or say is there anyway to get timing boundary)
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I notice that the output is always a little shifted from our label. Is that because of the feature extraction machenism in Sphinx? As I know, some toolkits does not start from the first singal point, because they need a little bit information ahead to compute the feature. Could you tell me where I can find the details about the feature extraction in Sphinx? I tried but didn't find them.
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I notice that the output is always a little shifted from our label. Is that because of the feature extraction machenism in Sphinx? As I know, some toolkits does not start from the first singal point, because they need a little bit information ahead to compute the feature.
Maybe
Could you tell me where I can find the details about the feature extraction in Sphinx? I tried but didn't find them.
In source code
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, dear all,
I trained a acoustic model and decode the testing data with pocketsphinx_batch, the WER is about 2.5%. I want to get the timing boundary of the recognized word, however, it seems there is no such function in pocketsphinx_batch. So I turned to pocketsphinx_continous. But the recognition seems different from the one given by pocketsphinx_batch (even with the same parameters). I then changed a little bit. I use the recognized text from pocketsphinx_batch as the grammar and do the decoding with pocketsphinx_continuous, some of the files say there are errors during decoding and cannot give out the result.
I want ask:
1. what's the difference between these two decoder?
2. how can I get timing boundary with pockesphinx_batch? (or say is there anyway to get timing boundary)
Thank you.
Batch uses LiveCMN, continuous uses continuous CMN which is not always accurate from initial estimation.
-hypseg option creates file with word times.
Thank you.
I notice that the output is always a little shifted from our label. Is that because of the feature extraction machenism in Sphinx? As I know, some toolkits does not start from the first singal point, because they need a little bit information ahead to compute the feature. Could you tell me where I can find the details about the feature extraction in Sphinx? I tried but didn't find them.
Thank you.
Does anyone else have the same problem as mine?
Maybe
In source code