I built a hybrid model of words and sylabbles for OOV detection for kannada language
while running the command sphinxtrain -s decode run the .align file gets created which has the mapping between the reference and the hypothesis after it runs the pocketsphinx_batch
HYP:
ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717
When i try to decode with pocketsphinx_continuous -infile wav/kannadatest_clstk/kan_0717.wav -hmm model_parameters/kannada2.cd_cont_200/ -lm etc/kannada2.lm.DMP -dict etc/kannada2.dic the hypothesis i obtain is
Whatever marked in BOLD is wrong
Why is there a difference in output while the model used is same ?
When i run pocketsphinx_continous it gives me extra sylabbles whereas in align those extra sylabble arent there
Not quite clear. You mean the kannada.align is OK? Then why EADMU is wrong in the second case?
I think your question should be reformulated.
As for the difference, you can check which parameters are used when you run sphinxtrain in logdir/decode. When I test with your example, I have some closer results to the first on with '-lw 5' option. Not sure if it is what you want to acheive.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When it run in batch the Hypothesis is right
But while i run pocketsphinx_continous with lw 5 i still get extra sylabbles which isnt there in my batch output
Actual transcription:
ii STHALLAVANNU NAVA BRUAMDAAVANA eamdu KAREYALAAGUTTADE (kannadatest_clstk-KAN_0717)
Batch mode output:
ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717
Pocketsphinx_continous output with language weight 5:
In batch mode i have just one sylabble which is misrecognized as shown in bold
while i try to decode with pocketsphinx_continous i have extra sylabbles at the beginning and end that its trying to get from the audio file
I'am not able to understand why the outputs are different in batch and continous?
According to output seen the batch mode recognition is better than the continous
So how can i achieve the same recognition in pocketsphinx_Continous?
Last edit: Tania Mendonca 2017-02-22
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried using the Sphinx4 decoder where i achieved the same results as pocket sphinx batch where the oov s were replaced as syllables. Like howwe could the right parameters in the logdir/decode how can I check the parameters chosen in Sphinx4 decoder? Because the input i give into the Sphinx 4 decoder is just the acoustic path dictionary and language model
Last edit: Tania Mendonca 2017-02-23
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I built a hybrid model of words and sylabbles for OOV detection for kannada language
while running the command sphinxtrain -s decode run the .align file gets created which has the mapping between the reference and the hypothesis after it runs the pocketsphinx_batch
kannada.align file
REF:
ii *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** STHALLAVANNU NAVA BRUAMDAAVANA eamdu *** *** *** *** *** *** *** *** *** *** *** *** *** *** KAREYALAAGUTTADE (kannadatest_clstk-KAN_0717)
HYP:
ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717
When i try to decode with pocketsphinx_continuous -infile wav/kannadatest_clstk/kan_0717.wav -hmm model_parameters/kannada2.cd_cont_200/ -lm etc/kannada2.lm.DMP -dict etc/kannada2.dic the hypothesis i obtain is
HYP with pocketsphinx_continous:
^S_II STHALLA V_A N_N_U$ N_A V_A ^G_R_A AM_TH_A VARGA EAMDU ^K_A R_E Y_A L_AA G_U T_T_A D_E$ S$
Whatever marked in BOLD is wrong
Why is there a difference in output while the model used is same ?
When i run pocketsphinx_continous it gives me extra sylabbles whereas in align those extra sylabble arent there
Last edit: Tania Mendonca 2017-02-21
Not quite clear. You mean the kannada.align is OK? Then why EADMU is wrong in the second case?
I think your question should be reformulated.
As for the difference, you can check which parameters are used when you run sphinxtrain in logdir/decode. When I test with your example, I have some closer results to the first on with '-lw 5' option. Not sure if it is what you want to acheive.
When it run in batch the Hypothesis is right
But while i run pocketsphinx_continous with lw 5 i still get extra sylabbles which isnt there in my batch output
Actual transcription:
ii STHALLAVANNU NAVA BRUAMDAAVANA eamdu KAREYALAAGUTTADE (kannadatest_clstk-KAN_0717)
Batch mode output:
ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717
Pocketsphinx_continous output with language weight 5:
^S_II STHALLA V_A N_N_U$ N_A V_A V_RU AM_DD_A V_A R_A EAMDU ^K_A R_E Y_A L_AA G_U T_T_A D_E$ S$
In batch mode i have just one sylabble which is misrecognized as shown in bold
while i try to decode with pocketsphinx_continous i have extra sylabbles at the beginning and end that its trying to get from the audio file
I'am not able to understand why the outputs are different in batch and continous?
According to output seen the batch mode recognition is better than the continous
So how can i achieve the same recognition in pocketsphinx_Continous?
Last edit: Tania Mendonca 2017-02-22
Try to check lodgir/decode from your training for right parameters. Or post it here. In general the result can be a bit different i believe
I tried using the Sphinx4 decoder where i achieved the same results as pocket sphinx batch where the oov s were replaced as syllables. Like howwe could the right parameters in the logdir/decode how can I check the parameters chosen in Sphinx4 decoder? Because the input i give into the Sphinx 4 decoder is just the acoustic path dictionary and language model
Last edit: Tania Mendonca 2017-02-23