Menu

Difference in decoding result

Help
2017-02-21
2017-03-01
  • Tania Mendonca

    Tania Mendonca - 2017-02-21

    I built a hybrid model of words and sylabbles for OOV detection for kannada language
    while running the command sphinxtrain -s decode run the .align file gets created which has the mapping between the reference and the hypothesis after it runs the pocketsphinx_batch

    kannada.align file
    REF:
    ii *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** STHALLAVANNU NAVA BRUAMDAAVANA eamdu *** *** *** *** *** *** *** *** *** *** *** *** *** *** KAREYALAAGUTTADE (kannadatest_clstk-KAN_0717)

    HYP:
    ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717

    When i try to decode with pocketsphinx_continuous -infile wav/kannadatest_clstk/kan_0717.wav -hmm model_parameters/kannada2.cd_cont_200/ -lm etc/kannada2.lm.DMP -dict etc/kannada2.dic the hypothesis i obtain is

    HYP with pocketsphinx_continous:

    ^S_II STHALLA V_A N_N_U$ N_A V_A ^G_R_A AM_TH_A VARGA EAMDU ^K_A R_E Y_A L_AA G_U T_T_A D_E$ S$

    Whatever marked in BOLD is wrong
    Why is there a difference in output while the model used is same ?
    When i run pocketsphinx_continous it gives me extra sylabbles whereas in align those extra sylabble arent there

     

    Last edit: Tania Mendonca 2017-02-21
    • Arseniy Gorin

      Arseniy Gorin - 2017-02-21

      Not quite clear. You mean the kannada.align is OK? Then why EADMU is wrong in the second case?

      I think your question should be reformulated.

      As for the difference, you can check which parameters are used when you run sphinxtrain in logdir/decode. When I test with your example, I have some closer results to the first on with '-lw 5' option. Not sure if it is what you want to acheive.

       
      • Tania Mendonca

        Tania Mendonca - 2017-02-22

        When it run in batch the Hypothesis is right
        But while i run pocketsphinx_continous with lw 5 i still get extra sylabbles which isnt there in my batch output

        Actual transcription:

        ii STHALLAVANNU NAVA BRUAMDAAVANA eamdu KAREYALAAGUTTADE (kannadatest_clstk-KAN_0717)

        Batch mode output:

        ii ^S TH A LL A V A N N U$ N A V A V RU AM G AA V A N A$ eamdu ^K A R E Y A L AA G U T T A D E$ (kannadatest_clstk-KAN_0717

        Pocketsphinx_continous output with language weight 5:

        ^S_II STHALLA V_A N_N_U$ N_A V_A V_RU AM_DD_A V_A R_A EAMDU ^K_A R_E Y_A L_AA G_U T_T_A D_E$ S$

        In batch mode i have just one sylabble which is misrecognized as shown in bold
        while i try to decode with pocketsphinx_continous i have extra sylabbles at the beginning and end that its trying to get from the audio file
        I'am not able to understand why the outputs are different in batch and continous?

        According to output seen the batch mode recognition is better than the continous
        So how can i achieve the same recognition in pocketsphinx_Continous?

         

        Last edit: Tania Mendonca 2017-02-22
        • Arseniy Gorin

          Arseniy Gorin - 2017-02-22

          Try to check lodgir/decode from your training for right parameters. Or post it here. In general the result can be a bit different i believe

           
          • Tania Mendonca

            Tania Mendonca - 2017-02-23

            I tried using the Sphinx4 decoder where i achieved the same results as pocket sphinx batch where the oov s were replaced as syllables. Like howwe could the right parameters in the logdir/decode how can I check the parameters chosen in Sphinx4 decoder? Because the input i give into the Sphinx 4 decoder is just the acoustic path dictionary and language model

             

            Last edit: Tania Mendonca 2017-02-23
          • Tania Mendonca

            Tania Mendonca - 2017-03-01
             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.