Menu

Pocketsphinx – Hypothesis changes for the same audio file

Help
Oren G.
2018-01-31
2018-02-01
  • Oren G.

    Oren G. - 2018-01-31

    I read here about the issue where the decoder gives incorrect hyp when decoding the file at the first time, and when decoding it a second time and on, the hyps are correct.

    However, I’ve observed a more complex pattern, in which the decoder fluctuates between two hyps over and over.

    To reproduce the issue – this is what I did:

    1) I’ve made a change to the main function in the file continuous.c, so the same audio file will be decoded 20 times:

    if (cmd_ln_str_r(config, "-infile") != NULL) {
    
        int i;
        for (i = 1; i <= 20; i++) {
            recognize_from_file();
        }
    } …
    

    2) I’ve created this grammar:

    JSGF V1.0;
    grammar grammar1;
    public <rule1> = (

    /1/ i like to play football |
    /50000000000/ they like to play football
    );

    3) I’ve run pocketsphinx_continuous to decode an audio file with that grammar (file attached):

    pocketsphinx_continuous -hmm hmm -dict cmu.dict -jsgf 1.jsgf -infile 1.wav -logfn log.txt

    The output is:

    they like to play football
    they like to play football
    they like to play football
    they like to play football
    they like to play football
    they like to play football
    i like to play football
    i like to play football
    they like to play football
    i like to play football
    i like to play football
    they like to play football
    i like to play football
    i like to play football
    they like to play football
    i like to play football
    i like to play football
    they like to play football
    i like to play football
    i like to play football

    Log is attached. I would like to ask:
    1. Why does it happen?
    2. Can I alter this behavior?

     

    Last edit: Oren G. 2018-01-31
    • Alex Rudnicky

      Alex Rudnicky - 2018-01-31

      Judging from your log file, the cmn vector is being computed from utterance
      to utterance.
      Eventually it settles into a 3-utt loop, which is why you're seeing what
      you're seeing.

      This doesn't make sense since you have '-cmn current' in your config; it's
      acting more like 'prior'.
      Look at the code and check that the cmn logic is correct...

      The 'first utterance is garbage' problem happens in prior mode, when the
      default cmn values are just wrong for the utt at hand.

      On Wed, Jan 31, 2018 at 10:08 AM, Oren G. orenstuf@users.sourceforge.net
      wrote:

      I read here https://github.com/watsonbox/pocketsphinx-ruby/issues/10
      about the issue where the decoder gives incorrect hyp when decoding the
      file at the first time, and when decoding it a second time and on, the hyps
      are correct.

      However, I’ve observed a more complex pattern, in which the decoder
      fluctuates between two hyps over and over.

      To reproduce the issue – this is what I did:

      1) I’ve made a change to the main function in the file continuous.c, so
      the same audio file will be decoded 20 times:

      if (cmd_ln_str_r(config, "-infile") != NULL) {

      int i;
      for (i = 1; i <= 20; i++) {
          recognize_from_file();
      }
      

      } …

      2) I’ve created this grammar:

      JSGF V1.0;
      grammar grammar1;
      public <rule1> = (

      /1/ i like to play football |
      /50000000000/ they like to play football
      );

      3) I’ve run pocketsphinx_continuous to decode an audio file with that
      grammar (file attached):

      pocketsphinx_continuous -hmm hmm -dict cmu.dict -jsgf 1.jsgf -infile 1.wav
      -logfn log.txt

      The output is:

      they like to play football
      they like to play football
      they like to play football
      they like to play football
      they like to play football
      they like to play football
      i like to play football
      i like to play football
      they like to play football
      i like to play football
      i like to play football
      they like to play football
      i like to play football
      i like to play football
      they like to play football
      i like to play football
      i like to play football
      they like to play football
      i like to play football
      i like to play football

      Log is attached. I would like to ask:
      1. Why does it happen?
      2. Can I alter this behavior?


      Pocketsphinx – Hypothesis changes for the same audio file
      https://sourceforge.net/p/cmusphinx/discussion/help/thread/0a40daec/?limit=25#dc75


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/cmusphinx/discussion/help/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Oren G.

        Oren G. - 2018-02-01

        Look at the code and check that the cmn logic is correct...

        Notice that there is no code of my mine. It's only the pocketsphinx_continuous porgram, with the change that I describe in my post.

         

        Last edit: Oren G. 2018-02-01

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.