CMU Sphinx / Forums / Help: Pocketsphinx – Hypothesis changes for the same audio file

Oren G. - 2018-01-31

I read here about the issue where the decoder gives incorrect hyp when decoding the file at the first time, and when decoding it a second time and on, the hyps are correct.

However, I’ve observed a more complex pattern, in which the decoder fluctuates between two hyps over and over.

To reproduce the issue – this is what I did:

1) I’ve made a change to the main function in the file continuous.c, so the same audio file will be decoded 20 times:

if (cmd_ln_str_r(config, "-infile") != NULL) { int i; for (i = 1; i <= 20; i++) { recognize_from_file(); } } …

2) I’ve created this grammar:

JSGF V1.0;
grammar grammar1;
public <rule1> = ( </rule1>

/1/ i like to play football |
/50000000000/ they like to play football
);

3) I’ve run pocketsphinx_continuous to decode an audio file with that grammar (file attached):

pocketsphinx_continuous -hmm hmm -dict cmu.dict -jsgf 1.jsgf -infile 1.wav -logfn log.txt

The output is:

they like to play football
they like to play football
they like to play football
they like to play football
they like to play football
they like to play football
i like to play football
i like to play football
they like to play football
i like to play football
i like to play football
they like to play football
i like to play football
i like to play football
they like to play football
i like to play football
i like to play football
they like to play football
i like to play football
i like to play football

Log is attached. I would like to ask:
1. Why does it happen?
2. Can I alter this behavior?

Last edit: Oren G. 2018-01-31

1.jsgf

1.wav

log.txt
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Alex Rudnicky - 2018-01-31
  
  Judging from your log file, the cmn vector is being computed from utterance
  to utterance.
  Eventually it settles into a 3-utt loop, which is why you're seeing what
  you're seeing.
  
  This doesn't make sense since you have '-cmn current' in your config; it's
  acting more like 'prior'.
  Look at the code and check that the cmn logic is correct...
  
  The 'first utterance is garbage' problem happens in prior mode, when the
  default cmn values are just wrong for the utt at hand.
  
  On Wed, Jan 31, 2018 at 10:08 AM, Oren G. orenstuf@users.sourceforge.net
  wrote:
  
  I read here https://github.com/watsonbox/pocketsphinx-ruby/issues/10
  about the issue where the decoder gives incorrect hyp when decoding the
  file at the first time, and when decoding it a second time and on, the hyps
  are correct.
  
  However, I’ve observed a more complex pattern, in which the decoder
  fluctuates between two hyps over and over.
  
  To reproduce the issue – this is what I did:
  
  1) I’ve made a change to the main function in the file continuous.c, so
  the same audio file will be decoded 20 times:
  
  if (cmd_ln_str_r(config, "-infile") != NULL) {
  
  int i; for (i = 1; i <= 20; i++) { recognize_from_file(); }
  
  } …
  
  2) I’ve created this grammar:
  
  JSGF V1.0;
  grammar grammar1;
  public <rule1> = (</rule1>
  
  /1/ i like to play football |
  /50000000000/ they like to play football
  );
  
  3) I’ve run pocketsphinx_continuous to decode an audio file with that
  grammar (file attached):
  
  pocketsphinx_continuous -hmm hmm -dict cmu.dict -jsgf 1.jsgf -infile 1.wav
  -logfn log.txt
  
  The output is:
  
  they like to play football
  they like to play football
  they like to play football
  they like to play football
  they like to play football
  they like to play football
  i like to play football
  i like to play football
  they like to play football
  i like to play football
  i like to play football
  they like to play football
  i like to play football
  i like to play football
  they like to play football
  i like to play football
  i like to play football
  they like to play football
  i like to play football
  i like to play football
  
  Log is attached. I would like to ask:
  1. Why does it happen?
  2. Can I alter this behavior?
  
  Pocketsphinx – Hypothesis changes for the same audio file
  https://sourceforge.net/p/cmusphinx/discussion/help/thread/0a40daec/?limit=25#dc75
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/cmusphinx/discussion/help/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  alternate
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Oren G. - 2018-02-01
    
    Look at the code and check that the cmn logic is correct...
    
    Notice that there is no code of my mine. It's only the pocketsphinx_continuous porgram, with the change that I describe in my post.
    
    Last edit: Oren G. 2018-02-01
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pocketsphinx – Hypothesis changes for the same audio file

Speech Recognition Toolkit

Forums

Help

Pocketsphinx – Hypothesis changes for the same audio file document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Pocketsphinx – Hypothesis changes for the same audio file