Menu

Setting cmn to 'none'

Help
2017-07-24
2017-07-24
  • Dino The Dinosaur

    Hello!
    I want to recognize my wav files with no cepstral mean normalization, but, apparently, instead of setting the value to none, decoder is configured with -cmn 'batch'. How do I disable cmn?

    The problem takes its roots from investigating the reasons behind different outputs from utterance to utterance on the same files. Even if I configure decoders independently, each time for an audio, results still differ. What could be the possible reasons behind this randomness, besides cmn?

     

    Last edit: Dino The Dinosaur 2017-07-24
    • Nickolay V. Shmyrev

      I want to recognize my wav files with no cepstral mean normalization, but, apparently, instead of setting the value to none, decoder is configured with -cmn 'batch'. How do I disable cmn?

      You can probably remove the corresponding line in feat.params in acoustic model

      The problem takes its roots from investigating the reasons behind different outputs from utterance to utterance on the same files. Even if I configure decoders independently, each time for an audio, results still differ. What could be the possible reasons behind this randomness, besides cmn?

      Maybe you have dither enabled. It is disabled by default though.

       
      • Dino The Dinosaur

        Thank you! Will try it out :)

         
      • Dino The Dinosaur

        I see, as was argued before, that cmn helps recognition significantly, and recognition without cmn is poor.
        But, for instance, if I wanted to fix some cmn values and recognize all of the audio with the values, would it be possible? Logically, it would work in Python if I configured a decoder every time I recognized a next utterance and used cmn 'batch' mode. I tried that, but my results vary once again. What could be the cause? The -dither parameter is disabled, as you have mentioned.

         
        • Nickolay V. Shmyrev

          but, for instance, if I wanted to fix some cmn values and recognize all of the audio with the values, would it be possible?

          You have to modify the code for that.

           
          • Dino The Dinosaur

            You mean the source code?

             
          • Dino The Dinosaur

            And could you please also verify these words?

            Logically, it would work in Python if I configured a decoder every time I recognized a next utterance and used cmn 'batch' mode. I tried that, but my results vary once again. What could be the cause?

             

            Last edit: Dino The Dinosaur 2017-07-26

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.