Hello!
I want to recognize my wav files with no cepstral mean normalization, but, apparently, instead of setting the value to none, decoder is configured with -cmn 'batch'. How do I disable cmn?
The problem takes its roots from investigating the reasons behind different outputs from utterance to utterance on the same files. Even if I configure decoders independently, each time for an audio, results still differ. What could be the possible reasons behind this randomness, besides cmn?
Last edit: Dino The Dinosaur 2017-07-24
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I want to recognize my wav files with no cepstral mean normalization, but, apparently, instead of setting the value to none, decoder is configured with -cmn 'batch'. How do I disable cmn?
You can probably remove the corresponding line in feat.params in acoustic model
The problem takes its roots from investigating the reasons behind different outputs from utterance to utterance on the same files. Even if I configure decoders independently, each time for an audio, results still differ. What could be the possible reasons behind this randomness, besides cmn?
Maybe you have dither enabled. It is disabled by default though.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I see, as was argued before, that cmn helps recognition significantly, and recognition without cmn is poor.
But, for instance, if I wanted to fix some cmn values and recognize all of the audio with the values, would it be possible? Logically, it would work in Python if I configured a decoder every time I recognized a next utterance and used cmn 'batch' mode. I tried that, but my results vary once again. What could be the cause? The -dither parameter is disabled, as you have mentioned.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logically, it would work in Python if I configured a decoder every time I recognized a next utterance and used cmn 'batch' mode. I tried that, but my results vary once again. What could be the cause?
Last edit: Dino The Dinosaur 2017-07-26
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello!
I want to recognize my wav files with no cepstral mean normalization, but, apparently, instead of setting the value to none, decoder is configured with -cmn 'batch'. How do I disable cmn?
The problem takes its roots from investigating the reasons behind different outputs from utterance to utterance on the same files. Even if I configure decoders independently, each time for an audio, results still differ. What could be the possible reasons behind this randomness, besides cmn?
Last edit: Dino The Dinosaur 2017-07-24
You can probably remove the corresponding line in feat.params in acoustic model
Maybe you have dither enabled. It is disabled by default though.
Thank you! Will try it out :)
I see, as was argued before, that cmn helps recognition significantly, and recognition without cmn is poor.
But, for instance, if I wanted to fix some cmn values and recognize all of the audio with the values, would it be possible? Logically, it would work in Python if I configured a decoder every time I recognized a next utterance and used cmn 'batch' mode. I tried that, but my results vary once again. What could be the cause? The -dither parameter is disabled, as you have mentioned.
You have to modify the code for that.
You mean the source code?
And could you please also verify these words?
Last edit: Dino The Dinosaur 2017-07-26