That thread was about the online-nnet2 setup, and it does not use CMVN
(because that is tricky to implement online), but instead relies on
the iVectors to represent the cepstral offset and the nnet training to
use the iVector appropriately. However, if the training data was too
carefully normalized, or normalized only within a narrow range, it can
fail to learn complete invariance with respect to volume differences.
In some more recent recipes we have started perturbing the volume of
the training data to help it learn this better.
Dan
Hello,
It is said here https://sourceforge.net/p/kaldi/discussion/1355348/thread/052a15fd/
that
"Normalizing the amplitude of the input is important- at least to have it in
the right range"
But since we use CMVN, why would any other kind of normalization will be required at all? Isn't the energy coefficient normalized to fix this issue?
That thread was about the online-nnet2 setup, and it does not use CMVN
(because that is tricky to implement online), but instead relies on
the iVectors to represent the cepstral offset and the nnet training to
use the iVector appropriately. However, if the training data was too
carefully normalized, or normalized only within a narrow range, it can
fail to learn complete invariance with respect to volume differences.
In some more recent recipes we have started perturbing the volume of
the training data to help it learn this better.
Dan
On Wed, Jun 24, 2015 at 7:27 AM, dovark dovark@users.sf.net wrote: