We are trying evaluate feature normalization/enhancement algorithms on CDHMM and DNN. For example, VTS. We are currently doing in the following manner
Generate VTS features using Matlab and save it required format for Kaldi
Build CDHMM model using them
Build DNN models
Decode
Now, as I understand, both CDHMM and DNN model building steps are applying CMVN on the features before training, i.e, the features are now VTS+CMVN and not VTS alone. We would like to disable the CMVN part for both CDHMM and DNN to study the performance for VTS alone. Could any of you please suggest the best way to disable CMVN computation and get the required output.
We also notice that In RBM pretraining, for every iteration CMVN is applied. Is it ok to disable CMVN at-least in first iteration so that it used our features? Is this the right approach to do?
Thanks in advance for your help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Some of the training scripts take an option like --cmvn-opts
"--norm-means-false" which can be used to switch off CMN. (variance
normalization is off by default). You should do this at the
train_lda_mllt.sh stage.
The CMN in RBM pretraining may just be doing a global normalization of the
features, rather than per speaker; you'd have to check the script.
In any case, I would recommend to leave the CMVN switched on. It tends to
give better performance, and there is no reason to think this would no
longer be true after applying VTS.
Dan
We are trying evaluate feature normalization/enhancement algorithms on
CDHMM and DNN. For example, VTS. We are currently doing in the following
manner
Generate VTS features using Matlab and save it required format for Kaldi
Build CDHMM model using them
Build DNN models
Decode
Now, as I understand, both CDHMM and DNN model building steps are applying
CMVN on the features before training, i.e, the features are now VTS+CMVN
and not VTS alone. We would like to disable the CMVN part for both CDHMM
and DNN to study the performance for VTS alone. Could any of you please
suggest the best way to disable CMVN computation and get the required
output.
We also notice that In RBM pretraining, for every iteration CMVN is
applied. Is it ok to disable CMVN at-least in first iteration so that it
used our features? Is this the right approach to do?
We are trying evaluate feature normalization/enhancement algorithms on CDHMM and DNN. For example, VTS. We are currently doing in the following manner
Generate VTS features using Matlab and save it required format for Kaldi
Build CDHMM model using them
Build DNN models
Decode
Now, as I understand, both CDHMM and DNN model building steps are applying CMVN on the features before training, i.e, the features are now VTS+CMVN and not VTS alone. We would like to disable the CMVN part for both CDHMM and DNN to study the performance for VTS alone. Could any of you please suggest the best way to disable CMVN computation and get the required output.
We also notice that In RBM pretraining, for every iteration CMVN is applied. Is it ok to disable CMVN at-least in first iteration so that it used our features? Is this the right approach to do?
Thanks in advance for your help.
Some of the training scripts take an option like --cmvn-opts
"--norm-means-false" which can be used to switch off CMN. (variance
normalization is off by default). You should do this at the
train_lda_mllt.sh stage.
The CMN in RBM pretraining may just be doing a global normalization of the
features, rather than per speaker; you'd have to check the script.
In any case, I would recommend to leave the CMVN switched on. It tends to
give better performance, and there is no reason to think this would no
longer be true after applying VTS.
Dan
We are trying evaluate feature normalization/enhancement algorithms on