If I have clean training data but with low amplitude ( average speech amplitude <= 0.02) will the models work well for test data with twice or more average speech amplitude?
Here are the amplitude distribution plots for two wav files. https://docs.google.com/file/d/0BzNdoGke8tyRQmY1M2MxRnl2WjA/edit?usp=sharing
Above file shows amplitude distribution for an audio file from the training data (WSJ1 database). Maximum amplitude is ~ 0.05. Average speech amplitude is <= 0.02 (I didn't calculate it by removing silence frames, just a guess from the figure)
If I have clean training data but with low amplitude ( average speech amplitude <= 0.02) will the models work well for test data with twice or more average speech amplitude?
Amplitude (first cepstrum coefficient) is actually normalized during CMN. So the accuracy will be the same unless you have to high amplitude due to clipping or two low amplitude due to precision loss.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks Nickolay. This is helpful for my case as well. Is there an actual value you would recommend for the too "high" and too "low" situation? What in your experience would you quantify these as?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
If I have clean training data but with low amplitude ( average speech amplitude <= 0.02) will the models work well for test data with twice or more average speech amplitude?
Here are the amplitude distribution plots for two wav files.
https://docs.google.com/file/d/0BzNdoGke8tyRQmY1M2MxRnl2WjA/edit?usp=sharing
Above file shows amplitude distribution for an audio file from the training data (WSJ1 database). Maximum amplitude is ~ 0.05. Average speech amplitude is <= 0.02 (I didn't calculate it by removing silence frames, just a guess from the figure)
Below image shows amplitude histogram for my test file. Max amplitude is 0.08, Average speech amplitude is >= 0.04.
https://docs.google.com/file/d/0BzNdoGke8tyReE1acGpMWFM4c1U/edit?usp=sharing
If the training data amplitude is so low, can I just amplify them by X dB and retrain the models?
Regarding plots, Y axis is frequency, X axis is magnitude of amplitude.
Amplitude (first cepstrum coefficient) is actually normalized during CMN. So the accuracy will be the same unless you have to high amplitude due to clipping or two low amplitude due to precision loss.
Thanks Nickolay. This is helpful for my case as well. Is there an actual value you would recommend for the too "high" and too "low" situation? What in your experience would you quantify these as?