What is the best way to do speech recognition in noisy environment ? a)Is it to use clean voice
acoustic model and do noise filtering on the incoming speech or b)train the acoustic models for
noisy speech itself ? In b) How can you ensure that all kinds of noises seen in practice would be
covered in training ?
Thanks,
Li
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
a)Is it to use clean voice acoustic model and do noise filtering on the incoming speech or b)train the acoustic models for noisy speech itself
With conventional training a trained model doesn't have a model for noise so training on noisy speech will actually hurt. If your model itself only models speech, it's preferable to train on the clean speech
It's possible to have a model which trains both noise parameters and speech parameters, such a model could be trained on noisy speech. CDCN training is an example of such framework.
b) How can you ensure that all kinds of noises seen in practice would be
covered in training ?
It's hard to have a model for all kinds of noises, that's why this method is not recommended.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
a)For speech recognition using acoustic models trained with clean speech, which kind of noise filtering algorithms work well (on incoming audio) without introducing non linear distortions that would affect the accuracy ?
b)Will microphone array help ?
Li
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
a) For speech recognition using acoustic models trained with clean speech, which kind of noise filtering algorithms work well (on incoming audio) without introducing non linear distortions that would affect the accuracy ?
There is no silver bullet for noisy speech recognition. You can find a detailed information on robust speech recogniton here:
Hi,
What is the best way to do speech recognition in noisy environment ? a)Is it to use clean voice
acoustic model and do noise filtering on the incoming speech or b)train the acoustic models for
noisy speech itself ? In b) How can you ensure that all kinds of noises seen in practice would be
covered in training ?
Thanks,
Li
With conventional training a trained model doesn't have a model for noise so training on noisy speech will actually hurt. If your model itself only models speech, it's preferable to train on the clean speech
It's possible to have a model which trains both noise parameters and speech parameters, such a model could be trained on noisy speech. CDCN training is an example of such framework.
It's hard to have a model for all kinds of noises, that's why this method is not recommended.
a)For speech recognition using acoustic models trained with clean speech, which kind of noise filtering algorithms work well (on incoming audio) without introducing non linear distortions that would affect the accuracy ?
b)Will microphone array help ?
Li
There is no silver bullet for noisy speech recognition. You can find a detailed information on robust speech recogniton here:
http://books.google.ru/books?id=EwyqfWv24l8C
I sincerely recommend you to check this book from CMUSphinx team
Microphone arrays helps with signal source separation. I'm not sure if it's applicable for your particular situation.