Hello, I trained a model with sphinx, it gives me around %13 WER in decoding part. It also works good when I am in 1 meter distance to microphone. However, when I moved a bit far (3 meter), the accuracy drops dramatically.
I am using very advanced microarray, and I use VAD program also.
Do you think I should give the low-volume input data for training? Would it solve the problem or at least would help in someway theoretically? Or do you have any other suggestions?
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, I trained a model with sphinx, it gives me around %13 WER in decoding part. It also works good when I am in 1 meter distance to microphone. However, when I moved a bit far (3 meter), the accuracy drops dramatically.
I am using very advanced microarray, and I use VAD program also.
Do you think I should give the low-volume input data for training? Would it solve the problem or at least would help in someway theoretically? Or do you have any other suggestions?
Thanks.
You should try Kaldi instead.