Hello,
I would like to request help regarding the following problem. On some systems there is a possibility to set the microphone gain of the input audio samples. I've read that recognition accuracy can be affected by this. What I couldn't accurately find was the methodology to compute the gain in order to minimize the degradation of performance. I am interested in some general approaches. From what I have read there are many types of gains that could be applied (at the feature level, at the spectral spectrum) but I am interested in one that operate in the time domain and applies the gain to audio samples. I have read that the gain is computed based on a desired "target level" or "desired amplitude", also "gain curve" has come up a few times. What are the steps to find those values? I realize my knowledge about this is incomplete but I want to ask for some directions on this topic like a document that treats this subject top to bottom. If it can be explained in simple steps or if some examples are available a link would be much appreciated. The only thing that I found was in sphinxbase in ad_oss.c . However I couldn't find an explanation as to how that input gain value was determined. If more details are required please ask. Thank you in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I would like to request help regarding the following problem. On some systems there is a possibility to set the microphone gain of the input audio samples. I've read that recognition accuracy can be affected by this. What I couldn't accurately find was the methodology to compute the gain in order to minimize the degradation of performance. I am interested in some general approaches. From what I have read there are many types of gains that could be applied (at the feature level, at the spectral spectrum) but I am interested in one that operate in the time domain and applies the gain to audio samples. I have read that the gain is computed based on a desired "target level" or "desired amplitude", also "gain curve" has come up a few times. What are the steps to find those values? I realize my knowledge about this is incomplete but I want to ask for some directions on this topic like a document that treats this subject top to bottom. If it can be explained in simple steps or if some examples are available a link would be much appreciated. The only thing that I found was in sphinxbase in ad_oss.c . However I couldn't find an explanation as to how that input gain value was determined. If more details are required please ask. Thank you in advance.