Recently, we have made about a dozen interviews that we want to recognize.
We have the following additional material at hand:
- For each interview, we have 9 sentences of speaker adaptation data (i. e. the people were asked to read 9 sentences). The sentences were recorded in the same environment using the same equipment as the original interview.
- For each interview, we have the background noise. This was created by placing an additional microphone in the vicinity of the speaker. Meaning that we have not only a sample of the background noise, but the exact background noise of the interview.
We are already using MLLR to adapt to the speakers and the background noise.
--- The problem ---
Is noise removal redundant to using MLLR, since it adapts to the environment already, or can we gain anything by additional noise removal?
IF noise removal makes sense, which software would be best to use?
The usual techniques assume that we have a sample, not the complete background noise at the time of recording. So instead of using a sample, we would have to directly subtract the noise profile from the interview, I guess. What would be the optimal tools for this use?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Without data it's hard to comment on this issue. You need to provide the data to reproduce your issue in order to get help on this.
The problem is that we're not allowed to hand out our material. Hence why we cannot provide all the files used. And hence why we can, at the moment, only talk about the general procedure we'd have to do.
Assuming the situation described above, i. e. having one wav file for the interview and one file from a separate microphone stationed about a metre away, what would be the correct way to use the second file to remove the noise of the first?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
--- The situation ---
Recently, we have made about a dozen interviews that we want to recognize.
We have the following additional material at hand:
- For each interview, we have 9 sentences of speaker adaptation data (i. e. the people were asked to read 9 sentences). The sentences were recorded in the same environment using the same equipment as the original interview.
- For each interview, we have the background noise. This was created by placing an additional microphone in the vicinity of the speaker. Meaning that we have not only a sample of the background noise, but the exact background noise of the interview.
We are already using MLLR to adapt to the speakers and the background noise.
--- The problem ---
Is noise removal redundant to using MLLR, since it adapts to the environment already, or can we gain anything by additional noise removal?
IF noise removal makes sense, which software would be best to use?
The usual techniques assume that we have a sample, not the complete background noise at the time of recording. So instead of using a sample, we would have to directly subtract the noise profile from the interview, I guess. What would be the optimal tools for this use?
You can get additional gains
Any programming language you are familar with
I was hoping we could use pre-existing software. We tried Audacity noise removal, but that didn't directly subtract the background noise.
We ended up trying phase cancellation with Audacity using the method described in here:
http://blog.youdownwithfcp.com/2010/06/29/how-to-remove-vocals-from-music-with-phase-cancellation/
The results are, most of the time, worse than without the phase cancellation. o_O
Is phase canellation the correct technique to use for subtracting background noise?
Without data it's hard to comment on this issue. You need to provide the data to reproduce your issue in order to get help on this.
It's a similar technology but you need to understand what is going on before applying it.
The problem is that we're not allowed to hand out our material. Hence why we cannot provide all the files used. And hence why we can, at the moment, only talk about the general procedure we'd have to do.
Assuming the situation described above, i. e. having one wav file for the interview and one file from a separate microphone stationed about a metre away, what would be the correct way to use the second file to remove the noise of the first?