From: Jan S. <ha...@st...> - 2015-05-20 07:02:36
|
On May 08 22:26:03, ji...@ji... wrote: > Currently, I use sox like this: > > sox -d -e u-law --endian little -b 8 -c 1 -r 8000 -t ul - silence 1 > 0.3 1% 1 0.3 1% > > For reference, this is recording audio from the default microphone and > outputting little endian, ulaw formatted audio at 8 bits and a 8k rate. The > effects filter trims audio until the noise hits a threshold for 0.3 > seconds, then continues to record until there is 0.3 seconds of silence. > All of this streams to stdout which I use to stream to a remote server. > > I am using all of this to record a bit of voice and finish when I am done > speaking. To trigger sox, I use specialized hardware to trigger the start > of the recording. What specialized hardware do you use to start sox? > I can switch to using almost any audio format or codec as > long as it supports on the fly formatting/encoding. My target platform is > raspbian on the raspberry pi 2 B. > > My ideal solution would be to use vad to stop the recording when the user > is finished speaking. My hope is that this would work even with background > chatter. However, the sox documentation on the vad effect states this: > > The use of the norm effect is recommended, but remember that neither > reverse nor norm is suitable for use with streamed audio. It also says: The effect can trim only from the front of the audio, so in order to trim from the back, the reverse effect must also be used. And you can't use reverse, because it's streamed. > I haven't been able to piece parameters together to get vad and streaming > working. Is it possible to use the vad effect to stop the recording of > audio while still maintaining the stdin->sox->stdout piping? Are there > better alternatives? The 'silence' effect :-) Are you having a specific issue with 'silence'? False positives, false negatives? It's usually about specifying the right treshold. Jan |