Re: [Kaldi-developers] DTMF and dialtone detection

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Are you looking for signal processing solutions or model based solutions.
For signal processing solutions without making a lot of assumptions about
the duration of the tones we can take the approach below.

Seeing the DTMF frequency ranges (table below) and given that the
requirement is to process speech which has overlap with DTMF tones, I think
a good detector can be designed by

   1. measuring energy at the pair of DTMF frequencies (for each digit)
   2. compare it with energy in the total speech frequency range. (This can
   be done after the FFT is generated during MFCC extraction to avoid
   redundant computation.)
   3. detect a digit if the energy difference is greater than certain level

With regards to elimination of DTMF tones, since we are just interested in
clean MFCCs

   1. we can just subtract the detected DTMF energies from the
   corresponding mel-bins after scaling DTMF energies with the mel weight of
   corresponding frequencies
   *or*
   2. we can run the signal through a notch filter which has notches as the
   two frequencies corresponding to the digit, for the duration of the
   digit,  and extract MFCCs from the filtered signal

spectrogram for 112163_112196_11#9632_##9696 (audio file attached):

DTMF keypad frequencies (with sound clips)1209 Hz1336 Hz1477 Hz1633 Hz697 Hz
1 <http://upload.wikimedia.org/wikipedia/commons/b/bf/Dtmf1.ogg>2
<http://upload.wikimedia.org/wikipedia/commons/7/7d/Dtmf2.ogg>3
<http://upload.wikimedia.org/wikipedia/commons/2/28/Dtmf3.ogg>A
<http://upload.wikimedia.org/wikipedia/commons/d/d5/DtmfA.ogg>770 Hz4
<http://upload.wikimedia.org/wikipedia/commons/9/9f/Dtmf4.ogg>5
<http://upload.wikimedia.org/wikipedia/commons/1/1c/Dtmf5.ogg>6
<http://upload.wikimedia.org/wikipedia/commons/7/7b/Dtmf6.ogg>B
<http://upload.wikimedia.org/wikipedia/commons/5/5a/DtmfB.ogg>852 Hz7
<http://upload.wikimedia.org/wikipedia/commons/9/9f/Dtmf7.ogg>8
<http://upload.wikimedia.org/wikipedia/commons/f/f7/Dtmf8.ogg>9
<http://upload.wikimedia.org/wikipedia/commons/5/59/Dtmf9.ogg>C
<http://upload.wikimedia.org/wikipedia/commons/9/96/DtmfC.ogg>941 Hz*
<http://upload.wikimedia.org/wikipedia/commons/e/e7/DtmfStar.ogg>0
<http://upload.wikimedia.org/wikipedia/commons/2/2d/Dtmf0.ogg>#
<http://upload.wikimedia.org/wikipedia/commons/c/c4/Dtmf-.ogg>D
<http://upload.wikimedia.org/wikipedia/commons/9/99/DtmfD.ogg>

On Fri, Mar 13, 2015 at 1:04 AM, Daniel Povey <dp...@gm...> wrote:

> Everyone,
>
> Something that is sometimes needed is code to detect DTMF tones and dial
> tones.  It would be good to have the option to remove them from the audio
> in addition to detecting them (so we can correctly process speech that
> occurs on top of DTMF tones).  And this ideally should be done in an
> algorithm which can be in principle applied online, as the signal comes in.
> Does anyone want to help with this?  If so, you might want to draft an
> interface for this.
>
> Dan
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>