Re: [Audacity-devel] MFCC display -- initial attempt, + minor patch for Vector class
A free multi-track audio editor and recorder
Brought to you by:
aosiniao
From: James C. <cr...@in...> - 2008-04-24 14:41:29
|
Prashant, I'm not familiar with MFCCs. Your best bet for intelligent feedback about whether it looks right is from Chris, Roger and Jen. I'm actually unlikely to have time to try out new Audacity code until GSoC coding period starts - I've a lot of catching up to do with work related coding between now and then. The patch looks non invasive, it's an optional addition and does not take anything away. Could you post screenshots showing how this improves things, e.g for vowel sounds? I'd expect that to show it to be a useful addition to our analysis, and something well worth having in head. Do you have a sourceforge id which we could give CVS access for so that we can get it into the next official beta easily? A couple of small things leapt out from the code: Instead of: in[i] = log(sqrt((out[i] * out[i]) + (out2[i] * out2[i]))); why not: in[i] = 0.5f * log( power ); ? in Vector operator/(const Vector &left, double right) possibly set up a constant one_over_right = 1.0 / right and multiply by that? If I had the time, I'd read up on MFCC, compile the patch, check it, look for related adaptations that we might benefit from as well. I'm sorry I can't do this, but maybe Jen can help here, as she has the most immediate need for it? --James. Prashant Vaibhav wrote: > Dev-team: > > Turns out it was not that trivial after all. Attached is a patch > against cvs head which attempts to add a preliminary MFCC display to > the spectrum analysis window (Analyze > Plot spectrum > choose > Mel-frequency Cepstrum). > > I have sourced the transformation matrix from here: > http://labrosa.ee.columbia.edu/matlab/rastamat/fft2melmx.m > The general algorithm which I got from wikipedia : FFT -> log > amplitude -> transform to mel-filter bank -> DCT (via fft). > > Currently the FFT size for calculating mfcc is fixed at 512 samples, > and the number of MFCC bins is set to 32 (both modifiable in > FreqWindow.cpp), from 0 Hz to 0.5*sampleRate Hz. > > I can't say how mathematically correct this is, so if someone wants to > go through it, most welcome :-) The code for computing the mfcc > transformation matrix (which converts N point FFT to M point mel > filter bank), and a routine to apply the transformation to float > arrays reside in the new class "MFCC", as static methods. > > Apart from this, I also made minor additions to Matrix.cpp/.h (the > "Vector") class so that it now allows MATLAB-style subtraction, > addition, division of vectors with scalars, and negation of vectors. I > think the Vector and Matrix classes, however, need to be expanded > significantly.. > > Comments welcome. I am still apprehensive about the mathematical > soundness of the algorithm .. I don't know of the mfcc is supposed to > look like that :-D > > Best, > Prashant > > > > > > 2008/4/21, Richard Ash <ri...@au... > <mailto:ri...@au...>>: > > On Sun, 2008-04-20 at 12:17 -0700, Jennifer Murdoch wrote: > > I'm wondering if there's any existing facility for computing > MFCCs in > > Audacity? > > A quick grep of the sourcecode didn't reveal anything. I did see > > code for plain cepstrum computation in FreqWindow.cpp. > > > I think that's a no, if it was anywhere I'd expect it to be in the > vicinity of the cepstrum stuff. > > > > A segmentation algorithm I'm working on requires them, and while > I could > > add MFCC computation code directly to the Effect sub-class I'm > > working on, it may be of benefit to others (??) to put this code > elsewhere (??). > > > An obvious use is to add it to the display options in Analyse > Plot > Spectrum (which already has some non-spectral functions available). > Unless it's trivial, it would probably stand being a class (and > file) in > it's own right, in the same way that FFT.cpp is. > > > Richard > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > ------------------------------------------------------------------------ > > _______________________________________________ > Audacity-devel mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audacity-devel > |