Simon Wood Jr - 2003-08-03

A lot of times I find myself working with large data sets. Using the FFT to compute cross correlations and perform filtering operations is extremely efficient and saves a lot of computation time.

Here is a short program that compares the performance of the "time domain" approach presently used in IT++ versus a "frequency domain" approach.  There are a couple of things to note:

1. The speed of the frequency domain approach starts getting better after a couple of hundred samples or so. However the real increase kicks in at 1000 samples and above.

2. I'm using version 3.7.0 of IT++. So the frequency domain XCORR int the test routine (my_xcorr) is reflective of the XCORR function found in version 3.7.0. I've noticed the XCORR routine(s) in version 3.7.1 have been upgraded with additional options. An equivalent version of 'my_xcorr' utilizing a frequency domain approach should have the same speed advantages over the current version in 3.7.1.

3. The frequency domain approach can also be used for convolution (i.e. filtering) as well. It would be nice to have a 'fastfilter' routine.

-Simon

Code starts here...

/*
* File: xcorr_time.cpp
* Info: Main driver routine to test the execution time of a frequency
* domain (FFT based) XCORR function.
*
* Here are some sample timing results run on a P4 3.08 GHz with 1GB Ram:
*
* Vector Size:     10
* Original time:   8.8277e-07
* New time:        4.4843e-06
*
* Vector Size:     100
* Original time:   1.7967e-05
* New time:        2.0642e-05
*
* Vector Size:     1000
* Original time:   9.0319e-04
* New time:        1.7584e-04
*
* Vector Size:     5000
* Original time:   2.2624e-02
* New time:        3.2606e-03
*
* Vector Size:     10000
* Original time:   8.9885e-02
* New time:        9.0020e-03
*
* Vector Size:     20000
* Original time:   3.6374e-01
* New time:        3.6200e-02
*
* Vector Size:     40000
* Original time:   1.4726e+00
* New time:        8.9889e-02
*
* Vector Size:     60000
* Original time:   6.0551e+00
* New time:        8.2238e-02
*
* Vector Size:     80000
* Original time:   1.3277e+01
* New time:        1.9100e-01
*
* Vector Size:     100000
* Original time:   2.6620e+01
* New time:        1.7816e-01
*/

#include "itbase.h"

using namespace std;

// Frequency domaing XCORR function
vec my_xcorr(const vec &x);

int main()
{
  // Number of iterations to average time over
#if 1
  const int Iterations = 1000;
  int N[5] = {10, 100, 1000, 5000, 10000};
#endif

  // Larger vector sizes. For these you want to use a smaller number
  // of iterations (Trust me ;-)
#if 0
  const int Iterations = 2;
  int N[5] = {20000, 40000, 60000, 80000, 100000};
#endif

  // Timer object
  Real_Timer tt;

  for(int jj=0; jj<5; jj++)
    {
      // Generate a vector to perform XCORR on
      vec a = linspace(0,1,N[jj]);

      double orig_time = 0.0;
      double new_time = 0.0;
      for(int ii=0; ii<Iterations; ii++)
    {
      tt.reset();
      tt.start();
      // Call original XCORR function
      vec b = xcorr(a);
      tt.stop();
      orig_time += tt.get_time();
   
      tt.reset();
      tt.start();
      // Call new (faster) XCORR function
      vec c = my_xcorr(a);
      tt.stop();
      new_time += tt.get_time();
    }

      printf("Vector Size:\t %d\n", N[jj]);
      printf("Original time:\t %2.4e\n",orig_time/Iterations);
      printf("New time:\t %2.4e\n\n",new_time/Iterations);
    }

  return 0;
}

vec my_xcorr(const vec &x)
{
  // Compute the FFT size as the "next power of 2" of the input vector's length
  int b = static_cast<int>(ceil( log2(2*x.length()-1) ));
  int fftsize = pow2i(b);

  int end = fftsize - 1;
  int maxlag = x.length() - 1;

  // Take FFT of input vector
  cvec X = fft_real(zero_pad(x,fftsize));

  // Compute the abs(X).^2 and take the inverse FFT...this is the Autocorrelation
  vec temp2 = ifft_real(elem_mult(X,conj(X)));

  // Move negative lags to the beginning. Any "extra" values from the FFT/IFFT
  // computations are dropped.
  vec out = concat(temp2(end-maxlag+1,end),temp2(0,maxlag));

  return out;
}