Thread: Re: [Linuxsampler-devel] sample accurate events and scheduling events in the future (Page 2)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Scrive Steve Harris <S.W...@ec...>:

> On Sat, Aug 09, 2003 at 07:01:40 +0200, Benno Senoner wrote:
> > basically instead of doing (like LADSPA would do)
> > 
> > am_sample=amplitude_modulator(input_sample,256samples)
> > output_sample=LP_filter(am_sample, 256samples)
> > 
> > (output_sample is a pointer to 256 samples) the current audio block.
> > 
> > it would be faster do do
> > 
> > for(i=0;i<256;i++) {
> >   output_sample[i]=do_filter(do_amplitude_modulation(input_sample[i]));
> > }
> 
> Thats not what I did, I did something like:
> 
> for(i=0;i<256;i++) {
>   do_amplitude_modulation(input_sample+i, &tmp);
>   do_filter(&tmp, output_sample[i]);
> }
> 
> Otherwise your limited in the number of outputs you can have, and this way
> makes it more obvious what the execution order is (imagine branches in the
> graph) and its no slower.

You mean that your method is almost as fast as my equation since the compiler
optimizes away these (needless) temporary variables ?
And yes you are right with you method you can tap any output you like
(and probably often you need to do so).

>  
> > The question is indeed if yo do LADSPA style processing
> > (applying all DSP processing in sequence) the compiler uses SIMD
> > and optimization of the processing loops and is therefore faster
> 
> I doubt it, I've not come across a compiler that can generate halfway decent
> SIMD instructions - including the intel one.

Yes

> 
> NB you can still use SIMD instruction in blockless code, you just do it
> across the input channels, rather than along them.

But since the channels are dynamic and often each new voice points to
different sample data (with different pitches etc) I see it hard to
get any speed up from SIMD.

>  
> > Ah you are using the concept of duration.
> > Ins't it a bit redundant ? Instead of using duration one can use
> > duration-less RAMP events and just generate an event that sets 
> > the delta ramp value to zero when you want the ramp to stop.
> 
> If you just specify the delta 'cos then the receiver is limited to linear
> segments, as it cant 2nd guess the duration. This wont work well eg. for
> envelopes. Some of the commercial guys (eg. cakewalk) are now evaluating
> thier envelope curves every sample anyway, so if we go for ramp events for
> envelope curves we will be behind the state of the art. FWIW Adobe use
> 1/4 audio rate controls, as its convienient for SIMD processing (this came
> out in some GMPI discusions).

But since you can set the ramping value at each sample you can 
perform any kind of modulation not only at 1/4 sample rate but 
at samplerate too. (but it will be a bit more expensive than
the pure streamed approach, but OTOH such an accuracy is seldom needed,
see exponential curves where you can aproximate a large part of the curve
 using only few linear segments while the first part needs a more dense
event stream, but on the average you win over the streamed approach).

PS: I attached a small C file to test the performance of code using
temporary variables and code that inlines all in one equation.
Steve, you are right the speed difference is almost null.
(aplitude modulator -> amplitude modulator -> filter).
On my box 40k iterations of func1() take 18.1sec while the optimized case
(func2) takes 18.0sec = 0.5% speed difference.
Of course we cannot generalize but I guess that the speed difference
between a hand opzimized function and a the code generated using
tmp variables is in the low single digit percentage.

Benno.

#include <stdio.h>
#include <stdlib.h>

#define BLOCKSIZE 256

int func1();
int func2();
static double oldfiltervalue;
static double sample[BLOCKSIZE];
static double output[BLOCKSIZE];

main() {
  int i;
  int u;
  int res;
  //init the sample array
  for(i=0;i<BLOCKSIZE;i++) {
    sample[i]=i;
  }

  oldfiltervalue=0.0;
  for(u=0;u<40000; u++) {
    res=func1();
    // res=func2();
  }

}

int func1(void) {
  int i;
  double tmp;
  double tmp2;
  double tmp3;

  double v1,v2;
  v1=1.0;
  v1=1.1;
  for(i=0;i < BLOCKSIZE; i++) {
    tmp=sample[i]*v1;
    v1+=0.00001;
    tmp2=tmp*v2;
    v2+=0.00001;
    tmp3=(tmp2 + oldfiltervalue)/2.0;
    oldfiltervalue=tmp2;
    output[i]=tmp3;
  }
}

int func2(void) {
  int i;
  double tmp;
  double v1,v2;
  v1=1.0;
  v1=1.1;
  for(i=0;i < BLOCKSIZE; i++) {
    tmp=sample[i]*v1*v2;
    v1+=0.00001;
    v2+=0.00001;
    output[i]=(tmp + oldfiltervalue)/2.0;
    oldfiltervalue=tmp;
  }
}

-------------------------------------------------
This mail sent through http://www.gardena.net

Thread: Re: [Linuxsampler-devel] sample accurate events and scheduling events in the future (Page 2)

linuxsampler-devel