|
From: Benno S. <be...@ga...> - 2003-08-09 20:07:37
|
Scrive Steve Harris <S.W...@ec...>:
> On Sat, Aug 09, 2003 at 07:01:40 +0200, Benno Senoner wrote:
> > basically instead of doing (like LADSPA would do)
> >
> > am_sample=amplitude_modulator(input_sample,256samples)
> > output_sample=LP_filter(am_sample, 256samples)
> >
> > (output_sample is a pointer to 256 samples) the current audio block.
> >
> > it would be faster do do
> >
> > for(i=0;i<256;i++) {
> > output_sample[i]=do_filter(do_amplitude_modulation(input_sample[i]));
> > }
>
> Thats not what I did, I did something like:
>
> for(i=0;i<256;i++) {
> do_amplitude_modulation(input_sample+i, &tmp);
> do_filter(&tmp, output_sample[i]);
> }
>
> Otherwise your limited in the number of outputs you can have, and this way
> makes it more obvious what the execution order is (imagine branches in the
> graph) and its no slower.
You mean that your method is almost as fast as my equation since the compiler
optimizes away these (needless) temporary variables ?
And yes you are right with you method you can tap any output you like
(and probably often you need to do so).
>
> > The question is indeed if yo do LADSPA style processing
> > (applying all DSP processing in sequence) the compiler uses SIMD
> > and optimization of the processing loops and is therefore faster
>
> I doubt it, I've not come across a compiler that can generate halfway decent
> SIMD instructions - including the intel one.
Yes
>
> NB you can still use SIMD instruction in blockless code, you just do it
> across the input channels, rather than along them.
But since the channels are dynamic and often each new voice points to
different sample data (with different pitches etc) I see it hard to
get any speed up from SIMD.
>
> > Ah you are using the concept of duration.
> > Ins't it a bit redundant ? Instead of using duration one can use
> > duration-less RAMP events and just generate an event that sets
> > the delta ramp value to zero when you want the ramp to stop.
>
> If you just specify the delta 'cos then the receiver is limited to linear
> segments, as it cant 2nd guess the duration. This wont work well eg. for
> envelopes. Some of the commercial guys (eg. cakewalk) are now evaluating
> thier envelope curves every sample anyway, so if we go for ramp events for
> envelope curves we will be behind the state of the art. FWIW Adobe use
> 1/4 audio rate controls, as its convienient for SIMD processing (this came
> out in some GMPI discusions).
But since you can set the ramping value at each sample you can
perform any kind of modulation not only at 1/4 sample rate but
at samplerate too. (but it will be a bit more expensive than
the pure streamed approach, but OTOH such an accuracy is seldom needed,
see exponential curves where you can aproximate a large part of the curve
using only few linear segments while the first part needs a more dense
event stream, but on the average you win over the streamed approach).
PS: I attached a small C file to test the performance of code using
temporary variables and code that inlines all in one equation.
Steve, you are right the speed difference is almost null.
(aplitude modulator -> amplitude modulator -> filter).
On my box 40k iterations of func1() take 18.1sec while the optimized case
(func2) takes 18.0sec = 0.5% speed difference.
Of course we cannot generalize but I guess that the speed difference
between a hand opzimized function and a the code generated using
tmp variables is in the low single digit percentage.
Benno.
#include <stdio.h>
#include <stdlib.h>
#define BLOCKSIZE 256
int func1();
int func2();
static double oldfiltervalue;
static double sample[BLOCKSIZE];
static double output[BLOCKSIZE];
main() {
int i;
int u;
int res;
//init the sample array
for(i=0;i<BLOCKSIZE;i++) {
sample[i]=i;
}
oldfiltervalue=0.0;
for(u=0;u<40000; u++) {
res=func1();
// res=func2();
}
}
int func1(void) {
int i;
double tmp;
double tmp2;
double tmp3;
double v1,v2;
v1=1.0;
v1=1.1;
for(i=0;i < BLOCKSIZE; i++) {
tmp=sample[i]*v1;
v1+=0.00001;
tmp2=tmp*v2;
v2+=0.00001;
tmp3=(tmp2 + oldfiltervalue)/2.0;
oldfiltervalue=tmp2;
output[i]=tmp3;
}
}
int func2(void) {
int i;
double tmp;
double v1,v2;
v1=1.0;
v1=1.1;
for(i=0;i < BLOCKSIZE; i++) {
tmp=sample[i]*v1*v2;
v1+=0.00001;
v2+=0.00001;
output[i]=(tmp + oldfiltervalue)/2.0;
oldfiltervalue=tmp;
}
}
-------------------------------------------------
This mail sent through http://www.gardena.net
|