|
From: Benno S. <be...@ga...> - 2003-08-09 20:07:37
|
Scrive Steve Harris <S.W...@ec...>:
> On Sat, Aug 09, 2003 at 07:01:40 +0200, Benno Senoner wrote:
> > basically instead of doing (like LADSPA would do)
> >
> > am_sample=amplitude_modulator(input_sample,256samples)
> > output_sample=LP_filter(am_sample, 256samples)
> >
> > (output_sample is a pointer to 256 samples) the current audio block.
> >
> > it would be faster do do
> >
> > for(i=0;i<256;i++) {
> > output_sample[i]=do_filter(do_amplitude_modulation(input_sample[i]));
> > }
>
> Thats not what I did, I did something like:
>
> for(i=0;i<256;i++) {
> do_amplitude_modulation(input_sample+i, &tmp);
> do_filter(&tmp, output_sample[i]);
> }
>
> Otherwise your limited in the number of outputs you can have, and this way
> makes it more obvious what the execution order is (imagine branches in the
> graph) and its no slower.
You mean that your method is almost as fast as my equation since the compiler
optimizes away these (needless) temporary variables ?
And yes you are right with you method you can tap any output you like
(and probably often you need to do so).
>
> > The question is indeed if yo do LADSPA style processing
> > (applying all DSP processing in sequence) the compiler uses SIMD
> > and optimization of the processing loops and is therefore faster
>
> I doubt it, I've not come across a compiler that can generate halfway decent
> SIMD instructions - including the intel one.
Yes
>
> NB you can still use SIMD instruction in blockless code, you just do it
> across the input channels, rather than along them.
But since the channels are dynamic and often each new voice points to
different sample data (with different pitches etc) I see it hard to
get any speed up from SIMD.
>
> > Ah you are using the concept of duration.
> > Ins't it a bit redundant ? Instead of using duration one can use
> > duration-less RAMP events and just generate an event that sets
> > the delta ramp value to zero when you want the ramp to stop.
>
> If you just specify the delta 'cos then the receiver is limited to linear
> segments, as it cant 2nd guess the duration. This wont work well eg. for
> envelopes. Some of the commercial guys (eg. cakewalk) are now evaluating
> thier envelope curves every sample anyway, so if we go for ramp events for
> envelope curves we will be behind the state of the art. FWIW Adobe use
> 1/4 audio rate controls, as its convienient for SIMD processing (this came
> out in some GMPI discusions).
But since you can set the ramping value at each sample you can
perform any kind of modulation not only at 1/4 sample rate but
at samplerate too. (but it will be a bit more expensive than
the pure streamed approach, but OTOH such an accuracy is seldom needed,
see exponential curves where you can aproximate a large part of the curve
using only few linear segments while the first part needs a more dense
event stream, but on the average you win over the streamed approach).
PS: I attached a small C file to test the performance of code using
temporary variables and code that inlines all in one equation.
Steve, you are right the speed difference is almost null.
(aplitude modulator -> amplitude modulator -> filter).
On my box 40k iterations of func1() take 18.1sec while the optimized case
(func2) takes 18.0sec = 0.5% speed difference.
Of course we cannot generalize but I guess that the speed difference
between a hand opzimized function and a the code generated using
tmp variables is in the low single digit percentage.
Benno.
#include <stdio.h>
#include <stdlib.h>
#define BLOCKSIZE 256
int func1();
int func2();
static double oldfiltervalue;
static double sample[BLOCKSIZE];
static double output[BLOCKSIZE];
main() {
int i;
int u;
int res;
//init the sample array
for(i=0;i<BLOCKSIZE;i++) {
sample[i]=i;
}
oldfiltervalue=0.0;
for(u=0;u<40000; u++) {
res=func1();
// res=func2();
}
}
int func1(void) {
int i;
double tmp;
double tmp2;
double tmp3;
double v1,v2;
v1=1.0;
v1=1.1;
for(i=0;i < BLOCKSIZE; i++) {
tmp=sample[i]*v1;
v1+=0.00001;
tmp2=tmp*v2;
v2+=0.00001;
tmp3=(tmp2 + oldfiltervalue)/2.0;
oldfiltervalue=tmp2;
output[i]=tmp3;
}
}
int func2(void) {
int i;
double tmp;
double v1,v2;
v1=1.0;
v1=1.1;
for(i=0;i < BLOCKSIZE; i++) {
tmp=sample[i]*v1*v2;
v1+=0.00001;
v2+=0.00001;
output[i]=(tmp + oldfiltervalue)/2.0;
oldfiltervalue=tmp;
}
}
-------------------------------------------------
This mail sent through http://www.gardena.net
|
|
From: David O. <da...@ol...> - 2003-08-09 22:31:08
|
On Saturday 09 August 2003 19.01, Benno Senoner wrote:
[...]
> > > The other philosopy is not to adopt the "everything is a CV",
> > > use typed ports and use time stamped scheduled events.
> >
> > Another way of thinking about it is to view streams of control
> > ramp events as "structured audio data". It allows for various
> > optimizations for low bandwidth data, but still doesn't have any
> > absolute bandwidth limit apart from the audio sample rate. (And
> > not even that, if you use a different unit for event timestamps -
> > but that's probably not quite as easy as it may sound.)
>
> So at this time for linuxsampler would you advocate an event
> based approach or continuous control stream (that runs at a
> fraction of the samplerate) ?
I don't like "control rate streams" and the like very much at all.=20
They're only slightly easier to deal with than timestamped events,=20
while they restrict timing accuracy in a way that's not musically=20
logical. They get easier to deal with if you fix the C-rate at=20
compile time, but then you have a quantization that may differ=20
between installed systems - annoying if you want to move=20
sounds/projects/whatever around.
I've had enough trouble with h/w synths that have these kind of=20
restrictions (MCUs and/or IRQ driven EGs + LFOs and whatnot) that I=20
don't want to see anything like it in any serious synth or sampler=20
again. Envelopes and stuff *have* to be sample accurate for serious=20
sound programming. (That is, anything beyond applying slow filter=20
sweeps to samples. Worse than sample accurate timing essentially=20
rules out virtual analog synthesis, at least when it comes to=20
percussive sounds.)
> As far as I understood it from reading your mail it seems that you
> agree that on current machines (see filters that need to
> recalculate coefficients etc) it makes sense to use an event based
> system.
Yes, that seems to be the best compromize at the moment. Maybe that=20
will change some time, but I'm not sure. *If* audio rate controls and=20
blockless processing become state-of-the-art for native processing,=20
it's only because CPUs are so fast that the cost is acceptable in=20
relation to the gains. (Audio rate control data everywhere and=20
slightly simpler DSP code, I guess...)
[...]
> > Now, that is apparently impossible to implement that on some
> > platforms (poor RT scheduling), but some people using broken OSes
> > is no argument for broken API designs, IMNSHO...
>
> Ok but even if there is a jitter of a few samples it is much better
> than having an event jitter equivalent to the audio fragment size.
Sure - but it seems that on Windoze, the higher priority MIDI thread=20
will pretty much be quantized to audio block granularity by the=20
scheduler, so the results are no better than polling MIDI once per=20
audio block.
Anyway, that's not our problem - and it's not an issue with h/w=20
timestamping MIDI interfaces, since they do the job properly=20
regardless of OS RT performance.
> It will be impossible to the user to notice that the midi pitchbend
> event was schedule a few usecs too late compared to the ideal time.
> Plus as said it will work with relatively large audio fragsizes
> too.
Yes. Or on Win32; maybe. ;-) I don't have any first hand experience=20
with pure Win32. The last time I messed with it, it seemed to "sort=20
of" work as intended, but that was on Win95 with Win16 code running=20
in IRQ context, and that's a different story entirely. You can't do=20
that on Win32 without some serious Deep Hack Mode sessions, AFAIK.
Anyway, who cares!? ;-) It works on Linux, QNX, BeOS, Mac OS X, Irix=20
and probably some other platforms.
The only reason I brought it up is that Win32 failing to support this=20
kind of timestamped real time input was seriously suggested as an=20
argument to drop timestamped events in GMPI. Of course, that was=20
overruled, as the kind of plugins GMPI is meant for are usually=20
driven from the host's integrated sequencer anyway, and there's no=20
point in having two event systems; one with timestamps and one=20
without.
> > > With the streamed approach we would need some scheduling of
> > > MIDI events too thus we would probably need to create a module
> > > that waits N samples (control samples) and then emits the
> > > event. So basically we end up in a timestamped event scenario
> > > too.
> >
> > Or the usual approach; MIDI is processed once per block and
> > quantized to block boundaries...
>
> I don't like that , it might work ok for very small fragsizes eg
> 32-64 samples / block but if you go up to let's say 512 - 1024
> timing of MIDI events will suck badly.
Right. I don't like it either, but suggested it only for completeness.=20
(See above, on Win32 etc; they do this for RT MIDI input, because it=20
hardly gets any better anyway.)
> > > Not sure we gain in performance compared to the block based
> > > processing where we apply all operations sequentially on a
> > > buffer (filters, envelopes etc) like they were ladspa modules
> > > but without calling external modules but instead "pasting"
> > > their sources in sequence without function calls etc.
> >
> > I suspect it is *always* a performance loss, except in a few
> > special cases and/or with very small nets and a good optimizing
> > "compiler".
>
> So it seems that the best compromise is to process audio in blocks
> but to perform all dsp operations relative to an output sample
> (relative to a voice) in one rush.
Sometimes... I think this depends on the architecture (# of registers,=20
cache behavior etc) and on the algorithm. For most simple algorithms,=20
it's the memory access pattern that determines how things should be=20
done.
[...]
> it would be faster do do
>
> for(i=3D0;i<256;i++) {
> output_sample[i]=3Ddo_filter(do_amplitude_modulation(input_sample[i])
>);
> }
>
> right ?
Maybe - if you still get a properly optimized inner loop that way. Two=20
optimized inner loops with a hot buffer in between is probably faster=20
than one that isn't properly optimized.
> While for pure audio processing the approach is quite
> straightforward, when we take envelope generators etc into account
> we must inject code that checks the current timestamp (an if()
> statement and then modifies the right values accordingly to the
> events pending in the queue (or autogenerated events).
It shouldn't make much of a difference if the "integration" is done=20
properly. In Audiality, I just wrap the DSP loop inside the event=20
decoding loop (the normal, efficient and obvious way of handling=20
timestamped events), so there's only event checking overhead per=20
event queue (normally one per "unit") and per block - not per sample.=20
Code compiled from multiple plugins should be structured the same=20
way, and then it doesn't matter if some of these plugin do various=20
forms of processing that doesn't fit the "once-per-sample" model.
[...]
> sample_ptr_and_len: pointer to a sample stored in RAM with
> associated len
>
>
> attack looping: a list of looping points:
> (position_to_jump, loop_len, number_of_repeats)
>
> release looping: same structure as above but it is used when
> the sampler module goes into release phase.
> Basically when you release a key if the sample has loops after
> the current loop comes to the end pos you switch to the
> release_looping list.
This is where it goes wrong, I think. These things are not musical=20
events, but rather "self triggered" chains of events, closely related=20
to the audio data and the inner workings of the sampler. You can't=20
really control this from a sequencer in a useful way, because it=20
requires sub-sample accurate timing as well as intimate knowledge of=20
how the sampler handles pitch bend, and how accurate it's=20
pitch/position tracking is. You'll most likely get small glitches=20
everywhere for no obvious reason if you try this approach. Lots of=20
fun! ;-)
Of course, you'll need some nice way of passing this stuff as=20
*parameters" to the sampler - but that goes for the audio data as=20
well. If it can't be represented as plain values and text controls in=20
any useful way, it's better handled as "raw data" controls; ie just=20
blocks of private data. (Somewhat like MIDI SysEx dumps.)
[...]
> So I'd be interested how the RAM sampler module described above
> could be made to work with only the RAMP event.
You just handle the private stuff as "raw data", and there's no=20
problem. :-) Controls are meant for things that can actually be=20
driven by a sequencer, or some other event generator or processor.
Also note that a control event is just a command that tells the plugin=20
to change a control to a specified value at a specified time. It's=20
not to be viewed as an object passed to the plugin, to be added to=20
some list or anything like that.
> BTW: you said RAMP with value =3D 0 means set a value.
No, RAMP (<value>, <duration>), where <duration> is 0. The value and=20
timestamp fields are always valid.
> But what do you set exactly to 0 , the timestamp ?
The 'duration' field. (Which BTW, is completely ignored by controls=20
that support only SET operations.)
> this would not be ideal since 0 is a legitimate value.
> It would be better to use -1 or something alike.
No, -1 (or rather, whatever it is in unsigned format) is a perfectly=20
valid timestamp as well, at least in Audiality. It uses 16 bit=20
timestamps that wrap pretty frequently, which is just fine, since=20
events are only for the current buffer anyway.
> OTOH this would require an additional if() statement
> (to check it it is a regular ramp or a set statement) and it could
> possibly slow down things a bit.
This cannot be avoided anyway. You'll have to put that if() somewhere=20
anyway, to avoid div-by-zero and other nasty issues. Consider this=20
code from Audiality:
(in the envelope generator:)
=09case PES_DELAY:
=09=09duration =3D S2S(p->param[APP_ENV_T1]);
=09=09target =3D p->param[APP_ENV_L1];
=09=09clos->env_state =3D PES_ATTACK;
=09=09break;
This is the switch from DELAY to ATTACK. 'duration' goes into the=20
event's duration field, and may well be 0, either because the=20
APP_ENV_T1 parameter is very small, or because the sample rate is=20
low. (S2S() is a macro that converts seconds to samples, based on the=20
current system sample rate.) Since RAMP(target, 0) does exactly what=20
we want in that case, that's just fine. We leave the special case=20
handling to the receiver:
(in the voice mixer:)
=09case VE_IRAMP:
=09=09if(ev->arg2)
=09=09{
=09=09=09v->ic[ev->index].dv =3D ev->arg1 << RAMP_BITS;
=09=09=09v->ic[ev->index].dv -=3D v->ic[ev->index].v;
=09=09=09v->ic[ev->index].dv /=3D ev->arg2 + 1;
=09=09}
=09=09else
=09=09=09v->ic[ev->index].v =3D ev->arg1 << RAMP_BITS;
=09=09v->ic[ev->index].aimv =3D ev->arg1;
=09=09v->ic[ev->index].aimt =3D ev->arg2 + s;
=09=09break;
Here we calculate our internal ramping increments - or, if the=20
duration argument (ev->arg2) is 0, we just grab the target value=20
right away.
One would think that it would be appropriate to set .dv to 0 in the=20
SET case, but it's irrelevant, since what happens after the RAMP=20
duration is undefined, and it's illegal to leave a connected control=20
without input.
That is, the receiver will never end up in the "where do I go now?"=20
situation. In the case of Audiality, the instance is destroyed at the=20
very moment the RAMP event stream ends. In the case of XAP, there is=20
always the option of disconnecting the control, and in that case, the=20
disconnect() call would clear .dv as required to lock the control at=20
a fixed value. (Or the plugin will reconfigure itself internally to=20
remove the control entirely, or whatever...)
> My proposed ramping approach that consists of
> value_to_be_set, delta
> does not require an if and if you simply want to set a value
> you set delta =3D 0
Delta based ramping has the error build-up issue - but <value, delta>=20
tuples should be ok... The <target, duration> approach has the nice=20
side effects of
=091) telling the plugin for how long the (possibly
=09 approximated) ramp must behave nicely, and
=092) eliminating the requirement for the sender to
=09 know the exact current value of controls.
(The latter can make life easier for senders in some cases, but in the=20
case of Audiality - which uses fixed point controls - it's mostly=20
about avoiding error build-up drift without introducing clicks.)
> But my approach has the disadvantage that if you want do to mosly
> ramping you have always to calculate value_to_be_set at each event
> and this could become not trivial if you do not track the values
> within the modules.
Actually, I think you'll have to keep track of what you're doing most=20
of the time anyway, so that's not a major one. The <target, duration>=20
approach doesn't have that requirement, though...
[...]
> blockless as refered above by me (blockless =3D one single equation
> but processed in blocks), or blockless using another kind of
> approach ? (elaborate please)
By "blockless", I mean "no blocks", or rather "blocks of exactly one=20
sample frame". That is, no granularity below the audio rate, since=20
plugins are executed once per sample frame. C-rate =3D=3D A-rate, and=20
there's no need for timestamped events or anything like that, since=20
everything is sample accurate anyway.
> > That said, something that generates C code that's passed to a
> > good optimizing compiler might shift things around a bit,
> > especially now that there are compilers that automatically
> > generate SIMD code and stuff like that.
>
> The question is indeed if yo do LADSPA style processing
> (applying all DSP processing in sequence) the compiler uses SIMD
> and optimization of the processing loops and is therefore faster
> than calculating the result one single big equation at time which
> could possibly not take advantage of SIMD etc.
> But OTOH the blockless processing has the advantage that
> things are not moved around much in the cache.
> The output value of the first module is directly available as the
> input value of the next module without needing to move it to
> a temporary buffer or variable.
It's just that the cost of temporary buffers is very low, as long as=20
they all fit in the cache. Meanwhile, if you throw too many cache=20
abusing plugins into the same inner loop, you may end up with=20
something that thrashes the cache on a once-per-sample-frame basis...
[...]
> > > As said I dislike "everyting is a CV" a bit because you cannot
> > > do what I proposed:
[...]
> > I disagree to some extent - but this is a very complex subject.
> > Have you followed the XAP discussions? I think we pretty much
> > concluded that you can get away with "everything is a control",
> > only one event type (RAMP, where duration =3D=3D 0 means SET) and a
> > few data types. That's what I'm using internally in Audiality,
> > and I'm not seing any problems with it.
>
> Ah you are using the concept of duration.
> Ins't it a bit redundant ?
No - especially not if you consider that some plugins will only=20
*approximate* the linear ramps. When you do that, it becomes useful=20
to know where the sender intends the ramp to end, to come up with a=20
nice approximation that hits the start and end points dead on.
> Instead of using duration one can use
> duration-less RAMP events and just generate an event that sets
> the delta ramp value to zero when you want the ramp to stop.
Yes, but since it's not useful to "just let go" of a control anyway,=20
this doesn't make any difference. Each control receives a continous=20
stream of structured audio rate data, and that stream must be=20
maintained by the sender until the control is disconnected.
Of course, there are other ways of looking at it, but I don't really=20
see the point. Either you're connected, or you aren't. Analog CV=20
outputs don't have a high impedance "free" mode, AFAIK. ;-) (Or maybe=20
some weird variants do - but that would still be an extra feature, ie=20
"soft disconnect", rather than a property of CV control signals.)
> > > Basically in my model you cannot connect everything with
> > > everything (Steve says it it bad but I don't think so) but you
> > > can connect everything with "everything that makes sense to
> > > connect to".
> >
> > Well, you *can* convert back and forth, but it ain't free... You
> > can't have everything.
>
> Ok but converters will be the exception and not the rule:
One would hope so... That should be carefully considered when deciding=20
which controls use what format.
> for example the MIDI mapper module
>
> see the GUI screenshot message here:
> http://sourceforge.net/mailarchive/forum.php?thread_id=3D2841483&foru
>m_id=3D12792
>
> acts as a proxy between the MIDI Input and the RAM sampler module.
> So it makes the right port types available.
> No converters are needed. It's all done internally in the best
> possible way without needless float to int conversions,
> interpreting pointers as floats and other "everything is a CV"
> oddities ;-)
*hehe*
Well, it gets harder when those modular synth dudes show up and want=20
to play around with everything. ;-)
[...]
> > Yes... In XAP, we tried to forget about the "argument bundling"
> > of MIDI, and just have plain controls. We came up with a nice and
> > clean design that can do everything that MIDI can, and then some,
> > still without any multiple argument events. (Well, events *have*
> > multiple arguments, but only one value argument - the others are
> > the timestamp and various addressing info.)
>
> Hmm I did not follow the XAP discussions (I was overloaded during
> that time as usual ;-) ) but can you briefly explain how this XAP
> model would fit the model where the MIDI IN module talks to the
> MIDI mapper which in turns talks to the RAM sampler.
Well, the MIDI IN module would map everything to XAP Instrument=20
Control events, for starters.
=09NoteOn(Ch, Pitch, Vel);
would become something like
=09vid =3D Pitch;=09=09//Use Pitch for voice ID, like MIDI...
=09ALLOC_VOICE(vid);=09//Tell the receiver we need a voice for 'vid'
=09VCTRL(vid, VELOCITY, Vel);=09//Set velocity
=09VCTRL(vid, PITCH, Pitch);=09//Set pitch
=09VCTRL(vid, VOICE, 1);=09=09//Voice on
and
=09NoteOff(Ch, Pitch);
becomes
=09vid =3D Pitch;
=09VCTRL(vid, VOICE, 0)=09=09//Voice off
=09RELEASE_VOICE(vid)=09=09//We won't talk about 'vid' no more.
Note that RELEASE_VOICE does *not* mean the actual voice dies=20
instantly. That's entirely up to the receiver. What it *does* mean is=20
that you give up the voice ID, so you can't control the voice from=20
now on. If you ALLOC_VOICE(that_same_vid), it well be assigned to a=20
new, independent voice.
Anyway, VELOCITY, PITCH (and any others you like) are just plain=20
controls. They may be continous, "voice on latched" and/or "voice off=20
latched", so you can emulate the same (rather restrictive) logic that=20
applies to MIDI NoteOn/Off parameters. That is, we use controls that=20
are latched at state transitions instead of explicit argruments to=20
some VOICE ON/OFF event.
[...]
> > Either way, the real heavy stuff is always the DSP code. In cases
> > where it isn't, the whole plugin is usually so simple that it
> > doesn't really matter what kind of control interface you're
> > using; the DSP code fits right into the basic "standard model"
> > anyway. In such cases, an API like XAP or Audiality's internal
> > "plugin API" could provide some macros that make it all insanely
> > simple - maybe simpler than LADSPA.
>
> So you are saying that in pratical terms (processing performance)
> it does not matter whether you use events or streamed control
> values ?
I'm saying it doesn't matter much in terms of code complexity. (Which=20
is rather important - someone has to code all those kewl plugins! :-)
It *does* matter a great deal in terms of performance, though, which=20
is why we have to go with the slightly more (and sometimes very much=20
more) complicated solutions.
> I still prefer the event based system because it allows you to deal
> more easily with real time events (with sub audio block precision)
> and if you need it you can run at full sample rate.
Sure, some things actually get *simpler* with timestamped events, and=20
other things don't get all that complicated. After all, you can just=20
do the "VST style quick hack" and process all events first and then=20
process the audio for the full block, if you can't be arsed to do it=20
properly. (They actually do that in some examples that come with the=20
VST SDK, at least in the older versions... *heh*)
> > Anyway, need to get back to work now... :-)
>
> yeah, we unleashed those km-long mails again ... just like in the
> old times ;-) can you say infinite recursion ;-)))
*hehe* :-)
//David Olofson - Programmer, Composer, Open Source Advocate
=2E- The Return of Audiality! --------------------------------.
| Free/Open Source Audio Engine for use in Games or Studio. |
| RT and off-line synth. Scripting. Sample accurate timing. |
`-----------------------------------> http://audiality.org -'
--- http://olofson.net --- http://www.reologica.se ---
|
|
From: Steve H. <S.W...@ec...> - 2003-08-09 18:31:08
|
On Sat, Aug 09, 2003 at 05:29:05PM +0200, David Olofson wrote: > However, keep in mind that what we design now will run on hardware > that's at least twice as fast as what we have now. It's likely that > the MIPS/memory bandwidth ratio will be worse, but you never know... I'm always nervous about this suggestion, I will most likly still be using the machine I'm using now, so moores law wont have effected me personally. We should support machines that are current now and not try to second guess. > What I'm saying is basically that benchmarking for future hardware is > pretty much gambling, and results on current hardware may not give us > the right answer. But at least we knew they are accurate for /some/ hardware. > Audio rate controls *are* the real answer (except for some special > cases, perhaps; audio rate text messages, anyone? ;-), but it's still > a bit on the expensive side on current hardware. (Filters have to > recalculate coefficients, or at least check the input, every sample > frame, for example.) In modular synths, it probably is the right Not really, they /can/ recalcualte every sample, they dont have to. - Steve |
|
From: Simon J. <sje...@bl...> - 2003-08-10 04:22:52
|
David Olofson wrote: >>[Benno:]Now if we assume we do all blockless processing eg the >>dsp compiler generates one giant equation for each dsp network >>(instrument). output = func(input1,input2,....) >> >>Not sure we gain in performance compared to the block based >>processing where we apply all operations sequentially on a buffer >>(filters, envelopes etc) like they were ladspa modules but without >>calling external modules but instead "pasting" their sources in >>sequence without function calls etc. >> >> > >I suspect it is *always* a performance loss, except in a few special >cases and/or with very small nets and a good optimizing "compiler". > You certainly need an optimising compiler because the gains, if present, would be achieved by register optimisations. I was getting better performance from the paste-together-one-big-function approach and had thought that it "just was" better (provided the function fit in the cache of course), but there's been some disagreement on the matter so I'm going over it again. >Audio rate controls *are* the real answer (except for some special >cases, perhaps; audio rate text messages, anyone? ;-), but it's still >a bit on the expensive side on current hardware. (Filters have to >recalculate coefficients, or at least check the input, every sample >frame, for example.) > Agreed it may be very expensive to send a-rate control to certain inputs of certain modules. OTOH that's a limitation of those particular inputs, on those particular modules. Given that we're going to downcompile it would maybe be possible to deal with such inputs "surgically" by (for example) generating code which k-rated connections to just those inputs. Simon Jenkins (Bristol, UK) |