You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(27) |
Nov
(120) |
Dec
(16) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(65) |
Feb
(2) |
Mar
(53) |
Apr
(15) |
May
|
Jun
(19) |
Jul
(8) |
Aug
(35) |
Sep
(17) |
Oct
(70) |
Nov
(87) |
Dec
(94) |
| 2004 |
Jan
(133) |
Feb
(28) |
Mar
(45) |
Apr
(30) |
May
(113) |
Jun
(132) |
Jul
(33) |
Aug
(29) |
Sep
(26) |
Oct
(11) |
Nov
(21) |
Dec
(60) |
| 2005 |
Jan
(108) |
Feb
(153) |
Mar
(108) |
Apr
(44) |
May
(72) |
Jun
(90) |
Jul
(99) |
Aug
(67) |
Sep
(117) |
Oct
(38) |
Nov
(40) |
Dec
(27) |
| 2006 |
Jan
(16) |
Feb
(18) |
Mar
(21) |
Apr
(71) |
May
(26) |
Jun
(48) |
Jul
(27) |
Aug
(40) |
Sep
(20) |
Oct
(118) |
Nov
(69) |
Dec
(35) |
| 2007 |
Jan
(76) |
Feb
(98) |
Mar
(26) |
Apr
(126) |
May
(94) |
Jun
(46) |
Jul
(9) |
Aug
(89) |
Sep
(18) |
Oct
(27) |
Nov
|
Dec
(49) |
| 2008 |
Jan
(117) |
Feb
(40) |
Mar
(18) |
Apr
(30) |
May
(40) |
Jun
(10) |
Jul
(30) |
Aug
(13) |
Sep
(29) |
Oct
(23) |
Nov
(22) |
Dec
(35) |
| 2009 |
Jan
(19) |
Feb
(39) |
Mar
(17) |
Apr
(2) |
May
(6) |
Jun
(6) |
Jul
(8) |
Aug
(11) |
Sep
(1) |
Oct
(46) |
Nov
(13) |
Dec
(5) |
| 2010 |
Jan
(21) |
Feb
(3) |
Mar
(2) |
Apr
(7) |
May
(1) |
Jun
(26) |
Jul
(3) |
Aug
(10) |
Sep
(13) |
Oct
(35) |
Nov
(10) |
Dec
(17) |
| 2011 |
Jan
(26) |
Feb
(27) |
Mar
(14) |
Apr
(32) |
May
(8) |
Jun
(11) |
Jul
(4) |
Aug
(7) |
Sep
(27) |
Oct
(25) |
Nov
(7) |
Dec
(2) |
| 2012 |
Jan
(20) |
Feb
(17) |
Mar
(59) |
Apr
(31) |
May
|
Jun
(6) |
Jul
(7) |
Aug
(10) |
Sep
(11) |
Oct
(2) |
Nov
(4) |
Dec
(17) |
| 2013 |
Jan
(17) |
Feb
(2) |
Mar
(3) |
Apr
(4) |
May
(8) |
Jun
(3) |
Jul
(2) |
Aug
|
Sep
(3) |
Oct
|
Nov
|
Dec
(1) |
| 2014 |
Jan
(6) |
Feb
(26) |
Mar
(12) |
Apr
(14) |
May
(8) |
Jun
(7) |
Jul
(6) |
Aug
(6) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
(9) |
Feb
(5) |
Mar
(4) |
Apr
(9) |
May
(3) |
Jun
(2) |
Jul
(4) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(3) |
| 2016 |
Jan
(2) |
Feb
(4) |
Mar
(5) |
Apr
(4) |
May
(14) |
Jun
(31) |
Jul
(18) |
Aug
|
Sep
(10) |
Oct
(3) |
Nov
|
Dec
|
| 2017 |
Jan
(39) |
Feb
(5) |
Mar
(2) |
Apr
|
May
(52) |
Jun
(11) |
Jul
(36) |
Aug
(1) |
Sep
(7) |
Oct
(4) |
Nov
(10) |
Dec
(8) |
| 2018 |
Jan
(3) |
Feb
(4) |
Mar
|
Apr
(8) |
May
(28) |
Jun
(11) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(1) |
Nov
(2) |
Dec
(25) |
| 2019 |
Jan
(12) |
Feb
(50) |
Mar
(14) |
Apr
(3) |
May
(8) |
Jun
(17) |
Jul
(10) |
Aug
(2) |
Sep
(21) |
Oct
(10) |
Nov
|
Dec
(28) |
| 2020 |
Jan
(4) |
Feb
(10) |
Mar
(7) |
Apr
(16) |
May
(10) |
Jun
(7) |
Jul
(2) |
Aug
(5) |
Sep
(3) |
Oct
(3) |
Nov
(2) |
Dec
(1) |
| 2021 |
Jan
|
Feb
(5) |
Mar
(13) |
Apr
(13) |
May
(7) |
Jun
|
Jul
(1) |
Aug
(11) |
Sep
(12) |
Oct
(7) |
Nov
(26) |
Dec
(41) |
| 2022 |
Jan
(23) |
Feb
|
Mar
(8) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
(3) |
Nov
(1) |
Dec
(1) |
| 2023 |
Jan
|
Feb
(5) |
Mar
(2) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(11) |
Sep
(5) |
Oct
(1) |
Nov
|
Dec
|
| 2024 |
Jan
(2) |
Feb
(4) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(10) |
Dec
|
| 2025 |
Jan
|
Feb
(4) |
Mar
(1) |
Apr
(2) |
May
|
Jun
(17) |
Jul
(1) |
Aug
(4) |
Sep
(7) |
Oct
(1) |
Nov
|
Dec
|
|
From: Matthias W. <mat...@in...> - 2002-11-15 21:22:50
|
On Wed, Nov 13, 2002 at 10:35:51PM -0800, Josh Green wrote: > > consist of samples, instruments, presets, zones etc, there are quite a > lot of them. Thats a lot of wasted space just for a lock. I guess I'll > have to look into some other threading libraries, perhaps just go > pthread or something. The ideal would be a recursive R/W lock. I've seen > these in the pthread library, but I'm not sure if they are recursive. AFAIK there are no r/w locks in the current linuxthread implementation. I read about plans to include those in connection with the NPT (native posix threads). matthias |
|
From: Paul K. <pau...@ma...> - 2002-11-15 14:15:44
|
Steve Harris <S.W...@ec...> wrote:
>
> Benno and I were discussing envelope generation last night, I think that
> the right way to generate an exponential envelope (I checked some synths
> outputs too and it looks like this is that way its done) is to feed a
> constant value into a LP filter with different parameters for each stage.
Yes, so the envelope level tends exponentially to a target level.
Where this gets complicated is the attack, which should have a target level
maybe 1.5 times it's end level, otherwise you spend a long time at nearly full
volume waiting for the decay to start. DLS specifies the attack should be
linear not exponental, and I tend to agree with that - for short attacks it
doesn't sound any different, but for long attacks an exponential curve gets
too loud too soon.
Some softsynths now have much more complicated envelopes, with a time,
target level and curve (variable between exp/lin/log) for each stage, but
it's important to let the user set up a simple ADSR if that is all that's
needed.
> I suspect that, in general you need to calculate the amplitude of the
> envelope for each sample, to avoid stairstepping (zipper noise).
Yes, or use short linear segments, and update the envelope every 32 samples
for
example (64 samples is too long, and people will complain about the
resolution).
> Example code:
>
> env = env_input * ai + env * a;
May be faster with one multiplication: env += ai * (env_input - env);
If you allow real-time control of envelope times/rates, counting down the
time to the next stage can get complicated, so it might be better to
trigger the next stage when you reach a certain level. Here is some
nasty code that does it that way, so env_rate could be modulated in
real-time (but to be able to modulate the sustain_level, you would have
to make env_target a pointer).
//initialize:
env = 0.0f;
env_rate = ATTACK_RATE;
env_thresh = 1.0f;
env_target = 1.5f; //else we will never reach threshold
//per sample:
env += env_rate * (env_target - env);
if(env > env_thresh) //end of attack
{
env_rate = DECAY_RATE;
env_thresh = env; //move threshold out of the way
env_target = SUSTAIN_LEVEL;
//could set a flag so this block is skipped in future
}
//note off:
env_rate = RELEASE_RATE;
env_target = 0.0f; //kill the voice before this de-normals!
Paul.
_____________________________
m a x i m | digital audio
http://mda-vst.com
_____________________________
|
|
From: Paul K. <pau...@ma...> - 2002-11-15 14:15:44
|
Benno Senoner <be...@ga...> wrote:
>
> > If each effect can be chained into the input of the next effect, one
> > routable FX send may be enough.
>
> Sorry I am no expert here, but taking the usual reverb, chorus case I
> don't think they can be chained , can they ?
I think even some GM modules (Roland?) let you send some of the chorus
output into the reverb. This is what I meant... instead of the effects
being in series (which can also be useful, depending on the effects)
the first effect is sent to the output *and* has a send level into the
next effect.
One example of where series effects (with no way of routing the signal
from one effect to another) is not good enough is delay and reverb: You
hear the signal with reverb, but then the delay repeats are dry - this
sounds silly!
> I think separate send and insert FXes should be supported.
> They can be either stereo or mono (probably for insert FXes it makes
> sense to keep them mono in the cases the sample sources are mono).
...except you might have a mono sample, but the pan of each voice might
be random or track the keyboard, so unless you are applying effects to
each voice individually (something the Native Instruments Kontakt
sampler allows) inserts need stereo busses.
Paul.
_____________________________
m a x i m | digital audio
http://mda-vst.com
_____________________________
|
|
From: Benno S. <be...@ga...> - 2002-11-15 13:04:09
|
On Fri, 2002-11-15 at 13:31, Steve Harris wrote:
>
> I dont understand the question, the envelope shape is dynamic depending on
> the length of the appropriate section (attack, decay, etc). The only thing
> that needsa to be selected is the fudge factor (4.0 ish IIRC) which
> determines epsilon for the envelope.
Ok.
>
> Well, the code for this is created on demand right? (a JIT sampler ;)
> So if there is no envelope there is no envelope code. If there is an
> envelope, then when we have reached epsilon, the envelope has (in theory)
> come to an end, we can fade to 0.0 to make sure, and then the voice has
> finished.
But I meant a voice that can be dyncamically enveloped at runtime where
the voice sucks up less CPU when no enveloping is active.
>
> BTW do we know for sure than samplers use exponential envelopes? I guess we
> need linear ones too, but they are easy to implement. We should probably
> get some recordings from samples of high freq sinewaves with envelopes.
> I think Frank N. has done this allready for the S2000, Frank are you on
> the list?
Since I'm a bit CPU-power paranoid at this design stage, I thought it it
really make sense to use the exponentials.
I mean: why not use only linear envelopes and "emulate" exponentials
with a few linear segments.
The big advantage of linear that it costs us only one addidional
addition whiche your exponential code is more expensive.
with linear envelopes we have:
while(1) {
output_sample(pos);
sample_pos += pitch;
pitch += pitch_delta;
volume += volume_delta;
}
These two _delta additions add very little to the total CPU cost but
provide us with good flexibility.
When no pitch/volume enveloping is occurring, simply set
pitch_delta and volume_delta to 0
When applying modulation generate events that set the *_delta variables
in appropriate intervals.
Assume we want to emulate exponentials: generate events at let's say 1/8
samplingrate and I think it is probably impossible to distinguish the
approximation from the exact curve. (we are talking of relatively "slow"
volume and pitch modulation).
For example assume we process MIDI pitch bend events with the linear
interpolator.
It is probably the easiest to keep track of the previous pitch value
and then simply interpolate linearly to the new value within a let's say
0.5 - 1 msec time frame. I think this would smooth out things nicely,
right ?
(Steve, others, do you agree ?)
>
> Yes. Though you can get away without it if you know the cutoff isn;t
> chaging too rapidly (eg. a slow LFO).
Ok but assume these nice LP filter wah-wah effects.
The LFO frequency in this case is up to a few Hz, but the modulation
frequency (filter coefficient change rate) ? How low can it go so that
zipper noise can be avoided ?
>
> > What would a good tradeoff for achieving low-CPU usage zippernoise-free
> > filter modulation would be ? (some kind of interpolation of precomputed
> > coefficients ?)
>
> I think so. Not all filters like you doing this, but most are OK IIRC.
> I have some SVF and biquad test code around, so I can test this if you
> like. I think my LADSPA SVF uses linear interp. to smooth the coefficients.
ok let us know.
Benno.
--
http://linuxsampler.sourceforge.net
Building a professional grade software sampler for Linux.
Please help us designing and developing it.
|
|
From: Steve H. <S.W...@ec...> - 2002-11-15 12:32:34
|
On Fri, Nov 15, 2002 at 01:12:42 +0100, Benno Senoner wrote: > Hi, > the shapes do look nice, I was wondering about two things: > What the ideal coefficients would be so that very fast envelopes can > be achieved while slower ones are smoothed out so that no zipper noise > is audibile. I dont understand the question, the envelope shape is dynamic depending on the length of the appropriate section (attack, decay, etc). The only thing that needsa to be selected is the fudge factor (4.0 ish IIRC) which determines epsilon for the envelope. > BTW since in most of cases (in absence of envelopes) the LP smoothing is > not needed, it would be wise to skip that code (to save cycles) when the > envelope reached the final point (up to a very small epsilon since we > are talking of exponentials). What do you suggest here ? Well, the code for this is created on demand right? (a JIT sampler ;) So if there is no envelope there is no envelope code. If there is an envelope, then when we have reached epsilon, the envelope has (in theory) come to an end, we can fade to 0.0 to make sure, and then the voice has finished. BTW do we know for sure than samplers use exponential envelopes? I guess we need linear ones too, but they are easy to implement. We should probably get some recordings from samples of high freq sinewaves with envelopes. I think Frank N. has done this allready for the S2000, Frank are you on the list? > Regarding adding a let's say resonant LP filter to the sample output: > I guess the coefficients need to be smoothed out in some way too, > otherwise zipper noise will probably show up again. Yes. Though you can get away without it if you know the cutoff isn;t chaging too rapidly (eg. a slow LFO). > What would a good tradeoff for achieving low-CPU usage zippernoise-free > filter modulation would be ? (some kind of interpolation of precomputed > coefficients ?) I think so. Not all filters like you doing this, but most are OK IIRC. I have some SVF and biquad test code around, so I can test this if you like. I think my LADSPA SVF uses linear interp. to smooth the coefficients. - Steve |
|
From: Benno S. <be...@ga...> - 2002-11-15 12:00:59
|
Hi, the shapes do look nice, I was wondering about two things: What the ideal coefficients would be so that very fast envelopes can be achieved while slower ones are smoothed out so that no zipper noise is audibile. BTW since in most of cases (in absence of envelopes) the LP smoothing is not needed, it would be wise to skip that code (to save cycles) when the envelope reached the final point (up to a very small epsilon since we are talking of exponentials). What do you suggest here ? Regarding adding a let's say resonant LP filter to the sample output: I guess the coefficients need to be smoothed out in some way too, otherwise zipper noise will probably show up again. I've found these interesting msgs on the saol-users list but I do not have a deep understanding of the things mentioned in these mails. http://sound.media.mit.edu/mpeg4/saol-users/0179.html http://sound.media.mit.edu/mpeg4/saol-users/0178.html What would a good tradeoff for achieving low-CPU usage zippernoise-free filter modulation would be ? (some kind of interpolation of precomputed coefficients ?) Benno On Thu, 2002-11-14 at 13:28, Steve Harris wrote: > Hi all, > > Benno and I were discussing envelope generation last night, I think that > the right way to generate an exponential envelope (I checked some synths > outputs too and it looks like this is that way its done) is to feed a > constant value into a LP filter with different parameters for each stage. > -- http://linuxsampler.sourceforge.net Building a professional grade software sampler for Linux. Please help us designing and developing it. |
|
From: Steve H. <S.W...@ec...> - 2002-11-14 14:44:49
|
I'm assuming you meant to apply to the list... On Thu, Nov 14, 2002 at 02:37:54 +0000, Nathaniel Virgo wrote: > > For reverb and chrous I think you want them to be serial, parallel: > > > > .----> chorus ---. > > > > -----+ +------> > > > > '----> reverb ---' > > > > will sound odd I think. > > On my aging Yamaha CS1x keyboard (an XG thing with lots of "analogue-style" > sounds and built in fx) they are in parallel on most of the presets. You can > put them in series (kind of) but it makes the reverb take up more of the mix. OK, well that answers that then. Sure, this can be configurable, we have to support both anyway. And the code will be build dynamically, so it wont really hurt speed. > Why not allow the user to set up the routing however they want it? Perhaps > you could simplify things a lot by letting the user send each voice to either > a mono JACK output or a stereo pair, and do effects routing in something like > Ardour. Or would that be inefficient/impractical/at odds with the aims of > this project? One of the (eventual) aims is to build optimal optimal code paths for the effects routing, to get the voice count as high / cpu load as low as possible, that kind of rules out external processing, and it would be problematic anyway and the number of active voices varies from block to block. PS I was vaguely worried about the overhead from having to use position independent code in the recompiled voices, but it turns out to only be a few percent overhead on my PIIIM (which has terrible rspeed4 benchmark performance BTW, >100 cycles for the last test). Benno, you could add -fPIC -DPIC to the CFLAGS if you want to account for this in your benchmarks. - Steve |
|
From: Steve H. <S.W...@ec...> - 2002-11-14 13:54:18
|
On Thu, Nov 14, 2002 at 03:52:15 +0100, Benno Senoner wrote:
> This means the benchmarks I posted in my previous mail
> (284 voices on P4 1.8GHz , 331 voices on Athlon 1.4GHz) are meant with 2
> different FX sends on a per-voice basis. With per-MIDI-channel the
> performance is probably around 500-600 voices on the same boxes. This
> means that there will be plenty room for running the actual FXes and
> providing additional insert FXes like LP filters etc.
I think these benchamrks are optomistic, but for very simple voices it
may be approachable.
> > If each effect can be chained into the input of the next effect, one
> > routable FX send may be enough.
>
> Sorry I am no expert here, but taking the usual reverb, chorus case I
> don't think they can be chained , can they ?
For reverb and chrous I think you want them to be serial, parallel:
.----> chorus ---.
| |
-----+ +------>
| |
'----> reverb ---'
will sound odd I think.
- Steve
|
|
From: Steve H. <S.W...@ec...> - 2002-11-14 13:44:28
|
On Thu, Nov 14, 2002 at 12:28:02 +0000, Steve Harris wrote: > Example code: > Theres a graph of the output here: http://inanna.ecs.soton.ac.uk/~swh/foo.png - Steve |
|
From: Benno S. <be...@ga...> - 2002-11-14 13:41:21
|
Paul Kellet wrote: > Mono FX sends / stereo FX returns is the most common configuration - > on hardware samplers and synths, and on all but the biggest analogue > mixers. Ok, but what does this mean for the the audible result of a panned mono signal processed by a FX with mono sends ? Eg only the try part gets panned while the wet part is still centered in them middle ? Is this the audible result ok or does it sound bad in extreme panning positions ? I'd prefer to use mono sends (as default, stereo will be possible too) if that is the standard since it will save us some CPU cycles which helps to increase polyphony. > But hardware synths and samplers will usually have less FX sends than > they have FX busses, so you set the send level and destination for > each source. Yes of course. Since we use the recompiler we can create as many FX sends per voice as we wish. I AFAIK the usual GM MIDI standard has two sends for each MIDI channel.(reverb and chorus). The flexible nature of linuxsampler will allow arbitrary per-voice dynamically routable FX sends but in most cases this will not be needed since when implementing a MIDI sampler we can simply mix all voices on the same channel and then send the result to the FX since all voices belonging to the same channel share the same FX send level. This saves a lot of CPU. This means the benchmarks I posted in my previous mail (284 voices on P4 1.8GHz , 331 voices on Athlon 1.4GHz) are meant with 2 different FX sends on a per-voice basis. With per-MIDI-channel the performance is probably around 500-600 voices on the same boxes. This means that there will be plenty room for running the actual FXes and providing additional insert FXes like LP filters etc. > If each effect can be chained into the input of the next effect, one > routable FX send may be enough. Sorry I am no expert here, but taking the usual reverb, chorus case I don't think they can be chained , can they ? > But if you want to support "insert" effects like compression, these > should be on stereo busses. So the question turns into: do we have > one sort of effect and use stereo FX busses, or do we have separate > send and insert FX (or just have FX sends, and insert effects can be > applied to outputs). I think separate send and insert FXes should be supported. They can be either stereo or mono (probably for insert FXes it makes sense to keep them mono in the cases the sample sources are mono). Can the concept of per channel FXes the case of MIDI devices be applied to inserts too ? I guess yes. (eg let's say on midi chan 1 we have a polyphonic synth sound and we want to use an insert FX (a lowpass) to process the sound. In that case we can simply apply the FX to the channel mix buffer, right ? Benno -- http://linuxsampler.sourceforge.net Building a professional grade software sampler for Linux. Please help us designing and developing it. |
|
From: Steve H. <S.W...@ec...> - 2002-11-14 12:28:09
|
Hi all,
Benno and I were discussing envelope generation last night, I think that
the right way to generate an exponential envelope (I checked some synths
outputs too and it looks like this is that way its done) is to feed a
constant value into a LP filter with different parameters for each stage.
I suspect that, in general you need to calculate the amplitude of the
envelope for each sample, to avoid stairstepping (zipper noise).
I expect Paul Kellett knows the right approach, so should be able to say
if we're barking up the wrong tree.
Example code:
#include <math.h>
#include <stdio.h>
#define EVENTS 5
#define ENV_NONE 0
#define ENV_ATTACK 1
#define ENV_DECAY 2
#define ENV_SUSTAIN 3
#define ENV_RELEASE 4
void lp_set_par(double time, double *a, double *ai) {
*a = exp(-5.0 / time); // The 5.0 is a fudge factor
*ai = 1.0 - *a;
}
int main() {
unsigned int event_time[EVENTS] = {0, 100, 200, 400, 900};
unsigned int event_action[EVENTS]= {ENV_ATTACK, ENV_DECAY, ENV_SUSTAIN,
ENV_RELEASE, ENV_NONE};
unsigned int i, event = 0;
float env_input = 0.0f;
double env = 0.0f;
double a, ai;
float attack_level = 1.0f;
float sustain_level = 0.5f;
float release_level = 0.0f;
for (i=0; i<1000; i++) {
if (i == event_time[event]) {
switch (event_action[event]) {
case ENV_ATTACK:
env_input = attack_level;
break;
case ENV_DECAY:
env_input = sustain_level;
break;
case ENV_SUSTAIN:
env_input = sustain_level;
break;
case ENV_RELEASE:
env_input = release_level;
break;
}
lp_set_par((double)(event_time[event+1] -
event_time[event]), &a, &ai);
event++;
}
env = env_input * ai + env * a;
printf("%g\n", env);
}
return 0;
}
|
|
From: Paul K. <pau...@ma...> - 2002-11-14 11:30:32
|
Benno Senoner <be...@ga...> wrote:
>
> I have an important question regarding the effect sends: (since I am not
> an expert here) Are FXes in soft samplers/synths usually stereo or mono ?
>
> The CPU power for two mono sends is about the same for one single stereo
> send so I was just wondering which way we should go initially. (mono I
> guess ?).
Mono FX sends / stereo FX returns is the most common configuration - on
hardware samplers and synths, and on all but the biggest analogue mixers.
But hardware synths and samplers will usually have less FX sends than they
have FX busses, so you set the send level and destination for each source.
If each effect can be chained into the input of the next effect, one
routable FX send may be enough.
But if you want to support "insert" effects like compression, these should
be on stereo busses. So the question turns into: do we have one sort of
effect and use stereo FX busses, or do we have separate send and insert
FX (or just have FX sends, and insert effects can be applied to outputs).
Paul.
_____________________________
m a x i m | digital audio
http://mda-vst.com
_____________________________
|
|
From: Josh G. <jg...@us...> - 2002-11-14 06:35:20
|
On Wed, 2002-11-13 at 13:16, Matthias Weiss wrote: > Hi Josh! > > First of all, very interesting idea! > > On Wed, Nov 13, 2002 at 12:35:26AM -0800, Josh Green wrote: > > The way I'm currently thinking of things is that libInstPatch could act > > as a patch "server" and would contain the loaded patch objects. > > Why is it necessary that those patch server run in their own process? > Why not simply add this functionality to the nodes themself? > > > size/name or perhaps MD5 on the parameter data). In the case of the GUI, > > I think it needs to have direct access to the patch objects (so anything > > it is editing should be locally available, either being served or > > synchronized to another machine). > > Hm, would that mean the GUI has to keep a copy of the state data? I > think the state data (modulation parameters, etc.) should only be kept > by the nodes, because those are the ones who need the data always > available. > > It would look something like that: > > GUI1 <-->libInstPatch.so/LinuxSampler1 > ^ ^ > | | > | +- Vintage Dreams SF2 (Master) > | +- Lots of Foo.DLS (Slave) > | +- Wide Load.gig > | > | > +---->libInstPatch.so/LinuxSampler2 > | ^ > | | > ... +-->GUI2 > > With this scheme, libInstPatch would be a shared library that handles > (to the outside world) the peer communication as well as the communication > with the GUI and controls the state changes to the inside world (sample > engine paramters). The state data would have to be kept one time per > node, no duplication in the patch server and GUI. > Thats actually what I meant. I was using the term "server" in reference to the patch objects being synchronized between multiple clients, I was assuming that it would be a shared library (it is currently like this). So, no, the GUI would not need a separate copy of the data. > > GUI and LinuxSampler would not necessarily be communicating with the > > same libInstPatch server, they could be on separate synchronized > > machines. > > What's the advantage of this? > I think someone mentioned something about being able to edit patches on one machine but actually sequencing it on another. That was what I was referring to. Your diagram above is an example of that :) > > Anyways, thats how I see it. Does this make sense? Cheers. > > To me the idea is great and makes a lot of sense ;-). > > matthias > > Thanks for the encouragement. Unfortunately I'm finding that my current thread implementation with libInstPatch is less than perfect. I'm using the glib GStaticRecMutex to lock individual patch objects, but just recently realized that each mutex requires 40 bytes! Since patch objects consist of samples, instruments, presets, zones etc, there are quite a lot of them. Thats a lot of wasted space just for a lock. I guess I'll have to look into some other threading libraries, perhaps just go pthread or something. The ideal would be a recursive R/W lock. I've seen these in the pthread library, but I'm not sure if they are recursive. libInstPatch is going to need a bit of work. I have faith in the architecture, but lots of debugging and optimization is in order :) Cheers. Josh Green |
|
From: Matthias W. <mat...@in...> - 2002-11-13 21:20:02
|
Hi Josh!
First of all, very interesting idea!
On Wed, Nov 13, 2002 at 12:35:26AM -0800, Josh Green wrote:
> The way I'm currently thinking of things is that libInstPatch could act
> as a patch "server" and would contain the loaded patch objects.
Why is it necessary that those patch server run in their own process?
Why not simply add this functionality to the nodes themself?
> size/name or perhaps MD5 on the parameter data). In the case of the GUI,
> I think it needs to have direct access to the patch objects (so anything
> it is editing should be locally available, either being served or
> synchronized to another machine).
Hm, would that mean the GUI has to keep a copy of the state data? I
think the state data (modulation parameters, etc.) should only be kept
by the nodes, because those are the ones who need the data always
available.
It would look something like that:
GUI1 <-->libInstPatch.so/LinuxSampler1
^ ^
| |
| +- Vintage Dreams SF2 (Master)
| +- Lots of Foo.DLS (Slave)
| +- Wide Load.gig
|
|
+---->libInstPatch.so/LinuxSampler2
| ^
| |
... +-->GUI2
With this scheme, libInstPatch would be a shared library that handles
(to the outside world) the peer communication as well as the communication
with the GUI and controls the state changes to the inside world (sample
engine paramters). The state data would have to be kept one time per
node, no duplication in the patch server and GUI.
> GUI and LinuxSampler would not necessarily be communicating with the
> same libInstPatch server, they could be on separate synchronized
> machines.
What's the advantage of this?
> Anyways, thats how I see it. Does this make sense? Cheers.
To me the idea is great and makes a lot of sense ;-).
matthias
|
|
From: Steve H. <S.W...@ec...> - 2002-11-13 21:06:32
|
On Wed, Nov 13, 2002 at 09:46:42 +0100, Nicolas Justin wrote: > On Wednesday 13 November 2002 21:25, Nicolas Justin wrote: > > I can try to look at the code, and see if there is room for optimisations. > > But I'm very new to this project, and I think there is more experimented > > programmers than me on this list. > > Maybe you can look at libSIMD (http://libsimd.sf.net), these is a library > implementing simple maths functions with SIMD instructions. Thats interesting, there only appears to be 3dnow accelerations at the moment, but it could be useful once they get SSE done. - Steve |
|
From: Nicolas J. <nic...@fr...> - 2002-11-13 20:47:49
|
On Wednesday 13 November 2002 21:25, Nicolas Justin wrote: > I can try to look at the code, and see if there is room for optimisations. > But I'm very new to this project, and I think there is more experimented > programmers than me on this list. Maybe you can look at libSIMD (http://libsimd.sf.net), these is a library implementing simple maths functions with SIMD instructions. There is also a patch by Stéphane Marchesin implementing a MMX mixer and audio converter for SDL (http://www.libsdl.org), you can find it here: http://dea-dess-info.u-strasbg.fr/~marchesin/SDL_mmx.patch Just my 2 cents... -- Nicolas Justin - <nic...@fr...> |
|
From: Steve H. <S.W...@ec...> - 2002-11-13 20:34:57
|
On Wed, Nov 13, 2002 at 09:25:46 +0100, Nicolas Justin wrote: > I can try to look at the code, and see if there is room for optimisations. > But I'm very new to this project, and I think there is more experimented > programmers than me on this list. I would wait until we have finalised the inner loop, its likely to change a lot. - Steve |
|
From: Nicolas J. <nic...@fr...> - 2002-11-13 20:26:25
|
On Wednesday 13 November 2002 18:58, Steve Harris wrote: > > I heard the P4 heavily relies on optimal SSE2 optimizations in order to > > deliver maximum performance and it seems that both gcc and icc do not > > work optimally in this regard. > > SSE, not SSE2 IIRC. SSE2 is still only 128bits wide, and uses 64bit floats > so it can only go two-way. Gcc and even icc are not really good at code vectorisation. IMHA it is a better idea to parallel the code manually using the SSE instructions, you will get better performances. I can try to look at the code, and see if there is room for optimisations. But I'm very new to this project, and I think there is more experimented programmers than me on this list. -- Nicolas Justin - <nic...@fr...> |
|
From: Steve H. <S.W...@ec...> - 2002-11-13 17:58:47
|
On Wed, Nov 13, 2002 at 05:47:14 +0100, Benno Senoner wrote: > Since we will probably go all floating point (because high precision, > head room and flexibility over integer) you need to be careful to > optimize the code because as we all know x86 FPUs do suck a bit. Right, but we can use SSE in P4's (and maybe P3's if its faster) with gcc3. This just needs the flags I posted to l-a-d, no code changes. > Steve H: I have added stereo mixing with volume support to better > reflect the behaviour of a real sampler with pan support, fortunately > the performance drop from the mono version is minimal thanks to caching. Excellent. I though we were wasting a lot of cycles waiting for the RAM in the mono case. [events and CV] > One might say this is a waste of CPU but as Steve wrote in an earlier > posting on this list, the rate of CV values is usually much lower (1/4 - > 1/16) than the samplerate. This means that even if the event stream is > very dense the added overhead is minimal. > I think the best way to find a good comprimise between flexibility > and speed is to try out several methods and pick those with the best > price/performance ratio. OK, well events are more LADSPA like, which is convienient I suppose, this is really an internal enging thoing though, so we dont have to decide upfront. > Are FXes in soft samplers/synths usually stereo or mono ? > Since we are using recompilation this can be made flexible but I have > noticed that FX send channels can chew up quite some CPU. > see this: I think on older samplers they are stereo return (to the main mix outs), newer samplers have many more outputs, so I dont know how they handle it. The number of send channels is equal to the number of channels in the sample. > P4: > samples/sec = 12528321.035306 mono voices at 44.1kHz = 284.088912 > efficency: 144.401951 CPU cycles/sample > > Athlon: > samples/sec = 14626412.219113 mono voices at 44.1kHz = 331.664676 > efficency: 95.721219 CPU cycles/sample > > > This with both gcc3.2 and 2.96. The P4 seem to suck quite. P4's really dont like branches from what I have heard (very long pipelines). The Athlon is much shallower. What RAM systems did the two machines have? > Using the Intel C / gcc compilers with SSE optimizations did not > provide any speedup, in some cases the performance was even worse. Even on P4? > I heard the P4 heavily relies on optimal SSE2 optimizations in order to > deliver maximum performance and it seems that both gcc and icc do not > work optimally in this regard. SSE, not SSE2 IIRC. SSE2 is still only 128bits wide, and uses 64bit floats so it can only go two-way. - Steve |
|
From: Steve H. <S.W...@ec...> - 2002-11-13 17:44:15
|
On Wed, Nov 13, 2002 at 11:35:58 -0600, Richard A. Smith wrote: > On 13 Nov 2002 17:47:14 +0100, Benno Senoner wrote: > > > The strange thing is that on most modern x86 CPUs using doubles is as > > fast/faster than floats. That's good :-) > > > > Perhaps thats due to data bus size and the FPU size. Modern x86s > FPUs are 80-bit IIRC. The data bus is 64 bits wide so fetching a > double or float from memory take the same ammount of cycles. The 80bit format is mainly internal, its used to maintain IEEE compatibility in the 387 as I understand it. SSE does not use it. The problem with using doubles (64bit) or long doubles (80bit) in your code is the cache effects. You still have the same number of fp stack registers though. See Tim G.'s early attemps with ladspa filters, it makes no difference when thats the only thing running, but when you add more processes it becomes much slower. If using doubles was actually faster you may have missed the trailing f off a constant or used, eg. sin() instead of sinf(). - Steve |
|
From: Richard A. S. <rs...@bi...> - 2002-11-13 17:36:11
|
On 13 Nov 2002 17:47:14 +0100, Benno Senoner wrote: > The strange thing is that on most modern x86 CPUs using doubles is as > fast/faster than floats. That's good :-) > Perhaps thats due to data bus size and the FPU size. Modern x86s FPUs are 80-bit IIRC. The data bus is 64 bits wide so fetching a double or float from memory take the same ammount of cycles. Perhaps going from a 32-bit float to the 80 bit FPU format involves a cast that uses more cycles than a 64 bit double to 80-bit. -- Richard A. Smith Bitworks, Inc. rs...@bi... 479.846.5777 x104 Sr. Design Engineer http://www.bitworks.com |
|
From: Benno S. <be...@ga...> - 2002-11-13 16:35:37
|
Hi, during the last couple of days I performed benchmarks in order to analyze the speed of resampling/mixing routines which will make up the core of the RAM sampler module. Since we will probably go all floating point (because high precision, head room and flexibility over integer) you need to be careful to optimize the code because as we all know x86 FPUs do suck a bit. I performed benchmarks on a celeron,p4 and athlon and must admit that the athlon will make up for a damn good sampler box since it seems to have a speedy fpu. The difference is notable especially when using cubic interpolaton: an athlon 1400 matches the performance of a 1.8Ghz P4. Anyway if you want to play a bit with my benchmark (it's only a quick hack to test a few routines) just download it from http://www.linuxdj.com/benno/rspeed4.tgz Steve H: I have added stereo mixing with volume support to better reflect the behaviour of a real sampler with pan support, fortunately the performance drop from the mono version is minimal thanks to caching. The strange thing is that on most modern x86 CPUs using doubles is as fast/faster than floats. That's good :-) Regarding the RAM sampler module I proposed earlier: I studied some event based stuff David Olofson proposed long time ago and since Steve H. said "we will probably need both event based stuff and control values but the control value frequency does not need that high", I made a few calculations and it seems to pay of to implement the control values as fine grained events. One might say this is a waste of CPU but as Steve wrote in an earlier posting on this list, the rate of CV values is usually much lower (1/4 - 1/16) than the samplerate. This means that even if the event stream is very dense the added overhead is minimal. I think the best way to find a good comprimise between flexibility and speed is to try out several methods and pick those with the best price/performance ratio. I have an important question regarding the effect sends: (since I am not an expert here) Are FXes in soft samplers/synths usually stereo or mono ? Since we are using recompilation this can be made flexible but I have noticed that FX send channels can chew up quite some CPU. see this: data of my celeron 366: cubic interpolation with looping, mono voices but output is stereo (with pan) no fx sends: samples/sec = 4879341.532237 mono voices at 44.1kHz = 110.642665 efficency: 74.957245 CPU cycles/sample one FX stereo send: samples/sec = 4104676.981704 mono voices at 44.1kHz = 93.076576 efficency: 89.103723 CPU cycles/sample two FX stereo sends: samples/sec = 3508911.444682 mono voices at 44.1kHz = 79.567153 efficency: 104.232326 CPU cycles/sample The CPU power for two mono sends is about the same for one single stereo send so I was just wondering which way we should go initially. (mono I guess ?). The innermost mixing loop with 2 stereo FX sends looks like this: sample_val=CUBIC_INTERPOLATOR; output_sum_left[u] += volume_left * sample_val; output_sum_right[u] += volume_right * sample_val; effect_sum_left[u] += fx_volume_left * sample_val; effect_sum_right[u] += fx_volume_right * sample_val; effect2_sum_left[u] += fx2_volume_left * sample_val; effect2_sum_right[u] += fx2_volume_right * sample_val; makes sense ? (output_sum_left/right is the dry component , effect_sum and effect2_sum the FX sends) Some other numbers I got from P4 1.8Ghz vs Athlon 1400 cubic,looping and 2 stereo FX sends: P4: samples/sec = 12528321.035306 mono voices at 44.1kHz = 284.088912 efficency: 144.401951 CPU cycles/sample Athlon: samples/sec = 14626412.219113 mono voices at 44.1kHz = 331.664676 efficency: 95.721219 CPU cycles/sample This with both gcc3.2 and 2.96. The P4 seem to suck quite. Using the Intel C / gcc compilers with SSE optimizations did not provide any speedup, in some cases the performance was even worse. I heard the P4 heavily relies on optimal SSE2 optimizations in order to deliver maximum performance and it seems that both gcc and icc do not work optimally in this regard. (if I get my hands on a Visual C++ compiler on a P4 box I will try to run it on that box to see what the performance looks like). Let me know your thoughts about all the issues I raised in this (boring) mail :-) cheers, Benno -- http://linuxsampler.sourceforge.net Building a professional grade software sampler for Linux. Please help us designing and developing it. |
|
From: Josh G. <jg...@us...> - 2002-11-13 08:34:44
|
I have been thinking about this issue for some time now. Its the primary
reason why I started making libInstPatch multi-threaded. My dream is to
have shared multi-peer patch editing (just add a touch of streaming MIDI
and you have yourself a Jam session). SwamiJam is the name I have given
this particular application of Swami. This same idea could be applied to
the GUI-->LinuxSampler problem, I believe.
Some thoughts on GUI(Swami?)/libInstPatch/LinuxSampler:
The way I'm currently thinking of things is that libInstPatch could act
as a patch "server" and would contain the loaded patch objects. Other
"servers" could be set up to synchronize to patch objects residing on
different servers. These "servers" could take advantage of locally
stored patch files (if someone already has the patch that another user
is using, use it - ensuring they are identical could be done via simple
size/name or perhaps MD5 on the parameter data). In the case of the GUI,
I think it needs to have direct access to the patch objects (so anything
it is editing should be locally available, either being served or
synchronized to another machine).
GUI <--> libInstPatch server --> LinuxSampler
|
|
+- Vintage Dreams SF2 (Master)
+- Lots of Foo.DLS (Slave)
+- Wide Load.gig
GUI and LinuxSampler would not necessarily be communicating with the
same libInstPatch server, they could be on separate synchronized
machines.
A synchronization protocol then needs to be implemented. It would look
something like so (sudo operations):
ADD <object-data>
REMOVE <object-id>
CHANGE <object-id> <property> <value>
Sample data should probably be handled specially to allow multiple
segments to be sent (rather than all at once). libFLAC could be used for
loss-less compression of sample data.
The question then becomes how LinuxSampler will talk to its local patch
server. LinuxSampler will only really care about presets that are
currently active (i.e. selected on a MIDI channel). I think it would be
too much of a performance bottleneck for LinuxSampler to query the patch
server directly (since the server needs to take into account multiple
peer access, and therefore locking of objects).
Each synth primitive (envelope, LFO, filter, etc) would have its own
internal state data as well as a copy of the parameters from the object
system. Updates to the object system could be queued and synchronized to
LinuxSampler local parameters when they are not being accessed by the
synth.
Anyways, thats how I see it. Does this make sense? Cheers.
Josh Green
|
|
From: Matthias W. <mat...@in...> - 2002-11-12 20:16:40
|
On Mon, Nov 11, 2002 at 11:53:02PM +0000, Steve Harris wrote: > On Mon, Nov 11, 2002 at 10:20:06 +0100, Matthias Weiss wrote: > > I think linuxsampler should also have the ability to create a sample > > instrument/set. Therefore it will be necessary to edit wave files, set > > loop points etc. Now the first problem that occures to me: how do I > > generate the waveview of the sample data when the samples are on one > > machine and the GUI on the other? Should I pregenerate the sample view > > data on the plugin machine and send it to the GUI machine? And if I edit > > the sample, should the edit commands be send over the net to the plugin > > machine, the plugin calculates the result, obtains the new sample view > > datas and sends them back? > > I would say that you only send graphical inforamtion to the GUI, and it > sneds instructions back to the engine, which executes them. That's what I called "sample view data". > The ammount of data in a visible lump of waveform is acutally pretty low > if you think about it, you cant see more than a couple of thousand samples > at once. I agree, but it seems to me that the effort to create a remote GUI is considerable. Running it remote via network might also have an impact on the latency side because of the interrupts generated by the NIC, of course when running it locally, this wouldn't be a problem. Further, as I outlined in my previous mail, it propably won't be possible to integrate an existing wave file editor in the sampler app which I'd consider a true drawback. matthias |
|
From: Steve H. <S.W...@ec...> - 2002-11-11 23:53:07
|
On Mon, Nov 11, 2002 at 10:20:06 +0100, Matthias Weiss wrote: > I think linuxsampler should also have the ability to create a sample > instrument/set. Therefore it will be necessary to edit wave files, set > loop points etc. Now the first problem that occures to me: how do I > generate the waveview of the sample data when the samples are on one > machine and the GUI on the other? Should I pregenerate the sample view > data on the plugin machine and send it to the GUI machine? And if I edit > the sample, should the edit commands be send over the net to the plugin > machine, the plugin calculates the result, obtains the new sample view > datas and sends them back? I would say that you only send graphical inforamtion to the GUI, and it sneds instructions back to the engine, which executes them. This will reduce the bandwidth over the pipe, and reduce the locking problems etc. The ammount of data in a visible lump of waveform is acutally pretty low if you think about it, you cant see more than a couple of thousand samples at once. - Steve |