[Linuxsampler-devel] RAM sampler module proposal

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,
(please read carefully, long mail :-) )

I'd like you to comment about the design of a simple RAM sample playback
module that can later be incorporated into the signal path compiler.
(Of course many other modules  like FXes, modulators etc will be
required for a real sample playback engine but the sampler module is 
fundamental here and it is very important that it is well designed,
efficient and provides a good audio quality.
At a later stage we can introduce the disk based sample playback module
which will act in a similar way as the RAM version but with some
limitations (eg loop point modulation will not be possibile etc)

Now to my RAM sample module proposal, first see this simple diagram: 
http://linuxsampler.sourceforge.net/images/ramsamplemodule1.png

Basically it contains no MIDI or keymapping capabilities because this
will be delegated to other modules.
The module allows note start/stop triggering , processing of looping
information via lists of looping segments so that you can loop the
sample in very flexible ways.
Being RAM based, one could even modulate the loop points without any
problem.
Regarding the note On/Off triggers I would propose to use timestamped
 events which allow sample accurate triggering.
For example without timestamped events, when using not so small buffer
sizes, note starts/stop would get quantized to buffer boundaries
introducing unacceptable timing artifacts.

Where I am not so sure about using timestamped events or not is when
 modulating the pitch. 
The problem is we do need to handle both real time pitch messages like
those generated by the pitch-bend MIDI controller and at the same time
allow the pitch of the sampler module getting modulated by elements like
LFOs, envelopes etc.
In the latter case it is perhaps more efficient to provide a continuous
stream of control values (one for each sample) (Steve does CV models
work like that ?)

Or perhaps both models can be used ? (a time stamped event that supplies
an array of control values (eg bufsize=64 samples and we want to
modulate the pitch in a continuous way so we simply generate an event
that supplies 64 control values (floats) at the beginning of the
processing of each fragment.
This is a very important issue and I'd like the experts out here to give
us the right advices.

Another important issue is how to process looping information
efficiently:

--------|-------------------------|----
        |        |                |
      start     playback ptr     end

basically we need to figure out when the playback ptr goes past the loop
end point and reset it to the loop start position.

assume that the playback ptr gets increased by the pitch value during 
each iteration of the audio rendering loop. (pitch=1.0 sample gets
played at nominal speed pitch=2.0 sample played one octave higher etc).

if pitch remains constant for the entire audio fragment we can figure
out at which iteration the playback ptr needs to be reset to the loop
start position.
That way we save CPU time and can avoid the statement
if(sample_ptr >= loop_end_pos) reset_sample_ptr();   
within audio rendering loop.

When assuming that the pitch gets only changed by MIDI pitch bend events
the above event based model works well since the pitch remains constant
 between two events. 
The problems arise when we let the pitch getting modulated with single
sample resolution by other modules like LFOs and envelopes.
Generating an event for each sample is too heavy in terms of CPU cycles
and since external modules can modulate the pitch in an almost arbitray
way, it becomes hard to estimate when the sample playback ptr needs to
be reset to the loop start position.
I see 3 solutions for this problem (I hope that you guys can come up
with something more efficient if it exists):

a) preserve the if(sample_ptr >= loop_end_pos) ...  statement within the
 audio rendering loop , waste a bit of CPU but allow arbitrary pitch
modulation, regardless if it is event based or driven by continuous
values.

b) limit the upward pitch modulation to let's say +5 octaves from the
root note. (max pitch=32). That way cou can estimate when you will need
to start to check if the loop end point was reached.
assuming 
cycles_to_go = (loop_end_pos - playback_pos)*pitch
with pitch=32 you waste CPU time in the sense that you need to perform
the if() .. check for up to 32*samples_per_fragment.
This is not that much since when running real time samples_per_fragment
can be as low as 32 or 64 thus  64*32 = 2048.
You perform the if() check 2048 out of possibily hundred thousands of
times (assuming each sample is around 100k). This means the waste of CPU
is only a few % while still allowing arbitrary modulation with some
upward limits. (the limitation pitch up modulation of +5 octave)

c) allow only linear ramping between two pitch events
eg at each iteration you do:

playback_ptr+=pitch;
pitch+=delta_pitch;

Complex pitch modulation would be emulated through many linear ramps,
sending pitch events.
The linear behaviour of the pitch lets you easily calculate the position
where you need to reset the playback pointer to the loop_start position.

So what do you think about a) , b) and c) ?
Personally I prefer a) or b) , if a) does not waste that much CPU I'd
like to use this method since it allows flexible pitch modulation.
But probably the impact will not be negligible.

Does a d) solution that is more efficient than the above ones exist ?

Your thoughts and comments please.

Below I'm responding to other issues raised in the last messages in
order to avoid spamming the list too much:

Steve, Josh:
Regarding the IIWU synth I tried it today in conjunction with with the 
large fluid sound font:
http://inanna.ecs.soton.ac.uk/~swh/fluid-unpacked/

IIWU/Fluid sounds quite nice but it seems to be quite CPU heavy.
I took a look at the voice generation routines and it is all pretty much
"hard coded" plus it uses integer math (integer,fractional parts etc)
which is not that efficient as you might think.
I tested it on my dual Celeron 366 and when playing MIDI files it often
cannot keep up because the CPU load goes to 100%.
The same MIDI file played in Timidity on the same box works ok without
 drop outs.
I do not want to criticize IIWU here, I think the authors have done
quite a nice work but I don't see it suitable as base for our "sampler
constrution kit" or like Steve H. said "maybe the question should be
whether it's easier to add disk streaming to FluidSynth".
I know some of you want quick results or say "if we set the target too
 high we will not reach it and developers might loose interest etc",
but I think the open source world still lack a very well thought out,
 flexible and efficient sampling engine and this takes some time.
Sure, we can learn a lot from Fluid, perhaps turning it into a SoundFont
playback module for LinuxSampler but I do not envision LinuxSampler as
 an extension of Fluid.  

Phil K. : Regarding the GUI , socket and DMIDI issues:
As some of you said, it is better to separate GUI and (MIDI) real time
control sockets.
The GUI can easily use the real time socket to issue MIDI commands etc.
I think we should provide an intermediate layer for handling these real
time messages so that one can easily support multiple backends like
DMIDI, alsa-seq, raw-midi, etc.

Alex Klein: Hi, welcome on board. If you have good experience with
widows audio sw samplers , in particular gigastudio this is ideal since
you can
do side-to-side comparisions, suggest improvements, check performance,  
etc. (especially since you said you have lots of spare time in the next
months :-) )
Regarding VST support in linuxsampler, according to this:
http://eca.cx/lad/2002/11/0109.html
VST for Linux would need some modifications in the headers and asking
Steinberg for the permission to redistribute the result.
That given, you could quite easily port to Linux the DSP processing part
of VST plugins where the source is available. The GUI is another issue
and would probably require a complete rewrite except the author has used
some cross-platform toolkit like Qt etc.
But as you know the native plugin API for Linux si LADSPA and we will of
course support it in LinuxSampler (mainly for FX processing).

Regarding the JACK issues that Matthias W raised:
I'm no JACK expert but I hope that JACK supports running a JACK client
directly in it's own process space as it were a plugin.
This would save unnecessary context switches since there would be only
one SCHED_FIFO process runnnig. 
(Do the JACK ALSA I/O modules work that way ? )

cheers
Benno

-- 
http://linuxsampler.sourceforge.net
Building a professional grade software sampler for Linux.
Please help us designing and developing it.