[Libvisual-devel] Time to rock the boat... ;)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello all (not sure now many people are in this list)  I'm the eXtace author (I
did NOT originally write it,  but I've maintained it and rewritten it a few
times since I took it over in 1998)

I'm interested in libvisual and have a bunch of ideas on how to greatly improve
it's power and flexibility and I'd like to see what everyone else thinks
(dennis seems to like my ideas so far)

First a list of some of the things I found that are non-optimal:

1. Plugins run in lockstep with audio source,  so if audio is written in large
blocks less often (esound, networked audio) the plugins run slowly and piss off
the end user..
2. No internal ringbuffers seems to be present in the current model limiting
audio blocks sizes to the vis plugins in fixed FFT sizes (works for a lot of
plugins but is pretty restrictive when you want to do some creative things)
3. No way to compensate for lagged audio I/O (like esound, or networked audio
systems)

Solutions:  Use some of the code from eXtace (GPL) to provide a LOT more
flexibility:

eXtace's model has all input data (sound I/O) coming in and being fed into a
ringbuffer (Statically sized,  non optimal but see below for a more
flexible/better idea) This runs as a thread and is decoupled from the
rendering.  When data arrives, it's copied into the ring (Wrapping if needed),
and several shared variables (that the renderer reads) are updated (last write
point into the ring and the TIME when it happened (gettimeofday() ) This runs
basically from now until the end of time (it uses poll and goes to sleep in
between arrivals of data)

The renderer in eXtace is called by a GTK+ timeout at a periodic rate.  Each
time it runs it checks the status of the ring (last write position and the time
difference from "now" to when the last write occurred to know how far behind
the feeder it should go (this is CONFIGURABLE in the gui to allow compensation
for latency in the sound system (esound has about 329 ms. of delay by default).
the data is read, crunched on IF needed (not all displays NEED an FFT) and the
display is rendered..

I think that in libvisual the ringbuffer concept should be implemented in a
slightly different method which I though of yesterday.  What I envisioned was a
doubly linked list of "segments" (think of it as a list of buffers) each buffer
would be sized to coincide with the sample rate and type of audio data coming
in.  i.e. the assuming the audio input rate was 44100hz 16 bit stereo, we could
have a segment size of 100 milliseconds, which is 17640 bytes. Now since this
is a doubly linked list the ringbuffer could be expanded/subtracted at almost
any time (simple locks required), with some simple tricks to alter the
"next/prev" pointers in the GList datastructure.  Even though this is MORE
complex than one large buffer (which is what extace uses) it doesn't allow ring
resize during runtime, the calcs should be a little easier and ther's a couple
other reasons I can't remember at the moment..

Currently in libvisual plugins are synchronous withthe audio input (At least it
seems that way).  This has several problems: if the audio comes in in large
blocks, the plugin renders slowly and that's irritating as hell. 

I think the plugin in it's init function should request a FPS rate, (which can
be honored directly or rounded up/down to a convenient number). If the core of
libvisual tied into the RealTimeClock on just about every computer out there
(intel at least) you can have a simple thread that runs and wakes up each time
the RTC ticks and if it's close to a request time for a plugin to be fired,
then it should be dispatched.  This has several good and bad points. i.e. what
do you do when there's no RTC?  what if you're running an older kernel that
limits the RTC interrupt rate to no more than 64 times/second? (all plugins
could either run at 64 fps, 32 fps or 21.4 fps)   How do libraries like GTK+
provide the gtk_timeout?  perhaps something similar would work.

Also I think a plugin should be able to request what data and HOW much it
wants.  Things like arbritrary FFT sizes, or time domain datablock sizes (for
scopes).  Some cool optimizations I can think of is if you have 5 plugins
running at once and they all ask for a 4096 point FFT,  if they all have the
same latency amount (another benefit of the ringbuffer setup) then you ONLY
need to calculate the FFT ONCE, and share it for all of then instead of 5
times..  Similar optimizations can happen as well depending on the number of
plugins running..

For beat matching.  I checked out a link from dennis and there's a mention of
correlation.  You should definitely LOOK at eXtace's code for convolution used
for the scope syncronizer (I'm no math whiz, as someone submitted that code) 
the code is screaming fast, and it basically does a "pattern match" and is used
to keep the scope in sync (similar to triggering on a regular scope,  but it
works with complex signals).  It's described in eXtace's source code in
input_processing.c on lines 404-440 or so..

Any comments?  Am I out of my mind here?  I think some of these things can
REALLY make libvisual one seriously kickass setup.

Dave J. Andruczyk

__________________________________ 
Do you Yahoo!? 
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/ 

[Libvisual-devel] Time to rock the boat... ;)

Audio visualisation library

[Libvisual-devel] Time to rock the boat... ;)