Re: [Audacity-devel] Digital Studio (was libaudacity)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Fri, 28 Jun 2002 21:51:45 -0700, Augustus Saunders wrote:
>Anthony Airon Oetzmann wrote:

>>Here are my abosulte favorites for implementation as soon as possible.

>>(6), the multitracking enviroment, with proper regions, and consequentially
>>a region bin(list with subfolders, find function, delete region, delete disk
>>file, optimize files on disk, etc.). This is what I consider the one feature
>>of Audacity that'll get the screen cleaned up and therefore more managable.

>>Productivity will jump, as users can push around their audio between tracks,
>>extend and contract the edges, and more easily group together elements
>>such as vocals over a small number managable tracks

>Keep in mind that for right now, I want to focus on what the 
>capabilities need to be
>and let the GUI work itself out.  The library should probably only 
>support things that
>actualy have an impact on the sound.  For example, something like 
>folders could be
>a fiction maintained by the UI, so is there a good reason to add 
>complexity (and size)
>to the library to support them?  If it turns out that the library can 
>make it more effecient,
>then that could be a good reason (for example, applying identical 
>automation to multiple
>tracks might be somehow more efficient if the library natively supports 
>folders).

You're right of course. I was getting ahead of myself with the GUI. When I'm
reffering to an audio mixer, I don't talk about any GUI elements, but the code
that clobers the audio data in the background. The GUI elements that control
these mixing parameters are up for grabs at a later time.

As for the region bin, Protools sticks to one sampling and bit rate. Therefore,
upon importing audio, it converts files in to the audio data folder of the
project, much like Audacity does. If the bitrate and sample rate match the
project specs, the file merely gets referenced, IF it is on a hard disk.

I say this, because this issue may creep up sooner or later and we need to
decide about some basic engine parameters.

=== Does it resample audio on the fly like Vegas ? ===
    Pro: flexibilty
 Contra: heavier CPU load, especially for better quality

One might concede that the final rendering(or EXPORT as we call it) could use
the highest quality for resampling audio, but if anyone ever does serious mixing
with plugins, he's not going to want to waste CPU time on resampling, especially
if it's not top quality resampling. It's doesn't bum off the CPU completly, but
it does have a cost.

=== If using resampling, should a function to
    CONVERT ALL AUDIO IN THE PROJECT TO PROJECT SPECS exist ? ===

This could remedy the above situation, because it leaves the user to choose when
he wants to expend that CPU time to save cycles for mixing. If we choose such an
approach, this has to be in there.

=== Should audio outside project specs(sample & bit rate) be
    converted to project specs ??? ===
Pro:
saves massive amounts of CPU cycles and streamlines the audio mixer to handle
only ONE type of audio. Note that Protools uses a 48 bit mixer, dithered to 24
bits, no matter what you do. Bitrates are easier to handle, but sample rates are
a CPU time nightmare.

Contra:
converted material takes up extra disk space
conversion takes time(not much with SSRC, an exellent and precise GPL'ed
resampler)

=== Should material with the right sample rate but different
    bitrate to project specs be converted too ? =======

No : saves disk space and costs very little CPU time (Dominic, please correct be
if I'm wrong about the ease of this) . Why ? If we use a 24, 32 or 48 bit mixer,
most material will be bit rate converted anyway. It's only a matter of writing a
byte(8-bit) or word(16-bit) in to a double word(32bit?:) or 48 bit value.

Yes: blows more diskspace and doesn't streamline the audio mixer very much.

The sample rate descision is hard to make. We should ask around to see if doing
a CONVERT-upon-importing engine is the only way to go, or if an engine that
includes on-the-fly resampling has serious applications. Nobody should make the
mistake that resampling costs little CPU time. High quality costs LOTS of cpu
time.

My opinion is that the resample-on-the-fly is a good OPTION to have, especially
for project with little track count (region stuff folks). For small
machines(P133 for example), the program should notify the user to switch to
convert-upon-import for material outside project specs. The question I put
forward is, whether the resampler can be efficiently switched out of the signal
path of the audio mixer or not. If not, Audacity would need two mixer engines.

=== Where should the bit rate conversion in the mixer occur ? ===

Upon entry in to the engine of course. That way you avoid all the extremly nasty
hoopala about what happens when 22kHz-16bit meets 44kHz-16bit on a BUS or the
main phyical OUTPUTS. But there are more descisions to make.

=== Should the project settings bit rate be used in the mixer ? ===

Of course not. 32 bits seem destined to be used by a 32-bit processor. Dominic,
is this the most efficient ? Protools is DSP based, so they have little trouble
pulling off a 48 bit mixer.

This means that data is converted to 32 bits. The time counter reaches a region
of audio data to be mixed(everything is more prone to latency on CPU based
systems so a lot is preloaded), converts it to 32 bit and off it goes, either to
an output, where it gets mixed with anything else that is sent to this output, a
BUS, where it gets mixed with anything that is sent to this BUS and so on.

As you can see, we'd need one resampler per track and we haven't even talked
about how to handle crossfades on tracks yet. The resampler would be dynamic too
as some people may have material of different sample rates on the same track.
This adds a little complexity to the latency, if it is to remain constant.

I'm troubled by my lack of understanding concerning the input and output formats
of plugins. Do they except 32 bit data ? Do they output 32 bit data ?

Perhaps we should use 24 bits for the audio, shoved in to a 32 bit double word.
Nobody I know is going to sample 32 bit audio, BUT ....

If I send audio over a BUS, which then gets picked up by an AUX track(or any
number of I might add), this audio consists of perhaps multiple audio streams of
attenuated 24 bit audio in a 32 bit double word, which as it was attenuated
actualy started using the lowest eight bits of the 32 bits(the other way around
we'd loose more audio information).

Wouldn't it be stupid to truncate the lowest eight bits for plugins ?
Would we expend extra CPU time to dither this audio to 24 bit to send to the
plugin ?

>>(8) mixing capabilities. Volume and panning faders for each track. There's no
>>way around this. Destructive editing as it works in Audacity right now takes
>>too much time.

>We also want to consider surround sound.  More importantly though, this 
>is where all
>of your talk about busses, sends, inserts, etc goes.

Not necessarily. Busses are not used for this often. The panning control is the
big one here. It can be anything from 2.0, 3.0, 5.1 to 8.1(SDDS). These panning
controls are way more efficient that controlling six sends by hand with little
faders for example. Of course the output can consist of a group of busses, but
it's not their primary use. I doubt any Audacity users will want to make
multiple mixes at the same time, as it is sometimes done in PRotools. Then AUX
tracks that use the BUSSES as inputs mix down perhaps an 8.1 mix to 2.0 or 5.1.
All you have to do then is MUTE the other AUX tracks that grab the same busses,
but only pass the data to through to nine outputs and voila, you've got an
alternative mix.

Here are two real world use for busses, that are applicable for most Audacity
users.

First, submixes can be done. Send all vocals to a submix and compress that a
little. Or EQ a group of four guitars with one EQ instead of four.

Then, and this is the real kicker, professional effects handling.
Imagine a vocal track. You want to have some delay on that and soem reverb.
Pretty basic you may think. The delayed signal has to be reverbed as well
however. This is what Logic(w/o DAE engine) and Cobase still don't have.

So you want to send some of that delay to the bus that gets picked up by the AUX
track with the reverb in its insert. THe alterative, if we didn't have busses
but only FX tracks that cannot send anything else to any other fx track, is to
pop in a SECOND IDENTICAL reverb after the delay.

Funny thing is, this kind of functionality is present on any two-bit analog
mixing deck with more than one AUX send.

And that's just two uses. Busses are very powerful, yet very easy to use.

>Here's a very important issue: how
>independant can this be?  I'd like for the mixer to only depend on a 
>multitrack interface
>and an effect plugin interface, and then to export a mixer interface for 
>use by an Automation
>engine.  We need to work out what these interfaces should be, and then 
>people can go
>off and do all of these modules independantly.

Only a programmer can tell you how well this modularization can work. My guess
is, modularizations will cost a lot of speed.

>>(7) realtime effects and fx previews. The previews are a killer feature
>>that'll improve productivity with fx dramatically. Current fx offer a progress
>>bar that doesn't say much. That's it. The fx work, but interfaces are nowhere
>>to be seen for most part.

>>This is the basic core audio functionality, and we might as well get the
>>audio mixer rock solid right from the start.

>I think it's more important that we get these interfaces reasonable from 
>the start.  Then,
>we can offer a stable, simple mixer while we develop a complete virtual 
>console.  The
>complete version is going to take a lot of time.

Of course. The fx previews aren't realy part of a core mixing engine. They're
stand-alone parts.

>>Personally, it's flashy little buttons and clunky interface drive me nuts and
>>nobody in any larger studio I know uses it. They use Protools or Logic Audio.

>Keep in mind again that I'm talking about the library, and I'm deferring 
>all GUI issues.

Yep, getting ahead of myself a bit :).

>>The interface to this engine is a whole other design issue all together,
>>though taking the most useful things from Protools, Logic, Cubase and the likes
>>is a good idea.

>Something that I want to clear about is that I intend to work for a super
>modular system where all subsystems can be easily swapped out for 
>alternate implementations.
>I want to provide a platform where it is easy for people to try 
>different approaches to audio
>processing.  I don't want to just copy existing software.  I want to 
>promote innovation
>and competition by allowing people to focus on just the parts that they 
>care about.  This, to
>me, is the whole point of making a "libaudacity."  Since you seem keen 
>on the virtual console,
>I'd like for you to be able to work out ideas on that without worrying 
>about how a soft synth
>fits in the picture and vice versa, for example.  

Again, I pass this question on as to whether too much speed will be sacrificed
for modularization of a core audio engine. I haven't seen this in any software
I've used so far.

As for copying features, that cannot be an issue. Protools is neat in all sorts
of ways, but there's no real innovation on basic features possible for a core
mixing engine right now. CPUs are just too slow for this and Audacity is used by
a low of lower end users.

Like I said, it remains to be seen if one for example, can leave out busses
during compile time. I do think its possible.

I think good parameters would be to limit the number sends, busses, outputs,
inputs or on-the-fly resampling.

The mixer would have to designed in such a way that unused sends and busses user
little or no CPU time. BUt what else could one modularize ? This is after all
the workhorse of the entire program. It does the processing.

One more thing to answer is, who does the disk access. Having that as a modular
component could for example, enable an easier implementation of a
record-and-play-to/from-RAM engine. People might implement a realy small program
that uses only inputs, outputs and inserts that encodes live mp3 or something.
This output module would enable this kind of functionality. Same for input for
decoding purposes ?

If modularizing the audio engine comes at too great a cost, I'd vote for having
two engines. A high performance engine that features regions(handled by a
seperate disk access module ?), busses, sends and inserts is still the right
step to a high quality open source audio editor IMO.

 Take care
   Tony

Re: [Audacity-devel] Digital Studio (was libaudacity)

A free multi-track audio editor and recorder

Re: [Audacity-devel] Digital Studio (was libaudacity)