Re: [Audacity-devel] Digital Studio (was libaudacity)
A free multi-track audio editor and recorder
Brought to you by:
aosiniao
From: Anthony A. O. <Ant...@ep...> - 2002-06-30 17:04:41
|
On Fri, 28 Jun 2002 21:51:45 -0700, Augustus Saunders wrote: >Anthony Airon Oetzmann wrote: >>Here are my abosulte favorites for implementation as soon as possible. >>(6), the multitracking enviroment, with proper regions, and consequentially >>a region bin(list with subfolders, find function, delete region, delete disk >>file, optimize files on disk, etc.). This is what I consider the one feature >>of Audacity that'll get the screen cleaned up and therefore more managable. >>Productivity will jump, as users can push around their audio between tracks, >>extend and contract the edges, and more easily group together elements >>such as vocals over a small number managable tracks >Keep in mind that for right now, I want to focus on what the >capabilities need to be >and let the GUI work itself out. The library should probably only >support things that >actualy have an impact on the sound. For example, something like >folders could be >a fiction maintained by the UI, so is there a good reason to add >complexity (and size) >to the library to support them? If it turns out that the library can >make it more effecient, >then that could be a good reason (for example, applying identical >automation to multiple >tracks might be somehow more efficient if the library natively supports >folders). You're right of course. I was getting ahead of myself with the GUI. When I'm reffering to an audio mixer, I don't talk about any GUI elements, but the code that clobers the audio data in the background. The GUI elements that control these mixing parameters are up for grabs at a later time. As for the region bin, Protools sticks to one sampling and bit rate. Therefore, upon importing audio, it converts files in to the audio data folder of the project, much like Audacity does. If the bitrate and sample rate match the project specs, the file merely gets referenced, IF it is on a hard disk. I say this, because this issue may creep up sooner or later and we need to decide about some basic engine parameters. === Does it resample audio on the fly like Vegas ? === Pro: flexibilty Contra: heavier CPU load, especially for better quality One might concede that the final rendering(or EXPORT as we call it) could use the highest quality for resampling audio, but if anyone ever does serious mixing with plugins, he's not going to want to waste CPU time on resampling, especially if it's not top quality resampling. It's doesn't bum off the CPU completly, but it does have a cost. === If using resampling, should a function to CONVERT ALL AUDIO IN THE PROJECT TO PROJECT SPECS exist ? === This could remedy the above situation, because it leaves the user to choose when he wants to expend that CPU time to save cycles for mixing. If we choose such an approach, this has to be in there. === Should audio outside project specs(sample & bit rate) be converted to project specs ??? === Pro: saves massive amounts of CPU cycles and streamlines the audio mixer to handle only ONE type of audio. Note that Protools uses a 48 bit mixer, dithered to 24 bits, no matter what you do. Bitrates are easier to handle, but sample rates are a CPU time nightmare. Contra: converted material takes up extra disk space conversion takes time(not much with SSRC, an exellent and precise GPL'ed resampler) === Should material with the right sample rate but different bitrate to project specs be converted too ? ======= No : saves disk space and costs very little CPU time (Dominic, please correct be if I'm wrong about the ease of this) . Why ? If we use a 24, 32 or 48 bit mixer, most material will be bit rate converted anyway. It's only a matter of writing a byte(8-bit) or word(16-bit) in to a double word(32bit?:) or 48 bit value. Yes: blows more diskspace and doesn't streamline the audio mixer very much. The sample rate descision is hard to make. We should ask around to see if doing a CONVERT-upon-importing engine is the only way to go, or if an engine that includes on-the-fly resampling has serious applications. Nobody should make the mistake that resampling costs little CPU time. High quality costs LOTS of cpu time. My opinion is that the resample-on-the-fly is a good OPTION to have, especially for project with little track count (region stuff folks). For small machines(P133 for example), the program should notify the user to switch to convert-upon-import for material outside project specs. The question I put forward is, whether the resampler can be efficiently switched out of the signal path of the audio mixer or not. If not, Audacity would need two mixer engines. === Where should the bit rate conversion in the mixer occur ? === Upon entry in to the engine of course. That way you avoid all the extremly nasty hoopala about what happens when 22kHz-16bit meets 44kHz-16bit on a BUS or the main phyical OUTPUTS. But there are more descisions to make. === Should the project settings bit rate be used in the mixer ? === Of course not. 32 bits seem destined to be used by a 32-bit processor. Dominic, is this the most efficient ? Protools is DSP based, so they have little trouble pulling off a 48 bit mixer. This means that data is converted to 32 bits. The time counter reaches a region of audio data to be mixed(everything is more prone to latency on CPU based systems so a lot is preloaded), converts it to 32 bit and off it goes, either to an output, where it gets mixed with anything else that is sent to this output, a BUS, where it gets mixed with anything that is sent to this BUS and so on. As you can see, we'd need one resampler per track and we haven't even talked about how to handle crossfades on tracks yet. The resampler would be dynamic too as some people may have material of different sample rates on the same track. This adds a little complexity to the latency, if it is to remain constant. I'm troubled by my lack of understanding concerning the input and output formats of plugins. Do they except 32 bit data ? Do they output 32 bit data ? Perhaps we should use 24 bits for the audio, shoved in to a 32 bit double word. Nobody I know is going to sample 32 bit audio, BUT .... If I send audio over a BUS, which then gets picked up by an AUX track(or any number of I might add), this audio consists of perhaps multiple audio streams of attenuated 24 bit audio in a 32 bit double word, which as it was attenuated actualy started using the lowest eight bits of the 32 bits(the other way around we'd loose more audio information). Wouldn't it be stupid to truncate the lowest eight bits for plugins ? Would we expend extra CPU time to dither this audio to 24 bit to send to the plugin ? >>(8) mixing capabilities. Volume and panning faders for each track. There's no >>way around this. Destructive editing as it works in Audacity right now takes >>too much time. >We also want to consider surround sound. More importantly though, this >is where all >of your talk about busses, sends, inserts, etc goes. Not necessarily. Busses are not used for this often. The panning control is the big one here. It can be anything from 2.0, 3.0, 5.1 to 8.1(SDDS). These panning controls are way more efficient that controlling six sends by hand with little faders for example. Of course the output can consist of a group of busses, but it's not their primary use. I doubt any Audacity users will want to make multiple mixes at the same time, as it is sometimes done in PRotools. Then AUX tracks that use the BUSSES as inputs mix down perhaps an 8.1 mix to 2.0 or 5.1. All you have to do then is MUTE the other AUX tracks that grab the same busses, but only pass the data to through to nine outputs and voila, you've got an alternative mix. Here are two real world use for busses, that are applicable for most Audacity users. First, submixes can be done. Send all vocals to a submix and compress that a little. Or EQ a group of four guitars with one EQ instead of four. Then, and this is the real kicker, professional effects handling. Imagine a vocal track. You want to have some delay on that and soem reverb. Pretty basic you may think. The delayed signal has to be reverbed as well however. This is what Logic(w/o DAE engine) and Cobase still don't have. So you want to send some of that delay to the bus that gets picked up by the AUX track with the reverb in its insert. THe alterative, if we didn't have busses but only FX tracks that cannot send anything else to any other fx track, is to pop in a SECOND IDENTICAL reverb after the delay. Funny thing is, this kind of functionality is present on any two-bit analog mixing deck with more than one AUX send. And that's just two uses. Busses are very powerful, yet very easy to use. >Here's a very important issue: how >independant can this be? I'd like for the mixer to only depend on a >multitrack interface >and an effect plugin interface, and then to export a mixer interface for >use by an Automation >engine. We need to work out what these interfaces should be, and then >people can go >off and do all of these modules independantly. Only a programmer can tell you how well this modularization can work. My guess is, modularizations will cost a lot of speed. >>(7) realtime effects and fx previews. The previews are a killer feature >>that'll improve productivity with fx dramatically. Current fx offer a progress >>bar that doesn't say much. That's it. The fx work, but interfaces are nowhere >>to be seen for most part. >>This is the basic core audio functionality, and we might as well get the >>audio mixer rock solid right from the start. >I think it's more important that we get these interfaces reasonable from >the start. Then, >we can offer a stable, simple mixer while we develop a complete virtual >console. The >complete version is going to take a lot of time. Of course. The fx previews aren't realy part of a core mixing engine. They're stand-alone parts. >>Personally, it's flashy little buttons and clunky interface drive me nuts and >>nobody in any larger studio I know uses it. They use Protools or Logic Audio. >Keep in mind again that I'm talking about the library, and I'm deferring >all GUI issues. Yep, getting ahead of myself a bit :). >>The interface to this engine is a whole other design issue all together, >>though taking the most useful things from Protools, Logic, Cubase and the likes >>is a good idea. >Something that I want to clear about is that I intend to work for a super >modular system where all subsystems can be easily swapped out for >alternate implementations. >I want to provide a platform where it is easy for people to try >different approaches to audio >processing. I don't want to just copy existing software. I want to >promote innovation >and competition by allowing people to focus on just the parts that they >care about. This, to >me, is the whole point of making a "libaudacity." Since you seem keen >on the virtual console, >I'd like for you to be able to work out ideas on that without worrying >about how a soft synth >fits in the picture and vice versa, for example. Again, I pass this question on as to whether too much speed will be sacrificed for modularization of a core audio engine. I haven't seen this in any software I've used so far. As for copying features, that cannot be an issue. Protools is neat in all sorts of ways, but there's no real innovation on basic features possible for a core mixing engine right now. CPUs are just too slow for this and Audacity is used by a low of lower end users. Like I said, it remains to be seen if one for example, can leave out busses during compile time. I do think its possible. I think good parameters would be to limit the number sends, busses, outputs, inputs or on-the-fly resampling. The mixer would have to designed in such a way that unused sends and busses user little or no CPU time. BUt what else could one modularize ? This is after all the workhorse of the entire program. It does the processing. One more thing to answer is, who does the disk access. Having that as a modular component could for example, enable an easier implementation of a record-and-play-to/from-RAM engine. People might implement a realy small program that uses only inputs, outputs and inserts that encodes live mp3 or something. This output module would enable this kind of functionality. Same for input for decoding purposes ? If modularizing the audio engine comes at too great a cost, I'd vote for having two engines. A high performance engine that features regions(handled by a seperate disk access module ?), busses, sends and inserts is still the right step to a high quality open source audio editor IMO. Take care Tony |