| 
      
      
      From: Christian S. <chr...@ep...> - 2002-11-23 20:58:50
       | 
| I read the evo-0.0.6 code to become acquainted with the basics involved in this project and audio applications in general (as I've never written an audio app before). I appreciate that the code isn't that complex, so you can easily get an overview of the most relevant parts. It left me with some minor questions however, that I hope somebody can explain to me (I guess that's most likely Benno but if anybody else can tell me too... :) Is there a reason that int types were always used for actually bool types or is it just some kind of habit? Is there any benefit of using this double_to_int() function [audiovoice.cpp] and the asm code in it, instead of a normal type cast? I was a bit surprised about the ring buffer. I expected a buffer similar to the ones which are used in ASIO or Gigasampler. These consume less CPU time but on the other hand of course, latency values are dependent on the buffer size. How about giving the user the choice which kind of buffer to use? And finally I was a little bit unsure about the purpose of the check_wrap() method in the RingBuffer template. It ensures a sequential reading of 2*WRAP_ELEMTS = 2048 elements, right??? If so, shouldn't there a be a check if there are enough elements available for reading? Regards, Christian | 
| 
      
      
      From: Steve H. <S.W...@ec...> - 2002-11-24 00:24:14
       | 
| On Sat, Nov 23, 2002 at 10:00:21 +0100, Christian Schoenebeck wrote: > Is there a reason that int types were always used for actually bool types or > is it just some kind of habit? Ints are generally the fastest types to access, so using them for bools makes sense. > Is there any benefit of using this double_to_int() function [audiovoice.cpp] > and the asm code in it, instead of a normal type cast? Casting to int rounds down to the nearest int (which isn't always what you want) and is very slow on x86. It reuires flushing the register stack reseting the fpunit's parameters, doing the conversion, flushing the stack... The other questions are specific to evo, so I cant help you there. - Steve | 
| 
      
      
      From: Benno S. <be...@ga...> - 2002-11-24 21:18:46
       | 
| On Sat, 2002-11-23 at 22:00, Christian Schoenebeck wrote: > > Is there a reason that int types were always used for actually bool types or > is it just some kind of habit? Not sure if ints are faster than bools, but OTOH I agree with you that bools are more elegant. (somethimes I write inelegant code, sorry). > > Is there any benefit of using this double_to_int() function [audiovoice.cpp] > and the asm code in it, instead of a normal type cast? See Steve's reply, this hack is needed only on x86, PPCs and most of the other archs can use simple casts. > > I was a bit surprised about the ring buffer. I expected a buffer similar to > the ones which are used in ASIO or Gigasampler. These consume less CPU time > but on the other hand of course, latency values are dependent on the buffer > size. How about giving the user the choice which kind of buffer to use? What does mean expected buffers similar to ASIO etc ? Can you elaborate ? The ringbuffer is very very efficient, basically the audio thread can access the ringbuffer as it were a linear sample, at the end of the processing of a fragment you simply do read_ptr = (read_ptr + fragment_size) & (ringbuf_size-1); (ringbuf_size is a power of two to avoid relatively slow % (modulus) ops). If you you have a more efficient scheme in mind please tell us. The final latency does not depend on the ringbuffer size since this buffer is only for compensating disk latency. The bigger this ringbuffer the smaller the risk of ending up in an "empty disk voice buffer" situation. The disk buffer size can be smaller for low-msec disk access times. On the audio side we simply use regular fragment-based audio output. This means usually only 2-3 fized sized buffers so no ring buffers are involved. > > And finally I was a little bit unsure about the purpose of the check_wrap() > method in the RingBuffer template. It ensures a sequential reading of > 2*WRAP_ELEMTS = 2048 elements, right??? > If so, shouldn't there a be a check if there are enough elements available > for reading? The check_wrap is suboptimal since it checks for when it is about time to replicate some data past the buffer end so that the interpolator can fetch samples past the "official ring buffer end boundary". This check can be moved within the disk thread which accesses the buffer with a much lower frequency and where it is easy to figure out when the last read reached the ring buffer end position. (in that case read the data from disk and write it at ring buffer start and replicate some samples past the ring buffer end). If you have more question or if some of the issues are not clear please let us know on the mailing list. cheers, Benno -- http://linuxsampler.sourceforge.net Building a professional grade software sampler for Linux. Please help us designing and developing it. | 
| 
      
      
      From: Christian S. <chr...@ep...> - 2002-11-25 23:21:43
       | 
| Es geschah am Sonntag, 24. November 2002 22:25 als Benno Senoner schrieb: > Not sure if ints are faster than bools, but OTOH I agree with you that > bools are more elegant. (somethimes I write inelegant code, sorry). I wasn't able to determine any performance benefit of it, nevertheless at least a typef won't hurt. > > Is there any benefit of using this double_to_int() function > > [audiovoice.cpp] and the asm code in it, instead of a normal type cast? > > See Steve's reply, this hack is needed only on x86, PPCs and most of the > other archs can use simple casts. I compared both and got exactly the same results wether asm or normal type cast and wether integral or rational number (Athlon). But the performance is in fact a point. > > I was a bit surprised about the ring buffer. I expected a buffer similar > > to the ones which are used in ASIO or Gigasampler. These consume less CPU > > time but on the other hand of course, latency values are dependent on the > > buffer size. How about giving the user the choice which kind of buffer to > > use? > > What does mean expected buffers similar to ASIO etc ? Can you elaborate > ? AFAIK ASIO uses two buffers which are isolated. One buffer is only accessed for wether reading or writing, never both at the same time. So buffer A gets filled by the disk stream while buffer B is read by the audio thread. After that buffer B gets filled while buffer A is read and so on... The time for one period is fixed by the buffer size, sampling rate and audio channels. Somebody already posted an article about that, but I haven't found it anymore. The advantage of those simple buffers is that you can always expect a fixed buffer size for reading/writing at any time. Whereas with you approach, you always have to check how many bytes are left to read/write and if you have to continue from the beginning of the buffer after accessing some pieces at the end. That's why your way is a bit more CPU hungry but circumvents those latency problems ASIO and Gigasampler has (latency fixed to the buffer size and latency jitter or even higher latency to correct that jitter). > The ringbuffer is very very efficient, basically the audio thread can > access the ringbuffer as it were a linear sample, at the end of the > processing of a fragment you simply do read_ptr = (read_ptr + > fragment_size) & (ringbuf_size-1); (ringbuf_size is a power of two to > avoid relatively slow % (modulus) ops). Yes I found the latter a very clever and efficient trick to keep the pointer within the boundaries. > If you you have a more efficient scheme in mind please tell us. > The final latency does not depend on the ringbuffer size since this > buffer is only for compensating disk latency. Of course not, I meant those ASIO/Giga Buffers. > On the audio side we simply use regular fragment-based audio output. > This means usually only 2-3 fized sized buffers so no ring buffers are > involved. I guess you mean these audio_sum, audio_buf arrays [audiothread.cpp] to calculate the mix and send it to the audio device, right? > > And finally I was a little bit unsure about the purpose of the > > check_wrap() method in the RingBuffer template. It ensures a sequential > > reading of 2*WRAP_ELEMTS = 2048 elements, right??? > > If so, shouldn't there a be a check if there are enough elements > > available for reading? > > The check_wrap is suboptimal since it checks for when it is about time > to replicate some data past the buffer end so that the interpolator can > fetch samples past the "official ring buffer end boundary". Why is this inefficient? Do you mean because of the memcpy() in check_wrap() that copies a portion within the buffer? > This check can be moved within the disk thread which accesses the buffer > with a much lower frequency and where it is easy to figure out when the > last read reached the ring buffer end position. I'm not sure what you're getting at. Do you mean that it's more likely that read/write access to the buffer won't interfere/overlap, because the audio thread reads faster than the disk buffer can fill up the space, due to the higher priority of the audio thread? Regards, Christian | 
| 
      
      
      From: Benno S. <be...@ga...> - 2002-12-11 22:00:49
       | 
| Il mar, 2002-11-26 alle 00:23, Christian Schoenebeck ha scritto: > > > > What does mean expected buffers similar to ASIO etc ? Can you elaborate > > ? > > AFAIK ASIO uses two buffers which are isolated. One buffer is only accessed > for wether reading or writing, never both at the same time. So buffer A gets > filled by the disk stream while buffer B is read by the audio thread. After > that buffer B gets filled while buffer A is read and so on... > The time for one period is fixed by the buffer size, sampling rate and audio > channels. > > Somebody already posted an article about that, but I haven't found it anymore. > > The advantage of those simple buffers is that you can always expect a fixed > buffer size for reading/writing at any time. Whereas with you approach, you > always have to check how many bytes are left to read/write and if you have to > continue from the beginning of the buffer after accessing some pieces at the > end. That's why your way is a bit more CPU hungry but circumvents those > latency problems ASIO and Gigasampler has (latency fixed to the buffer size > and latency jitter or even higher latency to correct that jitter). Keep in mind that with variable sample pitch (which is always the case in a sampler) you will never be able to read an integral number of samples from the source (the sample on disk or ram) resample them and write it to as you call it to the "asio buffer". In my code I use OSS or ALSA writing data in chunks too (fragments), thus it is excactly the stuff you described. Thanks to the wraparound concept, the overhead is really small since the buffer ptr updates are very infrequent and occur in a low priority thread.(the disk thread). > > > If you you have a more efficient scheme in mind please tell us. > > The final latency does not depend on the ringbuffer size since this > > buffer is only for compensating disk latency. > > Of course not, I meant those ASIO/Giga Buffers. but this is user selectable , no problem here, ... As said before, I successfully ran tests with 3 x 32 sample audio buffers which leads to 2.1 msec latency. :) > > > On the audio side we simply use regular fragment-based audio output. > > This means usually only 2-3 fized sized buffers so no ring buffers are > > involved. > > I guess you mean these audio_sum, audio_buf arrays [audiothread.cpp] to > calculate the mix and send it to the audio device, right? yes. > > > > The check_wrap is suboptimal since it checks for when it is about time > > to replicate some data past the buffer end so that the interpolator can > > fetch samples past the "official ring buffer end boundary". > > Why is this inefficient? Do you mean because of the memcpy() in check_wrap() > that copies a portion within the buffer? Not the function itself is inefficient, only the fact that it is called relatively often from within the audiothread. The check_wrap can be moved to the audio thread where it will be called less frequently thus saving some cpu cycles (but not that much since it is inlined). > > > This check can be moved within the disk thread which accesses the buffer > > with a much lower frequency and where it is easy to figure out when the > > last read reached the ring buffer end position. > > I'm not sure what you're getting at. Do you mean that it's more likely that > read/write access to the buffer won't interfere/overlap, because the audio > thread reads faster than the disk buffer can fill up the space, due to the > higher priority of the audio thread? no, the disk thread "always writes faster than the audio thread reads", or in short: there will be always some data in the ring buffer as long as the disk is not overloaded (if that happens we must lower the max # of voices or user more powerful hardware ... it is not our fault). Anyway since in regular conditions there will always some data in the buffer, resampling will work perfectly even at buffer boundaries thanks to the check_wrap func that mirrors the data that is stored at the buffer beginning to the position that resides past the buffer end. cheers, Benno -- http://linuxsampler.sourceforge.net Building a professional grade software sampler for Linux. Please help us designing and developing it. | 
| 
      
      
      From: Christian S. <chr...@ep...> - 2002-12-14 20:31:59
       | 
| Es geschah am Mittwoch, 11. Dezember 2002 23:08 als Benno Senoner schrieb: > > > This check can be moved within the disk thread which accesses the > > > buffer with a much lower frequency and where it is easy to figure out > > > when the last read reached the ring buffer end position. > > > > I'm not sure what you're getting at. Do you mean that it's more likely > > that read/write access to the buffer won't interfere/overlap, because the > > audio thread reads faster than the disk buffer can fill up the space, due > > to the higher priority of the audio thread? > > no, the disk thread "always writes faster than the audio thread reads", > or in short: there will be always some data in the ring buffer as long > as the disk is not overloaded (if that happens we must lower the max # > of voices or user more powerful hardware ... it is not our fault). Oh, I expressed myself wrong. Anyway, I think I got the idea. Thanks! |