Thread: [GD-Windows] Streaming WAVE audio
Brought to you by:
vexxed72
From: Brian H. <bri...@py...> - 2002-04-27 21:34:03
|
Yes, it's me with a remedial, way-back, old-school question. However, this is because I'm working on WinCE 3.0 (my God, all those WinCE docs that show up in MSDN searches now MEAN something!). Since WinCE/PPC2K lack Direct Sound, I have to do my own audio mixing into wave buffers. Conceptually, this is simple, although some of the implementation details were a bit tough to find out (specifically, using a thread instead of a callback). In a nutshell, I'm doing this in the straightforward fashion you'd expect, double buffering two audio buffers that are filled periodically. fill( buffer[ 0 ] ); fill( buffer[ 1 ] ); Under Win32's waveOut interface, you can specify a thread to handle messages indicating WOM_DONE, WOM_OPEN, WOM_CLOSE. This is pretty easy, you just CreateThread() and pass the identifier and CALLBACK_FUNCTION to waveOut Open. That much works. The thread function basically does this: while ( GetMessage( ... ) { if ( msg == WOM_DONE ) { waveOutWrite( nextBuffer ); fill( bufferThatJustEnded ); } } The above works as well, in that it doesn't crash =) That's all a streaming WAVE audio subsystem should really need to do. Of course, I do the necessary waveOutPrepareHeader, etc. I'm doing this in 8-bit, mono, 11KHz format, and the device definitely supports it. In fact, I have this code running on Win2000 just so that I can make sure it works on a desktop before moving to a PDA. So the problem is that I'm getting dropouts and stuttering, echoing, and a host of other bad things. I boosted the thread's priority to highest, and I also set my buffers to some arbitrarily huge sizes to see if it was underruns to no avail. I check all the return values. Everything looks fine. I timed calls to WOM_DONE handling, and the delta times are the expected buffer sizes. The mixing code is the same for OS X and DirectSound and works as expected. Each buffer is 150ms, more than enough time to mix in some stuff (especially on a P3/1000). So I'm at a loss what else could be causing problems. Originally I was using a CALLBACK_FUNCTION, but that's somewhat problematic for other reasons (you can't call waveOutXXX functions within the callback, since that can cause a deadlock). Any suggestions on what else to look for that could be causing problems? Brian |
From: Jon W. <hp...@mi...> - 2002-04-27 23:07:26
|
> CALLBACK_FUNCTION I suppose you mean CALLBACK_THREAD when you're actually using a thread. Have you tried CALLBACK_EVENT and WaitForSingleObject as an alternative, btw? > Any suggestions on what else to look for that could be causing problems? Poor drivers? Having had access to WinCE drivers as "reference code" when writing device drivers, I'm underwhelmed by some of the vendors' approaches to quality. Others do OK. Another thing to try is to use at least three buffers, possibly four, because some drivers may have implementation issues with you handing it "only" one fresh buffer of advance queue. Also try to make the buffers powers of two in size, to work around some classes of bugs. You also should not prepare a header more than once; prepare all the headers on start-up, then call waveOutWrite() in a cyclic fashion. You're probably already doing this right. It may be that preparing a header that has already been prepared is not the efficient no-op you'd want it to be. Another thing to look for is internal logic errors where you play the "wrong" buffer, and/or get out of sync if you miss one buffer. The safest and most predictable way to do this is to always copy data from the mixer function, and fill all buffers with zero before you start the device, rather than trying to pre-fill buffers with data. While this will result in a startup latency equal to the steady-state latency and some people think this is a draw-back, I think it's the right thing to do. Last, you can poll the WHDR_DONE bit in the headers for the buffers, rather than relying on a callback to fill the buffers. This lets you use a model slightly closer to that of DirectSound. Let us know when you find out what it is! Cheers, / h+ |
From: Brian H. <bri...@py...> - 2002-04-27 23:32:10
|
At 04:07 PM 4/27/2002 -0700, Jon Watte wrote: >I suppose you mean CALLBACK_THREAD when you're actually using a >thread. Correct. >Have you tried CALLBACK_EVENT and WaitForSingleObject as >an alternative, btw? Not yet, but I was hoping to avoid getting that desperate. >Poor drivers? This is an i815 board (my Win2K box), so I'm assuming that Intel's drivers don't suck that bad. >Another thing to try is to use at least three buffers, possibly four, I'm actually using three buffers right now, to no avail. I fill and write them out when I start up to get the chain going, then when dealing with WOM_DONE I refill buffers that are WHDR_DONE and dispatch buffers that aren't (and which have presumably been filled previously). >buffers powers of two in size, to work around some classes of bugs. Did that, no difference. >You also should not prepare a header more than once; prepare all the >headers on start-up, then call waveOutWrite() in a cyclic fashion. Yep, that's what I do. They're all pointing to the same buffers through their lives: GLOBAL char buffer[ 3 ][ 1024 ]; header[ 0 ].lpData = buffer[ 0 ]; header[ 0 ].dwLength = sizeof( buffer[ 0 ] ); waveOutPrepareHeader( ... ); //etc. etc. >Another thing to look for is internal logic errors where you play >the "wrong" buffer, and/or get out of sync if you miss one buffer. That's what I'm thinking it is, but so far I don't see it. It's probably a stupid bug on my part, unfortunately the tried-and-true "find the bug after sending an e-mail asking for help" trick didn't work this time =) When processing WOM_DONE I do a bunch of sanity checks, including verifying that the buffer I'm writing is the next one that needs to be written; that the buffer I'm filling isn't INQUEUE; that the buffer I'm filling is DONE. Should be pretty simple and straightforward. >Last, you can poll the WHDR_DONE bit in the headers for the buffers, >rather than relying on a callback to fill the buffers. This lets you >use a model slightly closer to that of DirectSound. Yes, I could do that, and I may once I get REAL desperate. I could poll the WHDR_DONE in a separate thread a la my DSound buffering version and go with that, but I have a feeling if I implement it that way that things won't actually get any better, I'll just have shifted around the problem. Thanks! Brian |
From: Brian H. <bri...@py...> - 2002-04-27 23:41:33
|
At 04:31 PM 4/27/2002 -0700, Brian Hook wrote: >This is an i815 board (my Win2K box), so I'm assuming that Intel's drivers >don't suck that bad. Just to verify, this also occurred on a system with a different processor (Athlon), OS (WinXP) and sound card (Echo Mia), so it's probably not a driver thing. I'm pretty sure it's a programmer thing. Since it sounds like I'm doing everything right, I'm going to assume that it's just Stupid Programmer Syndrome and I'll try to figure it out from there. Thanks! Brian |
From: Brian H. <bri...@py...> - 2002-04-28 00:38:40
|
Woo-hoo, found it! And, of course, it was stupid programmer syndrome. Specifically, I wasn't dispatching a buffer after filling it, instead I was waiting for it to get dispatched "when necessary". My WOM_DONE handler was doing this: waveOutWrite( nextBuffer ); fillBuffer( completedBuffer ); Changing it to this fixed it: waveOutWrite( nextBuffer ); fillBuffer( completedBuffer ); waveOutWrite( completedBuffer ); //keep the chain going I was counting on/assuming that the "waveOutWrite( nextBuffer )" would eventually dispatch a buffer that was filled NUM_BUFFERS-1 notifications ago, which apparently wasn't the case. Or maybe something else was wrong, but this fixes it for now. Seat of the pants, baby! Brian |
From: Jon W. <hp...@mi...> - 2002-04-28 16:35:52
|
> I was counting on/assuming that the "waveOutWrite( nextBuffer )" would > eventually dispatch a buffer that was filled NUM_BUFFERS-1 notifications > ago, which apparently wasn't the case. Yeah, that'd do it. If you count it, you're running with only a single buffer, if this is what you're doing. > Changing it to this fixed it: > > waveOutWrite( nextBuffer ); > fillBuffer( completedBuffer ); > waveOutWrite( completedBuffer ); //keep the chain going This can't possibly work right, as you enqueue two buffers for each buffer played. I think you actually mean something else, such as just calling fillBuffer() and waveOutWrite(). For the benefit of other readers of this thread, I've always found this logic to be the most robust when playing audio: unsigned char buffers[3][SIZE]; int nextBuffer; setup: memset( buffers[0], QUIET, SIZE ); memset( buffers[1], QUIET, SIZE ); memset( buffers[2], QUIET, SIZE ); prepareBuffer( 0 ); prepareBuffer( 1 ); prepareBuffer( 2 ); nextBuffer = 0; start: queueBuffer( 0 ); queueBuffer( 1 ); queueBuffer( 2 ); completion: fillBuffer( nextBuffer ); queueBuffer( nextBuffer ); nextBuffer = (nextBuffer + 1) % 3; If there's a chance that you will lose completion events, then you can turn completion into a while loop: completion_with_misses: while( bufferDone( nextBuffer ) ) { fillBuffer( nextBuffer ); queueBuffer( nextBuffer ); nextBuffer = (nextBuffer + 1) % 3; } Note that this mechanism scales very well to different number of buffers, and the code is very clean and decision-free, which is always a plus. This structure will work fine with double-buffering (only two buffers) assuming the buffers are big enough for the inherent scheduling jitter of your platform (missing a single buffer in double-buffering means you'll probably hear it). Cheers, / h+ |
From: Brian H. <bri...@py...> - 2002-04-28 17:03:58
|
At 09:36 AM 4/28/2002 -0700, Jon Watte wrote: >Yeah, that'd do it. If you count it, you're running with only a single >buffer, if this is what you're doing. Yeah, that was part of me being lazy and tired -- I didn't run through simulation steps. > > Changing it to this fixed it: > > > > waveOutWrite( nextBuffer ); > > fillBuffer( completedBuffer ); > > waveOutWrite( completedBuffer ); //keep the chain going > >This can't possibly work right, as you enqueue two buffers for each >buffer played. I think you actually mean something else, such as >just calling fillBuffer() and waveOutWrite(). Actually, using your nomenclature, where "nextBuffer" indicates the "nextBuffer to be filled", I was doing this: waveOutWrite( ( nextBuffer + 1 ) % 3 ); fillBuffer( nextBuffer ); waveOutWrite( nextBuffer ); nextBuffer = ( nextBuffer + 1 ) % 3; And you're right, it didn't make sense -- it WORKED, but only because waveOutWrite is friendly when you send it a buffer it can't use. In the case of the first completion, nextBuffer+1 would have pointed to a previously INQUEUED buffer, generating an error (which would be silent if the return values weren't checked...fwiw, the return value was '33', which isn't an MMSYSERR_xxxx). So basically that first waveOutWrite() wasn't doing anything -- in fact, it was a legacy of my initial implementation where I always tried to have a premixed buffer waiting to go next. That is, the completion routine was expecting to dispatch _immediately_ to avoid any hiccups, it originally didn't have any other buffers queued waiting to go. That's the problem with long debugging sessions -- you start getting old code interfering with new code, but you're so tired and pissed you're scared to change stuff that seemingly works =) >setup: > memset( buffers[0], QUIET, SIZE ); > memset( buffers[1], QUIET, SIZE ); > memset( buffers[2], QUIET, SIZE ); > prepareBuffer( 0 ); > prepareBuffer( 1 ); > prepareBuffer( 2 ); > nextBuffer = 0; That's exactly what I was doing. >start: > queueBuffer( 0 ); > queueBuffer( 1 ); > queueBuffer( 2 ); Ayup. >completion: > fillBuffer( nextBuffer ); > queueBuffer( nextBuffer ); > nextBuffer = (nextBuffer + 1) % 3; Yup. The above is pretty much identical to the code I have now, but with my last e-mail I had a spurious queueBuffer((nextBuffer+1)%3) before I did the fill buffer. Brian |
From: Brian H. <ho...@bo...> - 2004-07-09 05:21:12
|
I may actually win the award for oldest thread resurrection (this is about 2.25 years old). So I'm rewriting my streaming audio code and open sourcing it (http://www.bookofhook.com/sal for those that are curious -- it's kind of pre-alpha right now but seems to work for the most part on DirectSound, OS X/CoreAudio, OSS/Linux, and ALSA/Linux). When revisiting my streaming audio code, I realized I was using three different buffers when in _theory_ I should be able to use one largish buffer and just query the output position and write the necessary number of bytes out to it when I'm done. The key to this is that the buffer is marked as WHDR_BEGINLOOP | WHDR_ENDLOOP. I then query the output position in my mixing thread with waveOutGetPosition(). This works on my system, but it seems to start failing on some systems randomly depending on the output format (e.g. one user reported that 44KHz worked fine, but there was massive noise/distortion on 22KHz, using the same executable I was using successfully). So the question is: is there a definite known problem doing it this way instead of multibuffering? I can go back to multibuffering fairly trivially and just do something like: queue( buffer[0] ); queue( buffer[1] ); Then in my main audio thread: while ( 1 ) { if ( buffer[ 0 ].is_done ) { fill( buffer[ 0 ] ); waveOutWrite( buffer[0] ); } if ( buffer[ 1 ].is_done ) { fill( buffer[ 1 ] ); waveOutWrite( buffer[ 0 ] ); } sleep( buffer_len/2); } You get the gist -- poll for completed buffers and then fill them up on demand, relying on sleep + poll instead of an event system or callbacks or a separate thread function. Brian |