RE: [GD-Windows] BitBlt() syncing to VBL
Brought to you by:
vexxed72
From: Brian S. <bs...@mi...> - 2002-01-15 19:19:07
|
> Nope. This is on OS X using off-screen Gworlds and=20 > CopyBits(). This is > pretty much the MacOS equivalent to DIBSections and BitBlt().=20 > In fact, > I should have even WORSE performance under OS X because I'm actually > triple buffering -- my blit goes into the window's off-screen > backbuffer, which is then blitted by the OS later. Under Windows I'm > just going straight from DIB section to the Window's front buffer (in > theory). Yeah, but assuming that the offscreen backbuffer is on the video card, the vidmem-to-vidmem blit is so fast that it's essentially free. Certainly doesn't cost you any CPU time to queue it up. > My guess is that there's still some VBL action going on=20 > somewhere (note: > the Mac also VBLs, even though I'm getting 125fps, but the triple > buffering is probably accelerating things by allowing=20 > multiple blits in > a single refresh?), since I'm locked very close to my monitor's > ostensible frame rate. Yeah, the fact that your frame rate is at the refresh rate - it would be a stretch to suspect anything else. Can you disable bits of your pipeline and log your framerate to see where the bottleneck is? i.e. if build your frame but not blit it to the card, how many fps do you get? What if you just blit an empty frame to the card every time? Etc etc. I think the recipe for speed is to minimizing blits across the bus - composite the new frame in system memory, do one blit to the back buffer of the video card, then flip or blit back to front. You don't want to send something across the bus that will later be overdrawn. > I've tried disabling all the various "sync" parameters in the driver > properties, but to no avail. >=20 > I do find this quite a bit odd simply because I was expecting to do a > lot of optimization work on the Mac since the Mac has a slower clock > speed and significantly less memory bandwidth. My nearest=20 > guess is that > I'm either doing something terribly wrong on the Windows side, or the > Mac has some kind of mad, stupid Altivec optimized=20 > memcpy()/CopyBits(). I would bet that CopyBits is heavily optimized, but BitBlt should be too. I think the key for both is to make sure you're on the fast path - no transparency, pixel formats match, palettes (if any) match - so that the function can just blast bits. I prefer DirectDraw over GDI because if you're not on the fast path you can tell immediately - either nothing will draw, or in the case of 1555 vs. 565, everything looks very, very odd. --brian |