RE: [GD-Windows] BitBlt() syncing to VBL
Brought to you by:
vexxed72
From: Brian H. <bri...@py...> - 2002-01-15 19:27:08
|
> Yeah, but assuming that the offscreen backbuffer is on the > video card, the vidmem-to-vidmem blit is so fast that it's > essentially free. Certainly doesn't cost you any CPU time to > queue it up. If only it were so simple =) You can specify where you want your offscreen GWorld allocated, and all my allocations are hardcoded into system memory. DX programming has taught me to stay away from anything twitchy -- like VRAM/AGP buffers =) > Yeah, the fact that your frame rate is at the refresh rate - > it would be a stretch to suspect anything else. Can you > disable bits of your pipeline and log your framerate to see > where the bottleneck is? i.e. if build your frame but not > blit it to the card, how many fps do you get? What if you > just blit an empty frame to the card every time? Etc etc. I'm going to check all that again. On the DX list this erupted into the "how to measure time" thread, but for now I'm using QPC and I'll see what my timings are like for the screen build and the blit. > I think the recipe for speed is to minimizing blits across > the bus - composite the new frame in system memory, do one > blit to the back buffer of the video card, then flip or blit > back to front. You don't want to send something across the > bus that will later be overdrawn. That's what I'm doing. > I would bet that CopyBits is heavily optimized, but BitBlt > should be too. Not necessarily -- I mean, I would imagine that the Windows engineers have long ago decided that GDI acceleration probably isn't going to be a major priority and have concentrated their efforts elsewhere. The Mac engineers, however, recognize that they need to look for optimizations for Altivec anywhere they can since that's a part of their marketing strategy (the "MHz Myth"). > I think the key for both is to make sure > you're on the fast path - no transparency, pixel formats > match, palettes (if any) match - so that the function can > just blast bits. Well, this goes back to the XLTE_ColorTransform thread. There is some conversion happening, but it's nearly unavoidable unless I write some huge explosion of blitters. Right now I'm taking advice from a friend to basically do everything in some canonical format (x555 in my case) and let the back blitter handle conversion. This is theoretically slower than making a DDB and then writing every permutation of blitter necessary to support building my buffers in DDB land. That just seems like a failure case waiting to happen though, given the sheer number of pixel formats that are available. Brian |