Re: [PyOpenGL-Users] 2d sprite engine performance.
Brought to you by:
mcfletch
From: Erik J. <ejo...@fa...> - 2005-04-06 21:04:55
|
On Wed, 06 Apr 2005 16:24:33 -0400, "Mike C. Fletcher" > Btw, IIRC, performance for graphics chips of that age (Rage 128 (which I > *think* is around the age of a TNT)) was to max out around 1000 to 1500 > textured, lit polygons/second (games use (fairly advanced) culling > algorithms to reduce the number of polygons on-screen for any given > rendering pass). Eliminating lighting should increase that to around 2 > or 3000 polys, but that was with the old (small) textures that were > heavily reduced (32x32 or 64x64). I believe the Rage 128 is roughly equivalent to a TNT2. I am using mostly 16 * 16 or 32 * 32 textures, and the numbers that you are quoting are what make me surprised to be maxing out at 250 polygons. > I wouldn't be surprised if you're running into texture bandwidth > problems and maybe even simple fill-rate problems. You may find that > the card is extremely sensitive to colour mode for its performance (IIRC > switching a TNT to 16-bit mode would get close to doubling frame-rates > on our VR system of the time). I have heard that older cards don't do 32-bit mode well. I will look into 16-bit. As I said in another message though, I can do 1400 sprites with a Pentium 4 @2.6ghz using a Matrox G400, which I believe is only slightly faster than a Rage128. I don't think the video card is my limiting factor at the moment. CPU speed seems to make all the difference. > Just to be clear: > > * You *are* running those glBegin...glEnd blocks as display-lists, > not immediate-mode calls, right? > o You create a display list holding each sprite to draw (you > may only need one, depends on the proportions of the sprites) > o Sort the sprites by texture (and by potential overlap > (virtual Z ordering)) > o Load the texture > o for (x,y,z),sprite in textureset: > + glTranslated( x,y,z ) > + glCallList( sprite ) Interesting, I hadn't thought of this approach. > Python is much slower than equivalent C; to get decent performance you > do need to use a mechanism that pushes the code down into C. I normally > use array geometry myself, but then I normally do 3D work with game-like > rendering loads. I understand this principle, I just haven't found a good way to implement it with large numbers of polys that can move relative to each other every frame. > Display lists likely would help if you're currently drawing the polygons > with run-time calls: create a single "sprite" display list for your > standard sprite size, call that once for each sprite (after translate > and texture load) to do the > glBegin();glVertex(...);glTexCoord(...);glVertex(...);glTexCoord(...);glVertex(...);glTexCoord(...);glVertex(...);glTexCoord(...);glEnd(...) > and you've just reduced the number of Python calls by a factor of ~10. > That *should* have a significant effect on performance. Yes, I can see this approach helping. I think that it does conflict with my current scheme of grouping my small textures onto a big texture and never changing away from my big texture. With your approach, I would need to keep my small textures and do texture swaps, but I would greatly reduce my function call overhead. I guess this is where sorting by texture comes in. I will have to experiment with this and see how it affects performance. Would creating a unique display list for every sprite be a viable option? > >The other optimizations that I can think to try now are cutting out as > >many glBegins and glEnds as possible, and do big groups of Vertex calls. > >Or I can try rendering any sprite that won't move for a few frames to > >the background, and work around the problem by cutting down on the > >number of sprites I have on screen. > > > > > Sounds like a lot of extra bitmap bandwidth (re-storing the > background). I've been avoiding trying this approach for this exact reason. > You likely do *not* want to be doing Vertex calls directly > from Python save to generate a display list (as noted above). Python > just isn't the right tool for that kind of low-level operation, it has > too much per-call overhead. If you do that kind of thing you should be > using array geometry (and be sure you use exactly the correct array type > for the data-type of the calls you're making to avoid extra copying). The problem that I ran into with vertex arrays is that while a single call to drawarrays is faster than all the immediate mode calls, the overhead of building the needed arrays every frame ended up being too great and making things slower. I will try using display lists for individual quads, and hopefully that will help. > Python is slower than C, but OpenGL has an enormous amount of room to > play. Using higher-level features from the higher-level language can > make the experience much more rewarding. I get the impression that OpenGL can deliver all the speed I want, I just seem to be having problems unlocking that speed. Thanks for your help, Erik |