Re: [PyOpenGL-Users] 2d sprite engine performance.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

> >Would creating a unique display list for every sprite be a viable
> >option?
> >  
> >
> Yes, keeping in mind the memory overhead required. Older cards were 
> extremely memory-limited.

I'm thinking that 16mb vram is a reasonable minimum for anything that I
would seriously consider playing games on, so I should be ok in this
regard.  I have small textures.

> BTW, you are using Textures, not doing copys
> for each frame, right?  Even if you've got more textures than card
> memory, letting OpenGL handle the back-and-forth swapping of textures is
> likely going to be better for performance than anything you're going to
> do.  i.e. use glBindTexture and glGenTextures, not just bald
> glTexImage2D calls... hmm, you know, it's been so long I'm not even sure
> you *can* use bald glTexImage2D in OpenGL... I think you can because of
> the video-display cases... you'd have to be able to if my understanding
> is correct of common practice there... need to go back to doing raw
> OpenGL coding sometime soon :) .

Yes, I'm using real textures.  I have read about the early pre-texture
days of opengl, and it doesn't sound like fun.

> >The problem that I ran into with vertex arrays is that while a single
> >call to drawarrays is faster than all the immediate mode calls, the
> >overhead of building the needed arrays every frame ended up being too
> >great and making things slower.
>
> Ah, there's a problem.  You'd want to keep an array handy, with each
> sprite knowing it's index and updating the array directly,

This is more sophisticated that what I have tired so far.

I have taken a couple passes at vertex arrays. At first, I was just
using
list.append(vertex)
to create a big python list every frame and passing this as a vertex
array.  Performance was slower than immediate mode, but not by much.

Then I tried just allocating a big Numeric array of zeros, and using
array[index] = vertex  # 4 times per sprite
to add sprites to the list as I needed them.  At the end of the frame
after drawing the array, I would reset the index to 0.  With this second
approach, drawing the array was very fast due to not having to do type
conversions, but building the array in this manner was so inefficient
that overall this approach was even slower than my first try.

> so a sprite's
> move command would look like:
> 
>     self.getSpriteVectors()[self.startIndex:self.stopIndex] += delta
> 
> (where delta would be a simple tuple), allowing the array to handle
> updates in Numpy code.

I might need to learn more about Numeric.  I didn't know that you could
do things like the above and have Numeric take care of it without Python
doing a bunch of intermediate steps behind the scene.

> You'd want to use the contiguous() function from PyOpenGL whenever you
> resize the array, hence the need for the getSpriteVectors level of
> indirection.  Goal there is that you don't *build* the array for each
> frame (lots of memory copying), but just update it in-place.

I thought that contiguous() just checked if the array in question could
be passed to opengl directly as a pointer instead of being copied.  Am I
missing something?

> You have
> to watch out for rotation problems with that approach, however.  Might
> want special code to watch for and fix skew when rotations are in play
> for a given sprite.

With my vertex array approaches I have just bypassed rotated sprites and
used immediate mode to draw them.  Most of my sprites aren't rotated.

> Honestly, though, this kind of code gets messy fast enough that I'd
> avoid it until I'd exhausted the display-list approach.

I will take your advice and start with the display list approach.

> Good luck!

Thank you, and thanks again for you help.  This has been very
informative.

Erik