Thread: [PyOpenGL-Users] Building an interleaved buffer for pyopengl
Brought to you by:
mcfletch
From: Nick S. <ni...@so...> - 2010-06-30 14:04:40
Attachments:
testing2.py
|
Hello! (Full disclosure: I actually posted this on stackoverflow a few months ago but I didn't really receive a good response. url: http://bit.ly/a8W5hR ) I have a series of sprite objects rendering to textured quads. Right now they all have individual render() methods which return an interleaved buffer. I call them all in order and batch these vertices and texture coords before sending to pyOpengl's glInterleavedArrays/glDrawArrays. Is there a better way to be doing this? The fastest way seems to be generating a python list and converting to a numpy array later which doesn't seem right. Thankyou for any help? I have attached some example code and their timings. Timings using random numbers on my machine: $ python testing2.py from list: 9.02473926544 ms array: indexed: 14.8707222939 ms array: slicing: 30.3278303146 ms array: concat: 26.5012693405 ms array: put: 45.0277900696 ms ctypes float array: 26.240530014 ms - Nick |
From: Dan H. <Dan...@no...> - 2010-06-30 18:40:45
|
On 06/30/2010 07:04 AM, Nick Sonneveld wrote: > Hello! > > (Full disclosure: I actually posted this on stackoverflow a few months > ago but I didn't really receive a good response. url: > http://bit.ly/a8W5hR ) > > I have a series of sprite objects rendering to textured quads. Right > now they all have individual render() methods which return an > interleaved buffer. I call them all in order and batch these vertices > and texture coords before sending to pyOpengl's > glInterleavedArrays/glDrawArrays. Is there a better way to be doing > this? The fastest way seems to be generating a python list and > converting to a numpy array later which doesn't seem right. > > Thankyou for any help? I think the answer to improving performance here is actually in one of the responses to the stackoverflow question you posted: "The real savings will be realized by a recasting of the render routine so that you don't have to create a python object for every value that ends up being placed in the buffer." In other words, don't create a separate object for each sprite, and don't call a render() function once for each sprite. Structure your code so that it's more data oriented rather than object oriented: Fill up a NumPy array all at once with all of your sprites, trying to avoid any looping in Python. Then send that array (interleaved or not) to a PyOpenGL VBO. Dan |
From: Nick S. <ni...@so...> - 2010-07-01 00:53:43
|
Hello, thanks for your response! On Thu, Jul 1, 2010 at 4:40 AM, Dan Helfman <Dan...@no...> wrote: > I think the answer to improving performance here is actually in one of > the responses to the stackoverflow question you posted: "The real > savings will be realized by a recasting of the render routine so that > you don't have to create a python object for every value that ends up > being placed in the buffer." I believe the point the poster bpowah on stackoverflow was making was just that I should reduce the number of objects (eg pregenerating texture coord lists)? At least I guess that's how I interpreted it. > In other words, don't create a separate object for each sprite, and > don't call a render() function once for each sprite. Structure your code > so that it's more data oriented rather than object oriented: Fill up a > NumPy array all at once with all of your sprites, trying to avoid any > looping in Python. Then send that array (interleaved or not) to a > PyOpenGL VBO. The problem is that I can't really see how to do things such as collision detection and response without doing at least one loop through all sprite objects. (I step through each sprite individually and do sprite movement and collision response to ensure no overlapping sprites). Wouldn't things such as updating sprite position and adjusting for collision response mean the vertices change on every update? I can fill up a numpy array but it seems that modifying the array is slower than just creating a new numpy array from a python list. The random() methods may be a red herring. Testing with a dummy method (that doesn't add data to a list) shows that render() in the example does take a while to run. It doesn't quite explain why generating a new numpy array from a list is quicker than just modifying a preallocated numpy array though. # dummy array.. just iterate but don't save def create_array_dummy(): for x in xrange(4096): render(x) return [] $ python testing2.py # on another slower machine. from dummy: 9.96989965439 ms from list: 16.1171197891 ms etc... - Nick |
From: Dan H. <Dan...@no...> - 2010-07-01 18:23:56
|
On 06/30/2010 05:53 PM, Nick Sonneveld wrote: > Hello, thanks for your response! > > On Thu, Jul 1, 2010 at 4:40 AM, Dan Helfman<Dan...@no...> wrote: >> I think the answer to improving performance here is actually in one of >> the responses to the stackoverflow question you posted: "The real >> savings will be realized by a recasting of the render routine so that >> you don't have to create a python object for every value that ends up >> being placed in the buffer." > > I believe the point the poster bpowah on stackoverflow was making was > just that I should reduce the number of objects (eg pregenerating > texture coord lists)? At least I guess that's how I interpreted it. Regardless of what the poster meant, pre-generating texture coords is a good idea if you can get away with it. >> In other words, don't create a separate object for each sprite, and >> don't call a render() function once for each sprite. Structure your code >> so that it's more data oriented rather than object oriented: Fill up a >> NumPy array all at once with all of your sprites, trying to avoid any >> looping in Python. Then send that array (interleaved or not) to a >> PyOpenGL VBO. > > The problem is that I can't really see how to do things such as > collision detection and response without doing at least one loop > through all sprite objects. (I step through each sprite individually > and do sprite movement and collision response to ensure no overlapping > sprites). Wouldn't things such as updating sprite position and > adjusting for collision response mean the vertices change on every > update? I can fill up a numpy array but it seems that modifying the > array is slower than just creating a new numpy array from a python > list. A couple of ideas that may or may not be applicable to your program: First, try to vectorize your use of NumPy arrays whenever possible. For instance, let's say you've got an array containing all sprite positions. You can create another array of movement/velocity vectors for all sprites. (An example of one velocity vector might be [ +10, -5 ].) And then each frame just do: sprite_positions += sprite_velocities to move the sprites in bulk without having to loop. For many sprites, velocities will only change occasionally, so you won't have to update all of them at once. Second, if you really must loop over something, do it in C or Cython rather than pure Python. And for many cases, you can avoid having to do this by replacing a loop with a single NumPy operator or function call as described above. > The random() methods may be a red herring. Testing with a dummy > method (that doesn't add data to a list) shows that render() in the > example does take a while to run. It doesn't quite explain why > generating a new numpy array from a list is quicker than just > modifying a preallocated numpy array though. I'm not sure why that's happening, but ideally you'd be able to avoid generating a new list or a new array every frame. If you can generate an array once and then reuse it for as long as possible, that would be preferable. Dan |