Re: [PyOpenGL-Users] Rolling spectrogram with pyopengl
Brought to you by:
mcfletch
From: Timothée L. <tim...@lp...> - 2009-12-09 10:50:52
|
Ian Mallett a écrit : > 2009/12/6 Timothée Lecomte <tim...@lp... > <mailto:tim...@lp...>> > > Hi Ian, > > I precisely need it to run as fast as the refresh rate, and it > already is running fast enough, either with opengl or with pure > 2D. However, it burns too much CPU to my taste, since the drawing > part takes more time than the processing part, although the latter > is quite heavy (FFT...), so I think there is room for large > improvements. > > I'm confused. The CPU doesn't work too hard to coordinate the actions > of the GPU--it just takes time to do it, because it must wait, as Gijs > explained. If the CPU is working hard, that means something else is > going on. You can reduce processing time by optimizing, if > applicable, and/or using a JIT compiler (e.g. psyco). > > It sounds like you're computing the FFT (which I assume is for signal > processing) on the CPU. If you're doing that for a thousand some > times every frame, that is likely to be your speed problem. Using a > shader would take all that load of the CPU. > <...> > > Ian Hi Ian, Thanks for your comment. I understand your idea about the FFT. However in my case profiling tells that drawing is the bottleneck, not processing. To give you a more precise idea, here is the result of cProfile on my application: http://imgur.com/deMyT.png It's a bit crowded, so I'll try to explain where the relevant pieces of information are (it's also slightly different from my first post in this thread, since I've learnt to use PBO, VBO and a bit of shaders in the mean time): Starting from the top of the profile, we see: <built-in method exec_> 99.49% (73.94%) This is the main loop, which is running 100 % of the total application time (minus 0.51% for initialization). And it is idle 73.94% of the time (which is already quite good!!!). Then, we have three leaves below exec_, two of them are prominent: spectrogram_timer_slot 8.11% This is the function that retrieves the data from the audio card, does the FFT and other math processing. That's 8% of the total time, and FFT in particular is only 1% of total time. And we have: paintGL 17.10% This function does again a little bit of math, and the openGL calls. Among the maths things, there is mostly some interpolation (1.17%) and computation of pixel color (2.85%) (which could be done in a fragment shader, by the way). Finally the openGL calls are: send_data (2.14%) which copies my ~1000 pixels to a PBO with: GL.glBufferData(GL.GL_PIXEL_UNPACK_BUFFER_ARB, byteString, GL.GL_DYNAMIC_DRAW) put_data_in_texture (5.23%) which uses the PBO to update the texture with: GL.glTexSubImage2D(GL.GL_TEXTURE_2D, 0, self.offset, 0, 1, height, GL.GL_BGRA, GL.GL_UNSIGNED_INT_8_8_8_8_REV, None) GL.glTexSubImage2D(GL.GL_TEXTURE_2D, 0, self.offset + self.canvas_width, 0, 1, height, GL.GL_BGRA, GL.GL_UNSIGNED_INT_8_8_8_8_REV, None) realpaint (4.52%) which draws using two VBO with: GL.glBindBuffer(GL.GL_ARRAY_BUFFER, self.vertex_vbo) GL.glVertexPointer(2, GL.GL_FLOAT, 0, None) GL.glBindBuffer(GL.GL_ARRAY_BUFFER, self.texture_vbo) GL.glTexCoordPointer(2, GL.GL_FLOAT, 0, None) #self.program contains a pixel shader #that draws the texture with a small offset on each frame GL.glUseProgram(self.program) GL.glUniform1f(self.loc, xoff) GL.glDrawArrays(GL.GL_QUADS, 0, 4) The whole openGL commands boils down to a single: wrapper:__call__ (10.45%) In conclusion, the FFT is only ~1% while openGL drawing is ~11% ! That's why I want to improve the drawing and not the processing. Now this profile brings another question: wrapper:__call__ seems to use a lot of slow python calls (calculate_pyArgs, calculate_cArguments...) whereas it does not seem to spend so much time in the actual openGL (3% if I count correctly). Is there a way to improve this ? I hope this was not too complicated ! Thanks for your help. Timothée |