Re: [PyOpenGL-Users] Perl vs. Python OpenGL bindings benchmarks
Brought to you by:
mcfletch
From: Mike C. F. <mcf...@vr...> - 2007-06-05 03:02:57
|
Aleksandar B. Samardzic wrote: > On 6/4/07, Mike C. Fletcher <mcf...@vr...> wrote: > ... >> PyOpenGL does a lot more than just wrap the C call. In particular, it >> adds a glGetError call after *every* call to provide Python-like >> exception-on-error semantics instead of requiring explicit error checks >> in the code. The current 3.0.0a6 version also does a logging-enabling >> call for every function (I'm thinking of making that feature default to >> off, though it would mean less informative error messages). >> ... > Take a look say into benchmark() function in trislam_pyopengl_ctp.py - > benchmark consist of same drawing sequence rendered either 100 times > or for 10 seconds, whatever takes less. For vertex arrays (for > example, draw_quads_va() function), drawing sequence consists > practically of glVertexPointer() and glDrawArrays() calls; further, > array of vertex coordinates that is passed to glVertexPointer() is > prepared beforehand (before benchmark() function loop entered). Now I > guess you, as PyOpenGL primary developer, may know better, but if > array of vertex coordinates is already ctypes based array (and I sent > some tiny patches, so trislam_pyopengl_ctp.py is benchmark version > that is ctypes based now), then from what I can see in PyOpenGL code, > overhead has to be negligible... > > On the other side, I do think that alike benchmark could be very > useful in pointing to performance bottlenecks of a wrapper; thus, > while it is indeed very simple benchmark, I guess PyOpenGL could only > benefit if you find some time, sometimes, to play with trislam. > Here's what a cProfile run of their "benchmark" method shows (they had a .tar.gz on the site, I'd thought it was just the one file, it includes the whole suite): * In ~400s, 66s are taken up with the error-checking functionality, most of that time is in the individual-vertex versions (array versions amortize the cost of the operation over the size of the array) * 24.1s is due to overhead from the wrapper objects o 6.5s is due to direct overhead from the "wrapper" objects (mostly bookkeeping (e.g. temporary list creation) and function-call overhead) o Array conversion overhead + 8.4s is due to overhead in determining size of ctypes arrays automatically + 7.9s is due to overhead in determining the "stride size" of ctypes arrays automatically The ctypes arrays are spending so much time in those two functions because they don't have a "dims" member, as seen in numpy arrays, so they have a little while loop that adds up the length of the sub-component arrays, producing lots of intermediate objects as a result. Making the error-checking an optional C module would be doable, but it's still going to be a fairly large part of total run-time (we would see a maximum of 16% speedup). Similarly, we might get a 1.5% speedup by writing the wrapper function in C. We might get that to 6% if we were to write the whole of the wrapping system in C (which I really do *not* want to do). Altogether we could get maybe 25% from the lower-hanging fruit of optimisation. We also seem to spend 7.9s (2%) in the time.time function and almost 207s (52% of time) waiting for glFlush calls to complete. That would suggest to me that we're seeing synchronization issues skewing results at least part of the time. I'm running all of this on Python 2.5 on a Gentoo Linux box, by the way. Anyway, hope that helps somewhat, Mike Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 338 0.237 0.001 0.242 0.001 PyBench.py:311(fade_to_white) 369465 1.329 0.000 1.826 0.000 arraydatatype.py:17(getHandler) 123155 0.557 0.000 1.278 0.000 arraydatatype.py:45(dataPointer) 123155 0.542 0.000 1.820 0.000 arraydatatype.py:55(voidDataPointer) 123155 0.797 0.000 1.649 0.000 arraydatatype.py:59(asArray) 123155 0.636 0.000 7.987 0.000 arraydatatype.py:71(unitSize) 123155 0.452 0.000 1.941 0.000 arrayhelpers.py:52(__call__) 123155 0.449 0.000 8.436 0.000 arrayhelpers.py:78(arraySizeOfFirst) 123155 0.348 0.000 0.348 0.000 contextdata.py:22(getContext) 123155 0.815 0.000 1.489 0.000 contextdata.py:35(setValue) 123155 0.449 0.000 2.097 0.000 converters.py:111(__call__) 123155 0.185 0.000 0.185 0.000 converters.py:155(__call__) 123155 0.192 0.000 0.192 0.000 converters.py:169(__call__) 492620 1.779 0.000 2.866 0.000 ctypesarrays.py:55(types) 369465 1.727 0.000 5.339 0.000 ctypesarrays.py:63(dims) 123155 0.169 0.000 0.169 0.000 ctypesarrays.py:69(asArray) 123155 1.441 0.000 6.781 0.000 ctypesarrays.py:72(unitSize) 17225384 48.363 0.000 66.352 0.000 error.py:162(glCheckError) 14887425 17.989 0.000 17.989 0.000 error.py:191(nullGetError) 125123 0.217 0.000 0.217 0.000 error.py:209(onBegin) 125123 0.208 0.000 0.208 0.000 error.py:212(onEnd) 125123 0.875 0.000 1.632 0.000 exceptional.py:44(glBegin) 125123 1.020 0.000 1.552 0.000 exceptional.py:48(glEnd) 49 0.000 0.000 0.000 0.000 formathandler.py:90(registerEquivalent) 49176 0.058 0.000 0.058 0.000 trislam_pyopengl_ctp.py:156(draw_empty) 26 0.000 0.000 0.000 0.000 trislam_pyopengl_ctp.py:159(stats_empty) 5583 17.723 0.003 34.072 0.006 trislam_pyopengl_ctp.py:162(draw_quads) 20661 2.306 0.000 6.676 0.000 trislam_pyopengl_ctp.py:173(draw_quads_va) 52 0.000 0.000 0.000 0.000 trislam_pyopengl_ctp.py:181(stats_quads) 6829 12.125 0.002 24.593 0.004 trislam_pyopengl_ctp.py:189(draw_qs) 18933 4.886 0.000 10.204 0.001 trislam_pyopengl_ctp.py:198(draw_qs_va) 24497 0.351 0.000 0.409 0.000 trislam_pyopengl_ctp.py:210(draw_qs_dl) 22370 0.540 0.000 4.787 0.000 trislam_pyopengl_ctp.py:214(draw_qs_va_dl) 104 0.000 0.000 0.000 0.000 trislam_pyopengl_ctp.py:224(stats_qs) 4010 22.822 0.006 44.041 0.011 trislam_pyopengl_ctp.py:232(draw_tris) 20307 2.815 0.000 7.291 0.000 trislam_pyopengl_ctp.py:246(draw_tris_va) 52 0.000 0.000 0.000 0.000 trislam_pyopengl_ctp.py:256(stats_tris) 9041 12.119 0.001 24.633 0.003 trislam_pyopengl_ctp.py:264(draw_ts) 20087 4.529 0.000 9.986 0.000 trislam_pyopengl_ctp.py:273(draw_ts_va) 29773 0.335 0.000 0.405 0.000 trislam_pyopengl_ctp.py:285(draw_ts_dl) 20797 0.501 0.000 4.496 0.000 trislam_pyopengl_ctp.py:289(draw_ts_va_dl) 104 0.000 0.000 0.000 0.000 trislam_pyopengl_ctp.py:299(stats_ts) 252064 2.706 0.000 3.402 0.000 trislam_pyopengl_ctp.py:699(start_frame) 252064 205.511 0.001 207.370 0.001 trislam_pyopengl_ctp.py:703(end_frame) 339 7.990 0.024 398.633 1.176 trislam_pyopengl_ctp.py:707(benchmark) 123155 6.467 0.000 24.188 0.000 wrapper.py:294(wrapperCall) 123155 0.148 0.000 0.148 0.000 {_ctypes.addressof} 492620 0.552 0.000 0.552 0.000 {callable} 738930 1.246 0.000 1.246 0.000 {getattr} 49 0.000 0.000 0.000 0.000 {hasattr} 369465 0.587 0.000 0.587 0.000 {isinstance} 247663 0.325 0.000 0.325 0.000 {len} 1108733 1.279 0.000 1.279 0.000 {method 'append' of 'list' objects} 339 0.001 0.000 0.001 0.000 {method 'disable' of '_lsprof.Profiler' objects} 615775 0.823 0.000 0.823 0.000 {method 'get' of 'dict' objects} 98 0.000 0.000 0.000 0.000 {method 'has_key' of 'dict' objects} 338 0.001 0.000 0.001 0.000 {method 'keys' of 'dict' objects} 13 0.000 0.000 0.000 0.000 {method 'write' of 'file' objects} 246655 0.555 0.000 0.555 0.000 {range} 3848500 7.971 0.000 7.971 0.000 {time.time} 246310 0.584 0.000 0.584 0.000 {zip} For completeness, the diff against the trislam version in their archive, this version will run under Python 2.5 (which provides the cProfile module): mcfletch@raistlin:~/tmp/trislam$ diff trislam_pyopengl_ctp.py trislam_pyopengl_ctp_profile.py 2a3 > from __future__ import division 15a17,18 > ##from OpenGL.error import ErrorChecker > ##ErrorChecker.registerChecker( ErrorChecker.nullGetError ) 20d22 < from __future__ import division 25a28,29 > import cProfile > PROFILER = cProfile.Profile() 662a667,668 > PROFILER.print_stats( ) > PROFILER.dump_stats( 'test.profile' ) 670a677 > 671a679 > import cProfile 676c684 < benchmark() --- > PROFILER.runcall( benchmark ) 914,919c922,928 < init() < print "Benchmarks:", < glutDisplayFunc(display) < glutIdleFunc(display) < glutKeyboardFunc(keyboard) < glutMainLoop() --- > if __name__ == "__main__": > init() > print "Benchmarks:", > glutDisplayFunc(display) > glutIdleFunc(display) > glutKeyboardFunc(keyboard) > glutMainLoop() -- ________________________________________________ Mike C. Fletcher Designer, VR Plumber, Coder http://www.vrplumber.com http://blog.vrplumber.com |