From: Sebastian H. <ha...@ms...> - 2004-06-25 16:49:50
|
Hi, The long story is that I'm looking for a good/fast graph plotting programs; so I found WxPyPlot (http://www.cyberus.ca/~g_will/wxPython/wxpyplot.html) It uses wxPython and plots 25000 data points (with lines + square markers) in under one second - using Numeric that is. [the slow line in WxPyPlot is: dc.DrawLines(self.scaled) where self.scaled is an array of shape (25000,2) and type Float64 ] The short story is that numarray takes maybe 10 times as long as Numeric and I tracked the problem down into the wxPython SWIG typemap where he does this: <code-sniplet from wxPoint_LIST_helper() in helpers.cpp from wxPython> wxPoint* wxPoint_LIST_helper(PyObject* source, int *count) { <snip> bool isFast = PyList_Check(source) || PyTuple_Check(source); <snip> for (x=0; x<*count; x++) { // Get an item: try fast way first. if (isFast) { o = PySequence_Fast_GET_ITEM(source, x); } else { o = PySequence_GetItem(source, x); if (o == NULL) { goto error1; } } </code-sniplet> I'm not 100% sure that this is where the problem lies - is there a chance (or a known issue) that numarray does PySequence_GetItem() slower than Numeric ? I just ran this again using the python profiler and I get this w/ numarray: ncalls tottime percall cumtime percall filename:lineno(function) 1 1.140 1.140 1.320 1.320 gdi.py:554(DrawLines) 1 1.250 1.250 1.520 1.520 gdi.py:792(_DrawRectangleList) 50230 0.450 0.000 0.450 0.000 numarraycore.py:501(__del__) and this with Numeric: 1 0.080 0.080 0.080 0.080 gdi.py:554(DrawLines) 1 0.090 0.090 0.090 0.090 gdi.py:792(_DrawRectangleList) Thanks, Sebastian Haase |
From: John H. <jdh...@ac...> - 2004-06-25 21:36:36
|
>>>>> "Sebastian" == Sebastian Haase <ha...@ms...> writes: Sebastian> Hi, The long story is that I'm looking for a good/fast Sebastian> graph plotting programs; so I found WxPyPlot Sebastian> (http://www.cyberus.ca/~g_will/wxPython/wxpyplot.html) Sebastian> It uses wxPython and plots 25000 data points (with Sebastian> lines + square markers) in under one second - using Sebastian> Numeric that is. Not an answer to your question .... matplotlib has full numarray support (no need to rely on sequence API). You need to set NUMERIX='numarray' in setup.py before building it *and* set numerix : numarray in the matplotlib rc file. If you don't do both of these things, your numarray performance will suffer, sometimes dramatically. With this test script from matplotlib.matlab import * N = 25000 x = rand(N) y = rand(N) scatter(x,y, marker='s') #savefig('test') show() You can do a scatter plot of squares, on my machine in under a second using numarray (wxagg or agg backend). Some fairly recent changes to matplotlib have moved this drawing into extension code, with an approx 10x performance boost from older versions. The latest version on the sf site (0.54.2) however, does have these changes. To plot markers with lines, you would need plot(x,y, marker='-s') instead of scatter. This is considerably slower (approx 3s on my system), mainly because I haven't ported the new fast drawing of marker code to the line class. This is an easy fix, however, and will be added in short order. JDH |
From: Sebastian H. <ha...@ms...> - 2004-06-25 22:33:29
|
Hi John, I wanted to try matplotlib a few days ago, but first I had some trouble compiling it (my debian still uses gcc 2-95, which doesn't understand some 'std' namespace/template stuff) - and then it compiled, but segfaulted. Maybe I didn't get "set NUMERIX" stuff right - how do I know that it actually built _and_ uses the wx-backend ? BTW, from the profiling/timing I did you can tell that wxPyPlot actually plots 25000 data points in 0.1 secs - so it's _really_ fast ... So it would be nice to get to the ground of this ... Thanks for the comment, Sebastian On Friday 25 June 2004 02:12 pm, John Hunter wrote: > >>>>> "Sebastian" == Sebastian Haase <ha...@ms...> writes: > > Sebastian> Hi, The long story is that I'm looking for a good/fast > Sebastian> graph plotting programs; so I found WxPyPlot > Sebastian> (http://www.cyberus.ca/~g_will/wxPython/wxpyplot.html) > Sebastian> It uses wxPython and plots 25000 data points (with > Sebastian> lines + square markers) in under one second - using > Sebastian> Numeric that is. > > Not an answer to your question .... > > matplotlib has full numarray support (no need to rely on sequence > API). You need to set NUMERIX='numarray' in setup.py before building > it *and* set numerix : numarray in the matplotlib rc file. If you > don't do both of these things, your numarray performance will suffer, > sometimes dramatically. > > With this test script > > from matplotlib.matlab import * > N = 25000 > x = rand(N) > y = rand(N) > scatter(x,y, marker='s') > #savefig('test') > show() > > You can do a scatter plot of squares, on my machine in under a second > using numarray (wxagg or agg backend). Some fairly recent changes to > matplotlib have moved this drawing into extension code, with an approx > 10x performance boost from older versions. The latest version on the > sf site (0.54.2) however, does have these changes. > > To plot markers with lines, you would need > > plot(x,y, marker='-s') > > instead of scatter. This is considerably slower (approx 3s on my > system), mainly because I haven't ported the new fast drawing of > marker code to the line class. This is an easy fix, however, and will > be added in short order. > > JDH > > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Chris B. <Chr...@no...> - 2004-06-28 18:01:32
|
Sebastian Haase wrote: > BTW, from the profiling/timing I did you can tell that wxPyPlot actually plots > 25000 data points in 0.1 secs - so it's _really_ fast ... Actually, it's probably not that fast, if you are timing on Linux/wxGTK/X-Windows. X is asyncronous, so what you are timing is how long it takes your program to tell X what to draw, but it may take longer than that to actually draw it. However, what you are timing is all the stuff that is effected by numarray/Numeric. I worked on part of the wxPython DC.DrawXXXList stuff, and I really wanted a Numeric native version, but Robin really didn't want an additional dependency. We discussed on this list a while back whether you could compile against Numeric, but let people run without it, and have it all work unless Someone actually used it. What makes that tricky is that the functions that test whether a PyObject is a Numeric array are in Numeric... but it could probably be done if you tried hard enough (maybe include just that function in wxPython...) The Same applies for numarray support. Anyway, as it stands, wxPython DC methods are faster with Lists or Tuples of values than Numeric or Numarray arrays. You might try converting to a list with numarray.tolist() before making the DC call. Another option is to write a few specialized DC functions that take numarray arrays to draw, but are not included in wxPython. I think you'd get it as fast as possible that way. I intend to do that some day. If you want to get it started, I'll help. You could probably get a particularly nice improvement in drawing a lot of rectangles, as looping through a N X 4 array of coords directly would be much, much faster that using the Sequence API on the whole thing, and on each item, and checking at every step what kind of object everything is. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Todd M. <jm...@st...> - 2004-06-28 19:03:51
|
On Mon, 2004-06-28 at 13:59, Chris Barker wrote: > Sebastian Haase wrote: > > BTW, from the profiling/timing I did you can tell that wxPyPlot actually plots > > 25000 data points in 0.1 secs - so it's _really_ fast ... > > Actually, it's probably not that fast, if you are timing on > Linux/wxGTK/X-Windows. X is asyncronous, so what you are timing is how > long it takes your program to tell X what to draw, but it may take > longer than that to actually draw it. However, what you are timing is > all the stuff that is effected by numarray/Numeric. > > I worked on part of the wxPython DC.DrawXXXList stuff, and I really > wanted a Numeric native version, but Robin really didn't want an > additional dependency. We discussed on this list a while back whether > you could compile against Numeric, but let people run without it, and > have it all work unless Someone actually used it. What makes that tricky > is that the functions that test whether a PyObject is a Numeric array > are in Numeric... but it could probably be done if you tried hard enough > (maybe include just that function in wxPython...) numarray-1.0 has two macros for dealing with this: PyArray_Present() and PyArray_isArray(obj). The former (safely) determines that numarray is installed, while the latter determines that numarray is installed and that obj is a NumArray. Both macros serve to guard sections of code which make more extensive use of the numarray C-API to keep them from segfaulting when numarray is not installed. I think this would be easy to do for Numeric as well. One problem is that compiling a "numarray improved" extension requires some portion of the numarray headers. I refactored the numarray includes so that a relatively simple set of 3 files can be used to support the Numeric compatible interface (for numarray). These could either be included in core Python (with a successful PEP) or included in interested packages. This approach adds a small source code burden somewhere, but eliminates the requirement for users to have numarray installed either to run or compile from source. I'll send out the draft PEP later today. Regards, Todd |
From: Sebastian H. <ha...@ms...> - 2004-06-28 21:14:27
|
On Monday 28 June 2004 12:03 pm, Todd Miller wrote: > On Mon, 2004-06-28 at 13:59, Chris Barker wrote: > > Sebastian Haase wrote: > > > BTW, from the profiling/timing I did you can tell that wxPyPlot > > > actually plots 25000 data points in 0.1 secs - so it's _really_ fast > > > ... > > > > Actually, it's probably not that fast, if you are timing on > > Linux/wxGTK/X-Windows. X is asyncronous, so what you are timing is how > > long it takes your program to tell X what to draw, but it may take > > longer than that to actually draw it. However, what you are timing is > > all the stuff that is effected by numarray/Numeric. > > > > I worked on part of the wxPython DC.DrawXXXList stuff, and I really > > wanted a Numeric native version, but Robin really didn't want an > > additional dependency. We discussed on this list a while back whether > > you could compile against Numeric, but let people run without it, and > > have it all work unless Someone actually used it. What makes that tricky > > is that the functions that test whether a PyObject is a Numeric array > > are in Numeric... but it could probably be done if you tried hard enough > > (maybe include just that function in wxPython...) > > numarray-1.0 has two macros for dealing with this: PyArray_Present() > and PyArray_isArray(obj). The former (safely) determines that numarray > is installed, while the latter determines that numarray is installed and > that obj is a NumArray. Both macros serve to guard sections of code > which make more extensive use of the numarray C-API to keep them from > segfaulting when numarray is not installed. I think this would be easy > to do for Numeric as well. > > One problem is that compiling a "numarray improved" extension requires > some portion of the numarray headers. I refactored the numarray > includes so that a relatively simple set of 3 files can be used to > support the Numeric compatible interface (for numarray). These could > either be included in core Python (with a successful PEP) or included in > interested packages. This approach adds a small source code burden > somewhere, but eliminates the requirement for users to have numarray > installed either to run or compile from source. > > I'll send out the draft PEP later today. > > Regards, > Todd My original question was just this: Does anyone know why numarray is maybe 10 times slower that Numeric with that particular code segment (PySequence_GetItem) ? - Sebastian |
From: Todd M. <jm...@st...> - 2004-06-28 23:18:35
|
On Mon, 2004-06-28 at 17:14, Sebastian Haase wrote: > On Monday 28 June 2004 12:03 pm, Todd Miller wrote: > > On Mon, 2004-06-28 at 13:59, Chris Barker wrote: > > > Sebastian Haase wrote: > > > > BTW, from the profiling/timing I did you can tell that wxPyPlot > > > > actually plots 25000 data points in 0.1 secs - so it's _really_ fast > > > > ... > > > > > > Actually, it's probably not that fast, if you are timing on > > > Linux/wxGTK/X-Windows. X is asyncronous, so what you are timing is how > > > long it takes your program to tell X what to draw, but it may take > > > longer than that to actually draw it. However, what you are timing is > > > all the stuff that is effected by numarray/Numeric. > > > > > > I worked on part of the wxPython DC.DrawXXXList stuff, and I really > > > wanted a Numeric native version, but Robin really didn't want an > > > additional dependency. We discussed on this list a while back whether > > > you could compile against Numeric, but let people run without it, and > > > have it all work unless Someone actually used it. What makes that tricky > > > is that the functions that test whether a PyObject is a Numeric array > > > are in Numeric... but it could probably be done if you tried hard enough > > > (maybe include just that function in wxPython...) > > > > numarray-1.0 has two macros for dealing with this: PyArray_Present() > > and PyArray_isArray(obj). The former (safely) determines that numarray > > is installed, while the latter determines that numarray is installed and > > that obj is a NumArray. Both macros serve to guard sections of code > > which make more extensive use of the numarray C-API to keep them from > > segfaulting when numarray is not installed. I think this would be easy > > to do for Numeric as well. > > > > One problem is that compiling a "numarray improved" extension requires > > some portion of the numarray headers. I refactored the numarray > > includes so that a relatively simple set of 3 files can be used to > > support the Numeric compatible interface (for numarray). These could > > either be included in core Python (with a successful PEP) or included in > > interested packages. This approach adds a small source code burden > > somewhere, but eliminates the requirement for users to have numarray > > installed either to run or compile from source. > > > > I'll send out the draft PEP later today. > > > > Regards, > > Todd > > My original question was just this: Does anyone know why numarray is maybe 10 > times slower that Numeric with that particular code segment > (PySequence_GetItem) ? Well, the short answer is probably: no. Looking at the numarray sequence protocol benchmarks in Examples/bench.py, and looking at what wxPython is probably doing (fetching a 1x2 element array from an Nx2 and then fetching 2 numerical values from that)... I can't fully nail it down. My benchmarks show that numarray is 4x slower for fetching the two element array but only 1.1x slower for the two numbers; that makes me expect at most 4x slower. Noticing the 50k __del__ calls in your profile, I eliminated __del__ (breaking numarray) to see if that was the problem; the ratios changed to 2.5x slower and 0.9x slower (actually faster) respectively. The large number of "Check" routines preceding the numarray path (I count 7 looking at my copy of wxPython) has me a little concerned. I think those checks are more expensive for numarray because it is a new style class. I have a hard time imagining a 10x difference overall, but I think Python does have to traverse the numarray class hierarchy rather than do a type pointer comparison so they are more expensive. Is 10x a measured number or a gut feel? One last thought: because the sequence protocol is being used rather than raw array access, compiling matplotlib for numarray (or not) is not the issue. Regards, Todd |
From: Todd M. <jm...@st...> - 2004-06-29 14:52:46
|
On Mon, 2004-06-28 at 19:38, Sebastian Haase wrote: > > Is 10x a measured number or a gut feel? > > I put some time.clock() statements into the wxPyPlot code > I got this: (the times are differences: T_after-T_before) > one run with numarray: > <__main__.PolyLine instance at 0x868d414> time= 1.06 > <__main__.PolyMarker instance at 0x878e9c4> time= 1.37 > a second run with numarray: > <__main__.PolyLine instance at 0x875da1c> time= 0.85 > <__main__.PolyMarker instance at 0x86da034> time= 1.04 > first run with Numeric: > <__main__.PolyLine instance at 0x858babc> time= 0.07 > <__main__.PolyMarker instance at 0x858bc4c> time= 0.14 > a second one: > <__main__.PolyLine instance at 0x858cd7c> time= 0.08 > <__main__.PolyMarker instance at 0x8585d8c> time= 0.17 > This seems to be consistent with the profiling I did before: > I get this w/ numarray: > ncalls tottime percall cumtime percall filename:lineno(function) > 1 1.140 1.140 1.320 1.320 gdi.py:554(DrawLines) > 1 1.250 1.250 1.520 1.520 gdi.py:792(_DrawRectangleList) > 50230 0.450 0.000 0.450 0.000 numarraycore.py:501(__del__) > and this with Numeric: > 1 0.080 0.080 0.080 0.080 gdi.py:554(DrawLines) > 1 0.090 0.090 0.090 0.090 gdi.py:792(_DrawRectangleList) > > so this looks to me like a factor of 10x. Me too. Can you (or somebody) post the application code which does the drawlines? I can definitely instrument the bottleneck C-code, but I don't have time to ascend the wxPython learning curve. Todd |
From: Tim H. <tim...@co...> - 2004-06-29 19:21:31
|
Todd Miller wrote: >On Mon, 2004-06-28 at 19:38, Sebastian Haase wrote: > > > >>>Is 10x a measured number or a gut feel? >>> >>> >>[SNIP] >> >>so this looks to me like a factor of 10x. >> >> > >Me too. Can you (or somebody) post the application code which does the >drawlines? I can definitely instrument the bottleneck C-code, but I >don't have time to ascend the wxPython learning curve. > > Let me second that. With a little nudge from Chris Barker, I managed to get wxPython to compile here and I have some changes that may speed up drawlines, but it'd probably be best if we were all using the same benchmark. -tim |
From: Chris B. <Chr...@no...> - 2004-06-29 21:07:24
|
Tim Hochberg wrote: > Todd Miller wrote: >> Me too. Can you (or somebody) post the application code which does the >> drawlines? I can definitely instrument the bottleneck C-code, but I >> don't have time to ascend the wxPython learning curve. > Let me second that. With a little nudge from Chris Barker, I managed to > get wxPython to compile here and I have some changes that may speed up > drawlines, but it'd probably be best if we were all using the same > benchmark. Well, if I can't code, at least I can nudge! Perhaps I can also help get Todd what he's asking for, except that I'm not sure what you mean by "application code". Do you mean a small wxPython application that uses DC.DrawLines (and friends)? If so, yes, I can do that. What version of wxPython should I target? CVS head? 2.5.1? (The latest "released" version) there are some slight incompatibilities between versions, particularly with DC calls, unfortunately. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Todd M. <jm...@st...> - 2004-06-29 21:34:18
|
On Tue, 2004-06-29 at 17:05, Chris Barker wrote: > Tim Hochberg wrote: > > Todd Miller wrote: > >> Me too. Can you (or somebody) post the application code which does the > >> drawlines? I can definitely instrument the bottleneck C-code, but I > >> don't have time to ascend the wxPython learning curve. > > > Let me second that. With a little nudge from Chris Barker, I managed to > > get wxPython to compile here and I have some changes that may speed up > > drawlines, but it'd probably be best if we were all using the same > > benchmark. > > Well, if I can't code, at least I can nudge! > > Perhaps I can also help get Todd what he's asking for, except that I'm > not sure what you mean by "application code". Do you mean a small > wxPython application that uses DC.DrawLines (and friends)? Yes. What I most want to do is a 50000 point drawlines, similar to what you profiled. Friends are fine too. > If so, yes, I > can do that. What version of wxPython should I target? > CVS head? > 2.5.1? (The latest "released" version) I'd prefer 2.5.1 unless Tim says otherwise. > there are some slight incompatibilities between versions, particularly > with DC calls, unfortunately. I'm hoping this won't affect the profile. Regards, Todd |
From: Chris B. <Chr...@no...> - 2004-06-29 23:27:29
Attachments:
DrawLinesTest.py
|
Todd Miller wrote: > Yes. What I most want to do is a 50000 point drawlines, similar to what > you profiled. Friends are fine too. Actually, it was someone else that did the profiling, but here is a sample, about as simple as I could make it. It draws an N point line and M points. At the moment, it is using Numeric for the points, and numarray for the lines. Numeric is MUCH faster (which is the whole point of this discussion). Otherwise, it takes about the same amount of time to draw the lines as the points. Another note: if use the tolist() method in the numarray first, it's much faster also: dc.DrawLines(LinesPoints.tolist()) Obviously, the tolist method is much faster than wxPython's sequence methods, as would be expected. I'm going to send a note to Robin Dunn about this as well, and see what he thinks. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Andrew P. L. Jr. <bs...@ma...> - 2004-06-30 00:13:25
|
On Jun 29, 2004, at 4:25 PM, Chris Barker wrote: > Todd Miller wrote: >> Yes. What I most want to do is a 50000 point drawlines, similar to >> what >> you profiled. Friends are fine too. > > Actually, it was someone else that did the profiling, but here is a > sample, about as simple as I could make it. > > It draws an N point line and M points. At the moment, it is using > Numeric for the points, and numarray for the lines. Numeric is MUCH > faster (which is the whole point of this discussion). Otherwise, it > takes about the same amount of time to draw the lines as the points. > > Another note: if use the tolist() method in the numarray first, it's > much faster also: > > dc.DrawLines(LinesPoints.tolist()) > If you are going to move lots of lines and points, I would recommend pushing this stuff through PyOpenGL with array objects and vertex objects. Letting OpenGL handle the transformations, clipping, movement and iteration in hardware stomps all over even the best written C code. Most UI toolkits have some form of OpenGL widget. For lots of the code I have written, even Mesa (the open-souce software OpenGL renderer) was fast enough, and not having to write all of the display transformation code by hand was a huge win even when the speed was somewhat lagging. -a |
From: Tim H. <tim...@co...> - 2004-06-29 00:45:28
|
Todd Miller wrote: >On Mon, 2004-06-28 at 17:14, Sebastian Haase wrote: > > >> [SNIP] >> >>My original question was just this: Does anyone know why numarray is maybe 10 >>times slower that Numeric with that particular code segment >>(PySequence_GetItem) ? >> >> > >Well, the short answer is probably: no. > >Looking at the numarray sequence protocol benchmarks in >Examples/bench.py, and looking at what wxPython is probably doing >(fetching a 1x2 element array from an Nx2 and then fetching 2 numerical >values from that)... I can't fully nail it down. My benchmarks show >that numarray is 4x slower for fetching the two element array but only >1.1x slower for the two numbers; that makes me expect at most 4x >slower. > >Noticing the 50k __del__ calls in your profile, I eliminated __del__ >(breaking numarray) to see if that was the problem; the ratios changed >to 2.5x slower and 0.9x slower (actually faster) respectively. > > This reminds me, when profiling bits and pieces of my code I've often noticed that __del__ chews up a large chunk of time. Is there any prospect of this being knocked down at all, or is it inherent in the structure of numarray? >The large number of "Check" routines preceding the numarray path (I >count 7 looking at my copy of wxPython) has me a little concerned. I >think those checks are more expensive for numarray because it is a new >style class. > If that's really a significant slowdown, the culprit's are likely PyTuple_Check, PyList_Check and wxPySwigInstance_Check. PySequence_Check appears to just be pointer compares and shouldn't invoke any new style class machinery. PySequence_Length calls sq_length, but appears also to not involve new class machinery. Of these, I think PyTuple_Check and PyList_Check could be replaced with PyTuple_CheckExact and PyList_CheckExact. This would slow down people using subclasses of tuple/list, but speed everyone else up since the latter pair of functions are just pointer compares. I think the former group is a very small minority, possibly nonexistent, minority, so this would probably be acceptable. I don't see any easy/obvious ways to speed up wxPySwigInstance_Check, but I believe that wxPoints now obey the PySequence protocol, so I think that the whole wxPySwigInstance_Check branch could be removed. To get that into wxPython you'd probably have to convince Robin that it wouldn't hurt the speed of list of wxPoints unduly. Wait... If the above doesn't work, I think I do have a way that might work for speeding the check for a wxPoint. Before the loop starts, get a pointer to wx.core.Point (the class for wxPoints these days) and call it wxPoint_Type. Then just use for the check: o->ob_type == &wxPoint_Type Worth a try anyway. Unfortunately, I don't have any time to try any of this out right now. Chris, are you feeling bored? -tim >I have a hard time imagining a 10x difference overall, >but I think Python does have to traverse the numarray class hierarchy >rather than do a type pointer comparison so they are more expensive. > > |
From: Todd M. <jm...@st...> - 2004-06-29 14:13:01
|
On Mon, 2004-06-28 at 20:45, Tim Hochberg wrote: > Todd Miller wrote: > > >On Mon, 2004-06-28 at 17:14, Sebastian Haase wrote: > > > > > >> [SNIP] > >> > >>My original question was just this: Does anyone know why numarray is maybe 10 > >>times slower that Numeric with that particular code segment > >>(PySequence_GetItem) ? > >> > >> > > > >Well, the short answer is probably: no. > > > >Looking at the numarray sequence protocol benchmarks in > >Examples/bench.py, and looking at what wxPython is probably doing > >(fetching a 1x2 element array from an Nx2 and then fetching 2 numerical > >values from that)... I can't fully nail it down. My benchmarks show > >that numarray is 4x slower for fetching the two element array but only > >1.1x slower for the two numbers; that makes me expect at most 4x > >slower. > > > >Noticing the 50k __del__ calls in your profile, I eliminated __del__ > >(breaking numarray) to see if that was the problem; the ratios changed > >to 2.5x slower and 0.9x slower (actually faster) respectively. > > > > > This reminds me, when profiling bits and pieces of my code I've often > noticed that __del__ chews up a large chunk of time. Is there any > prospect of this being knocked down at all, or is it inherent in the > structure of numarray? __del__ is IMHO the elegant way to do numarray's shadowing of "misbehaved arrays". misbehaved arrays are ones which don't meet the requirements of a particular C-function, but generally that means noncontiguous, byte-swapped, misaligned, or of the wrong type; it also can mean some other sequence type like a list or tuple. I think using the destructor is "necessary" for maintaining Numeric compatibility in C because you can generally count on arrays being DECREF'd, but obviously you couldn't count on some new API call being called. __del__ used to be implemented in C as tp_dealloc, but I was running into segfaults which I tracked down to the order in which a new style class instance is torn down. The purpose of __del__ is to copy the contents of a well behaved working array (the shadow) back onto the original mis-behaved array. The problem was that, because of the numarray class hierarchy, critical pieces of the shadow (the instance dictionary) had already been torn down before the tp_dealloc was called. The only way I could think of to fix it was to move the destructor farther down in the class hierarchy, i.e. from _numarray.tp_dealloc to NumArray.__del__ in Python. If anyone can think of a way to get rid of __del__, I'm all for it. > >The large number of "Check" routines preceding the numarray path (I > >count 7 looking at my copy of wxPython) has me a little concerned. I > >think those checks are more expensive for numarray because it is a new > >style class. > > > If that's really a significant slowdown, the culprit's are likely > PyTuple_Check, PyList_Check and wxPySwigInstance_Check. > PySequence_Check appears to just be pointer compares and shouldn't > invoke any new style class machinery. PySequence_Length calls sq_length, > but appears also to not involve new class machinery. Of these, I think > PyTuple_Check and PyList_Check could be replaced with PyTuple_CheckExact > and PyList_CheckExact. This would slow down people using subclasses of > tuple/list, but speed everyone else up since the latter pair of > functions are just pointer compares. I think the former group is a very > small minority, possibly nonexistent, minority, so this would probably > be acceptable. > > I don't see any easy/obvious ways to speed up wxPySwigInstance_Check, Why no CheckExact, even if it's hand coded? Maybe the setup is tedious? > but I believe that wxPoints now obey the PySequence protocol, so I think > that the whole wxPySwigInstance_Check branch could be removed. To get > that into wxPython you'd probably have to convince Robin that it > wouldn't hurt the speed of list of wxPoints unduly. > > Wait... If the above doesn't work, I think I do have a way that might > work for speeding the check for a wxPoint. Before the loop starts, get a > pointer to wx.core.Point (the class for wxPoints these days) and call it > wxPoint_Type. Then just use for the check: > o->ob_type == &wxPoint_Type > Worth a try anyway. > > Unfortunately, I don't have any time to try any of this out right now. > > Chris, are you feeling bored? > > -tim What's the chance of adding direct support for numarray to wxPython? Our PEP reduces the burden on a package to at worst adding 3 include files for numarray plus the specialized package code. With those files, the package can be compiled by users without numarray and also run without numarray, but would receive a real boost for people willing to install numarray since the sequence protocol could be bypassed. Regards, Todd |
From: Tim H. <tim...@co...> - 2004-06-29 15:11:00
|
Todd Miller wrote: >On Mon, 2004-06-28 at 20:45, Tim Hochberg wrote: > > >>Todd Miller wrote: >> >> >> [SNIP] >>This reminds me, when profiling bits and pieces of my code I've often >>noticed that __del__ chews up a large chunk of time. Is there any >>prospect of this being knocked down at all, or is it inherent in the >>structure of numarray? >> >> > >__del__ is IMHO the elegant way to do numarray's shadowing of >"misbehaved arrays". misbehaved arrays are ones which don't meet the >requirements of a particular C-function, but generally that means >noncontiguous, byte-swapped, misaligned, or of the wrong type; it also >can mean some other sequence type like a list or tuple. I think using >the destructor is "necessary" for maintaining Numeric compatibility in C >because you can generally count on arrays being DECREF'd, but obviously >you couldn't count on some new API call being called. > > OK, that makes sense. In a general sense at least, I'll have to dig into the source to figure out the details. >__del__ used to be implemented in C as tp_dealloc, but I was running >into segfaults which I tracked down to the order in which a new style >class instance is torn down. The purpose of __del__ is to copy the >contents of a well behaved working array (the shadow) back onto the >original mis-behaved array. The problem was that, because of the >numarray class hierarchy, critical pieces of the shadow (the instance >dictionary) had already been torn down before the tp_dealloc was >called. The only way I could think of to fix it was to move the >destructor farther down in the class hierarchy, i.e. from >_numarray.tp_dealloc to NumArray.__del__ in Python. > > It seems that one could stash a reference to the instance dict somewhere (in PyArrayObject perhaps) to prevent the instance dict from being torn down when the rest of the subclass goes away. It would require another field in PyArrayObject , but that's big enough allready that no one would notice. I could easily be way off base here though. If I end up with too much spare time at some point, I may try to look into this. >If anyone can think of a way to get rid of __del__, I'm all for it. > > > [SNIP] >> >>I don't see any easy/obvious ways to speed up wxPySwigInstance_Check, >> >> > >Why no CheckExact, even if it's hand coded? Maybe the setup is tedious? > > I don't know although I suspect the setup being tedious might be part of it. I also believe that until recently, wxPython used old style classes and you couldn't use CheckExact. What I propose below is essentially a hand coded version of check exact for wxPoints. > > >>but I believe that wxPoints now obey the PySequence protocol, so I think >>that the whole wxPySwigInstance_Check branch could be removed. To get >>that into wxPython you'd probably have to convince Robin that it >>wouldn't hurt the speed of list of wxPoints unduly. >> >>Wait... If the above doesn't work, I think I do have a way that might >>work for speeding the check for a wxPoint. Before the loop starts, get a >>pointer to wx.core.Point (the class for wxPoints these days) and call it >>wxPoint_Type. Then just use for the check: >> o->ob_type == &wxPoint_Type >>Worth a try anyway. >> >>Unfortunately, I don't have any time to try any of this out right now. >> >>Chris, are you feeling bored? >> >>-tim >> >> > >What's the chance of adding direct support for numarray to wxPython? >Our PEP reduces the burden on a package to at worst adding 3 include >files for numarray plus the specialized package code. With those >files, the package can be compiled by users without numarray and also >run without numarray, but would receive a real boost for people willing >to install numarray since the sequence protocol could be bypassed. > > No idea, sorry. I haven't been keeping up with wxPython development lately. -tim |
From: Chris B. <Chr...@no...> - 2004-06-29 16:34:35
|
Tim Hochberg wrote: >>> but I believe that wxPoints now obey the PySequence protocol, so I >>> think that the whole wxPySwigInstance_Check branch could be removed. >>> To get that into wxPython you'd probably have to convince Robin that >>> it wouldn't hurt the speed of list of wxPoints unduly. >>> >>> Wait... If the above doesn't work, I think I do have a way that might >>> work for speeding the check for a wxPoint. Before the loop starts, >>> get a pointer to wx.core.Point (the class for wxPoints these days) >>> and call it wxPoint_Type. Then just use for the check: >>> o->ob_type == &wxPoint_Type >>> Worth a try anyway. >>> >>> Unfortunately, I don't have any time to try any of this out right now. >>> >>> Chris, are you feeling bored? Do you mean me? if so: A) I'm not bored. B) This is all kind of beyond me anyway, and C) I'm planning on implementing my own custom DC.DrawLotsOfStuff code, because I have some specialized needs that probably don't really belong in wxPython. My stuff will take Numeric arrays as input (This is for my FloatCanvas, if anyone cares). I'm still using Numeric, as numarray is a LOT slower when used in FloatCanvas, probably because I do a lot with small arrays, and maybe because of what we're talking about here as well. However, This may turn out to be important to me some day, so who knows? I'll keep this note around. >> What's the chance of adding direct support for numarray to wxPython? >> Our PEP reduces the burden on a package to at worst adding 3 include >> files for numarray plus the specialized package code. With those >> files, the package can be compiled by users without numarray and also >> run without numarray, but would receive a real boost for people willing >> to install numarray since the sequence protocol could be bypassed. If the PEP is accepted, and those include files are part of the standard Python distro, I suspect Robin would be quite happy to add direct support, at least if someone else writes the code. Whether he'd be open to including those files in the wxPython distribution itself, I don't know. Perhaps I'll drop him a line. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Tim H. <tim...@co...> - 2004-06-29 20:20:14
|
I'd bet a case of beer (or cash equivalent) that one of the main bottlenecks is the path PySequence_GetItem->_ndarray_item->_universalIndexing->_simpleIndexing->_simpleIndexingCore. The path through _universalIndexing in particular, if I deciphered it correctly, looks very slow. I don't think it needs to be that way though, _universalIndexing could probably be sped up, but more promising I think _ndarray_item could be made to call _simpleIndexingCore without all that much work. It appears that this would save the creation of several intermediate objects and it also looks like a couple of calls back to python! I'm not familiar with this code though, so I could easily be missing something that makes calling _simpleIndexingCore harder than it looks. -tim |
From: Todd M. <jm...@st...> - 2004-06-29 22:02:35
|
On Tue, 2004-06-29 at 16:19, Tim Hochberg wrote: > I'd bet a case of beer (or cash equivalent) that one of the main > bottlenecks is the path > PySequence_GetItem->_ndarray_item->_universalIndexing->_simpleIndexing->_simpleIndexingCore. I won't take the bet but if this works out, you get the beer. If it doesn't, well, I don't drink anymore anyway. > The path through _universalIndexing in particular, if I deciphered it > correctly, looks very slow. I don't think it needs to be that way > though, _universalIndexing could probably be sped up, but more promising > I think _ndarray_item could be made to call _simpleIndexingCore without > all that much work. It appears that this would save the creation of > several intermediate objects and it also looks like a couple of calls > back to python! I'm not familiar with this code though, so I could > easily be missing something that makes calling _simpleIndexingCore > harder than it looks. This looks very promising. I'll take a look tomorrow. Regards, Todd |