From: Todd M. <jm...@st...> - 2004-06-29 14:13:01
|
On Mon, 2004-06-28 at 20:45, Tim Hochberg wrote: > Todd Miller wrote: > > >On Mon, 2004-06-28 at 17:14, Sebastian Haase wrote: > > > > > >> [SNIP] > >> > >>My original question was just this: Does anyone know why numarray is maybe 10 > >>times slower that Numeric with that particular code segment > >>(PySequence_GetItem) ? > >> > >> > > > >Well, the short answer is probably: no. > > > >Looking at the numarray sequence protocol benchmarks in > >Examples/bench.py, and looking at what wxPython is probably doing > >(fetching a 1x2 element array from an Nx2 and then fetching 2 numerical > >values from that)... I can't fully nail it down. My benchmarks show > >that numarray is 4x slower for fetching the two element array but only > >1.1x slower for the two numbers; that makes me expect at most 4x > >slower. > > > >Noticing the 50k __del__ calls in your profile, I eliminated __del__ > >(breaking numarray) to see if that was the problem; the ratios changed > >to 2.5x slower and 0.9x slower (actually faster) respectively. > > > > > This reminds me, when profiling bits and pieces of my code I've often > noticed that __del__ chews up a large chunk of time. Is there any > prospect of this being knocked down at all, or is it inherent in the > structure of numarray? __del__ is IMHO the elegant way to do numarray's shadowing of "misbehaved arrays". misbehaved arrays are ones which don't meet the requirements of a particular C-function, but generally that means noncontiguous, byte-swapped, misaligned, or of the wrong type; it also can mean some other sequence type like a list or tuple. I think using the destructor is "necessary" for maintaining Numeric compatibility in C because you can generally count on arrays being DECREF'd, but obviously you couldn't count on some new API call being called. __del__ used to be implemented in C as tp_dealloc, but I was running into segfaults which I tracked down to the order in which a new style class instance is torn down. The purpose of __del__ is to copy the contents of a well behaved working array (the shadow) back onto the original mis-behaved array. The problem was that, because of the numarray class hierarchy, critical pieces of the shadow (the instance dictionary) had already been torn down before the tp_dealloc was called. The only way I could think of to fix it was to move the destructor farther down in the class hierarchy, i.e. from _numarray.tp_dealloc to NumArray.__del__ in Python. If anyone can think of a way to get rid of __del__, I'm all for it. > >The large number of "Check" routines preceding the numarray path (I > >count 7 looking at my copy of wxPython) has me a little concerned. I > >think those checks are more expensive for numarray because it is a new > >style class. > > > If that's really a significant slowdown, the culprit's are likely > PyTuple_Check, PyList_Check and wxPySwigInstance_Check. > PySequence_Check appears to just be pointer compares and shouldn't > invoke any new style class machinery. PySequence_Length calls sq_length, > but appears also to not involve new class machinery. Of these, I think > PyTuple_Check and PyList_Check could be replaced with PyTuple_CheckExact > and PyList_CheckExact. This would slow down people using subclasses of > tuple/list, but speed everyone else up since the latter pair of > functions are just pointer compares. I think the former group is a very > small minority, possibly nonexistent, minority, so this would probably > be acceptable. > > I don't see any easy/obvious ways to speed up wxPySwigInstance_Check, Why no CheckExact, even if it's hand coded? Maybe the setup is tedious? > but I believe that wxPoints now obey the PySequence protocol, so I think > that the whole wxPySwigInstance_Check branch could be removed. To get > that into wxPython you'd probably have to convince Robin that it > wouldn't hurt the speed of list of wxPoints unduly. > > Wait... If the above doesn't work, I think I do have a way that might > work for speeding the check for a wxPoint. Before the loop starts, get a > pointer to wx.core.Point (the class for wxPoints these days) and call it > wxPoint_Type. Then just use for the check: > o->ob_type == &wxPoint_Type > Worth a try anyway. > > Unfortunately, I don't have any time to try any of this out right now. > > Chris, are you feeling bored? > > -tim What's the chance of adding direct support for numarray to wxPython? Our PEP reduces the burden on a package to at worst adding 3 include files for numarray plus the specialized package code. With those files, the package can be compiled by users without numarray and also run without numarray, but would receive a real boost for people willing to install numarray since the sequence protocol could be bypassed. Regards, Todd |