You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Todd M. <jm...@st...> - 2004-07-02 16:02:50
|
On Fri, 2004-07-02 at 11:27, Sebastian Haase wrote: > On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote: > > Hi, > > > > Is this a bug?: > > >>> # (import numarray as na ; 'd' is a 3 dimensional array) > > >>> d.type() > > > > Float32 > > > > >>> d[80, 136, 122] > > > > 80.3997039795 > > > > >>> na.maximum.reduce(d[:,136, 122]) > > > > 85.8426361084 > > > > >>> na.maximum.reduce(d) [136, 122] > > > > 37.3658103943 > > > > >>> na.maximum.reduce(d,0)[136, 122] > > > > 37.3658103943 > > > > >>> na.maximum.reduce(d,1)[136, 122] > > > > Traceback (most recent call last): > > File "<input>", line 1, in ? > > IndexError: Index out of range > > > > I was using na.maximum.reduce(d) to get a "pixelwise" maximum along Z > > (axis 0). But as seen above it does not get it right. I then tried to > > reproduce > > > > this with some simple arrays, but here it works just fine: > > >>> a = na.arange(4*4*4) > > >>> a.shape=(4,4,4) > > >>> na.maximum.reduce(a) > > > > [[48 49 50 51] > > [52 53 54 55] > > [56 57 58 59] > > [60 61 62 63]] > > > > >>> a = na.arange(4*4*4).astype(na.Float32) > > >>> a.shape=(4,4,4) > > >>> na.maximum.reduce(a) > > > > [[ 48. 49. 50. 51.] > > [ 52. 53. 54. 55.] > > [ 56. 57. 58. 59.] > > [ 60. 61. 62. 63.]] > > > > > > Any hint ? > > > > Regards, > > Sebastian Haase > > Hi again, > I think the reason that no one responded to this is that it just sounds to > unbelievable ... This just slipped through the cracks for me. > Sorry for the missing piece of information, but 'd' is actually a memmapped > array ! > >>> d.info() > class: <class 'numarray.numarraycore.NumArray'> > shape: (80, 150, 150) > strides: (90000, 600, 4) > byteoffset: 0 > bytestride: 4 > itemsize: 4 > aligned: 1 > contiguous: 1 > data: <MemmapSlice of length:7290000 readonly> > byteorder: big > byteswap: 1 > type: Float32 > >>> dd = d.copy() > >>> na.maximum.reduce(dd[:,136, 122]) > 85.8426361084 > >>> na.maximum.reduce(dd)[136, 122] > 85.8426361084 > >>> > > Apparently we are using memmap so frequently now that I didn't even think > about that - which is good news for everyone, because it means that it works > (mostly). > > I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. > Could this be the problem? I think byteorder is a good guess at this point. What version of Python and numarray are you using? Regards, Todd |
From: Sebastian H. <ha...@ms...> - 2004-07-02 15:27:11
|
On Tuesday 29 June 2004 05:05 pm, Sebastian Haase wrote: > Hi, > > Is this a bug?: > >>> # (import numarray as na ; 'd' is a 3 dimensional array) > >>> d.type() > > Float32 > > >>> d[80, 136, 122] > > 80.3997039795 > > >>> na.maximum.reduce(d[:,136, 122]) > > 85.8426361084 > > >>> na.maximum.reduce(d) [136, 122] > > 37.3658103943 > > >>> na.maximum.reduce(d,0)[136, 122] > > 37.3658103943 > > >>> na.maximum.reduce(d,1)[136, 122] > > Traceback (most recent call last): > File "<input>", line 1, in ? > IndexError: Index out of range > > I was using na.maximum.reduce(d) to get a "pixelwise" maximum along Z > (axis 0). But as seen above it does not get it right. I then tried to > reproduce > > this with some simple arrays, but here it works just fine: > >>> a = na.arange(4*4*4) > >>> a.shape=(4,4,4) > >>> na.maximum.reduce(a) > > [[48 49 50 51] > [52 53 54 55] > [56 57 58 59] > [60 61 62 63]] > > >>> a = na.arange(4*4*4).astype(na.Float32) > >>> a.shape=(4,4,4) > >>> na.maximum.reduce(a) > > [[ 48. 49. 50. 51.] > [ 52. 53. 54. 55.] > [ 56. 57. 58. 59.] > [ 60. 61. 62. 63.]] > > > Any hint ? > > Regards, > Sebastian Haase Hi again, I think the reason that no one responded to this is that it just sounds to unbelievable ... Sorry for the missing piece of information, but 'd' is actually a memmapped array ! >>> d.info() class: <class 'numarray.numarraycore.NumArray'> shape: (80, 150, 150) strides: (90000, 600, 4) byteoffset: 0 bytestride: 4 itemsize: 4 aligned: 1 contiguous: 1 data: <MemmapSlice of length:7290000 readonly> byteorder: big byteswap: 1 type: Float32 >>> dd = d.copy() >>> na.maximum.reduce(dd[:,136, 122]) 85.8426361084 >>> na.maximum.reduce(dd)[136, 122] 85.8426361084 >>> Apparently we are using memmap so frequently now that I didn't even think about that - which is good news for everyone, because it means that it works (mostly). I just see that 'byteorder' is 'big' - I'm running this on an Intel Linux PC. Could this be the problem? Please some comments ! Thanks, Sebastian |
From: Christopher T K. <squ...@WP...> - 2004-07-02 13:36:58
|
On Thu, 1 Jul 2004, Perry Greenfield wrote: > 1) I suppose you did this for generated ufunc code? (ideally one > would put this in the codegenerator stuff but for the purposes > of testing it would be fine). I guess we would like to see > how you actually changed the code fragment (you can email > me or Todd Miller directly if you wish) Yep, I didn't know it was automatically generated :P > 2) How much improvement you would see depends on many details. > But if you were doing this for 10 million element arrays, I'm > surprised you saw such a small improvement (30% for 4 processors > isn't worth the trouble it would seem). So seeing the actual > test code would be helpful. If the array operation you are doing > for numarray aren't simple (that's a specialized use of the word; > by that I mean if the arrays are not the same type, aren't > contiguous, aren't aligned, or aren't of proper byte-order) > then there are a number of other issues that may slow it down > quite a bit (and there are ways of improving these for > parallel processing). I've been careful not to use anything to cause discontiguities in the arrays, and to keep them all the same type (Float64 in this case). See my next post for the code I'm using. |
From: Perry G. <pe...@st...> - 2004-07-01 22:00:28
|
Christopher T King wrote: > > (I originally posted this in comp.lang.python and was redirected here) > > In a quest to speed up numarray computations, I tried writing a 'threaded > array' class for use on SMP systems that would distribute its workload > across the processors. I hit a snag when I found out that since > the Python > interpreter is not reentrant, this effectively disables parallel > processing in Python. I've come up with two solutions to this problem, > both involving numarray's C functions that perform the actual vector > operations: > > 1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and > Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't > access Python structures) to run in parallel with the interpreter. > Python glue code would take care of threading and locking. > > 2) Move the parallelization into the C vector functions themselves. This > would likely get poorer performance (a chain of vector operations > couldn't be combined into one threaded operation). > > I'd much rather do #1, but will playing around with the interpreter state > like that cause any problems? > I don't think so, but it raises a number of questions that I ask just below. > Update from original posting: > > I've partially implemented method #1 for Float64s. Running on four 2.4GHz > Xeons (possibly two with hyperthreading?), I get about a 30% speedup while > dividing 10 million Float64s, but a small (<10%) slowdown doing addition > or multiplication. The operation was repeated 100 times, with the threads > created outside of the loop (i.e. the threads weren't recreated for each > iteration). Is there really that much overhead in Python? I can post the > code I'm using and the numarray patch if it's requested. > Questions and comments: 1) I suppose you did this for generated ufunc code? (ideally one would put this in the codegenerator stuff but for the purposes of testing it would be fine). I guess we would like to see how you actually changed the code fragment (you can email me or Todd Miller directly if you wish) 2) How much improvement you would see depends on many details. But if you were doing this for 10 million element arrays, I'm surprised you saw such a small improvement (30% for 4 processors isn't worth the trouble it would seem). So seeing the actual test code would be helpful. If the array operation you are doing for numarray aren't simple (that's a specialized use of the word; by that I mean if the arrays are not the same type, aren't contiguous, aren't aligned, or aren't of proper byte-order) then there are a number of other issues that may slow it down quite a bit (and there are ways of improving these for parallel processing). 3) I don't speak as an expert on threading or parallel processors, but I believe so long as you don't call any Python API functions (either directly or indirectly) between the global interpreter lock release and reacquisition, you should be fine. The vector ufunc code in numarray should satisfy this fine. Perry Greenfield |
From: Perry G. <pe...@st...> - 2004-07-01 20:56:11
|
Collin J. Williams Wrote: > I feel lower on the understanding tree with respect to what is being > proposed in the draft PEP, but would still like to offer my 2 cents > worth. I get the feeling that numarray is being bent out of shape to > fit Numeric. > Todd and Gerard address this point well. > It was my understanding that Numeric had certain weakness which made it > unacceptable as a Python component and that numarray was intended to > provide the same or better functionality within a pythonic framework. > Let me reiterate what our motivations were. We wanted to use an array package for our software, and Numeric had enough shortcomings that we needed some changes in behavior (e.g., type coercion for scalars), changes in performance (particularly with regard to memory usage), and enhancements in capabilities (e.g., memory mapping, record arrays, etc.). It was the opinion of some (Paul Dubois, for example) that a rewrite was in order in any case since the code was not that maintainable (not everyone felt this way, though at the time that wasn't as clear). At the same time there was some hope that Numeric could be accepted into the standard Python distribution. That's something we thought would be good (but wasn't the highest priority for us) and I've come to believe that perhaps a better solution with regard to that is what this PEP is trying to address. In any case Guido made it clear that he would not accept Numeric in its (then) current form. That it be written mostly in Python was something suggested by Guido, and we started off that way, mainly because it would get us going much faster than writing it all in C. We definitely understood that it would also have the consequence of making small array performance worse. We said as much when we started; it wasn't as clear as it is now that many users objected to a factor of few slower performance (as it turned out, a mostly Python based implemenation was more than an order of magnitude slower for small arrays). > numarray has not achieved the expected performance level to date, but > progress is being made and I believe that, for larger arrays, numarray > has been shown to be be superior to Numeric - please correct me if I'm > wrong here. > We never expected numarray to ever reach the performance level for small arrays that Numeric has. If it were within a factor of two I would be thrilled (its more like a factor of 3 or 4 currently for simple ufuncs). I still don't think it ever will be as fast for small arrays. The focus all along was on handling large arrays, which I think it does quite well, both regard to memory and speed. Yes, there are some functions and operations that may be much slower. Mainly they need to be called out so they can be improved. Generally we only notice performance issues that affect our software. Others need to point out remaining large discrepancies. I'm still of the opinion that if small array performance is really important, a very different approach should be used and have a completely different implementation. I would think that improvements of an order of magnitude over what Numeric does now are possible. But since that isn't important to us (STScI), don't expect us to work on that :-) > The shock came for me when Todd Miller said: > > <> > I looked at this some, and while INCREFing __dict__ maybe the right > idea, I forgot that there *is no* Python NumArray.__init__ anymore. > > Wasn't it the intent of numarray to work towards the full use of the > Python class structure to provide the benefits which it offers? > > The Python class has two constructors and one destructor. > > The constructors are __init__ and __new__, the latter only provides the > shell of an instance which later has to be initialized. In version 0.9, > which I use, there is no __new__, but there is a new function which has > a functionality similar to that intended for __new__. Thus, with this > change, numarray appears to be moving further away from being pythonic. > I'll agree that optimization is driving the underlying implementation to one that is more complex and that is the drawback (no surprise there). There's Pythonic in use and Pythonic in implementation. We are certainly receptive to better ideas for the implementation, but I doubt that a heavily Python-based implementation is ever going to be competitive for small arrays (unless something like psyco become universal, but I think there are a whole mess of problems to be solved for that kind of approach to work well generically). Perry |
From: Todd M. <jm...@st...> - 2004-07-01 20:45:50
|
On Thu, 2004-07-01 at 15:58, Colin J. Williams wrote: > Sebastian Haase wrote: > > >On Wednesday 30 June 2004 11:33 pm, ger...@gr... wrote: > > > > > >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote > >> > >> > >> > >>>So... you use the "meta" code to provide package specific ordinary > >>>(not-macro-fied) functions to keep the different versions of the > >>>Present() and isArray() macros from conflicting. > >>> > >>>It would be nice to have a standard approach for using the same > >>>"extension enhancement code" for both numarray and Numeric. The PEP > >>>should really be expanded to provide an example of dual support for one > >>>complete and real function, guts and all, so people can see the process > >>>end-to-end; Something like a simple arrayprint. That process needs > >>>to be refined to remove as much tedium and duplication of effort as > >>>possible. The idea is to make it as close to providing one > >>>implementation to support both array packages as possible. I think it's > >>>important to illustrate how to partition the extension module into > >>>separate compilation units which correctly navigate the dual > >>>implementation mine field in the easiest possible way. > >>> > >>>It would also be nice to add some logic to the meta-functions so that > >>>which array package gets used is configurable. We did something like > >>>that for the matplotlib plotting software at the Python level with > >>>the "numerix" layer, an idea I think we copied from Chaco. The kind > >>>of dispatch I think might be good to support configurability looks like > >>>this: > >>> > >>>PyObject * > >>>whatsThis(PyObject *dummy, PyObject *args) > >>>{ > >>> PyObject *result, *what = NULL; > >>> if (!PyArg_ParseTuple(args, "O", &what)) > >>> return 0; > >>> switch(PyArray_Which(what)) { > >>> USE_NUMERIC: > >>> result = Numeric_whatsThis(what); break; > >>> USE_NUMARRAY: > >>> result = Numarray_whatsThis(what); break; > >>> USE_SEQUENCE: > >>> result = Sequence_whatsThis(what); break; > >>> } > >>> Py_INCREF(Py_None); > >>> return Py_None; > >>>} > >>> > >>>In the above, I'm picturing a separate .c file for Numeric_whatsThis > >>>and for Numarray_whatsThis. It would be nice to streamline that to one > >>>.c and a process which somehow (simply) produces both functions. > >>> > >>>Or, ideally, the above would be done more like this: > >>> > >>>PyObject * > >>>whatsThis(PyObject *dummy, PyObject *args) > >>>{ > >>> PyObject *result, *what = NULL; > >>> if (!PyArg_ParseTuple(args, "O", &what)) > >>> return 0; > >>> switch(Numerix_Which(what)) { > >>> USE_NUMERIX: > >>> result = Numerix_whatsThis(what); break; > >>> USE_SEQUENCE: > >>> result = Sequence_whatsThis(what); break; > >>> } > >>> Py_INCREF(Py_None); > >>> return Py_None; > >>>} > >>> > >>>Here, a common Numerix implementation supports both numarray and Numeric > >>>from a single simple .c. The extension module would do "#include > >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call > >>>PyArray_* functions. > >>> > >>>The current stumbling block is that numarray is not binary compatible > >>>with Numeric... so numerix in C falls apart. I haven't analyzed > >>>every symbol and struct to see if it is really feasible... but it > >>>seems like it is *almost* feasible, at least for typical usage. > >>> > >>>So, in a nutshell, I think the dual implementation support you > >>>demoed is important and we should work up an example and kick it > >>>around to make sure it's the best way we can think of doing it. > >>>Then we should add a section to the PEP describing dual support as well. > >>> > >>> > >>I would never apply numarray code to Numeric arrays and the inverse. It > >>looks dangerous and I do not know if it is possible. The first thing > >>coming to mind is that numarray and Numeric arrays refer to different type > >>objects (this is what my pep module uses to differentiate them). So, even > >>if numarray and Numeric are binary compatible, any 'alien' code referring > >>the the 'Python-standard part' of the type objects may lead to surprises. A > >>PEP proposing hacks will raise eyebrows at least. > >> > >>Secondly, most people use Numeric *or* numarray and not both. > >> > >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out > >>(NINO) Of course, Numeric or numarray output can be a user option if NINO > >>does not apply. (explicit safe conversion between Numeric and numarray is > >>possible if really needed). > >> > >>I'll try to flesh out the demo with real functions in the way you indicated > >>(going as far as I consider safe). > >> > >>The problem of coding the Numeric (or numarray) functions in more than > >>a single source file has also be addressed. > >> > >>It may take 2 weeks because I am off to a conference next week. > >> > >>Regards -- Gerard > >> > >> > > > >Hi all, > >first, I would like to state that I don't understand much of this discussion; > >so the only comment I wanted to make is that IF this where possible, to make > >(C/C++) code that can live with both Numeric and numarray, then I think it > >would be used more and more - think: transition phase !! (e.g. someone could > >start making the FFTW part of scipy numarray friendly without having to > >switch everything at one [hint ;-)] ) > > > >These where just my 2 cents. > >Cheers, > >Sebastian Haase > > > > > I feel lower on the understanding tree with respect to what is being > proposed in the draft PEP, but would still like to offer my 2 cents > worth. I get the feeling that numarray is being bent out of shape to > fit Numeric. Yes and no. The numarray team has over time realized the importance of backward compatibility with the dominant array package, Numeric. A lot of People use Numeric now. We're trying to make it as easy as possible to use numarray. > It was my understanding that Numeric had certain weakness which made it > unacceptable as a Python component and that numarray was intended to > provide the same or better functionality within a pythonic framework. My understanding is that until there is a consensus on an array package, neither numarray nor Numeric is going into the Python core. > numarray has not achieved the expected performance level to date, but > progress is being made and I believe that, for larger arrays, numarray > has been shown to be be superior to Numeric - please correct me if I'm > wrong here. I think that's a fair summary. > > The shock came for me when Todd Miller said: > <> > I looked at this some, and while INCREFing __dict__ maybe the right > idea, I forgot that there *is no* Python NumArray.__init__ anymore. > > Wasn't it the intent of numarray to work towards the full use of the > Python class structure to provide the benefits which it offers? > Ack. I wasn't trying to start a panic. The __init__ still exists, as does __new__, they're just in C. Sorry if I was unclear. > The Python class has two constructors and one destructor. We're mostly on the same page. > The constructors are __init__ and __new__, the latter only provides the > shell of an instance which later has to be initialized. In version 0.9, > which I use, there is no __new__, It's there, but it's not very useful: >>> import numarray >>> numarray.NumArray.__new__ <built-in method __new__ of type object at 0x402fc860> >>> a = numarray.NumArray.__new__(numarray.NumArray) >>> a.info() class: <class 'numarray.numarraycore.NumArray'> shape: () strides: () byteoffset: 0 bytestride: 0 itemsize: 0 aligned: 1 contiguous: 1 data: None byteorder: little byteswap: 0 type: Any I don't, however, recommend doing this. > but there is a new function which has > a functionality similar to that intended for __new__. Thus, with this > change, numarray appears to be moving further away from being pythonic. Nope. I'm talking about moving toward better speed with no change in functionality at the Python level. I also think maybe we've gotten list threads crossed here: the "Numarray header PEP" thread is independent (but admittedly related) of the "Speeding up wxPython/numarray" thread. The Numarray header PEP is about making it easy for packages to write C extensions which *optionally* support numarray (and now Numeric as well). One aspect of the PEP is getting headers included in the Python core so that extensions can be compiled even when the numarray is not installed. The other aspect will be illustrating a good technique for supporting both numarray and Numeric, optionally and with choice, at the same time. Such an extension would still run where there is numarray, Numeric, both, or none installed. Gerard V. has already done some integration of numarray and Numeric with PyQwt so he has a few good ideas on how to do the "good technique" aspect of the PEP. The Speeding up wxPython/numarray thread is about improving the performance of a 50000 point wxPython drawlines which is 10x slower with numarray than Numeric. Tim H. and Chris B. have nailed this down (mostly) to the numarray sequence protocol and destructor, __del__. Regards, Todd |
From: <ger...@gr...> - 2004-07-01 20:39:43
|
On Thu, 01 Jul 2004 15:58:11 -0400, Colin J. Williams wrote > Sebastian Haase wrote: > > >On Wednesday 30 June 2004 11:33 pm, ger...@gr... wrote: > > > > > >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote > >> > >> > >> > >>>So... you use the "meta" code to provide package specific ordinary > >>>(not-macro-fied) functions to keep the different versions of the > >>>Present() and isArray() macros from conflicting. > >>> > >>>It would be nice to have a standard approach for using the same > >>>"extension enhancement code" for both numarray and Numeric. The PEP > >>>should really be expanded to provide an example of dual support for one > >>>complete and real function, guts and all, so people can see the process > >>>end-to-end; Something like a simple arrayprint. That process needs > >>>to be refined to remove as much tedium and duplication of effort as > >>>possible. The idea is to make it as close to providing one > >>>implementation to support both array packages as possible. I think it's > >>>important to illustrate how to partition the extension module into > >>>separate compilation units which correctly navigate the dual > >>>implementation mine field in the easiest possible way. > >>> > >>>It would also be nice to add some logic to the meta-functions so that > >>>which array package gets used is configurable. We did something like > >>>that for the matplotlib plotting software at the Python level with > >>>the "numerix" layer, an idea I think we copied from Chaco. The kind > >>>of dispatch I think might be good to support configurability looks like > >>>this: > >>> > >>>PyObject * > >>>whatsThis(PyObject *dummy, PyObject *args) > >>>{ > >>> PyObject *result, *what = NULL; > >>> if (!PyArg_ParseTuple(args, "O", &what)) > >>> return 0; > >>> switch(PyArray_Which(what)) { > >>> USE_NUMERIC: > >>> result = Numeric_whatsThis(what); break; > >>> USE_NUMARRAY: > >>> result = Numarray_whatsThis(what); break; > >>> USE_SEQUENCE: > >>> result = Sequence_whatsThis(what); break; > >>> } > >>> Py_INCREF(Py_None); > >>> return Py_None; > >>>} > >>> > >>>In the above, I'm picturing a separate .c file for Numeric_whatsThis > >>>and for Numarray_whatsThis. It would be nice to streamline that to one > >>>.c and a process which somehow (simply) produces both functions. > >>> > >>>Or, ideally, the above would be done more like this: > >>> > >>>PyObject * > >>>whatsThis(PyObject *dummy, PyObject *args) > >>>{ > >>> PyObject *result, *what = NULL; > >>> if (!PyArg_ParseTuple(args, "O", &what)) > >>> return 0; > >>> switch(Numerix_Which(what)) { > >>> USE_NUMERIX: > >>> result = Numerix_whatsThis(what); break; > >>> USE_SEQUENCE: > >>> result = Sequence_whatsThis(what); break; > >>> } > >>> Py_INCREF(Py_None); > >>> return Py_None; > >>>} > >>> > >>>Here, a common Numerix implementation supports both numarray and Numeric > >>>from a single simple .c. The extension module would do "#include > >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call > >>>PyArray_* functions. > >>> > >>>The current stumbling block is that numarray is not binary compatible > >>>with Numeric... so numerix in C falls apart. I haven't analyzed > >>>every symbol and struct to see if it is really feasible... but it > >>>seems like it is *almost* feasible, at least for typical usage. > >>> > >>>So, in a nutshell, I think the dual implementation support you > >>>demoed is important and we should work up an example and kick it > >>>around to make sure it's the best way we can think of doing it. > >>>Then we should add a section to the PEP describing dual support as well. > >>> > >>> > >>I would never apply numarray code to Numeric arrays and the inverse. It > >>looks dangerous and I do not know if it is possible. The first thing > >>coming to mind is that numarray and Numeric arrays refer to different type > >>objects (this is what my pep module uses to differentiate them). So, even > >>if numarray and Numeric are binary compatible, any 'alien' code referring > >>the the 'Python-standard part' of the type objects may lead to surprises. A > >>PEP proposing hacks will raise eyebrows at least. > >> > >>Secondly, most people use Numeric *or* numarray and not both. > >> > >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out > >>(NINO) Of course, Numeric or numarray output can be a user option if NINO > >>does not apply. (explicit safe conversion between Numeric and numarray is > >>possible if really needed). > >> > >>I'll try to flesh out the demo with real functions in the way you indicated > >>(going as far as I consider safe). > >> > >>The problem of coding the Numeric (or numarray) functions in more than > >>a single source file has also be addressed. > >> > >>It may take 2 weeks because I am off to a conference next week. > >> > >>Regards -- Gerard > >> > >> > > > >Hi all, > >first, I would like to state that I don't understand much of this discussion; > >so the only comment I wanted to make is that IF this where possible, to make > >(C/C++) code that can live with both Numeric and numarray, then I think it > >would be used more and more - think: transition phase !! (e.g. someone could > >start making the FFTW part of scipy numarray friendly without having to > >switch everything at one [hint ;-)] ) > > > >These where just my 2 cents. > >Cheers, > >Sebastian Haase > > > > > I feel lower on the understanding tree with respect to what is being > proposed in the draft PEP, but would still like to offer my 2 cents > worth. I get the feeling that numarray is being bent out of shape > to fit Numeric. > What we are discussing are methods to make it possible to import Numeric and numarray in the same extension module. This can be done by separating the colliding APIs of Numeric and numarray in separate *.c files. To achieve this, no changes to Numeric and numarray itself are necessary. In fact, this can be done by the author of the C-extension himself, but since it is not obvious we discuss the best methods and we like to provide the necessary glue code. It will make life easier for extension writers and facilitate the transition to numarray. Try to look at the problem from the other side: I am using Numeric (since my life depends on SciPy) but have written an extension that can also import numarray (hoping to get more users). I will never use the methods proposed in the draft PEP, because it excludes importing Numeric. > > It was my understanding that Numeric had certain weakness which made > it unacceptable as a Python component and that numarray was intended > to provide the same or better functionality within a pythonic framework. > > numarray has not achieved the expected performance level to date, > but progress is being made and I believe that, for larger arrays, > numarray has been shown to be be superior to Numeric - please > correct me if I'm wrong here. > I think you are correct. I don't know why the __init__ has disappeared, but I don't think it is because of the PEP and certainly not because of the thread. > > The shock came for me when Todd Miller said: > > <> > I looked at this some, and while INCREFing __dict__ maybe the right > idea, I forgot that there *is no* Python NumArray.__init__ anymore. > > Wasn't it the intent of numarray to work towards the full use of the > Python class structure to provide the benefits which it offers? > > The Python class has two constructors and one destructor. > > The constructors are __init__ and __new__, the latter only provides > the shell of an instance which later has to be initialized. In > version 0.9, which I use, there is no __new__, but there is a new > function which has a functionality similar to that intended for > __new__. Thus, with this change, numarray appears to be moving > further away from being pythonic. > Gerard |
From: Christopher T K. <squ...@WP...> - 2004-07-01 20:36:26
|
(I originally posted this in comp.lang.python and was redirected here) In a quest to speed up numarray computations, I tried writing a 'threaded array' class for use on SMP systems that would distribute its workload across the processors. I hit a snag when I found out that since the Python interpreter is not reentrant, this effectively disables parallel processing in Python. I've come up with two solutions to this problem, both involving numarray's C functions that perform the actual vector operations: 1) Surround the C vector operations with Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, thus allowing the vector operations (which don't access Python structures) to run in parallel with the interpreter. Python glue code would take care of threading and locking. 2) Move the parallelization into the C vector functions themselves. This would likely get poorer performance (a chain of vector operations couldn't be combined into one threaded operation). I'd much rather do #1, but will playing around with the interpreter state like that cause any problems? Update from original posting: I've partially implemented method #1 for Float64s. Running on four 2.4GHz Xeons (possibly two with hyperthreading?), I get about a 30% speedup while dividing 10 million Float64s, but a small (<10%) slowdown doing addition or multiplication. The operation was repeated 100 times, with the threads created outside of the loop (i.e. the threads weren't recreated for each iteration). Is there really that much overhead in Python? I can post the code I'm using and the numarray patch if it's requested. |
From: Fernando P. <Fer...@co...> - 2004-07-01 20:27:25
|
Chris Barker wrote: > Hi all, > > I'm looking for a way to read data from ascii text files quickly. I've > found that using the standard python idioms like: > > data = array((M,N),Float) > for in range(N): > data.append(map(float,file.readline().split())) > > Can be pretty slow. What I'd like is something like Matlab's fscanf: > > data = fscanf(file, "%g", [M,N] ) > > I may have the syntax a little wrong, but the gist is there. What Matlab > does keep recycling the format string until the desired number of > elements have been read. > > It is quite flexible, and ends up being pretty fast. > > Has anyone written something like this for Numeric (or numarray, but I'd > prefer Numeric at this point) ? > > I was surprised not to find something like this in SciPy, maybe I didn't > look hard enough. scipy.io.read_array? I haven't timed it, because it's been 'fast enough' for my needs. For reading binary data files, I have this little utility which is basically a wrapper around Numeric.fromstring (N below is Numeric imported 'as N'). Note that it can read binary .gz files directly, a _huge_ gain for very sparse files representing 3d arrays (I can read a 400k gz file which blows up to ~60MB when unzipped in no time at all, while reading the unzipped file is very slow): def read_bin(fname,dims,typecode,recast_type=None,offset=0,verbose=0): """Read in a binary data file. Does NOT check for endianness issues. Inputs: fname - can be .gz dims (nx1,nx2,...,nxd) typecode recast_type offset=0: # of bytes to skip in file *from the beginning* before data starts """ # config parameters item_size = N.zeros(1,typecode).itemsize() # size in bytes data_size = N.product(N.array(dims))*item_size # read in data if fname.endswith('.gz'): data_file = gzip.open(fname) else: data_file = file(fname) data_file.seek(offset) data = N.fromstring(data_file.read(data_size),typecode) data_file.close() data.shape = dims if verbose: #print 'Read',data_size/item_size,'data points. Shape:',dims print 'Read',N.size(data),'data points. Shape:',dims if recast_type is not None: data = data.astype(recast_type) return data HTH, f |
From: Chris B. <Chr...@no...> - 2004-07-01 20:17:22
|
Hi all, I'm looking for a way to read data from ascii text files quickly. I've found that using the standard python idioms like: data = array((M,N),Float) for in range(N): data.append(map(float,file.readline().split())) Can be pretty slow. What I'd like is something like Matlab's fscanf: data = fscanf(file, "%g", [M,N] ) I may have the syntax a little wrong, but the gist is there. What Matlab does keep recycling the format string until the desired number of elements have been read. It is quite flexible, and ends up being pretty fast. Has anyone written something like this for Numeric (or numarray, but I'd prefer Numeric at this point) ? I was surprised not to find something like this in SciPy, maybe I didn't look hard enough. If no one has done this, I guess I'll get started on it.... -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Todd M. <jm...@st...> - 2004-07-01 20:02:09
|
On Thu, 2004-07-01 at 14:51, Tim Hochberg wrote: > Todd Miller wrote: > > >On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: > > > > > >>>> > >>>> > >>>> > >>>FYI, the issue with tp_dealloc may have to do with which mode Python is > >>>compiled in, --with-pydebug, or not. One approach which seems like it > >>>ought to work (just thought of this!) is to add an extra reference in C > >>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed > >>>via a new attribute in the PyArrayObject struct) and then DECREF it as > >>>the last part of the tp_dealloc. > >>> > >>> > >>> > >>> > >>That sounds promising. > >> > >> > > <> > > I looked at this some, and while INCREFing __dict__ maybe the right > > idea, I forgot that there *is no* Python NumArray.__init__ anymore. > > > > So the INCREF needs to be done in C without doing any getattrs; this > > seems to mean calling a private _PyObject_GetDictPtr function to get a > > pointer to the __dict__ slot which can be dereferenced to get the > > __dict__. > > Might there be a simpler way? Since you're putting an extra attribute on > the PyArrayObject structure anyway, wouldn't it be possible to just > stash _shadows there instead of the reference to the dictionary? _shadows is already in the struct. The root problem (I recall) is not the loss of self->_shadows, it's the loss self->__dict__ before self can be copied onto self->_shadows. The cause of the problem appeared to me to be the tear down order of self: the NumArray part appeared to be torn down before the _numarray part, and the tp_dealloc needs to do a Python callback where a half destructed object just won't do. To really know what the problem is, I need to stick tp_dealloc back in and see what breaks. I'm pretty sure the problem was a missing instance __dict__, but my memory is quite fallable. Todd |
From: Colin J. W. <cj...@sy...> - 2004-07-01 19:58:15
|
Sebastian Haase wrote: >On Wednesday 30 June 2004 11:33 pm, ger...@gr... wrote: > > >>On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote >> >> >> >>>So... you use the "meta" code to provide package specific ordinary >>>(not-macro-fied) functions to keep the different versions of the >>>Present() and isArray() macros from conflicting. >>> >>>It would be nice to have a standard approach for using the same >>>"extension enhancement code" for both numarray and Numeric. The PEP >>>should really be expanded to provide an example of dual support for one >>>complete and real function, guts and all, so people can see the process >>>end-to-end; Something like a simple arrayprint. That process needs >>>to be refined to remove as much tedium and duplication of effort as >>>possible. The idea is to make it as close to providing one >>>implementation to support both array packages as possible. I think it's >>>important to illustrate how to partition the extension module into >>>separate compilation units which correctly navigate the dual >>>implementation mine field in the easiest possible way. >>> >>>It would also be nice to add some logic to the meta-functions so that >>>which array package gets used is configurable. We did something like >>>that for the matplotlib plotting software at the Python level with >>>the "numerix" layer, an idea I think we copied from Chaco. The kind >>>of dispatch I think might be good to support configurability looks like >>>this: >>> >>>PyObject * >>>whatsThis(PyObject *dummy, PyObject *args) >>>{ >>> PyObject *result, *what = NULL; >>> if (!PyArg_ParseTuple(args, "O", &what)) >>> return 0; >>> switch(PyArray_Which(what)) { >>> USE_NUMERIC: >>> result = Numeric_whatsThis(what); break; >>> USE_NUMARRAY: >>> result = Numarray_whatsThis(what); break; >>> USE_SEQUENCE: >>> result = Sequence_whatsThis(what); break; >>> } >>> Py_INCREF(Py_None); >>> return Py_None; >>>} >>> >>>In the above, I'm picturing a separate .c file for Numeric_whatsThis >>>and for Numarray_whatsThis. It would be nice to streamline that to one >>>.c and a process which somehow (simply) produces both functions. >>> >>>Or, ideally, the above would be done more like this: >>> >>>PyObject * >>>whatsThis(PyObject *dummy, PyObject *args) >>>{ >>> PyObject *result, *what = NULL; >>> if (!PyArg_ParseTuple(args, "O", &what)) >>> return 0; >>> switch(Numerix_Which(what)) { >>> USE_NUMERIX: >>> result = Numerix_whatsThis(what); break; >>> USE_SEQUENCE: >>> result = Sequence_whatsThis(what); break; >>> } >>> Py_INCREF(Py_None); >>> return Py_None; >>>} >>> >>>Here, a common Numerix implementation supports both numarray and Numeric >>>from a single simple .c. The extension module would do "#include >>>numerix/arrayobject.h" and "import_numerix()" and otherwise just call >>>PyArray_* functions. >>> >>>The current stumbling block is that numarray is not binary compatible >>>with Numeric... so numerix in C falls apart. I haven't analyzed >>>every symbol and struct to see if it is really feasible... but it >>>seems like it is *almost* feasible, at least for typical usage. >>> >>>So, in a nutshell, I think the dual implementation support you >>>demoed is important and we should work up an example and kick it >>>around to make sure it's the best way we can think of doing it. >>>Then we should add a section to the PEP describing dual support as well. >>> >>> >>I would never apply numarray code to Numeric arrays and the inverse. It >>looks dangerous and I do not know if it is possible. The first thing >>coming to mind is that numarray and Numeric arrays refer to different type >>objects (this is what my pep module uses to differentiate them). So, even >>if numarray and Numeric are binary compatible, any 'alien' code referring >>the the 'Python-standard part' of the type objects may lead to surprises. A >>PEP proposing hacks will raise eyebrows at least. >> >>Secondly, most people use Numeric *or* numarray and not both. >> >>So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out >>(NINO) Of course, Numeric or numarray output can be a user option if NINO >>does not apply. (explicit safe conversion between Numeric and numarray is >>possible if really needed). >> >>I'll try to flesh out the demo with real functions in the way you indicated >>(going as far as I consider safe). >> >>The problem of coding the Numeric (or numarray) functions in more than >>a single source file has also be addressed. >> >>It may take 2 weeks because I am off to a conference next week. >> >>Regards -- Gerard >> >> > >Hi all, >first, I would like to state that I don't understand much of this discussion; >so the only comment I wanted to make is that IF this where possible, to make >(C/C++) code that can live with both Numeric and numarray, then I think it >would be used more and more - think: transition phase !! (e.g. someone could >start making the FFTW part of scipy numarray friendly without having to >switch everything at one [hint ;-)] ) > >These where just my 2 cents. >Cheers, >Sebastian Haase > > I feel lower on the understanding tree with respect to what is being proposed in the draft PEP, but would still like to offer my 2 cents worth. I get the feeling that numarray is being bent out of shape to fit Numeric. It was my understanding that Numeric had certain weakness which made it unacceptable as a Python component and that numarray was intended to provide the same or better functionality within a pythonic framework. numarray has not achieved the expected performance level to date, but progress is being made and I believe that, for larger arrays, numarray has been shown to be be superior to Numeric - please correct me if I'm wrong here. The shock came for me when Todd Miller said: <> I looked at this some, and while INCREFing __dict__ maybe the right idea, I forgot that there *is no* Python NumArray.__init__ anymore. Wasn't it the intent of numarray to work towards the full use of the Python class structure to provide the benefits which it offers? The Python class has two constructors and one destructor. The constructors are __init__ and __new__, the latter only provides the shell of an instance which later has to be initialized. In version 0.9, which I use, there is no __new__, but there is a new function which has a functionality similar to that intended for __new__. Thus, with this change, numarray appears to be moving further away from being pythonic. Colin W |
From: Tim H. <tim...@co...> - 2004-07-01 18:51:57
|
Todd Miller wrote: >On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: > > >>>> >>>> >>>> >>>FYI, the issue with tp_dealloc may have to do with which mode Python is >>>compiled in, --with-pydebug, or not. One approach which seems like it >>>ought to work (just thought of this!) is to add an extra reference in C >>>to the NumArray instance __dict__ (from NumArray.__init__ and stashed >>>via a new attribute in the PyArrayObject struct) and then DECREF it as >>>the last part of the tp_dealloc. >>> >>> >>> >>> >>That sounds promising. >> >> > <> > I looked at this some, and while INCREFing __dict__ maybe the right > idea, I forgot that there *is no* Python NumArray.__init__ anymore. > > So the INCREF needs to be done in C without doing any getattrs; this > seems to mean calling a private _PyObject_GetDictPtr function to get a > pointer to the __dict__ slot which can be dereferenced to get the > __dict__. Might there be a simpler way? Since you're putting an extra attribute on the PyArrayObject structure anyway, wouldn't it be possible to just stash _shadows there instead of the reference to the dictionary? It appears that that the only time _shadows is accessed from python is in __del__. If it were instead an attribute on ndarray, the dealloc problem would go away since the responsibility for deallocing it would fall to ndarray. Since everything else accesses it from C, that shouldn't be much of a problem and should speed that stuff up as well. -tim |
From: Gerard V. <ger...@gr...> - 2004-07-01 18:38:16
|
On 01 Jul 2004 12:43:31 -0400 Todd Miller <jm...@st...> wrote: > A class of question which will arise for developers is this: "X works > with Numeric, but X doesn't work with numaray." The reverse also > happens occasionally. For this reason, being able to choose would be > nice for developers. > > > So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO) > > Of course, Numeric or numarray output can be a user option if NINO does not > > apply. > > When I first heard it, I though NINO was a good idea, with the > limitation that it doesn't apply when a function produces an array > without consuming any. But... there is another problem with NINO that > Perry Greenfield pointed out: with multiple arguments, there can be a > mix of array types. For this reason, it makes sense to be able to > coerce all the inputs to a particular array package. This form might > look more like: > > switch(PyArray_Which(<no_parameter_at_all!>)) { > case USE_NUMERIC: > result = Numeric_doit(a1, a2, a3); break; > case USE_NUMARRAY: > result = Numarray_doit(a1, a2, a3); break; > case USE_SEQUENCE: > result = Sequence_doit(a1, a2, a3); break; > } > > One last thing: I think it would be useful to be able to drive the code > into sequence mode with arrays. This would enable easy benchmarking of > the performance improvement. > > > (explicit safe conversion between Numeric and numarray is possible > > if really needed). Yeah, when I wrote 'if really needed', I was hoping to shift the responsibility of coercion (or conversion) to the Python programmer (my lazy side telling me that it can be done in pure Python). You talked me into doing it in C :-) Regards -- Gerard |
From: Todd M. <jm...@st...> - 2004-07-01 16:58:04
|
On Wed, 2004-06-30 at 19:00, Tim Hochberg wrote: > By this do you mean the "#if PY_VERSION_HEX >= 0x02030000 " that is > wrapped around _ndarray_item? If so, I believe that it *is* getting > compiled, it's just never getting called. > > What I think is happening is that the class NumArray inherits its > sq_item from PyClassObject. In particular, I think it picks up > instance_item from Objects/classobject.c. This appears to be fairly > expensive and, I think, ends up calling tp_as_mapping->mp_subscript. > Thus, _ndarray's sq_item slot never gets called. All of this is pretty > iffy since I don't know this stuff very well and I didn't trace it all > the way through. However, it explains what I've seen thus far. > > This is why I ended up using the horrible hack. I'm resetting NumArray's > sq_item to point to _ndarray_item instead of instance_item. I believe > that access at the python level goes through mp_subscrip, so it > shouldn't be affected, and only objects at the C level should notice and > they should just get the faster sq_item. You, will notice that there are > an awful lot of I thinks in the above paragraphs though... Ugh... Thanks for explaining this. > >>I then optimized _ndarray_item (code > >>at end). This halved the execution time of my arbitrary benchmark. This > >>trick may have horrible, unforseen consequences so use at your own risk. > >> > >> > > > >Right now the sq_item hack strikes me as somewhere between completely > >unnecessary and too scary for me! Maybe if python-dev blessed it. > > > > > Yes, very scary. And it occurs to me that it will break subclasses of > NumArray if they override __getitem__. When these subclasses are > accessed from C they will see nd_array's sq_item instead of the > overridden getitem. However, I think I also know how to fix it. But > it does point out that it is very dangerous and there are probably dark > corners of which I'm unaware. Asking on Python-List or PyDev would > probably be a good idea. > > The nonscary, but painful, fix would to rewrite NumArray in C. Non-scary to whom? > >This optimization looks good to me. > > > > > Unfortunately, I don't think the optimization to sq_item will affect > much since NumArray appears to override it with > > >>Finally I commented out the __del__ method numarraycore. This resulted > >>in an additional speedup of 64% for a total speed up of 240%. Still not > >>close to 10x, but a large improvement. However, this is obviously not > >>viable for real use, but it's enough of a speedup that I'll try to see > >>if there's anyway to move the shadow stuff back to tp_dealloc. > >> > >> > > > >FYI, the issue with tp_dealloc may have to do with which mode Python is > >compiled in, --with-pydebug, or not. One approach which seems like it > >ought to work (just thought of this!) is to add an extra reference in C > >to the NumArray instance __dict__ (from NumArray.__init__ and stashed > >via a new attribute in the PyArrayObject struct) and then DECREF it as > >the last part of the tp_dealloc. > > > > > That sounds promising. I looked at this some, and while INCREFing __dict__ maybe the right idea, I forgot that there *is no* Python NumArray.__init__ anymore. So the INCREF needs to be done in C without doing any getattrs; this seems to mean calling a private _PyObject_GetDictPtr function to get a pointer to the __dict__ slot which can be dereferenced to get the __dict__. > [SNIP] > > > > >Well, be picking out your beer. > > > > > I was only about half right, so I'm not sure I qualify... We could always reduce your wages to a 12-pack... Todd |
From: Todd M. <jm...@st...> - 2004-07-01 16:43:39
|
On Thu, 2004-07-01 at 02:33, ger...@gr... wrote: > On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote > > > > So... you use the "meta" code to provide package specific ordinary > > (not-macro-fied) functions to keep the different versions of the > > Present() and isArray() macros from conflicting. > > > > It would be nice to have a standard approach for using the same > > "extension enhancement code" for both numarray and Numeric. The PEP > > should really be expanded to provide an example of dual support for one > > complete and real function, guts and all, so people can see the process > > end-to-end; Something like a simple arrayprint. That process needs > > to be refined to remove as much tedium and duplication of effort as > > possible. The idea is to make it as close to providing one > > implementation to support both array packages as possible. I think it's > > important to illustrate how to partition the extension module into > > separate compilation units which correctly navigate the dual > > implementation mine field in the easiest possible way. > > > > It would also be nice to add some logic to the meta-functions so that > > which array package gets used is configurable. We did something like > > that for the matplotlib plotting software at the Python level with > > the "numerix" layer, an idea I think we copied from Chaco. The kind > > of dispatch I think might be good to support configurability looks like > > this: > > > > PyObject * > > whatsThis(PyObject *dummy, PyObject *args) > > { > > PyObject *result, *what = NULL; > > if (!PyArg_ParseTuple(args, "O", &what)) > > return 0; > > switch(PyArray_Which(what)) { > > USE_NUMERIC: > > result = Numeric_whatsThis(what); break; > > USE_NUMARRAY: > > result = Numarray_whatsThis(what); break; > > USE_SEQUENCE: > > result = Sequence_whatsThis(what); break; > > } > > Py_INCREF(Py_None); > > return Py_None; > > } > > > > In the above, I'm picturing a separate .c file for Numeric_whatsThis > > and for Numarray_whatsThis. It would be nice to streamline that to one > > .c and a process which somehow (simply) produces both functions. > > > > Or, ideally, the above would be done more like this: > > > > PyObject * > > whatsThis(PyObject *dummy, PyObject *args) > > { > > PyObject *result, *what = NULL; > > if (!PyArg_ParseTuple(args, "O", &what)) > > return 0; > > switch(Numerix_Which(what)) { > > USE_NUMERIX: > > result = Numerix_whatsThis(what); break; > > USE_SEQUENCE: > > result = Sequence_whatsThis(what); break; > > } > > Py_INCREF(Py_None); > > return Py_None; > > } > > > > Here, a common Numerix implementation supports both numarray and Numeric > > from a single simple .c. The extension module would do "#include > > numerix/arrayobject.h" and "import_numerix()" and otherwise just call > > PyArray_* functions. > > > > The current stumbling block is that numarray is not binary compatible > > with Numeric... so numerix in C falls apart. I haven't analyzed > > every symbol and struct to see if it is really feasible... but it > > seems like it is *almost* feasible, at least for typical usage. > > > > So, in a nutshell, I think the dual implementation support you > > demoed is important and we should work up an example and kick it > > around to make sure it's the best way we can think of doing it. > > Then we should add a section to the PEP describing dual support as well. > > > I would never apply numarray code to Numeric arrays and the inverse. It looks > dangerous and I do not know if it is possible. I think that's definitely the marching orders for now... but you gotta admit, it would be nice. > The first thing coming > to mind is that numarray and Numeric arrays refer to different type objects > (this is what my pep module uses to differentiate them). So, even if > numarray and Numeric are binary compatible, any 'alien' code referring the > the 'Python-standard part' of the type objects may lead to surprises. > A PEP proposing hacks will raise eyebrows at least. I'm a little surprised it took someone to talk me out of it... I'll just concede that this was probably a bad idea. > Secondly, most people use Numeric *or* numarray and not both. A class of question which will arise for developers is this: "X works with Numeric, but X doesn't work with numaray." The reverse also happens occasionally. For this reason, being able to choose would be nice for developers. > So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO) > Of course, Numeric or numarray output can be a user option if NINO does not > apply. When I first heard it, I though NINO was a good idea, with the limitation that it doesn't apply when a function produces an array without consuming any. But... there is another problem with NINO that Perry Greenfield pointed out: with multiple arguments, there can be a mix of array types. For this reason, it makes sense to be able to coerce all the inputs to a particular array package. This form might look more like: switch(PyArray_Which(<no_parameter_at_all!>)) { case USE_NUMERIC: result = Numeric_doit(a1, a2, a3); break; case USE_NUMARRAY: result = Numarray_doit(a1, a2, a3); break; case USE_SEQUENCE: result = Sequence_doit(a1, a2, a3); break; } One last thing: I think it would be useful to be able to drive the code into sequence mode with arrays. This would enable easy benchmarking of the performance improvement. > (explicit safe conversion between Numeric and numarray is possible > if really needed). > >I'll try to flesh out the demo with real functions in the way you indicated > (going as far as I consider safe). > > The problem of coding the Numeric (or numarray) functions in more than > a single source file has also be addressed. > > It may take 2 weeks because I am off to a conference next week. Excellent. See you in a couple weeks. Regards, Todd |
From: Sebastian H. <ha...@ms...> - 2004-07-01 16:04:30
|
On Wednesday 30 June 2004 11:33 pm, ger...@gr... wrote: > On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote > > > So... you use the "meta" code to provide package specific ordinary > > (not-macro-fied) functions to keep the different versions of the > > Present() and isArray() macros from conflicting. > > > > It would be nice to have a standard approach for using the same > > "extension enhancement code" for both numarray and Numeric. The PEP > > should really be expanded to provide an example of dual support for one > > complete and real function, guts and all, so people can see the process > > end-to-end; Something like a simple arrayprint. That process needs > > to be refined to remove as much tedium and duplication of effort as > > possible. The idea is to make it as close to providing one > > implementation to support both array packages as possible. I think it's > > important to illustrate how to partition the extension module into > > separate compilation units which correctly navigate the dual > > implementation mine field in the easiest possible way. > > > > It would also be nice to add some logic to the meta-functions so that > > which array package gets used is configurable. We did something like > > that for the matplotlib plotting software at the Python level with > > the "numerix" layer, an idea I think we copied from Chaco. The kind > > of dispatch I think might be good to support configurability looks like > > this: > > > > PyObject * > > whatsThis(PyObject *dummy, PyObject *args) > > { > > PyObject *result, *what = NULL; > > if (!PyArg_ParseTuple(args, "O", &what)) > > return 0; > > switch(PyArray_Which(what)) { > > USE_NUMERIC: > > result = Numeric_whatsThis(what); break; > > USE_NUMARRAY: > > result = Numarray_whatsThis(what); break; > > USE_SEQUENCE: > > result = Sequence_whatsThis(what); break; > > } > > Py_INCREF(Py_None); > > return Py_None; > > } > > > > In the above, I'm picturing a separate .c file for Numeric_whatsThis > > and for Numarray_whatsThis. It would be nice to streamline that to one > > .c and a process which somehow (simply) produces both functions. > > > > Or, ideally, the above would be done more like this: > > > > PyObject * > > whatsThis(PyObject *dummy, PyObject *args) > > { > > PyObject *result, *what = NULL; > > if (!PyArg_ParseTuple(args, "O", &what)) > > return 0; > > switch(Numerix_Which(what)) { > > USE_NUMERIX: > > result = Numerix_whatsThis(what); break; > > USE_SEQUENCE: > > result = Sequence_whatsThis(what); break; > > } > > Py_INCREF(Py_None); > > return Py_None; > > } > > > > Here, a common Numerix implementation supports both numarray and Numeric > > from a single simple .c. The extension module would do "#include > > numerix/arrayobject.h" and "import_numerix()" and otherwise just call > > PyArray_* functions. > > > > The current stumbling block is that numarray is not binary compatible > > with Numeric... so numerix in C falls apart. I haven't analyzed > > every symbol and struct to see if it is really feasible... but it > > seems like it is *almost* feasible, at least for typical usage. > > > > So, in a nutshell, I think the dual implementation support you > > demoed is important and we should work up an example and kick it > > around to make sure it's the best way we can think of doing it. > > Then we should add a section to the PEP describing dual support as well. > > I would never apply numarray code to Numeric arrays and the inverse. It > looks dangerous and I do not know if it is possible. The first thing > coming to mind is that numarray and Numeric arrays refer to different type > objects (this is what my pep module uses to differentiate them). So, even > if numarray and Numeric are binary compatible, any 'alien' code referring > the the 'Python-standard part' of the type objects may lead to surprises. A > PEP proposing hacks will raise eyebrows at least. > > Secondly, most people use Numeric *or* numarray and not both. > > So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out > (NINO) Of course, Numeric or numarray output can be a user option if NINO > does not apply. (explicit safe conversion between Numeric and numarray is > possible if really needed). > > I'll try to flesh out the demo with real functions in the way you indicated > (going as far as I consider safe). > > The problem of coding the Numeric (or numarray) functions in more than > a single source file has also be addressed. > > It may take 2 weeks because I am off to a conference next week. > > Regards -- Gerard Hi all, first, I would like to state that I don't understand much of this discussion; so the only comment I wanted to make is that IF this where possible, to make (C/C++) code that can live with both Numeric and numarray, then I think it would be used more and more - think: transition phase !! (e.g. someone could start making the FFTW part of scipy numarray friendly without having to switch everything at one [hint ;-)] ) These where just my 2 cents. Cheers, Sebastian Haase |
From: Francesc A. <fa...@py...> - 2004-07-01 08:48:17
|
A Dimecres 30 Juny 2004 23:47, Todd Miller va escriure: > > There were a couple of other things I tried that resulted in additional > > small speedups, but the tactics I used were too horrible to reproduce > > here. The main one of interest is that all of the calls to > > NA_updateDataPtr seem to burn some time. However, I don't have any idea > > what one could do about that. > > Francesc Alted had the same comment about NA_updateDataPtr a while ago. > I tried to optimize it then but didn't get anywhere. NA_updateDataPtr() > should be called at most once per extension function (more is > unnecessary but not harmful) but needs to be called at least once as a > consequence of the way the buffer protocol doesn't give locked > pointers. FYI I'm still refusing to call NA_updateDataPtr() in a spoecific part of my code that requires as much speed as possible. It works just fine from numarray 0.5 on (numarray 0.4 gave a segmentation fault on that). However, Todd already warned me about that and told me that this is unsafe. Nevertheless, I'm using the optimization for read-only purposes (i.e. they are not accessible to users) over numarray objects, and that *seems* to be safe (at least I did not have any single problem after numarray 0.5). I know that I'm walking on the cutting edge, but life is dangerous anyway ;). By the way, that optimization gives me a 70% of improvement during element access to NumArray elements. It would be very nice if you finally can achieve additional performance with your recent bet :). Good luck!, -- Francesc Alted |
From: <ger...@gr...> - 2004-07-01 06:33:30
|
On 30 Jun 2004 17:54:19 -0400, Todd Miller wrote > > So... you use the "meta" code to provide package specific ordinary > (not-macro-fied) functions to keep the different versions of the > Present() and isArray() macros from conflicting. > > It would be nice to have a standard approach for using the same > "extension enhancement code" for both numarray and Numeric. The PEP > should really be expanded to provide an example of dual support for one > complete and real function, guts and all, so people can see the process > end-to-end; Something like a simple arrayprint. That process needs > to be refined to remove as much tedium and duplication of effort as > possible. The idea is to make it as close to providing one > implementation to support both array packages as possible. I think it's > important to illustrate how to partition the extension module into > separate compilation units which correctly navigate the dual > implementation mine field in the easiest possible way. > > It would also be nice to add some logic to the meta-functions so that > which array package gets used is configurable. We did something like > that for the matplotlib plotting software at the Python level with > the "numerix" layer, an idea I think we copied from Chaco. The kind > of dispatch I think might be good to support configurability looks like > this: > > PyObject * > whatsThis(PyObject *dummy, PyObject *args) > { > PyObject *result, *what = NULL; > if (!PyArg_ParseTuple(args, "O", &what)) > return 0; > switch(PyArray_Which(what)) { > USE_NUMERIC: > result = Numeric_whatsThis(what); break; > USE_NUMARRAY: > result = Numarray_whatsThis(what); break; > USE_SEQUENCE: > result = Sequence_whatsThis(what); break; > } > Py_INCREF(Py_None); > return Py_None; > } > > In the above, I'm picturing a separate .c file for Numeric_whatsThis > and for Numarray_whatsThis. It would be nice to streamline that to one > .c and a process which somehow (simply) produces both functions. > > Or, ideally, the above would be done more like this: > > PyObject * > whatsThis(PyObject *dummy, PyObject *args) > { > PyObject *result, *what = NULL; > if (!PyArg_ParseTuple(args, "O", &what)) > return 0; > switch(Numerix_Which(what)) { > USE_NUMERIX: > result = Numerix_whatsThis(what); break; > USE_SEQUENCE: > result = Sequence_whatsThis(what); break; > } > Py_INCREF(Py_None); > return Py_None; > } > > Here, a common Numerix implementation supports both numarray and Numeric > from a single simple .c. The extension module would do "#include > numerix/arrayobject.h" and "import_numerix()" and otherwise just call > PyArray_* functions. > > The current stumbling block is that numarray is not binary compatible > with Numeric... so numerix in C falls apart. I haven't analyzed > every symbol and struct to see if it is really feasible... but it > seems like it is *almost* feasible, at least for typical usage. > > So, in a nutshell, I think the dual implementation support you > demoed is important and we should work up an example and kick it > around to make sure it's the best way we can think of doing it. > Then we should add a section to the PEP describing dual support as well. > I would never apply numarray code to Numeric arrays and the inverse. It looks dangerous and I do not know if it is possible. The first thing coming to mind is that numarray and Numeric arrays refer to different type objects (this is what my pep module uses to differentiate them). So, even if numarray and Numeric are binary compatible, any 'alien' code referring the the 'Python-standard part' of the type objects may lead to surprises. A PEP proposing hacks will raise eyebrows at least. Secondly, most people use Numeric *or* numarray and not both. So, I prefer: Numeric In => Numeric Out or Numarray In => Numarray Out (NINO) Of course, Numeric or numarray output can be a user option if NINO does not apply. (explicit safe conversion between Numeric and numarray is possible if really needed). I'll try to flesh out the demo with real functions in the way you indicated (going as far as I consider safe). The problem of coding the Numeric (or numarray) functions in more than a single source file has also be addressed. It may take 2 weeks because I am off to a conference next week. Regards -- Gerard |
From: Tim H. <tim...@co...> - 2004-06-30 23:01:14
|
Todd Miller wrote: >On Wed, 2004-06-30 at 15:57, Tim Hochberg wrote: > > >>[SNIP] >> >>After futzing around some more I figured out a way to trick python into >>using _ndarray_item. I added "type->tp_as_sequence->sq_item = >>_ndarray_item;" to _ndarray new. >> >> > >I'm puzzled why you had to do this. You're using Python-2.3.x, right? >There's conditionally compiled code which should be doing this >statically. (At least I thought so.) > > By this do you mean the "#if PY_VERSION_HEX >= 0x02030000 " that is wrapped around _ndarray_item? If so, I believe that it *is* getting compiled, it's just never getting called. What I think is happening is that the class NumArray inherits its sq_item from PyClassObject. In particular, I think it picks up instance_item from Objects/classobject.c. This appears to be fairly expensive and, I think, ends up calling tp_as_mapping->mp_subscript. Thus, _ndarray's sq_item slot never gets called. All of this is pretty iffy since I don't know this stuff very well and I didn't trace it all the way through. However, it explains what I've seen thus far. This is why I ended up using the horrible hack. I'm resetting NumArray's sq_item to point to _ndarray_item instead of instance_item. I believe that access at the python level goes through mp_subscrip, so it shouldn't be affected, and only objects at the C level should notice and they should just get the faster sq_item. You, will notice that there are an awful lot of I thinks in the above paragraphs though... >>I then optimized _ndarray_item (code >>at end). This halved the execution time of my arbitrary benchmark. This >>trick may have horrible, unforseen consequences so use at your own risk. >> >> > >Right now the sq_item hack strikes me as somewhere between completely >unnecessary and too scary for me! Maybe if python-dev blessed it. > > Yes, very scary. And it occurs to me that it will break subclasses of NumArray if they override __getitem__. When these subclasses are accessed from C they will see nd_array's sq_item instead of the overridden getitem. However, I think I also know how to fix it. But it does point out that it is very dangerous and there are probably dark corners of which I'm unaware. Asking on Python-List or PyDev would probably be a good idea. The nonscary, but painful, fix would to rewrite NumArray in C. >This optimization looks good to me. > > Unfortunately, I don't think the optimization to sq_item will affect much since NumArray appears to override it with >>Finally I commented out the __del__ method numarraycore. This resulted >>in an additional speedup of 64% for a total speed up of 240%. Still not >>close to 10x, but a large improvement. However, this is obviously not >>viable for real use, but it's enough of a speedup that I'll try to see >>if there's anyway to move the shadow stuff back to tp_dealloc. >> >> > >FYI, the issue with tp_dealloc may have to do with which mode Python is >compiled in, --with-pydebug, or not. One approach which seems like it >ought to work (just thought of this!) is to add an extra reference in C >to the NumArray instance __dict__ (from NumArray.__init__ and stashed >via a new attribute in the PyArrayObject struct) and then DECREF it as >the last part of the tp_dealloc. > > That sounds promising. [SNIP] > >Well, be picking out your beer. > > I was only about half right, so I'm not sure I qualify... -tim |
From: Todd M. <jm...@st...> - 2004-06-30 21:54:29
|
On Tue, 2004-06-29 at 17:32, ger...@gr... wrote: > On 29 Jun 2004 15:09:43 -0400, Todd Miller wrote > > On Tue, 2004-06-29 at 13:44, Gerard Vermeulen wrote: > > > > > > > > The PEP is attached. It is formatted using the docutils package which > > > > can be used to generate HTML or PDF. Comments and corrections would be > > > > appreciated. > > > > > > > PyQwt is a Python extension which can be conditionally compiled against > > > Numeric and/or numarray (both, one of them or none). > > > > Well that's cool! I'll have to keep the PyQwt guys in mind as potential > > first users. > > > > > Your PEP excludes importing of Numeric and numarray in the same C-extension. > > > > This is true but I don't understand your solution so I'll blather on > > below. > > > > > All you need to do is to hide the macros PyArray_Present(), PyArray_isArray() > > > and import_array() into a few functions with numarray specific names, so > > > that the following becomes possible: > > > > > > #include <Numeric/meta.h> > > > /* defines the functions (no macros) > > > int Numeric_Present(); > > > int Numeric_isArray(); > > > void import_numeric(); > > > to hide the Numeric C-API stuff in a small Numeric/meta.c file. > > > */ > > > #include <numarray/meta.h> > > > /* defines the functions (no macros) > > > int numarray_Present(); > > > int numarray_isArray(); > > > void import_numarray(); > > > to hide the numarray C-API stuff in a small numarray/meta.c file. > > > */ > > > > > > > I may have come up with the wrong scheme for the Present() and > > isArray(). With my scheme, they *have* to be macros because the API > > functions are unsafe: when numarray or Numeric is not present, the API > > function table pointers are NULL so calling through them results in > > either a fatal error or a segfault. > > > The macros can be hidden from the module file scope by wrapping them > in a function (see attached demo) Your demo is very clear... nice! > > There is an additional problem that the "same functions" need to be > > called through different API pointers depending on whether numarray > > or Numeric is being supported. Redefinition of typedefs and enumerations > > > > (or perhaps conditional compilation short-circuited re-definitions) may > > also present a problem with compiling (or worse, running). > > > Tested and works. > > > > I certainly like the idea of supporting both in the same extension > > module, but don't see how to get there, other than with separate > > compilation units. With seperate .c files, I'm not aware of a problem > > other than lots of global symbols. I haven't demoed that yet so I am > > interested if someone has made it work. > > > Yes, you cannot mix the numarray API and Numeric API in the same .c > file, but nevertheless you can hide the macros in small functions so > that the macros don't pollute. So... you use the "meta" code to provide package specific ordinary (not-macro-fied) functions to keep the different versions of the Present() and isArray() macros from conflicting. It would be nice to have a standard approach for using the same "extension enhancement code" for both numarray and Numeric. The PEP should really be expanded to provide an example of dual support for one complete and real function, guts and all, so people can see the process end-to-end; Something like a simple arrayprint. That process needs to be refined to remove as much tedium and duplication of effort as possible. The idea is to make it as close to providing one implementation to support both array packages as possible. I think it's important to illustrate how to partition the extension module into separate compilation units which correctly navigate the dual implementation mine field in the easiest possible way. It would also be nice to add some logic to the meta-functions so that which array package gets used is configurable. We did something like that for the matplotlib plotting software at the Python level with the "numerix" layer, an idea I think we copied from Chaco. The kind of dispatch I think might be good to support configurability looks like this: PyObject * whatsThis(PyObject *dummy, PyObject *args) { PyObject *result, *what = NULL; if (!PyArg_ParseTuple(args, "O", &what)) return 0; switch(PyArray_Which(what)) { USE_NUMERIC: result = Numeric_whatsThis(what); break; USE_NUMARRAY: result = Numarray_whatsThis(what); break; USE_SEQUENCE: result = Sequence_whatsThis(what); break; } Py_INCREF(Py_None); return Py_None; } In the above, I'm picturing a separate .c file for Numeric_whatsThis and for Numarray_whatsThis. It would be nice to streamline that to one .c and a process which somehow (simply) produces both functions. Or, ideally, the above would be done more like this: PyObject * whatsThis(PyObject *dummy, PyObject *args) { PyObject *result, *what = NULL; if (!PyArg_ParseTuple(args, "O", &what)) return 0; switch(Numerix_Which(what)) { USE_NUMERIX: result = Numerix_whatsThis(what); break; USE_SEQUENCE: result = Sequence_whatsThis(what); break; } Py_INCREF(Py_None); return Py_None; } Here, a common Numerix implementation supports both numarray and Numeric from a single simple .c. The extension module would do "#include numerix/arrayobject.h" and "import_numerix()" and otherwise just call PyArray_* functions. The current stumbling block is that numarray is not binary compatible with Numeric... so numerix in C falls apart. I haven't analyzed every symbol and struct to see if it is really feasible... but it seems like it is *almost* feasible, at least for typical usage. So, in a nutshell, I think the dual implementation support you demoed is important and we should work up an example and kick it around to make sure it's the best way we can think of doing it. Then we should add a section to the PEP describing dual support as well. Regards, Todd |
From: Todd M. <jm...@st...> - 2004-06-30 21:47:43
|
On Wed, 2004-06-30 at 15:57, Tim Hochberg wrote: > I spend some time seeing what I could do in the way of speeding up > wxPoint_LIST_helper by tweaking the numarray code. My first suspect was > _universalIndexing by way of _ndarray_item. However, due to some > new-style machinations, _ndarray_item was never getting called. Instead, > _ndarray_subscript was being called. So, I added a special case to > _ndarray_subscript. This sped things up by 50% or so (I don't recall > exactly). The code for that is at the end of this message; it's not > gauranteed to be 100% correct; it's all experimental. > > After futzing around some more I figured out a way to trick python into > using _ndarray_item. I added "type->tp_as_sequence->sq_item = > _ndarray_item;" to _ndarray new. I'm puzzled why you had to do this. You're using Python-2.3.x, right? There's conditionally compiled code which should be doing this statically. (At least I thought so.) > I then optimized _ndarray_item (code > at end). This halved the execution time of my arbitrary benchmark. This > trick may have horrible, unforseen consequences so use at your own risk. Right now the sq_item hack strikes me as somewhere between completely unnecessary and too scary for me! Maybe if python-dev blessed it. This optimization looks good to me. > Finally I commented out the __del__ method numarraycore. This resulted > in an additional speedup of 64% for a total speed up of 240%. Still not > close to 10x, but a large improvement. However, this is obviously not > viable for real use, but it's enough of a speedup that I'll try to see > if there's anyway to move the shadow stuff back to tp_dealloc. FYI, the issue with tp_dealloc may have to do with which mode Python is compiled in, --with-pydebug, or not. One approach which seems like it ought to work (just thought of this!) is to add an extra reference in C to the NumArray instance __dict__ (from NumArray.__init__ and stashed via a new attribute in the PyArrayObject struct) and then DECREF it as the last part of the tp_dealloc. > In summary: > > Version Time Rel Speedup Abs Speedup > Stock 0.398 ---- ---- > _naarray_item mod 0.192 107% 107% > del __del__ 0.117 64% 240% > > There were a couple of other things I tried that resulted in additional > small speedups, but the tactics I used were too horrible to reproduce > here. The main one of interest is that all of the calls to > NA_updateDataPtr seem to burn some time. However, I don't have any idea > what one could do about that. Francesc Alted had the same comment about NA_updateDataPtr a while ago. I tried to optimize it then but didn't get anywhere. NA_updateDataPtr() should be called at most once per extension function (more is unnecessary but not harmful) but needs to be called at least once as a consequence of the way the buffer protocol doesn't give locked pointers. > That's all for now. > > -tim Well, be picking out your beer. Todd > > > > static PyObject* > _ndarray_subscript(PyArrayObject* self, PyObject* key) > > { > PyObject *result; > #ifdef TAH > if (PyInt_CheckExact(key)) { > long ikey = PyInt_AsLong(key); > long offset; > if (NA_getByteOffset(self, 1, &ikey, &offset) < 0) > return NULL; > if (!NA_updateDataPtr(self)) > return NULL; > return _simpleIndexingCore(self, offset, 1, Py_None); > } > #endif > #if _PYTHON_CALLBACKS > result = PyObject_CallMethod( > (PyObject *) self, "_universalIndexing", "(OO)", key, Py_None); > #else > result = _universalIndexing(self, key, Py_None); > #endif > return result; > } > > > > static PyObject * > _ndarray_item(PyArrayObject *self, int i) > { > #ifdef TAH > long offset; > if (NA_getByteOffset(self, 1, &i, &offset) < 0) > return NULL; > if (!NA_updateDataPtr(self)) > return NULL; > return _simpleIndexingCore(self, offset, 1, Py_None); > #else > PyObject *result; > PyObject *key = PyInt_FromLong(i); > if (!key) return NULL; > result = _universalIndexing(self, key, Py_None); > Py_DECREF(key); > return result; > #endif > } > > > > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- |
From: Tim H. <tim...@co...> - 2004-06-30 19:57:55
|
I spend some time seeing what I could do in the way of speeding up wxPoint_LIST_helper by tweaking the numarray code. My first suspect was _universalIndexing by way of _ndarray_item. However, due to some new-style machinations, _ndarray_item was never getting called. Instead, _ndarray_subscript was being called. So, I added a special case to _ndarray_subscript. This sped things up by 50% or so (I don't recall exactly). The code for that is at the end of this message; it's not gauranteed to be 100% correct; it's all experimental. After futzing around some more I figured out a way to trick python into using _ndarray_item. I added "type->tp_as_sequence->sq_item = _ndarray_item;" to _ndarray new. I then optimized _ndarray_item (code at end). This halved the execution time of my arbitrary benchmark. This trick may have horrible, unforseen consequences so use at your own risk. Finally I commented out the __del__ method numarraycore. This resulted in an additional speedup of 64% for a total speed up of 240%. Still not close to 10x, but a large improvement. However, this is obviously not viable for real use, but it's enough of a speedup that I'll try to see if there's anyway to move the shadow stuff back to tp_dealloc. In summary: Version Time Rel Speedup Abs Speedup Stock 0.398 ---- ---- _naarray_item mod 0.192 107% 107% del __del__ 0.117 64% 240% There were a couple of other things I tried that resulted in additional small speedups, but the tactics I used were too horrible to reproduce here. The main one of interest is that all of the calls to NA_updateDataPtr seem to burn some time. However, I don't have any idea what one could do about that. That's all for now. -tim static PyObject* _ndarray_subscript(PyArrayObject* self, PyObject* key) { PyObject *result; #ifdef TAH if (PyInt_CheckExact(key)) { long ikey = PyInt_AsLong(key); long offset; if (NA_getByteOffset(self, 1, &ikey, &offset) < 0) return NULL; if (!NA_updateDataPtr(self)) return NULL; return _simpleIndexingCore(self, offset, 1, Py_None); } #endif #if _PYTHON_CALLBACKS result = PyObject_CallMethod( (PyObject *) self, "_universalIndexing", "(OO)", key, Py_None); #else result = _universalIndexing(self, key, Py_None); #endif return result; } static PyObject * _ndarray_item(PyArrayObject *self, int i) { #ifdef TAH long offset; if (NA_getByteOffset(self, 1, &i, &offset) < 0) return NULL; if (!NA_updateDataPtr(self)) return NULL; return _simpleIndexingCore(self, offset, 1, Py_None); #else PyObject *result; PyObject *key = PyInt_FromLong(i); if (!key) return NULL; result = _universalIndexing(self, key, Py_None); Py_DECREF(key); return result; #endif } |
From: Perry G. <pe...@st...> - 2004-06-30 16:07:32
|
On Jun 30, 2004, at 11:55 AM, Russell E Owen wrote: > I agree. I have gotten numarray.records to handle multi-dimensional > arrays, but it's a terrible pain to create them, str(arry) fails and > setting elements of records arrays is painful. I hope at some point > they get a major redesign, as they don't actually seem to have been > designed to fit in with numarray. The resulting code was so ugly that > I gave up and used multiple identically shaped arrays instead. > I think we will be taking a look at that soon. I agree that they could be generalized to work better with numarray. Hopefully we will be soliciting comments in the next few weeks about the best way to do that. Perry |
From: Russell E O. <rowen@u.washington.edu> - 2004-06-30 15:55:59
|
At 7:49 PM -0400 2004-06-29, Todd Miller wrote: >On Tue, 2004-06-29 at 17:52, Sebastian Haase wrote: >> OK, >> I'm still trying to get a handle on these record arrays - because >>I think they >> are pretty cool, if I could get them to work... >> Following the code from yesterday (see that posting below) I >>discovered this: >> main.ring4ext[0][0] is not the same as main.ring4ext[0,0] >> is this intended ?? >> >> >>> main.ring4ext[0][0] >> (2308, 76, 272, 1088481152.0, 104.18000030517578, 1994.949951171875) >> >>> main.ring4ext[0,0] >> (array([2308, 2309]), array([76, 76]), array([272, 269]), array([ >>1.08848115e >> +09, 1.08848115e+09], type=Float32), array([ 104.18000031, 104.45999908], >> type=Float32), array([ 1994.94995117, 1994.95996094], type=Float32)) >> >>> main.ring4ext.shape # yesterday I had this different !!! (20,1) >> (20, 2) >> >> Any comments are appreciated, > >I talked to JC Hsu, the numarray.records author, and he explained that >we're probably looking at a limitation of numarray.records: it doesn't >yet handle multi-dimensional arrays of records. JC indicated he had >replied to Sebastian, but for the benefit of everyone else, that's the >deal. I agree. I have gotten numarray.records to handle multi-dimensional arrays, but it's a terrible pain to create them, str(arry) fails and setting elements of records arrays is painful. I hope at some point they get a major redesign, as they don't actually seem to have been designed to fit in with numarray. The resulting code was so ugly that I gave up and used multiple identically shaped arrays instead. -- Russell |