From: Travis O. <oli...@ie...> - 2006-08-29 18:57:42
|
Hi all, Classes start for me next Tuesday, and I'm teaching a class for which I will be using NumPy / SciPy extensively. I need to have a release of these two (and hopefully matplotlib) that work with each other. Therefore, I'm going to make a 1.0b5 release of NumPy over the weekend (probably Monday), and also get a release of SciPy out as well. At that point, I'll only be available for bug-fixes to 1.0. Therefore, the next release after 1.0b5 I would like to be 1.0rc1 (release-candidate 1). To facilitate that, after 1.0b5 there will be a feature-freeze (except for in the compatibility modules and the alter_code scripts which can still be modified to ease the transition burden). The 1.0rc1 release of NumPy will be mid September I suspect. Also, I recognize that the default-axis switch is a burden for those who have already transitioned code to use NumPy (for those just starting out it's not a big deal because of the compatibility layer). As a result, I've added a module called fix_default_axis whose converttree method will walk a hierarchy and change all .py files to fix the default axis problem in those files. This can be done in one of two ways (depending on the boolean argument import_change). If import_change is False a) Add and axis=<olddefault> keyword argument to any function whose default changed in 1.0b2 or 1.0b3, which does not already have the axis argument --- this method does not distinguish where the function came from and so can do the wrong thing with similarly named functions from other modules (.e.g. builtin sum and itertools.repeat). If import_change is True b) Change the location where the function is imported from numpy to numpy.oldnumeric where the default axis is the same as before. This approach looks for several flavors of the import statement and alters the import location for any function whose default axis argument changed --- this can get confused if you use from numpy import sum as mysum --- it will not replace that usage of sum. I used this script on the scipy tree in mode a) as a test (followed by a manual replacement of all? incorrect substitutions). I hope it helps. I know it's annoying to have such things change. But, it does make NumPy much more consistent with respect to the default axis argument. With a few exceptions (concatenate, diff, trapz, split, array_split), the rule is that you need to specify the axis if there is more than 1 dimension or it will ravel the input. -Travis |
From: Charles R H. <cha...@gm...> - 2006-08-29 19:06:41
|
Hi Travis, On 8/29/06, Travis Oliphant <oli...@ie...> wrote: > > > Hi all, > > Classes start for me next Tuesday, and I'm teaching a class for which I > will be using NumPy / SciPy extensively. I need to have a release of > these two (and hopefully matplotlib) that work with each other. > > Therefore, I'm going to make a 1.0b5 release of NumPy over the weekend > (probably Monday), and also get a release of SciPy out as well. At that > point, I'll only be available for bug-fixes to 1.0. Therefore, the next > release after 1.0b5 I would like to be 1.0rc1 (release-candidate 1). > > To facilitate that, after 1.0b5 there will be a feature-freeze (except > for in the compatibility modules and the alter_code scripts which can > still be modified to ease the transition burden). Speaking of features, I wonder if more of the methods should return references. For instance, it might be nice to write something like: a.sort().searchsorted([...]) instead of making two statements out of it. The 1.0rc1 release of NumPy will be mid September I suspect. > > Also, I recognize that the default-axis switch is a burden for those who > have already transitioned code to use NumPy (for those just starting out > it's not a big deal because of the compatibility layer). I am curious as to why you made this switch. Not complaining, mind. Chuck |
From: Fernando P. <fpe...@gm...> - 2006-08-29 19:11:53
|
On 8/29/06, Charles R Harris <cha...@gm...> wrote: > Speaking of features, I wonder if more of the methods should return > references. For instance, it might be nice to write something like: > > a.sort().searchsorted([...]) > > instead of making two statements out of it. +1 for more 'return self' at the end of methods which currently don't return anything (well, we get the default None), as long as it's sensible. I really like this 'message chaining' style of programming, and it annoys me that much of the python stdlib gratuitously prevents it by NOT returning self in places where it would be a perfectly sensible thing to do. I find it much cleaner to write x = foo.bar().baz(param).frob() than foo.bar() foo.baz(param) x = foo.frob() but perhaps others disagree. Cheers, f |
From: Rudolph v. d. M. <rud...@gm...> - 2006-08-29 19:17:24
|
This definitely gets my vote as well (for what it's worth). R. On 8/29/06, Fernando Perez <fpe...@gm...> wrote: > +1 for more 'return self' at the end of methods which currently don't > return anything (well, we get the default None), as long as it's > sensible. I really like this 'message chaining' style of programming, > and it annoys me that much of the python stdlib gratuitously prevents > it by NOT returning self in places where it would be a perfectly > sensible thing to do. > > I find it much cleaner to write > > x = foo.bar().baz(param).frob() > > than > > foo.bar() > foo.baz(param) > x = foo.frob() > > but perhaps others disagree. > > Cheers, > > f -- Rudolph van der Merwe Karoo Array Telescope / Square Kilometer Array - http://www.ska.ac.za |
From: Tim H. <tim...@ie...> - 2006-08-29 19:26:16
|
-0.5 from me if what we're talking about here is having mutating methods return self rather than None. Chaining stuff is pretty, but having methods that mutate self and return self looks like a source of elusive bugs to me. -tim Rudolph van der Merwe wrote: > This definitely gets my vote as well (for what it's worth). > > R. > > On 8/29/06, Fernando Perez <fpe...@gm...> wrote: > >> +1 for more 'return self' at the end of methods which currently don't >> return anything (well, we get the default None), as long as it's >> sensible. I really like this 'message chaining' style of programming, >> and it annoys me that much of the python stdlib gratuitously prevents >> it by NOT returning self in places where it would be a perfectly >> sensible thing to do. >> >> I find it much cleaner to write >> >> x = foo.bar().baz(param).frob() >> >> than >> >> foo.bar() >> foo.baz(param) >> x = foo.frob() >> >> but perhaps others disagree. >> >> Cheers, >> >> f >> > > |
From: Charles R H. <cha...@gm...> - 2006-08-29 19:36:36
|
Hi, On 8/29/06, Tim Hochberg <tim...@ie...> wrote: > > > -0.5 from me if what we're talking about here is having mutating methods > return self rather than None. Chaining stuff is pretty, but having > methods that mutate self and return self looks like a source of elusive > bugs to me. > > -tim But how is that any worse than the current mutating operators? I think the operating principal is that methods generally work in place, functions make copies. The exceptions to this rule need to be noted. Chuck |
From: Tim H. <tim...@ie...> - 2006-08-29 20:01:10
|
Charles R Harris wrote: > Hi, > > On 8/29/06, *Tim Hochberg* <tim...@ie... > <mailto:tim...@ie...>> wrote: > > > -0.5 from me if what we're talking about here is having mutating > methods > return self rather than None. Chaining stuff is pretty, but having > methods that mutate self and return self looks like a source of > elusive > bugs to me. > > -tim > > > But how is that any worse than the current mutating operators? I think > the operating principal is that methods generally work in place, > functions make copies. The exceptions to this rule need to be noted. Is that really the case? I was more under the impression that there wasn't much rhyme nor reason to this. Let's do a quick dir(somearray) and see what we get (I'll strip out the __XXX__ names): 'all', 'any', 'argmax', 'argmin', 'argsort', 'astype', 'base', 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump', 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', 'item', 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', 'nonzero', 'prod', 'ptp', 'put', 'putmask', 'ravel', 'real', 'repeat', 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', 'swapaxes', 'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', 'var', 'view' Hmmm. Without taking too much time to go through these one at a time, I'm pretty certain that they do not in general mutate things in place. Probably at least half return, or can return new arrays, sometimes with references to the original data, but new shapes, sometimes with completely new data. In fact, other than sort, I'm not sure which of these does mutate in place. -tim |
From: Charles R H. <cha...@gm...> - 2006-08-29 20:17:34
|
On 8/29/06, Tim Hochberg <tim...@ie...> wrote: > > Charles R Harris wrote: > > Hi, > > > > On 8/29/06, *Tim Hochberg* <tim...@ie... > > <mailto:tim...@ie...>> wrote: > > > > > > -0.5 from me if what we're talking about here is having mutating > > methods > > return self rather than None. Chaining stuff is pretty, but having > > methods that mutate self and return self looks like a source of > > elusive > > bugs to me. > > > > -tim > > > > > > But how is that any worse than the current mutating operators? I think > > the operating principal is that methods generally work in place, > > functions make copies. The exceptions to this rule need to be noted. > Is that really the case? I was more under the impression that there > wasn't much rhyme nor reason to this. Let's do a quick dir(somearray) > and see what we get (I'll strip out the __XXX__ names): > > 'all', 'any', 'argmax', 'argmin', 'argsort', 'astype', 'base', > 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', > 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump', > 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', 'item', > 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', > 'nonzero', 'prod', 'ptp', 'put', 'putmask', 'ravel', 'real', 'repeat', > 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', > 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', 'swapaxes', > 'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', 'var', > 'view' There are certainly many methods where inplace operations make no sense. But for such things as conjugate and clip I think it should be preferred. Think of them as analogs of the "+=" operators that allow memory efficient inplace operations. At the moment there are too few such operators, IMHO, and that makes it hard to write memory efficient code when you want to do so. If you need a copy, the functional form should be the preferred way to go and can easily be implement by constructions like a.copy().sort(). Hmmm. Without taking too much time to go through these one at a time, > I'm pretty certain that they do not in general mutate things in place. > Probably at least half return, or can return new arrays, sometimes with > references to the original data, but new shapes, sometimes with > completely new data. In fact, other than sort, I'm not sure which of > these does mutate in place. > > -tim Chuck |
From: Tim H. <tim...@ie...> - 2006-08-29 21:03:52
|
Charles R Harris wrote: > > > On 8/29/06, *Tim Hochberg* <tim...@ie... > <mailto:tim...@ie...>> wrote: > > Charles R Harris wrote: > > Hi, > > > > On 8/29/06, *Tim Hochberg* <tim...@ie... > <mailto:tim...@ie...> > > <mailto:tim...@ie... <mailto:tim...@ie...>>> > wrote: > > > > > > -0.5 from me if what we're talking about here is having mutating > > methods > > return self rather than None. Chaining stuff is pretty, but > having > > methods that mutate self and return self looks like a source of > > elusive > > bugs to me. > > > > -tim > > > > > > But how is that any worse than the current mutating operators? I > think > > the operating principal is that methods generally work in place, > > functions make copies. The exceptions to this rule need to be noted. > Is that really the case? I was more under the impression that there > wasn't much rhyme nor reason to this. Let's do a quick dir(somearray) > and see what we get (I'll strip out the __XXX__ names): > > 'all', 'any', 'argmax', 'argmin', 'argsort', 'astype', 'base', > 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', > 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump', > 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', > 'item', > 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', > 'nonzero', 'prod', 'ptp', 'put', 'putmask', 'ravel', 'real', > 'repeat', > 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', > 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', > 'swapaxes', > 'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', > 'var', 'view' > > > There are certainly many methods where inplace operations make no > sense. But for such things as conjugate and clip I think it should be > preferred. Think of them as analogs of the "+=" operators that allow > memory efficient inplace operations. At the moment there are too few > such operators, IMHO, and that makes it hard to write memory efficient > code when you want to do so. If you need a copy, the functional form > should be the preferred way to go and can easily be implement by > constructions like a.copy().sort(). So let's make this clear; what you are proposing is more that just returning self for more operations. You are proposing changing the meaning of the existing methods to operate in place rather than return new objects. It seems awfully late in the day to be considering this being that we're on the edge of 1.0 and this would could break any existing numpy code that is out there. Just for grins let's look at the operations that could potentially benefit from being done in place. I think they are: byteswap clip conjugate round sort Of these, clip, conjugate and round support an 'out' argument like that supported by ufunces; byteswap has a boolean argument telling it whether to perform operations in place; and sort always operates in place. Noting that the ufunc-like methods (max, argmax, etc) appear to support the 'out' argument as well although it's not documented for most of them, it looks to me as if the two odd methods are byteswap and sort. The method situation could be made more consistent by swapping the boolean inplace flag in byteswapped with another 'out' argument and also having sort not operate in place by default, but also supply an out argument there. Thus: b = a.sort() # Returns a copy a.sort(out=a) # Sorts a in place a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort() in this case since we don't want to rewrite the sort routines) On the whole I think that this would be an improvement, but it may be too late in the day to actually implement it since 1.0 is coming up. There would still be a few methods (fill, put, etc) that modify the array in place and return None, but I haven't heard any complaints about those. -tim > > Hmmm. Without taking too much time to go through these one at a time, > I'm pretty certain that they do not in general mutate things in place. > Probably at least half return, or can return new arrays, sometimes > with > references to the original data, but new shapes, sometimes with > completely new data. In fact, other than sort, I'm not sure which of > these does mutate in place. > > -tim > > > Chuck > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: David M. C. <co...@ph...> - 2006-08-29 21:19:55
|
On Tue, 29 Aug 2006 14:03:39 -0700 Tim Hochberg <tim...@ie...> wrote: > Of these, clip, conjugate and round support an 'out' argument like that > supported by ufunces; byteswap has a boolean argument telling it > whether to perform operations in place; and sort always operates in > place. Noting that the ufunc-like methods (max, argmax, etc) appear to > support the 'out' argument as well although it's not documented for most > of them, it looks to me as if the two odd methods are byteswap and sort. > The method situation could be made more consistent by swapping the > boolean inplace flag in byteswapped with another 'out' argument and also > having sort not operate in place by default, but also supply an out > argument there. Thus: > > b = a.sort() # Returns a copy > a.sort(out=a) # Sorts a in place > a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort() > in this case since we don't want to rewrite the sort routines) Ugh. That's completely different semantics from sort() on lists, so I think it would be a source of bugs (at least, it would mean keeping two different ideas of .sort() in my head). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Fernando P. <fpe...@gm...> - 2006-08-29 21:25:11
|
On 8/29/06, David M. Cooke <co...@ph...> wrote: > On Tue, 29 Aug 2006 14:03:39 -0700 > Tim Hochberg <tim...@ie...> wrote: > > b = a.sort() # Returns a copy > > a.sort(out=a) # Sorts a in place > > a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort() > > in this case since we don't want to rewrite the sort routines) > > Ugh. That's completely different semantics from sort() on lists, so I think > it would be a source of bugs (at least, it would mean keeping two different > ideas of .sort() in my head). Agreed. Except where very well justified (such as slicing returning views for memory reasons), let's keep numpy arrays similar to native lists in their behavior... Special cases aren't special enough to break the rules. and all that :) Cheers, f |
From: Charles R H. <cha...@gm...> - 2006-08-29 21:20:29
|
Hi Tim, On 8/29/06, Tim Hochberg <tim...@ie...> wrote: > > Charles R Harris wrote: > > > > > > On 8/29/06, *Tim Hochberg* <tim...@ie... > > <mailto:tim...@ie...>> wrote: > > > > Charles R Harris wrote: > > > Hi, > > > > > > On 8/29/06, *Tim Hochberg* <tim...@ie... > > <mailto:tim...@ie...> > > > <mailto:tim...@ie... <mailto:tim...@ie...>>> > > wrote: > > > > > > > > > -0.5 from me if what we're talking about here is having > mutating > > > methods > > > return self rather than None. Chaining stuff is pretty, but > > having > > > methods that mutate self and return self looks like a source > of > > > elusive > > > bugs to me. > > > > > > -tim > > > > > > > > > But how is that any worse than the current mutating operators? I > > think > > > the operating principal is that methods generally work in place, > > > functions make copies. The exceptions to this rule need to be > noted. > > Is that really the case? I was more under the impression that there > > wasn't much rhyme nor reason to this. Let's do a quick > dir(somearray) > > and see what we get (I'll strip out the __XXX__ names): > > > > 'all', 'any', 'argmax', 'argmin', 'argsort', 'astype', 'base', > > 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', > 'copy', > > 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dtype', 'dump', > > 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', > > 'item', > > 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', > > 'nonzero', 'prod', 'ptp', 'put', 'putmask', 'ravel', 'real', > > 'repeat', > > 'reshape', 'resize', 'round', 'searchsorted', 'setfield', > 'setflags', > > 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', > > 'swapaxes', > > 'take', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', > > 'var', 'view' > > > > > > There are certainly many methods where inplace operations make no > > sense. But for such things as conjugate and clip I think it should be > > preferred. Think of them as analogs of the "+=" operators that allow > > memory efficient inplace operations. At the moment there are too few > > such operators, IMHO, and that makes it hard to write memory efficient > > code when you want to do so. If you need a copy, the functional form > > should be the preferred way to go and can easily be implement by > > constructions like a.copy().sort(). > So let's make this clear; what you are proposing is more that just > returning self for more operations. You are proposing changing the > meaning of the existing methods to operate in place rather than return > new objects. It seems awfully late in the day to be considering this > being that we're on the edge of 1.0 and this would could break any > existing numpy code that is out there. > > Just for grins let's look at the operations that could potentially > benefit from being done in place. I think they are: > byteswap > clip > conjugate > round > sort > > Of these, clip, conjugate and round support an 'out' argument like that > supported by ufunces; byteswap has a boolean argument telling it > whether to perform operations in place; and sort always operates in > place. Noting that the ufunc-like methods (max, argmax, etc) appear to > support the 'out' argument as well although it's not documented for most > of them, it looks to me as if the two odd methods are byteswap and sort. > The method situation could be made more consistent by swapping the > boolean inplace flag in byteswapped with another 'out' argument and also > having sort not operate in place by default, but also supply an out > argument there. Thus: > > b = a.sort() # Returns a copy > a.sort(out=a) # Sorts a in place > a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort() > in this case since we don't want to rewrite the sort routines) > > On the whole I think that this would be an improvement, but it may be > too late in the day to actually implement it since 1.0 is coming up. > There would still be a few methods (fill, put, etc) that modify the > array in place and return None, but I haven't heard any complaints about > those. That sounds like a good idea. One could keep the present behaviour in most cases by supplying a default value, although the out keyword might need a None value to indicate "copy" and a 'Self' value that means in place, or something like that, and then have all reasonable methods return values. That way the change would be transparent. The changes to the sort method would all be upper level, the low level sorting routines would remain unchanged. Methods are new, so code that needs to be changed is code specifically written for Numpy and now is the time to make these sort of decisions. -tim Chuck |
From: Alan G I. <ai...@am...> - 2006-08-29 21:38:28
|
On Tue, 29 Aug 2006, Tim Hochberg apparently wrote:=20 > b =3D a.sort() # Returns a copy=20 Given the extant Python vocabulary, this seems like a bad idea to me. (Better to call it 'sorted' in this case.) fwiw, Alan Isaac |
From: Alan G I. <ai...@am...> - 2006-08-29 19:40:55
|
On Tue, 29 Aug 2006, Tim Hochberg apparently wrote:=20 > -0.5 from me if what we're talking about here is having=20 > mutating methods return self rather than None. Chaining=20 > stuff is pretty, but having methods that mutate self and=20 > return self looks like a source of elusive bugs to me.=20 I believe this reasoning was the basis of sort (method,=20 returns None) and sorted (function, returns new object) in Python. =20 I believe that was a long and divisive discussion ... Cheers, Alan Isaac |
From: Travis O. <oli...@ee...> - 2006-08-29 20:43:32
|
Tim Hochberg wrote: >-0.5 from me if what we're talking about here is having mutating methods >return self rather than None. Chaining stuff is pretty, but having >methods that mutate self and return self looks like a source of elusive >bugs to me. > > I'm generally +0 on this idea (it seems like the clarity in writing comes largely for interactive users), and don't see much difficulty in separating the constructs. On the other hand, I don't see much problem in returning a reference to self either. I guess you are worried about the situation where you write b = a.sort() and think you have a new array, but in fact have a new reference to the already-altered 'a'? Hmm.. So, how is this different from the fact that b = a[1:10:3] already returns a reference to 'a' (I suppose in the fact that it actually returns a new object just one that happens to share the same data with a). However, I suppose that other methods don't return a reference to an already-altered object, do they. Tim's argument has moved me from +0 to -0 -Travis |
From: Travis O. <oli...@ee...> - 2006-08-29 20:36:25
|
Charles R Harris wrote: > > The 1.0rc1 release of NumPy will be mid September I suspect. > > Also, I recognize that the default-axis switch is a burden for > those who > have already transitioned code to use NumPy (for those just > starting out > it's not a big deal because of the compatibility layer). > > > I am curious as to why you made this switch. Not complaining, mind. New-comers to NumPy asked why there were different conventions on the methods and the functions for the axis argument. The only reason was backward compatibility. Because we had already created a compatibility layer for code transitioning, that seemed like a weak reason to keep the current behavior. The problem is it left early NumPy adopters (including me :-) ) in a bit of a bind, when it comes to code (like SciPy) that had already been converted. Arguments like Fernando's: "it's better to have a bit of pain now, then regrets later" also were convincing. -Travis |
From: Charles R H. <cha...@gm...> - 2006-08-29 19:25:17
|
Hi Fernando, On 8/29/06, Fernando Perez <fpe...@gm...> wrote: > > On 8/29/06, Charles R Harris <cha...@gm...> wrote: > > > Speaking of features, I wonder if more of the methods should return > > references. For instance, it might be nice to write something like: > > > > a.sort().searchsorted([...]) > > > > instead of making two statements out of it. > > +1 for more 'return self' at the end of methods which currently don't > return anything (well, we get the default None), as long as it's > sensible. I really like this 'message chaining' style of programming, > and it annoys me that much of the python stdlib gratuitously prevents > it by NOT returning self in places where it would be a perfectly > sensible thing to do. My pet peeve example: a.reverse() I would also like to see simple methods for "+=" operator and such. Then one could write x = a.copy().add(10) One could make a whole reverse polish translator out of such operations and a few parenthesis. I have in mind some sort of code optimizer. Chuck |
From: David M. C. <co...@ph...> - 2006-08-29 21:21:42
|
On Tue, 29 Aug 2006 13:25:14 -0600 "Charles R Harris" <cha...@gm...> wrote: > Hi Fernando, > > On 8/29/06, Fernando Perez <fpe...@gm...> wrote: > > > > On 8/29/06, Charles R Harris <cha...@gm...> wrote: > > > > > Speaking of features, I wonder if more of the methods should return > > > references. For instance, it might be nice to write something like: > > > > > > a.sort().searchsorted([...]) > > > > > > instead of making two statements out of it. > > > > +1 for more 'return self' at the end of methods which currently don't > > return anything (well, we get the default None), as long as it's > > sensible. I really like this 'message chaining' style of programming, > > and it annoys me that much of the python stdlib gratuitously prevents > > it by NOT returning self in places where it would be a perfectly > > sensible thing to do. -1, for the same reasons l.sort() doesn't (for a list l). For lists, the reason .sort() returns None is because it makes it clear it's a mutation. Returning self would make it look like it was doing a copy. > My pet peeve example: a.reverse() > > I would also like to see simple methods for "+=" operator and such. Then one > could write > > x = a.copy().add(10) There are: x = a.copy().__add__(10) or, for +=: x.__iadd__(10) > One could make a whole reverse polish translator out of such operations and > a few parenthesis. I have in mind some sort of code optimizer. It wouldn't be anymore efficient than the other way. For a code optimizer, you'll either have to parse the python code or use special objects (much like numexpr does), and then you might as well use the operators. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Christopher B. <Chr...@no...> - 2006-08-29 20:49:30
|
Fernando Perez wrote: > more 'return self' at the end of methods which currently don't > return anything (well, we get the default None), as long as it's > sensible. +1 Though I'm a bit hesitant: if it's really consistent that methods that alter the object in place NEVER return themselves, the there is something to be said for that. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Tim H. <tim...@ie...> - 2006-08-29 21:49:35
|
David M. Cooke wrote: > On Tue, 29 Aug 2006 14:03:39 -0700 > Tim Hochberg <tim...@ie...> wrote: > > >> Of these, clip, conjugate and round support an 'out' argument like that >> supported by ufunces; byteswap has a boolean argument telling it >> whether to perform operations in place; and sort always operates in >> place. Noting that the ufunc-like methods (max, argmax, etc) appear to >> support the 'out' argument as well although it's not documented for most >> of them, it looks to me as if the two odd methods are byteswap and sort. >> The method situation could be made more consistent by swapping the >> boolean inplace flag in byteswapped with another 'out' argument and also >> having sort not operate in place by default, but also supply an out >> argument there. Thus: >> >> b = a.sort() # Returns a copy >> a.sort(out=a) # Sorts a in place >> a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort() >> in this case since we don't want to rewrite the sort routines) >> > > Ugh. That's completely different semantics from sort() on lists, so I think > it would be a source of bugs (at least, it would mean keeping two different > ideas of .sort() in my head). > Thinking about it a bit more, I'd leave sort alone (returning None and all).. I was (over)reacting to changing to sort to return self, which makes the set of methods both less consistent within itself, less consistent with python and more error prone IMO, which seems the worst possibility. For the moment at least I do stand by the suggestion of changing byteswap to match the rest of the methods, as that would remove one outlier in the set methods. -tim |
From: Charles R H. <cha...@gm...> - 2006-08-29 22:42:25
|
On 8/29/06, Tim Hochberg <tim...@ie...> wrote: > > David M. Cooke wrote: > > On Tue, 29 Aug 2006 14:03:39 -0700 > > Tim Hochberg <tim...@ie...> wrote: > > > > > >> Of these, clip, conjugate and round support an 'out' argument like > that > >> supported by ufunces; byteswap has a boolean argument telling it > >> whether to perform operations in place; and sort always operates in > >> place. Noting that the ufunc-like methods (max, argmax, etc) appear to > >> support the 'out' argument as well although it's not documented for > most > >> of them, it looks to me as if the two odd methods are byteswap and > sort. > >> The method situation could be made more consistent by swapping the > >> boolean inplace flag in byteswapped with another 'out' argument and > also > >> having sort not operate in place by default, but also supply an out > >> argument there. Thus: > >> > >> b = a.sort() # Returns a copy > >> a.sort(out=a) # Sorts a in place > >> a.sort(out=c) # Sorts a into c (probably just equivalent to c = a.sort > () > >> in this case since we don't want to rewrite the sort routines) > >> > > > > Ugh. That's completely different semantics from sort() on lists, so I > think > > it would be a source of bugs (at least, it would mean keeping two > different > > ideas of .sort() in my head). > > > Thinking about it a bit more, I'd leave sort alone (returning None and > all).. I was (over)reacting to changing to sort to return self, which > makes the set of methods both less consistent within itself, less > consistent with python and more error prone IMO, which seems the worst > possibility. Here is Guido on sort: I'd like to explain once more why I'm so adamant that *sort*() shouldn't *return* 'self'. This comes from a coding style (popular in various other languages, I believe especially Lisp revels in it) where a series of side effects on a single object can be chained like this: x.compress().chop(y).*sort*(z) which would be the same as x.compress() x.chop(y) x.*sort*(z) I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second form makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to x (and that all calls are made for their side-effects), and not to something else. I'd like to reserve chaining for operations that *return* new values, like string processing operations: y = x.rstrip("\n").split(":").lower() There are a few standard library modules that encourage chaining of side-effect calls (pstat comes to mind). There shouldn't be any new ones; pstat slipped through my filter when it was weak. So it seems you are correct in light of the Python philosophy. For those operators that allow specification of out I would still like to see a special value that means inplace, I think it would make the code clearer. Of course, merely having the out flag violates Guido's intent. The idea seems to be that we want some way to avoid allocating new memory. So maybe byteswap should be inplace and return None, while a copyto method could be added. Then one would do a.copyto(b) b.byteswap() instead of b = a.byteswap() -tim Chuck |
From: Charles R H. <cha...@gm...> - 2006-08-29 23:17:37
|
On 8/29/06, Charles R Harris <cha...@gm...> wrote: > > On 8/29/06, Tim Hochberg <tim...@ie...> wrote: > > > David M. Cooke wrote: > > > On Tue, 29 Aug 2006 14:03:39 -0700 > > > Tim Hochberg <tim...@ie...> wrote: > > > > > > > > >> Of these, clip, conjugate and round support an 'out' argument like > > that > > >> supported by ufunces; byteswap has a boolean argument telling it > > >> whether to perform operations in place; and sort always operates in > > >> place. Noting that the ufunc-like methods (max, argmax, etc) appear > > to > > >> support the 'out' argument as well although it's not documented for > > most > > >> of them, it looks to me as if the two odd methods are byteswap and > > sort. > > >> The method situation could be made more consistent by swapping the > > >> boolean inplace flag in byteswapped with another 'out' argument and > > also > > >> having sort not operate in place by default, but also supply an out > > >> argument there. Thus: > > >> > > >> b = a.sort() # Returns a copy > > >> a.sort(out=a) # Sorts a in place > > >> a.sort(out=c) # Sorts a into c (probably just equivalent to c = > > a.sort() > > >> in this case since we don't want to rewrite the sort routines) > > >> > > > > > > Ugh. That's completely different semantics from sort() on lists, so I > > think > > > it would be a source of bugs (at least, it would mean keeping two > > different > > > ideas of .sort() in my head). > > > > > Thinking about it a bit more, I'd leave sort alone (returning None and > > all).. I was (over)reacting to changing to sort to return self, which > > makes the set of methods both less consistent within itself, less > > consistent with python and more error prone IMO, which seems the worst > > possibility. > > > Here is Guido on sort: > > I'd like to explain once more why I'm so adamant that * > sort*() shouldn't > *return* 'self'. > > This comes from a coding style (popular in various other languages, I > believe especially Lisp revels in it) where a series of side effects > > on a single object can be chained like this: > > x.compress().chop(y).*sort*(z) > > which would be the same as > > x.compress() > x.chop > (y) > x.*sort*(z) > > I find the chaining form a threat to readability; it requires that the > reader must be intimately familiar with each of the methods. The > > second form makes it clear that each of these calls acts on the same > object, and so even if you don't know the class and its methods very > well, you can understand that the second and third call are applied to > > x (and that all calls are made for their side-effects), and not to > something else. > > I'd like to reserve chaining for operations that *return* new values, > > like string processing operations: > > y = x.rstrip("\n").split(":").lower() > > There are a few standard library modules that encourage chaining of > side-effect calls (pstat comes to mind). There shouldn't be any new > > ones; pstat slipped through my filter when it was weak. > > So it seems you are correct in light of the Python philosophy. For those > operators that allow specification of out I would still like to see a > special value that means inplace, I think it would make the code clearer. Of > course, merely having the out flag violates Guido's intent. The idea seems > to be that we want some way to avoid allocating new memory. So maybe > byteswap should be inplace and return None, while a copyto method could be > added. Then one would do > > a.copyto(b) > b.byteswap() > > instead of > > b = a.byteswap() > > To expand on this a bit. Guidos philosophy, combined with a desire for memory efficiency, means that methods like byteswap and clip, which use the same memory, should operate inplace and return None. Thus, instead of b = a.clip(...) use b = a.copy() b.clip(...) Hey, it's a risc machine. If we did this, then functions could always return copies: b = clip(a,...) Chuck |
From: Fernando P. <fpe...@gm...> - 2006-08-29 23:24:55
|
On 8/29/06, Travis Oliphant <oli...@ie...> wrote: > > Hi all, > > Classes start for me next Tuesday, and I'm teaching a class for which I > will be using NumPy / SciPy extensively. I need to have a release of > these two (and hopefully matplotlib) that work with each other. > > Therefore, I'm going to make a 1.0b5 release of NumPy over the weekend > (probably Monday), and also get a release of SciPy out as well. At that > point, I'll only be available for bug-fixes to 1.0. Therefore, the next > release after 1.0b5 I would like to be 1.0rc1 (release-candidate 1). What's the status of these 'overwriting' messages? planck[/tmp]> python -c 'import scipy;scipy.test()' Overwriting info=<function info at 0x40ba748c> from scipy.misc (was <function info at 0x4080409c> from numpy.lib.utils) Overwriting fft=<function fft at 0x430ae33c> from scipy.fftpack.basic (was <module 'numpy.fft' from '/home/fperez/tmp/local/lib/python2.3/site-packages/numpy/fft/__init__.pyc'> from /home/fperez/tmp/local/lib/python2.3/site-packages/numpy/fft/__init__.pyc) ... I was under the impression you'd decided to quiet them out, but they seem to be making a comeback. Cheers, f |
From: Darren D. <dd...@co...> - 2006-08-31 14:37:27
|
On Tuesday 29 August 2006 19:24, Fernando Perez wrote: > On 8/29/06, Travis Oliphant <oli...@ie...> wrote: > > Hi all, > > > > Classes start for me next Tuesday, and I'm teaching a class for which I > > will be using NumPy / SciPy extensively. I need to have a release of > > these two (and hopefully matplotlib) that work with each other. > > > > Therefore, I'm going to make a 1.0b5 release of NumPy over the weekend > > (probably Monday), and also get a release of SciPy out as well. At that > > point, I'll only be available for bug-fixes to 1.0. Therefore, the next > > release after 1.0b5 I would like to be 1.0rc1 (release-candidate 1). > > What's the status of these 'overwriting' messages? > > planck[/tmp]> python -c 'import scipy;scipy.test()' > Overwriting info=<function info at 0x40ba748c> from scipy.misc (was > <function info at 0x4080409c> from numpy.lib.utils) > Overwriting fft=<function fft at 0x430ae33c> from scipy.fftpack.basic > (was <module 'numpy.fft' from > '/home/fperez/tmp/local/lib/python2.3/site-packages/numpy/fft/__init__.pyc' >> from > /home/fperez/tmp/local/lib/python2.3/site-packages/numpy/fft/__init__.pyc) > ... > > I was under the impression you'd decided to quiet them out, but they > seem to be making a comeback. Will these messages be included in NumPy-1.0? |