You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Alexander S. <a.s...@gm...> - 2002-06-12 22:50:02
|
Rick White <rl...@st...> writes: > Here is what I see as the fundamental problem with implementing slicing > in numarray using copy-on-demand instead views. > > Copy-on-demand requires the maintenance of a global list of all the > active views associated with a particular array buffer. Here is a > simple example: > > >>> a = zeros((5000,5000)) > >>> b = a[49:51,50] > >>> c = a[51:53,50] > >>> a[50,50] = 1 > > The assignment to a[50,50] must trigger a copy of the array b; > otherwise b also changes. On the other hand, array c does not need to > be copied since its view does not include element 50,50. You could > instead copy the array a -- but that means copying a 100 Mbyte array > while leaving the original around (since b and c are still using it) -- > not a good idea! Sure, if one wants do perform only the *minimum* amount of copying, things can get rather tricky, but wouldn't it be satisfactory for most cases if attempted modification of the original triggered the delayed copying of the "views" (lazy copies)? In those cases were it isn't satisfactory the user could still explicitly create real (i.e. alias-only) views. > > The bookkeeping can get pretty messy (if you care about memory usage, > which we definitely do). Consider this case: > > >>> a = zeros((5000,5000)) > >>> b = a[0:-10,0:-10] > >>> c = a[49:51,50] > >>> del a > >>> b[50,50] = 1 > > Now what happens? Either we can copy the array for b (which means two ``b`` and ``c`` are copied and then ``a`` is deleted. What does numarray currently keep of a if I do something like the above or: >>> b = a.flat[::-10000] >>> del a ? > copies of the huge (5000,5000) array exist, one used by c and the new > version used by b), or we can be clever and copy c instead. > > Even keeping track of the views associated with a buffer doesn't solve > the problem of an array that is passed to a C extension and is modified > in place. It would seem that passing an array into a C extension would > always require all the associated views to be turned into copies. > Otherwise we can't guarantee that views won't be modifed. Yes -- but only if the C extension is destructive. In that case the user might well be making a mistake in current Numeric if he has views and doesn't want them to be modified by the operation (of course he might know that the inplace operation does not affect the view(s) -- but wouldn't such cases be rather rare?). If he *does* want the views to be modified, he would obviously have to explictly specify them as such in a copy-on-demand scheme and in the other case he has been most likely been prevented from making an error (and can still explicitly use real views if he knows that the inplace operation on the original will not have undesired effects on the "views"). > > This kind of state information with side effects leads to a system that > is hard to develop, hard to debug, and really messes up the behavior of > the program (IMHO). It is *highly* desirable to avoid it if possible. Sure, copy-on-demand is an optimization and optmizations always mess up things. On the other hand, some optimizations also make "nicer" (e.g. less error-prone) semantics computationally viable, so it's often a question between ease and clarity of the implementation vs. ease and clarity of code that uses it. I'm not denying that too much complexity in the implementation also aversely affects users in the form of bugs and that in the particular case of delayed copying the user can also be affected directly by more difficult to understand ressource usage behavior (e.g. a[0] = 1 triggering a monstrous copying operation). Just out of curiosity, has someone already asked the octave people how much trouble it has caused them to implement copy on demand and whether matlab/octave users in practice do experience difficulties because of the more harder to predict runtime behavior (I think, like matlab, octave does copy-on-demand)? > > This is not to deny that copy-on-demand (with explicit views available > on request) would have some desirable advantages for the behavior of > the system. But we've worried these issues to death, and in the end > were convinced that slices == views provided the best compromise > between the desired behavior and a clean implementation. If the implementing copy-on-demand is too difficult and the resulting code would be too messy then this is certainly a valid reason to compromise on the current slicing behavior (especially since people like me who'd like to see copy-on-demand are unlikely to volunteer to implement it :) > Rick > > ------------------------------------------------------------------ > Richard L. White rl...@st... http://sundog.stsci.edu/rick/ > Space Telescope Science Institute > Baltimore, MD > > alex -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |
From: Paul F D. <pa...@pf...> - 2002-06-12 22:42:28
|
The users of Numeric at PCMDI found the 'view' semantics so annoying that they insisted their CS staff write a separate version of Numeric just to avoid it. We have since gotten out of that mess but that is the reason MA has copy semantics. Again, this is another issue where one is fighting over the right to 'own' the operator notation. I believe that copy semantics should win this one because it is a **proven fact** that scientists trip over it, and it is consistent with Python list semantics. People who really need view semantics could get it as previously suggested by someone, with something like x.sub[10:12, :]. There are now dead horses all over the landscape, and I for one am going to shut up. > -----Original Message----- > From: num...@li... > [mailto:num...@li...] On > Behalf Of Paul Barrett > Sent: Wednesday, June 12, 2002 8:54 AM > To: numpy-discussion > Subject: Re: [Numpy-discussion] RE: default axis for numarray > > > eric jones wrote: > > > > > I think the consistency with Python is less of an issue > than it seems. > > I wasn't aware that add.reduce(x) would generated the same > results as > > the Python version of reduce(add,x) until Perry pointed it > out to me. > > There are some inconsistencies between Python the language > and Numeric > > because the needs of the Numeric community. For instance, slices > > create views instead of copies as in Python. This was a > correct break > > with consistency in a very utilized area of Python because > of efficiency. > > <Begin Rant> > > I think consistency is an issue, particularly for novices. > You cite the issue > of slices creating views instead of copies as being the > correct choice. But > this decision is based solely on the perception that views > are 'inherently' more > efficient than copies and not on reasons of consistency or > usability. I (a > seasoned user) find view behavior to be annoying and have > been caught out on > this several times. For example, reversing in-place the > elements of any array > using slices, i.e. A = A[::-1], will give the wrong answer, > unless you > explicitly make a copy before doing the assignment. Whereas, > copy behavior will > do the right thing. I suggest that many novices will be > caught out by this and > similar examples, as I have been. Copy behavior for slices > can be just as > efficient as view behavior, if implemented as copy-on-write. > > The beauty of Python is that it allows the developer to spend > much more time on > consistency and usability issues than on implementation > issues. Sadly, I think > much of Numeric development is based solely on implementation > issues to the > detriment of consistency and usability. > > I don't have enough experience to definitely say whether > axis=0 should be > preferred over axis=-1 or vice versa. But is does appear that > for the most > general cases axis=0 is probably preferred. This is the > default for the APL and > J programming of which Numeric is based. Should we not > continue to follow their > lead? It might be nice to see a list of examples where > axis=0 is the preferred > default and the same for axis=-1. > > <End Rant> > > > -- > Paul Barrett, PhD Space Telescope Science Institute > Phone: 410-338-4475 ESS/Science Software Group > FAX: 410-338-4767 Baltimore, MD 21218 > > > _______________________________________________________________ > > Sponsored by: > ThinkGeek at http://www.ThinkGeek.com/ > _______________________________________________ > Numpy-discussion mailing list Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Tim H. <tim...@ie...> - 2002-06-12 20:37:26
|
From: "Chris Barker" <Chr...@no...> > I imagine there is a compelling reason that "and" and "or" have not been > overridden like the comparison operators, but it sure would be nice! Because it's not possible? "and" and "or" operate on the basis of the truth of their arguments, so the only way you can affect them is to overide __nonzero__. Since this is a unary operation, there is no way to get the equivalent of logical_and out of it. In practice I haven't found this to be much of a problem. Nearly every time I need to and two arrays together, "&" works just as well as logical_and. I can certainly imagin ecases where this isn't true, I just haven't run into them in practice. -tim |
From: Chris B. <Chr...@no...> - 2002-06-12 20:28:19
|
Reggie Dugard wrote: > This is not, in fact, a bug although I've fallen prey to the same > mistake myself. I'm assuming what you really wanted was to use > logical_and: > So the "and" is just returning its second argument, since both arguments > are considered "True" (containing at least 1 "True" element). I imagine there is a compelling reason that "and" and "or" have not been overridden like the comparison operators, but it sure would be nice! -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: Rick W. <rl...@st...> - 2002-06-12 20:24:48
|
On 12 Jun 2002, Travis Oliphant wrote: > I'd be interested to know what IDL does? Does it compare complex > numbers. Well, that was an interesting question with a surprising answer (at least to me, a long-time IDL user): (1) IDL allows comparisons of complex number using equality and inequality, but attempts to compare using GT, LT, etc. cause an illegal exception. (2) IDL sorts complex numbers by the amplitude. It ignores the phase. Numbers with the same amplitude and different phases are randomly ordered depending on their positions in the original array. > Matlab allows comparisons of complex numbers but just compares the real > part. I think this is reasonable. Often during a calculation of > limited precision one ends up with a complex number when the result is > in a "mathematically pure sense" real. So neither IDL nor Matlab has what I consider the desirable feature that the sort order be unique at least to the extent that equal values wind up next to each other in the sorted array. (Sorting by real value and then, for equal real values, by imaginary value would accomplish that.) Since complex numbers can't be fully ordered there is no single comparison function that can be plugged into a standard sort algorithm and give that result -- it would require a special complex sort algorithm. I guess if neither of the major array processing systems (that I know about) have this property in their complex sorts, it must not be *that* important. And since I've been using IDL for 13 years without discovering that complex greater-than comparisons are illegal, I guess that must not be an important property either (at least to me :-). My conclusion now is similar to Paul Dubois's suggestion -- we should allow equality comparisons and sorting. Beyond that I guess whatever other people want should carry the day, since it clearly doesn't matter to the sorts of things that I do with Numeric! Rick |
From: Travis O. <oli...@ie...> - 2002-06-12 19:02:24
|
I'd be interested to know what IDL does? Does it compare complex numbers. Matlab allows comparisons of complex numbers but just compares the real part. I think this is reasonable. Often during a calculation of limited precision one ends up with a complex number when the result is in a "mathematically pure sense" real. I guess I trust the user to realize that if they are comparing numbers they know what they mean --- (only real numbers are compared so the complex part is ignored). -Travis |
From: Reggie D. <re...@me...> - 2002-06-12 18:55:35
|
This is not, in fact, a bug although I've fallen prey to the same mistake myself. I'm assuming what you really wanted was to use logical_and: Python 2.2.1 (#1, Apr 29 2002, 15:21:53)=20 [GCC 3.0.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Numeric import * >>> a =3D array((1,1), 'b') >>> b =3D array((1,0), 'b') >>> logical_and(a,b) array([1, 0],'b') >>> logical_and(b,a) array([1, 0],'b') >>>=20 From the python documentation: "The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned." So the "and" is just returning its second argument, since both arguments are considered "True" (containing at least 1 "True" element). On Tue, 2002-06-11 at 23:27, Geza Groma wrote: > Using Numeric-21.0.win32-py2.2 I found this: >=20 > Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> from Numeric import * > >>> a =3D array((1, 1), 'b') > >>> b =3D array((1, 0), 'b') > >>> a and b > array([1, 0],'b') > >>> b and a > array([1, 1],'b') > >>> >=20 > It looks like a bug, or at least very weird. a&b and b&a work correctly. >=20 > -- > G=E9za Groma > Institute of Biophysics, > Biological Research Center of Hungarian Academy of Sciences > Temesv=E1ri krt.62. > 6726 Szeged > Hungary > phone: +36 62 432 232 > fax: +36 62 433 133 >=20 >=20 >=20 > _______________________________________________________________ >=20 > Sponsored by: > ThinkGeek at http://www.ThinkGeek.com/ > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion Reggie Dugard Merfin, LLC |
From: Konrad H. <hi...@cn...> - 2002-06-12 18:10:34
|
Paul Barrett <Ba...@st...> writes: > I think consistency is an issue, particularly for novices. You cite ... Finally a contribution that I can fully agree with :-) > I don't have enough experience to definitely say whether axis=0 should > be preferred over axis=-1 or vice versa. But is does appear that for > the most general cases axis=0 is probably preferred. This is the > default for the APL and J programming of which Numeric is based. > Should we not continue to follow their lead? It might be nice to see This the internal logic I referred to briefly earlier, but I didn't have the time to explain it in more detail. Now I have :-) The basic idea is that an array is seen as an array of array values. The N dimensions are split into two parts, the first N1 dimensions describe the shape of the "total" array, and the remaining N2=N-N1 dimensions describe the shape of the array-valued elements of the array. I suppose some examples will help: - A rank-1 array could be seen either as a vector of scalars (N1 = 1) or as a scalar containing a vector (N1 = 0), in practice there is no difference between these views. - A rank-2 array could be seen as a matrix (N1=2), as a vector of vectors (N1=1) or as a scalar containing a matrix (N1=0). The first and the last come down to the same, but the middle one doesn't. - A discretized vector field (i.e. one 3D vector value for each point on a 3D grid) is represented by a rank-6 array, with N1=3 and N2=3. Array operations are divided into two classes, "structural" and "element" operations. Element operations do something on each individual element of an array, returning a new array with the same "outer" shape, although the element shape may be different. Structural operations work on the outer shape, returning a new array with a possibly different outer shape but the same element shape. The most frequent element operations are addition, multiplication, etc., which work on scalar elements only. They need no axis argument at all. Element operations that work on rank-1 elements have a default axis of -1, I think FFT has been quoted as an example a few times. There are no element operations that work on higher-rank elements, but they are imaginable. A 2D FFT routine would default to axis=-2. Structural operations, which are by far the most frequent after scalar element operations, default to axis=0. They include reduction and accumulation, sorting, selection (take, repeat, ...) and some others. I hope this clarifies the choice of default axis arguments in the current NumPy. It is most definitely not arbitrary or accidental. If you follow the data layout principles explained above, you always never need to specify an explicit axis argument. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: <co...@ph...> - 2002-06-12 17:47:06
|
At some point, Geza Groma <gr...@nu...> wrote: > Using Numeric-21.0.win32-py2.2 I found this: > > Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> from Numeric import * >>>> a = array((1, 1), 'b') >>>> b = array((1, 0), 'b') >>>> a and b > array([1, 0],'b') >>>> b and a > array([1, 1],'b') >>>> > > It looks like a bug, or at least very weird. a&b and b&a work correctly. Nope. From the Python language reference (5.10 Boolean operations): The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned. Since in your case both a and b are true (they aren't zero-length sequences, etc.), the last value will be returned. It works for other types too, of course: Python 2.1.3 (#1, May 23 2002, 09:00:41) [GCC 3.1 (Debian)] on linux2 Type "copyright", "credits" or "license" for more information. >>> a = 'This is a' >>> b = 'This is b' >>> a and b 'This is b' >>> b and a 'This is a' >>> -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke |co...@mc... |
From: Perry G. <pe...@st...> - 2002-06-12 16:44:40
|
<Rick White writes> : > This kind of state information with side effects leads to a system that > is hard to develop, hard to debug, and really messes up the behavior of > the program (IMHO). It is *highly* desirable to avoid it if possible. > Rick beat me to the punch. The requirement for copy-on-demand definitely leads to a far more complex implementation with much more potential for misunderstood memory usage. You could do one small thing and suddenly force a spate of copies (perhaps cascading). There is no way we would taken on a redesign of Numeric with this requirement with the resources we have available. > This is not to deny that copy-on-demand (with explicit views available > on request) would have some desirable advantages for the behavior of > the system. But we've worried these issues to death, and in the end > were convinced that slices == views provided the best compromise > between the desired behavior and a clean implementation. > Rick's explanation doesn't really address the other position which is slices should force immediate copies. This isn't a difficult implementation issue by itself. But it does raise some related implementation questions. Supposing one does feel that views are a feature one wants even though they are not the default, it turns out that it isn't all that simple to obtain views without sacrificing ordinary slicing syntax to obtain a view. It is simple to obtain copies of view slices though. Slicing views may not be important to everyone. It is important to us (and others) and we do see a number of situations where forcing copies to operate on array subsets would be a serious performance problem. We did discuss this issue with Guido and he did not indicate that having different behavior on slicing with arrays would be a show stopper for acceptance into the Standard Library. We are also aware that there is no great consensus on this issue (even internally at STScI :-). Perry Greenfield |
From: Benyang T. <bt...@pa...> - 2002-06-12 16:34:45
|
The sum of an Int32 array and a Float32 array is a Float64 array, as shown by the following code: a = Numeric.array([1,2,3,4],'i') a.typecode(), a.itemsize() b = Numeric.array([1,2,3,4],'f') b.typecode(), b.itemsize() c=a+b c.typecode(), c.itemsize() >>> a = Numeric.array([1,2,3,4],'i') >>> a.typecode(), a.itemsize() ('i', 4) >>> >>> b = Numeric.array([1,2,3,4],'f') >>> b.typecode(), b.itemsize() ('f', 4) >>> c=a+b >>> c.typecode(), c.itemsize() ('d', 8) Why is the upcasting? I am using Linux/Pentium/python2.1/numpy20 . Thanks. Benyang Tang |
From: Rick W. <rl...@st...> - 2002-06-12 16:26:28
|
Here is what I see as the fundamental problem with implementing slicing in numarray using copy-on-demand instead views. Copy-on-demand requires the maintenance of a global list of all the active views associated with a particular array buffer. Here is a simple example: >>> a = zeros((5000,5000)) >>> b = a[49:51,50] >>> c = a[51:53,50] >>> a[50,50] = 1 The assignment to a[50,50] must trigger a copy of the array b; otherwise b also changes. On the other hand, array c does not need to be copied since its view does not include element 50,50. You could instead copy the array a -- but that means copying a 100 Mbyte array while leaving the original around (since b and c are still using it) -- not a good idea! The bookkeeping can get pretty messy (if you care about memory usage, which we definitely do). Consider this case: >>> a = zeros((5000,5000)) >>> b = a[0:-10,0:-10] >>> c = a[49:51,50] >>> del a >>> b[50,50] = 1 Now what happens? Either we can copy the array for b (which means two copies of the huge (5000,5000) array exist, one used by c and the new version used by b), or we can be clever and copy c instead. Even keeping track of the views associated with a buffer doesn't solve the problem of an array that is passed to a C extension and is modified in place. It would seem that passing an array into a C extension would always require all the associated views to be turned into copies. Otherwise we can't guarantee that views won't be modifed. This kind of state information with side effects leads to a system that is hard to develop, hard to debug, and really messes up the behavior of the program (IMHO). It is *highly* desirable to avoid it if possible. This is not to deny that copy-on-demand (with explicit views available on request) would have some desirable advantages for the behavior of the system. But we've worried these issues to death, and in the end were convinced that slices == views provided the best compromise between the desired behavior and a clean implementation. Rick ------------------------------------------------------------------ Richard L. White rl...@st... http://sundog.stsci.edu/rick/ Space Telescope Science Institute Baltimore, MD |
From: Paul B. <Ba...@st...> - 2002-06-12 15:54:57
|
eric jones wrote: > > I think the consistency with Python is less of an issue than it seems. > I wasn't aware that add.reduce(x) would generated the same results as > the Python version of reduce(add,x) until Perry pointed it out to me. > There are some inconsistencies between Python the language and Numeric > because the needs of the Numeric community. For instance, slices create > views instead of copies as in Python. This was a correct break with > consistency in a very utilized area of Python because of efficiency. <Begin Rant> I think consistency is an issue, particularly for novices. You cite the issue of slices creating views instead of copies as being the correct choice. But this decision is based solely on the perception that views are 'inherently' more efficient than copies and not on reasons of consistency or usability. I (a seasoned user) find view behavior to be annoying and have been caught out on this several times. For example, reversing in-place the elements of any array using slices, i.e. A = A[::-1], will give the wrong answer, unless you explicitly make a copy before doing the assignment. Whereas, copy behavior will do the right thing. I suggest that many novices will be caught out by this and similar examples, as I have been. Copy behavior for slices can be just as efficient as view behavior, if implemented as copy-on-write. The beauty of Python is that it allows the developer to spend much more time on consistency and usability issues than on implementation issues. Sadly, I think much of Numeric development is based solely on implementation issues to the detriment of consistency and usability. I don't have enough experience to definitely say whether axis=0 should be preferred over axis=-1 or vice versa. But is does appear that for the most general cases axis=0 is probably preferred. This is the default for the APL and J programming of which Numeric is based. Should we not continue to follow their lead? It might be nice to see a list of examples where axis=0 is the preferred default and the same for axis=-1. <End Rant> -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Group FAX: 410-338-4767 Baltimore, MD 21218 |
From: Pearu P. <pe...@ce...> - 2002-06-12 15:54:28
|
On Wed, 12 Jun 2002, Konrad Hinsen wrote: > > How do you sort an array of complex numbers if you can't compare them? > > You could for example sort by real part first and by imaginary part > second. That would be a well-defined sort order, but not a useful > definition of comparison in the mathematical sense. Releated discussion has been also in the scipy list. See the thread starting in http://www.scipy.org/site_content/mailman?fn=scipy-dev/2002-February/000364.html But here I would like to draw your attention to the suggestion that sort() function could take an optional argument that specifies the comparison method for complex numbers (for real numbers they are all equivalent). Here follows the releavant fragment of the message: http://www.scipy.org/site_content/mailman?fn=scipy-dev/2002-February/000366.html ... However, in different applications different conventions may be useful or reasonable for ordering complex numbers. Whatever is the convention, their mathematical correctness is irrelevant and this cannot be used as an argument for prefering one convention to another. I would propose providing number of efficient comparison methods for complex (or any) numbers that users may use in sort functions as an optional argument. For example, scipy.sort([2,1+2j],cmpmth='abs') -> [1+2j,2] # sorts by abs value scipy.sort([2,1+2j],cmpmth='real') -> [2,1+2j] # sorts by real part scipy.sort([2,1+2j],cmpmth='realimag') # sorts by real then by imag scipy.sort([2,1+2j],cmpmth='imagreal') # sorts by imag then by real scipy.sort([2,1+2j],cmpmth='absangle') # sorts by abs then by angle etc. scipy.sort([2,1+2j],cmpfunc=<user defined comparison function>) Note that scipy.sort([-1,1],cmpmth='absangle') -> [1,-1] which also demonstrates the arbitrariness of sorting complex numbers. ... Regards, Pearu |
From: Paul F D. <pa...@pf...> - 2002-06-12 15:44:19
|
Using the term "comparison operators" is too loose and is causing a communication problem here. There are these comparison operators == and != (group 1) <, >, <=, and >= (group 2) For complex numbers it is easy to define the operators in group 1: x == y iff x.real == y.real and x.imag == y.imag. And, x != y iff (not x == y). I hardly think any other definition would be conceivable. The utility of this definition is questionable, as in most instances one should be making these comparisons with a tolerance, but there at least are cases when it makes sense. For group 2, there are a variety of possible definitions. Just to name three possible > definitions, the greater magnitude, the greater phase mod 2pi, or a radix-type order e.g., x > y if x.real > y.real or (x.real == y.real and x.imag > y.imag). A person can always define a function my_greater_than (c1, c2) to embody one of these definitions, and use it as an argument to a sort routine that takes a function argument to tell it how to sort. What you are arguing about is whether some particular version of this comparison should be "blessed" by attaching it to the operator ">". I do not think one of the definitions is such a clear winner that it should be blessed -- it would mean a casual reader could not guess what the operator means, and ">" does not have a doc string. Therefore I oppose doing so. |
From: Alexander S. <a.s...@gm...> - 2002-06-12 15:43:45
|
"eric jones" <er...@en...> writes: > > Couldn't one have both consistency *and* efficiency by implementing a > > copy-on-demand scheme (which is what matlab does, if I'm not entirely > > mistaken; a real copy gets only created if either the original or the > > 'copy' > > is modified)? > > Well, slices creating copies is definitely a bad idea (which is what I > have heard proposed before) -- finite difference calculations (and > others) would be very slow with this approach. Your copy-on-demand > suggestion might work though. Its implementation would be more complex, > but I don't think it would require cooperation from the Python core.? > It could be handled in the ufunc code. It would also require extension > modules to make copies before they modified any values. > > Copy-on-demand doesn't really fit with python's 'assignments are > references" approach to things though does it? Using foo = bar in > Python and then changing an element of foo will also change bar. So, I My suggestion wouldn't conflict with any standard python behavior -- indeed the main motivation would be to have numarray conform to standard python behavior -- ``foo = bar`` and ``foo = bar[20:30]`` would behave exactly as for other sequences in python. The first one creates an alias to bar and in the second one the indexing operation creates a copy of part of the sequence which is then aliased to foo. Sequences are atomic in python, in the sense that indexing them creates a new object, which I think is not in contradiction to python's nice and consistent 'assignments are references' behavior. > guess there would have to be a distinction made here. This adds a > little more complexity. > > Personally, I like being able to pass views around because it allows for > efficient implementations. The option to pass arrays into extension > function and edit them in-place is very nice. Copy-on-demand might > allow for equal efficiency -- I'm not sure. I don't know how much of a performance drawback copy-on-demand would have when compared to views one -- I'd suspect it would be not significant, the fact that the runtime behavior becomes a bit more difficult to predict might be more of a drawback (but then I haven't heard matlab users complain and one could always force an eager copy). Another reason why I think a copy-on-demand scheme for slicing operations might be attractive is that I'd suspect one could gain significant benefits from doing other operations in a lazy fashion (plus optionally caching some results), too (transposing seems to cause in principle unnecessary copies at least in some cases at the moment). > > I haven't found the current behavior very problematic in practice and > haven't seen that it as a major stumbling block to new users. I'm happy From my experience not even all people who use Numeric quite a lot are *aware* that the slicing behavior differs from python sequences. You might be right that in practice aliasing doesn't cause too many problems (as long as one sticks to arrays -- it certainly makes it harder to write code that operates on slices of generic sequence types) -- I'd really be interested to know whether there are cases where people have spent a long time to track down a bug caused by the view behavior. > with status quo on this. But, if copy-on-demand is truly efficient and > didn't make extension writing a nightmare, I wouldn't complain about the > change either. I have a feeling the implementers of numarray would > though. :-) And talk about having to modify legacy code... Since the vast majorities of slicing operations are currently not done to create views that are depedently modified, the backward incompatibility might not affect that much code. You are right though, that if Perry and the other numarray implementors don't think that copy-on-demand could be worthwhile the bother then its unlikely to happen. > > > forwards-compatibility). I would also suspect that this would make it > *a > > lot* > > easier to get numarray (or parts of it) into the core, but this is > just a > > guess. > > I think the two things Guido wants for inclusion of numarray is a > consensus from our community on what we want, and (more importantly) a > comprehensible code base. :-) If Numeric satisfied this 2nd condition, > it might already be slated for inclusion... The 1st is never easy with > such varied opinions -- I've about concluded that Konrad and I are > anti-particles :-) -- but I hope it will happen. As I said I can only guess about the politics involved, but I would think that before a significant piece of code such as numarray is incorporated into the core a relevant pep will be discussed in the newsgroup and that many people will feel more confortable about incorporating something into core-python that doesn't deviate significantly from standard behavior (i.e. doesn't view-slice), especially if it mainly caters to a rather specialized audience. But Guido obviously has the last word on those issues and if he doesn't have a problem either way than either way then as long as the community is undivided it shouldn't be an obstacle for inclusion. I agree that division of the community might pose the most significant problems -- MA for example *does* create copies on indexing if I'm not mistaken and the (desirable) transition process from Numeric to numarray also poses not insignificant difficulties and risks, especially since there now are quite a few important projects (not least of them scipy) that are build on top of Numeric and will have to be incorporated in the transition if numarray is to take over. Everything seems in a bit of a limbo right now. I'm currently working on a (fully-featured) matrix class that I'd like to work with both Numeric and numarray (and also scipy where available) more or less transparently for the user, which turns out to be much more difficult than I would have thought. alex -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |
From: Konrad H. <hi...@cn...> - 2002-06-12 14:55:15
|
> The comparison operators could be defined to operate on the > magnitudes only. In this case you would get the kind of ugly > result that two complex numbers with the same magnitude but > different phases would be equal. If you want to compare magnitudes, you can do that explicitly without much effort. > How do you sort an array of complex numbers if you can't compare them? You could for example sort by real part first and by imaginary part second. That would be a well-defined sort order, but not a useful definition of comparison in the mathematical sense. Konrad -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Scott R. <ra...@ph...> - 2002-06-12 14:26:18
|
On Wed, Jun 12, 2002 at 10:32:12AM +0200, Konrad Hinsen wrote: > Scott Ransom <ra...@ph...> writes: > > > On June 11, 2002 04:56 pm, you wrote: > > > One can make a case for allowing == and != for complex arrays, but > > > > just doesn't make sense and should not be allowed. > > > > It depends if you think of complex numbers in phasor form or not. In phasor > > form, the amplitude of the complex number is certainly something that you > > could compare with > or < -- and in my opinion, that seems like a reasonable > > Sure, but that doesn't give a full order relation for complex numbers. > Two different numbers with equal magnitude would be neither equal nor > would one be larger than the other. The comparison operators could be defined to operate on the magnitudes only. In this case you would get the kind of ugly result that two complex numbers with the same magnitude but different phases would be equal. Complex comparisons of this type could be quite useful to those (like me) who are do lots of Fourier domain signal processing. > I agree with Paul that complex comparison should not be allowed. On the > other hand, Perry's argument about sorting makes sense as well. Is there > anything that prevents us from permitting arraysort() on complex arrays > but not the comparison operators? How do you sort an array of complex numbers if you can't compare them? Scott -- Scott M. Ransom Address: McGill Univ. Physics Dept. Phone: (514) 398-6492 3600 University St., Rm 338 email: ra...@ph... Montreal, QC Canada H3A 2T8 GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 |
From: Konrad H. <hi...@cn...> - 2002-06-12 08:54:16
|
"eric jones" <er...@en...> writes: > others) would be very slow with this approach. Your copy-on-demand > suggestion might work though. Its implementation would be more complex, > but I don't think it would require cooperation from the Python core.? It wouldn't, and I am not sure the implementation would be much more complex, but then I haven't tried. Having both copy on demand and views is difficult, both conceptually and implementationwise, but with copy-on-demand, views become less important. > Copy-on-demand doesn't really fit with python's 'assignments are > references" approach to things though does it? Using foo = bar in > Python and then changing an element of foo will also change bar. So, I That would be true as well with copy-on-demand arrays, as foo and bar would be the same object. Semantically, copy-on-demand would be equivalent to copying when slicing, which is exactly Python's behaviour for lists. > So, how about add.reduce() keep axis=0 to match the behavior of Python, > but sum() and friends defaulted to axis=-1 to match the rest of the That sounds like the most arbitrary inconsistency - add.reduce and sum are synonyms for me. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Konrad H. <hi...@cn...> - 2002-06-12 08:35:12
|
Scott Ransom <ra...@ph...> writes: > On June 11, 2002 04:56 pm, you wrote: > > One can make a case for allowing == and != for complex arrays, but > > > just doesn't make sense and should not be allowed. > > It depends if you think of complex numbers in phasor form or not. In phasor > form, the amplitude of the complex number is certainly something that you > could compare with > or < -- and in my opinion, that seems like a reasonable Sure, but that doesn't give a full order relation for complex numbers. Two different numbers with equal magnitude would be neither equal nor would one be larger than the other. I agree with Paul that complex comparison should not be allowed. On the other hand, Perry's argument about sorting makes sense as well. Is there anything that prevents us from permitting arraysort() on complex arrays but not the comparison operators? Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Geza G. <gr...@nu...> - 2002-06-12 06:29:34
|
Using Numeric-21.0.win32-py2.2 I found this: Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from Numeric import * >>> a =3D array((1, 1), 'b') >>> b =3D array((1, 0), 'b') >>> a and b array([1, 0],'b') >>> b and a array([1, 1],'b') >>> It looks like a bug, or at least very weird. a&b and b&a work correctly. -- G=E9za Groma Institute of Biophysics, Biological Research Center of Hungarian Academy of Sciences Temesv=E1ri krt.62. 6726 Szeged Hungary phone: +36 62 432 232 fax: +36 62 433 133 |
From: eric j. <er...@en...> - 2002-06-12 05:27:55
|
> "eric jones" <er...@en...> writes: > > > > I think the consistency with Python is less of an issue than it seems. > > I wasn't aware that add.reduce(x) would generated the same results as > > the Python version of reduce(add,x) until Perry pointed it out to me. > > There are some inconsistencies between Python the language and Numeric > > because the needs of the Numeric community. For instance, slices create > > views instead of copies as in Python. This was a correct break with > > consistency in a very utilized area of Python because of efficiency. > > Ahh, a loaded example ;) I always thought that Numeric's view-slicing is a > fairly problematic deviation from standard Python behavior and I'm not > entirely sure why it needs to be done that way. > > Couldn't one have both consistency *and* efficiency by implementing a > copy-on-demand scheme (which is what matlab does, if I'm not entirely > mistaken; a real copy gets only created if either the original or the > 'copy' > is modified)? Well, slices creating copies is definitely a bad idea (which is what I have heard proposed before) -- finite difference calculations (and others) would be very slow with this approach. Your copy-on-demand suggestion might work though. Its implementation would be more complex, but I don't think it would require cooperation from the Python core.? It could be handled in the ufunc code. It would also require extension modules to make copies before they modified any values. Copy-on-demand doesn't really fit with python's 'assignments are references" approach to things though does it? Using foo = bar in Python and then changing an element of foo will also change bar. So, I guess there would have to be a distinction made here. This adds a little more complexity. Personally, I like being able to pass views around because it allows for efficient implementations. The option to pass arrays into extension function and edit them in-place is very nice. Copy-on-demand might allow for equal efficiency -- I'm not sure. I haven't found the current behavior very problematic in practice and haven't seen that it as a major stumbling block to new users. I'm happy with status quo on this. But, if copy-on-demand is truly efficient and didn't make extension writing a nightmare, I wouldn't complain about the change either. I have a feeling the implementers of numarray would though. :-) And talk about having to modify legacy code... > The current behavior seems not just problematic because it > breaks consistency and hence user expectations, it also breaks code that > is > written with more pythonic sequences in mind (in a potentially hard to > track > down manner) and is, IMHO generally undesirable and error-prone, for > pretty > much the same reasons that dynamic scope and global variables are > generally > undesirable and error-prone -- one can unwittingly create intricate > interactions between remote parts of a program that can be very difficult > to > track down. > > Obviously there *are* cases where one really wants a (partial) view of an > existing array. It would seem to me, however, that these cases are > exceedingly > rare (In all my Numeric code I'm only aware of one instance where I > actually > want the aliasing behavior, so that I can manipulate a large array by > manipulating its views and vice versa). Thus rather than being the > default > behavior, I'd rather see those cases accommodated by a special syntax that > makes it explicit that an alias is desired and that care must be taken > when > modifying either the original or the view (e.g. one possible syntax would > be > ``aliased_vector = m.view[:,1]``). Again I think the current behavior is > somewhat analogous to having variables declared in global (or dynamic) > scope > by default which is not only error-prone, it also masks those cases where > global (or dynamic) scope *is* actually desired and necessary. > > It might be that the problems associated with a copy-on-demand scheme > outweigh the error-proneness, the interface breakage that the deviation > from > standard python slicing behavior causes, but otherwise copying on slicing > would be an backwards incompatibility in numarray I'd rather like to see > (especially since one could easily add a view attribute to Numeric, for > forwards-compatibility). I would also suspect that this would make it *a > lot* > easier to get numarray (or parts of it) into the core, but this is just a > guess. I think the two things Guido wants for inclusion of numarray is a consensus from our community on what we want, and (more importantly) a comprehensible code base. :-) If Numeric satisfied this 2nd condition, it might already be slated for inclusion... The 1st is never easy with such varied opinions -- I've about concluded that Konrad and I are anti-particles :-) -- but I hope it will happen. > > > > > I don't see choosing axis=-1 as a break with Python -- multi-dimensional > > arrays are inherently different and used differently than lists of lists > > in Python. Further, reduce() is a "corner" of the Python language that > > has been superceded by list comprehensions. Choosing an alternative > > Guido might nowadays think that adding reduce was as mistake, so in that > sense > it might be a "corner" of the python language (although some people, > including > me, still rather like using reduce), but I can't see how you can generally > replace reduce with anything but a loop. Could you give an example? Your right. You can't do it without a loop. List comprehensions only supercede filter and map since they always return a list. I think reduce is here to stay. And, like you, I would actually be disappointed to see it go (I like lambda too...) The point is that I wouldn't choose the definition of sum() or product() based on the behavior of Python's reduce operator. Hmmm. So I guess that is key -- its really these *function* interfaces that I disagree with. So, how about add.reduce() keep axis=0 to match the behavior of Python, but sum() and friends defaulted to axis=-1 to match the rest of the library functions? It does break with consistency across the library, so I think it is sub-optimal. However, the distinction is reasonably clear and much less likely to cause confusion. It also allows FFT and future modules (wavelets or whatever) operate across the fastest axis by default while conforming to an intuitive standard. take() and friends would also become axis=-1 for consistency with all other functions. Would this be a reasonable compromise? eric > > > alex > > -- > Alexander Schmolck Postgraduate Research Student > Department of Computer Science > University of Exeter > A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |
From: Reggie D. <re...@me...> - 2002-06-11 23:30:20
|
I vote for number 3 as well. As Paul already noted, his MA module already does something similar to this and I've found that very handy while working interactively. On Tue, 2002-06-11 at 11:43, Perry Greenfield wrote: > ... > Yet on the other hand, it is undeniably convenient to use > repr (by typing a variable) for small arrays interactively > rather than using a print statement. This leads to 3 possible > proposals for handling repr: > > 1) Do what is done now, always print a string that when > eval'ed will recreate the array. > > 2) Only give summary information for the array regardless of > its size. > > 3) Print the array if it has fewer than THRESHOLD number of > elements, otherwise print a summary. THRESHOLD may be adjusted > by the user. > >... |
From: Travis O. <oli...@ee...> - 2002-06-11 23:25:20
|
> 3) Print the array if it has fewer than THRESHOLD number of > elements, otherwise print a summary. THRESHOLD may be adjusted > by the user. I think this is best. I don't believe the convention of repr is critical to numarray. -Travis |
From: Alexander S. <a.s...@gm...> - 2002-06-11 23:02:50
|
"eric jones" <er...@en...> writes: > I think the consistency with Python is less of an issue than it seems. > I wasn't aware that add.reduce(x) would generated the same results as > the Python version of reduce(add,x) until Perry pointed it out to me. > There are some inconsistencies between Python the language and Numeric > because the needs of the Numeric community. For instance, slices create > views instead of copies as in Python. This was a correct break with > consistency in a very utilized area of Python because of efficiency. Ahh, a loaded example ;) I always thought that Numeric's view-slicing is a fairly problematic deviation from standard Python behavior and I'm not entirely sure why it needs to be done that way. Couldn't one have both consistency *and* efficiency by implementing a copy-on-demand scheme (which is what matlab does, if I'm not entirely mistaken; a real copy gets only created if either the original or the 'copy' is modified)? The current behavior seems not just problematic because it breaks consistency and hence user expectations, it also breaks code that is written with more pythonic sequences in mind (in a potentially hard to track down manner) and is, IMHO generally undesirable and error-prone, for pretty much the same reasons that dynamic scope and global variables are generally undesirable and error-prone -- one can unwittingly create intricate interactions between remote parts of a program that can be very difficult to track down. Obviously there *are* cases where one really wants a (partial) view of an existing array. It would seem to me, however, that these cases are exceedingly rare (In all my Numeric code I'm only aware of one instance where I actually want the aliasing behavior, so that I can manipulate a large array by manipulating its views and vice versa). Thus rather than being the default behavior, I'd rather see those cases accommodated by a special syntax that makes it explicit that an alias is desired and that care must be taken when modifying either the original or the view (e.g. one possible syntax would be ``aliased_vector = m.view[:,1]``). Again I think the current behavior is somewhat analogous to having variables declared in global (or dynamic) scope by default which is not only error-prone, it also masks those cases where global (or dynamic) scope *is* actually desired and necessary. It might be that the problems associated with a copy-on-demand scheme outweigh the error-proneness, the interface breakage that the deviation from standard python slicing behavior causes, but otherwise copying on slicing would be an backwards incompatibility in numarray I'd rather like to see (especially since one could easily add a view attribute to Numeric, for forwards-compatibility). I would also suspect that this would make it *a lot* easier to get numarray (or parts of it) into the core, but this is just a guess. > > I don't see choosing axis=-1 as a break with Python -- multi-dimensional > arrays are inherently different and used differently than lists of lists > in Python. Further, reduce() is a "corner" of the Python language that > has been superceded by list comprehensions. Choosing an alternative Guido might nowadays think that adding reduce was as mistake, so in that sense it might be a "corner" of the python language (although some people, including me, still rather like using reduce), but I can't see how you can generally replace reduce with anything but a loop. Could you give an example? alex -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |