You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Paul D. <pfd...@gm...> - 2006-11-14 04:13:41
|
Unfortunately, where does not have the behavior of not evaluating the second argument where the first one is true. That would be nice (if the speed were ok) but it isn't possible unless where is built into the language, since where doesn't even get called until the arguments have all been calculated. It was intended as having a different use than avoiding zero-divide. The ma package can calculate 1/a without problem, resulting in masked results where a is 0.0. I put where into numeric after it had proved invaluable in Basis, even though it has this limitation; it takes care of doing both merge and compress. On 13 Nov 2006 20:02:31 -0800, Tim Hochberg <tim...@ie...> wrote: > > val...@bl... wrote: > > Using numpy 1.0, why does > > > > > > > > > >>>> a = numpy.array([0.0,1.0,2.0],'d') > >>>> > > > > > >>>> numpy.where(a > >>>> > > == 0.0,1,1/a) > > > > > > > > give the correct result, but with the warning "Warning: divide > > by zero encountered in divide"? > > > > > > > > ? I thought that the point of where was > > that the second expression is never used for the elements where the > condition > > evaluates true. > > > > > > > > If this is the desired behavior, is there a way to suppress > > the warning? > > > Robert Kern has already pointed you to seterr. If you are using Python > 2.5, you also have the option using the with statement, which is more > convenient if you want to temporarily change the error state. You'll > need a "from __future__ import with_statement" at the top of your file. > Then you can temporarily disable errors as shown: > > >>> a = zeros([3]) > >>> b = 1/a # This will warn > Warning: divide by zero encountered in divide > >>> with errstate(divide='ignore'): # But this will not > ... c = 1/a > ... > >>> d = 1/a # And this will warn again since the error state is > restored when we exit the block > Warning: divide by zero encountered in divide > > > Another little tidbit: this is not as general as where, and could > probably be considered a little too clever to be clear, but: > > b = 1 / (a + (a==0.0)) > > is faster than using where in this particular case and sidesteps the > divide by zero issue altogether. > > -tim > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > |
From: Tim H. <tim...@ie...> - 2006-11-14 03:10:31
|
val...@bl... wrote: > Using numpy 1.0, why does > > > > >>>> a = numpy.array([0.0,1.0,2.0],'d') >>>> > > >>>> numpy.where(a >>>> > == 0.0,1,1/a) > > > > give the correct result, but with the warning "Warning: divide > by zero encountered in divide"? > > > > ? I thought that the point of where was > that the second expression is never used for the elements where the condition > evaluates true. > > > > If this is the desired behavior, is there a way to suppress > the warning? > Robert Kern has already pointed you to seterr. If you are using Python 2.5, you also have the option using the with statement, which is more convenient if you want to temporarily change the error state. You'll need a "from __future__ import with_statement" at the top of your file. Then you can temporarily disable errors as shown: >>> a = zeros([3]) >>> b = 1/a # This will warn Warning: divide by zero encountered in divide >>> with errstate(divide='ignore'): # But this will not ... c = 1/a ... >>> d = 1/a # And this will warn again since the error state is restored when we exit the block Warning: divide by zero encountered in divide Another little tidbit: this is not as general as where, and could probably be considered a little too clever to be clear, but: b = 1 / (a + (a==0.0)) is faster than using where in this particular case and sidesteps the divide by zero issue altogether. -tim |
From: Charles R H. <cha...@gm...> - 2006-11-14 01:54:24
|
On 11/13/06, Mathew Yeates <my...@jp...> wrote: > > Not sure. When I run "top" I see the line > Memory: 6016M real, 2895M free, 4174M swap in use, 2427M swap free > > Its the second number that drops like a rock. Plus, it never comes back > until I quit the program. This is a great way to turn my machine into a > nice desk ornament! try free: $[charris@fedora ~]$ free total used free shared buffers cached Mem: 1034952 995212 39740 0 126616 328124 -/+ buffers/cache: 540472 494480 Swap: 979956 152 979804 The second line under 'used' shows actual program use, i.e. used - buffers - cached from the first line. But if your system is dying I don't know what to say. My knowledge of these things is a bit sketchy. Chuck |
From: Robert K. <rob...@gm...> - 2006-11-14 01:44:09
|
val...@bl... wrote: > ? I thought that the point of where was > that the second expression is never used for the elements where the condition > evaluates true. It is not used, but the expression still gets evaluated. There's really no way around that. > If this is the desired behavior, is there a way to suppress > the warning? In [1]: from numpy import * In [2]: a = zeros(3) In [3]: 1/a Warning: divide by zero encountered in divide Warning: invalid value encountered in double_scalars Out[3]: array([ inf, inf, inf]) In [4]: seterr(divide='ignore', invalid='ignore') Out[4]: {'divide': 'print', 'invalid': 'print', 'over': 'print', 'under': 'ignore'} In [5]: 1/a Out[5]: array([ inf, inf, inf]) In [6]: seterr? Type: function Base Class: <type 'function'> Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.0.1.dev3432-py2.5-macosx-10.4-i386.egg /numpy/core/numeric.py Definition: seterr(all=None, divide=None, over=None, under=None, invalid=None) Docstring: Set how floating-point errors are handled. Valid values for each type of error are the strings "ignore", "warn", "raise", and "call". Returns the old settings. If 'all' is specified, values that are not otherwise specified will be set to 'all', otherwise they will retain their old values. Note that operations on integer scalar types (such as int16) are handled like floating point, and are affected by these settings. Example: >>> seterr(over='raise') {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} >>> seterr(all='warn', over='raise') {'over': 'raise', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} >>> int16(32000) * int16(3) Traceback (most recent call last): File "<stdin>", line 1, in ? FloatingPointError: overflow encountered in short_scalars >>> seterr(all='ignore') {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore', 'under': 'ignore'} -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: <val...@bl...> - 2006-11-14 01:32:31
|
Using numpy 1.0, why does >>> a = numpy.array([0.0,1.0,2.0],'d') >>> numpy.where(a == 0.0,1,1/a) give the correct result, but with the warning "Warning: divide by zero encountered in divide"? ? I thought that the point of where was that the second expression is never used for the elements where the condition evaluates true. If this is the desired behavior, is there a way to suppress the warning? Thanks! Michele |
From: Mathew Y. <my...@jp...> - 2006-11-13 22:23:54
|
Not sure. When I run "top" I see the line Memory: 6016M real, 2895M free, 4174M swap in use, 2427M swap free Its the second number that drops like a rock. Plus, it never comes back until I quit the program. This is a great way to turn my machine into a nice desk ornament! Mathew Charles R Harris wrote: > > > On 11/13/06, *Mathew Yeates* <my...@jp... > <mailto:my...@jp...>> wrote: > > I have a memory mapped array. When I try and assign data, my mem usage > goes through the roof. > > > Is it cache memory or process memory? I think a memory mapped file > will keep pages cached in memory until the space is needed so as to > avoid unneeded io. At least that is what happens in linux. > > Chuck > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Charles R H. <cha...@gm...> - 2006-11-13 22:03:43
|
On 11/13/06, Mathew Yeates <my...@jp...> wrote: > > I have a memory mapped array. When I try and assign data, my mem usage > goes through the roof. Is it cache memory or process memory? I think a memory mapped file will keep pages cached in memory until the space is needed so as to avoid unneeded io. At least that is what happens in linux. Chuck |
From: Stefan v. d. W. <st...@su...> - 2006-11-13 21:50:38
|
On Mon, Nov 13, 2006 at 02:29:11PM -0700, Tim Hochberg wrote: > Erin Sheldon wrote: > > On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > > =20 > >> Here's one more approach that's marginally faster than the map based > >> solution and also won't chew up an extra memory since it's based on = from > >> iter: > >> > >> numpy.fromiter(itertools.imap(tuple, results), dtype=3Dmydescriptor, > >> count=3Dlen(results)) > >> =20 > > > > Yes, this is what I need. BTW, there is no doc string for this.=20 > Yeah, I noticed that too. I swear I wrote one at one point I'm not sure= =20 > what happened to it. Sigh. A typo slipped into add_newdocs.py. Fixed in SVN. Cheers St=E9fan |
From: Tim H. <tim...@ie...> - 2006-11-13 21:29:36
|
Erin Sheldon wrote: > On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > >> Here's one more approach that's marginally faster than the map based >> solution and also won't chew up an extra memory since it's based on from >> iter: >> >> numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, >> count=len(results)) >> > > Yes, this is what I need. BTW, there is no doc string for this. Yeah, I noticed that too. I swear I wrote one at one point I'm not sure what happened to it. Sigh. > I just added > an example the the Numpy Example List. > Great. -tim |
From: Mathew Y. <my...@jp...> - 2006-11-13 21:23:37
|
I have a memory mapped array. When I try and assign data, my mem usage goes through the roof. example: outdat[filenum,:]=outarr where outdat is memory mapped. Anybody know how to avoid this? Mathew |
From: Erin S. <eri...@gm...> - 2006-11-13 19:43:28
|
On 11/13/06, Tim Hochberg <tim...@ie...> wrote: > Here's one more approach that's marginally faster than the map based > solution and also won't chew up an extra memory since it's based on from > iter: > > numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, > count=len(results)) Yes, this is what I need. BTW, there is no doc string for this. I just added an example the the Numpy Example List. Thanks, Erin |
From: Charles R H. <cha...@gm...> - 2006-11-13 19:13:25
|
On 11/13/06, Seweryn Kokot <sk...@po...> wrote: > > Hello, > > Why ipython and python interactive shell give two different information? > > --- ipython > Python 2.4.4 (#2, Oct 20 2006, 00:23:25) > Type "copyright", "credits" or "license" for more information. > > IPython 0.7.2 -- An enhanced Interactive Python. > ? -> Introduction to IPython's features. > %magic -> Information about IPython's 'magic' % functions. > help -> Python's own help system. > object? -> Details about 'object'. ?object also works, ?? prints more. > > In [1]: from scipy import linalg > > In [2]: help(linalg.eig) > > Help on function eig in module numpy.linalg.linalg: > > eig(a) > eig(a) returns u,v where u is the eigenvalues and > v is a matrix of eigenvectors with vector v[:,i] corresponds to > eigenvalue u[i]. Satisfies the equation dot(a, v[:,i]) = > u[i]*v[:,i] > --- ipython > > while > > --- python > Python 2.4.4 (#2, Oct 20 2006, 00:23:25) > [GCC 4.1.2 20061015 (prerelease) (Debian 4.1.1-16.1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> from scipy import linalg > >>> help(linalg.eig) > > Help on function eig in module scipy.linalg.decomp: > > eig(a, b=None, left=False, right=True, overwrite_a=False, > overwrite_b=False) > Solve ordinary and generalized eigenvalue problem > of a square matrix. > > Inputs: > > a -- An N x N matrix. > b -- An N x N matrix [default is identity(N)]. > left -- Return left eigenvectors [disabled]. > right -- Return right eigenvectors [enabled]. > overwrite_a, overwrite_b -- save space by overwriting the a and/or > b matrices (both False by default) > > Outputs: > > w -- eigenvalues [left==right==False]. > w,vr -- w and right eigenvectors [left==False,right=True]. > w,vl -- w and left eigenvectors [left==True,right==False]. > w,vl,vr -- [left==right==True]. > > Definitions: > ... > --- python > > Any idea? I expect scipy.linalg and numpy.linalg are different modules containing different functions. That said, the documentation in scipy.linalg looks quite a bit more complete. Chuck |
From: Tim H. <tim...@ie...> - 2006-11-13 17:46:24
|
Tim Hochberg wrote: > [SNIP] >> >> > > Just for completeness, I benchmarked the fromiter and map(tuple, > results) solutions as well. Map is fastest, followed by fromiter, list > comprehension and then fromrecords. The differences are pretty minor > however, so I'd stick with whatever seems clearest. > > -tim > > Here's one more approach that's marginally faster than the map based solution and also won't chew up an extra memory since it's based on from iter: numpy.fromiter(itertools.imap(tuple, results), dtype=mydescriptor, count=len(results)) [SNIP] -tim |
From: Seweryn K. <sk...@po...> - 2006-11-13 17:00:26
|
Hello, Why ipython and python interactive shell give two different information? --- ipython Python 2.4.4 (#2, Oct 20 2006, 00:23:25) Type "copyright", "credits" or "license" for more information. IPython 0.7.2 -- An enhanced Interactive Python. ? -> Introduction to IPython's features. %magic -> Information about IPython's 'magic' % functions. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more. In [1]: from scipy import linalg In [2]: help(linalg.eig) Help on function eig in module numpy.linalg.linalg: eig(a) eig(a) returns u,v where u is the eigenvalues and v is a matrix of eigenvectors with vector v[:,i] corresponds to eigenvalue u[i]. Satisfies the equation dot(a, v[:,i]) = u[i]*v[:,i] --- ipython while --- python Python 2.4.4 (#2, Oct 20 2006, 00:23:25) [GCC 4.1.2 20061015 (prerelease) (Debian 4.1.1-16.1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scipy import linalg >>> help(linalg.eig) Help on function eig in module scipy.linalg.decomp: eig(a, b=None, left=False, right=True, overwrite_a=False, overwrite_b=False) Solve ordinary and generalized eigenvalue problem of a square matrix. Inputs: a -- An N x N matrix. b -- An N x N matrix [default is identity(N)]. left -- Return left eigenvectors [disabled]. right -- Return right eigenvectors [enabled]. overwrite_a, overwrite_b -- save space by overwriting the a and/or b matrices (both False by default) Outputs: w -- eigenvalues [left==right==False]. w,vr -- w and right eigenvectors [left==False,right=True]. w,vl -- w and left eigenvectors [left==True,right==False]. w,vl,vr -- [left==right==True]. Definitions: ... --- python Any idea? regards SK |
From: Erin S. <eri...@gm...> - 2006-11-13 15:10:47
|
On 11/13/06, Francesc Altet <fa...@ca...> wrote: > In any case, you can also use rec.fromrecords for build recarrays from > lists of lists. This breaks the aforementioned rule, but Travis allowed > this because rec.* had to mimic numarray behaviour as much as possible. > Here is an example of use: > > In [46]:mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')} > In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] > In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) > In [49]:b = numpy.array([tuple(row) for row in results], > dtype=mydescriptor) > In [50]:a==b > Out[50]:recarray([True, True], dtype=bool) > > OTOH, it is said in the docs that fromrecords is discouraged because it > is somewhat slow, but apparently it has similar performance than using > comprehensions lists: > > In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", > "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = > {'names': ('gender','age','weight'), 'formats':('S1','f4', > 'f4')}").repeat(3,10) > Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] > > In [52]:Timer("numpy.array([tuple(row) for row in results], > dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; > mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')}").repeat(3,10) > Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] I checked the code. For lists of lists it just creates the recarray and runs a loop copying in the data row by row. The fact that they are of similar speed is actually good news because the list comprehension was making an extra copy of the data in memory. For large memory usage, which is my case, this 50% overhead would have been an issue. Erin |
From: Tim H. <tim...@ie...> - 2006-11-13 15:03:51
|
Francesc Altet wrote: > El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va > escriure: > >> On 11/13/06, Charles R Harris <cha...@gm...> wrote: >> >>> On 11/12/06, Erin Sheldon <eri...@gm...> wrote: >>> >>>> Hi all - >>>> >>>> Thanks to everyone for the suggestions. >>>> I think map(tuple, list) is probably the most compact, >>>> but the list comprehension also works well. >>>> >>>> Because map() is proably going to disappear someday, I'll >>>> stick with the list comprehension. >>>> array( [tuple(row) for row in result], dtype=dtype) >>>> >>>> That said, is there some compelling reason that the array >>>> function doesn't support this operation? >>>> >>> My understanding is that the array needs to be allocated up front. Since the >>> list comprehension is iterative it is impossible to know how big the result >>> is going to be. >>> >> Isn't it the same with a list of tuples? But you can send that directly to the >> array constructor. I don't see the fundamental difference, except that the >> code might be simpler to write. >> > > I think that the correct explanation is that Travis has chosen a tuple > as the way to refer to a inhomogeneous list of values (a record) and a > list as the way to refer to homogenous list of values. Just for the record, this is the officially blessed usage of tuple and lists for all of Python (by Guido himself). On the other hand, it's honored more in the breach than in reality. Since other factors, such as mutability/immutability or the mistaken belief that using tuples everywhere will make code noticeably faster or more memory frugal or something. > I'm not > completely sure why he did this, but I guess the reason was to be able > to distinguish the records in scenarios where nested records do appear. > I suspect that this could be made a little more forgiving, without loosing rigor. As long as none of the fields are objects of course in which case nearly all bets are off. Then again, the rule that tuple designate records is a lot simpler than something like tuples designate records, but you can use lists too, unless of course you have an object field in your array, in which case you really need to use tuples, except sometimes lists will work anyway, depending on where the object is fields is. So, maybe it's best just to keep it strict. > In any case, you can also use rec.fromrecords for build recarrays from > lists of lists. This breaks the aforementioned rule, but Travis allowed > this because rec.* had to mimic numarray behaviour as much as possible. > Here is an example of use: > [SNIP] > Just for completeness, I benchmarked the fromiter and map(tuple, results) solutions as well. Map is fastest, followed by fromiter, list comprehension and then fromrecords. The differences are pretty minor however, so I'd stick with whatever seems clearest. -tim print Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4','f4')}""").repeat(3,10) print Timer("numpy.array([tuple(row) for row in results], dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) print Timer("numpy.fromiter((tuple(x) for x in results), dtype=mydescriptor, count=len(results))", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) print Timer("numpy.array(map(tuple, results), dtype=mydescriptor)", """import numpy; results = [['M',64.0,75.0]]*100000; mydescriptor = {'names': ('gender','age','weight'),'formats':('S1','f4', 'f4')}""").repeat(3,10) ===> [1.3928521641717035, 1.3892659541925021, 1.3949996438094785] [1.344854164425926, 1.3157404083479882, 1.3207066819944986] [1.2768430065832401, 1.2742884919731416, 1.2736657871321633] [1.2081393026208644, 1.2025276955590734, 1.205871416618594] |
From: Sven S. <sve...@gm...> - 2006-11-13 10:03:38
|
Pierre GM schrieb: > On Sunday 12 November 2006 17:08, A. M. Archibald wrote: >> On 12/11/06, Keith Goodman <kwg...@gm...> wrote: >>> Is anybody interested in making x.max() and nanmax() behave the same >>> for matrices, except for the NaN part? That is, make >>> numpy.matlib.nanmax return a matrix instead of an array. > > Or, you could use masked arrays... In the new implementation, you can add a > mask to a subclassed array (such as matrix) to get a regular masked array. If > you fill this masked array, you get an array of the same subclass. > That is very interesting, but I agree with Keith and would actually call this a bug. (If still present in 1.0, that is, haven't checked, I think Keith used some rc?.) One proclaimed goal of numpy for the 1.0 release has been to be as matrix-friendly as possible, for which I am very grateful. Still, the use of masked arrays looks more attractive every time they're mentioned... -sven |
From: Francesc A. <fa...@ca...> - 2006-11-13 08:19:24
|
El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va escriure: > On 11/13/06, Charles R Harris <cha...@gm...> wrote: > > > > > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > > Hi all - > > > > > > Thanks to everyone for the suggestions. > > > I think map(tuple, list) is probably the most compact, > > > but the list comprehension also works well. > > > > > > Because map() is proably going to disappear someday, I'll > > > stick with the list comprehension. > > > array( [tuple(row) for row in result], dtype=dtype) > > > > > > That said, is there some compelling reason that the array > > > function doesn't support this operation? > > > > My understanding is that the array needs to be allocated up front. Since the > > list comprehension is iterative it is impossible to know how big the result > > is going to be. > > Isn't it the same with a list of tuples? But you can send that directly to the > array constructor. I don't see the fundamental difference, except that the > code might be simpler to write. I think that the correct explanation is that Travis has chosen a tuple as the way to refer to a inhomogeneous list of values (a record) and a list as the way to refer to homogenous list of values. I'm not completely sure why he did this, but I guess the reason was to be able to distinguish the records in scenarios where nested records do appear. In any case, you can also use rec.fromrecords for build recarrays from lists of lists. This breaks the aforementioned rule, but Travis allowed this because rec.* had to mimic numarray behaviour as much as possible. Here is an example of use: In [46]:mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')} In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) In [49]:b = numpy.array([tuple(row) for row in results], dtype=mydescriptor) In [50]:a==b Out[50]:recarray([True, True], dtype=bool) OTOH, it is said in the docs that fromrecords is discouraged because it is somewhat slow, but apparently it has similar performance than using comprehensions lists: In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] In [52]:Timer("numpy.array([tuple(row) for row in results], dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth |
From: Erin S. <eri...@gm...> - 2006-11-13 07:07:32
|
On 11/13/06, Charles R Harris <cha...@gm...> wrote: > > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > Hi all - > > > > Thanks to everyone for the suggestions. > > I think map(tuple, list) is probably the most compact, > > but the list comprehension also works well. > > > > Because map() is proably going to disappear someday, I'll > > stick with the list comprehension. > > array( [tuple(row) for row in result], dtype=dtype) > > > > That said, is there some compelling reason that the array > > function doesn't support this operation? > > My understanding is that the array needs to be allocated up front. Since the > list comprehension is iterative it is impossible to know how big the result > is going to be. Isn't it the same with a list of tuples? But you can send that directly to the array constructor. I don't see the fundamental difference, except that the code might be simpler to write. > > BTW, it might be possible to use fromfile('name', dtype=dtype) to do what > you want if the data is stored by rows in a file. I'm reading from a database. Erin |
From: Charles R H. <cha...@gm...> - 2006-11-13 06:18:26
|
On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > Hi all - > > Thanks to everyone for the suggestions. > I think map(tuple, list) is probably the most compact, > but the list comprehension also works well. > > Because map() is proably going to disappear someday, I'll > stick with the list comprehension. > array( [tuple(row) for row in result], dtype=dtype) > > That said, is there some compelling reason that the array > function doesn't support this operation? My understanding is that the array needs to be allocated up front. Since the list comprehension is iterative it is impossible to know how big the result is going to be. BTW, it might be possible to use fromfile('name', dtype=dtype) to do what you want if the data is stored by rows in a file. Chuck |
From: Erin S. <eri...@gm...> - 2006-11-13 05:56:31
|
Hi all - Thanks to everyone for the suggestions. I think map(tuple, list) is probably the most compact, but the list comprehension also works well. Because map() is proably going to disappear someday, I'll stick with the list comprehension. array( [tuple(row) for row in result], dtype=dtype) That said, is there some compelling reason that the array function doesn't support this operation? Thanks again, Erin On 11/12/06, Robert Kern <rob...@gm...> wrote: > Pierre GM wrote: > > On Sunday 12 November 2006 20:10, Erin Sheldon wrote: > >> Actually, there is a problem with that approach. It first converts > >> the entire array to a single type, by default a floating type. > > > > As A.M. Archibald suggested, you can use list comprehension: > > N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) > > > > or > > > > N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) > > > > Would you mind trying that, and let us know which one works best ? That could > > be put on the wiki somewhere... > > N.array(map(tuple, yourlist), dtype=yourdesc) > > is probably the best option. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it had > an underlying truth." > -- Umberto Eco > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Robert K. <rob...@gm...> - 2006-11-13 04:28:26
|
Pierre GM wrote: > On Sunday 12 November 2006 20:10, Erin Sheldon wrote: >> Actually, there is a problem with that approach. It first converts >> the entire array to a single type, by default a floating type. > > As A.M. Archibald suggested, you can use list comprehension: > N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) > > or > > N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) > > Would you mind trying that, and let us know which one works best ? That could > be put on the wiki somewhere... N.array(map(tuple, yourlist), dtype=yourdesc) is probably the best option. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Pierre GM <pgm...@gm...> - 2006-11-13 04:20:51
|
On Sunday 12 November 2006 20:10, Erin Sheldon wrote: > Actually, there is a problem with that approach. It first converts > the entire array to a single type, by default a floating type. As A.M. Archibald suggested, you can use list comprehension: N.array([(a,b,c,d,) for (a,b,c,d) in yourlist], dtype=yourdesc) or N.fromiter(((a,b,c,d) for (a,b,c,d,) in yourlist), dtype=yourdesc) Would you mind trying that, and let us know which one works best ? That could be put on the wiki somewhere... |
From: Tim H. <tim...@ie...> - 2006-11-13 04:01:42
|
Erin Sheldon wrote: > On 11/12/06, Tim Hochberg <tim...@ie...> wrote: > >> I haven't been following this too closely, but if you need to transpose >> your data without converting all to one type, I can think of a couple of >> different approaches: >> >> 1. zip(*yourlist) >> 2. numpy.transpose(numpy.array(yourlist, dtype=object) >> >> I haven't tested them though (particularly the second one), so caveat >> emptor, etc, etc. >> > > Its not that I want to transpose data. > > I'm trying to convert the output of a pgdb postgres query into > an array with fields and types corresponding to the columns > I have selected. The problem is pgdb does not return a list > of tuples as it should according to DB 2.0, but instead > a list of lists. So numpy.array(lol, dtype=) fails, and so will your > solution #2. In that case, I suggest just using a list comprehension or map, [tuple(x) for x in lol] for example. > I don't want to copy the data more than once > obviously, so I'm looking for a way to call array() with a lists > of lists. > It's probably pointless to worry about his. You are already allocating 5*N python objects (all those Python floats and integers as well as the lists themselves). I believe the list comprehension above is only going to allocate an additional N objects (the new tuples). Admittedly, the objects aren't all the same size, but in this case they are close enough that I doubt it'll matter. -tim |
From: Tim H. <tim...@ie...> - 2006-11-13 03:50:33
|
George Sakkis wrote: > Tim Hochberg wrote: > > >> George Sakkis wrote: >> >>> def index(a, value): >>> return N.where(a==value)[0][0] >>> >>> >> Or >> >> def index(a, value): >> return argmax(a == value) >> > > That's a bit easier to write and a bit harder to grok; that's ok, I can > live with it. > I'll add a little cautionary note as well: it won't work correctly if there are no elements of a that equal value (in which case the max of a==values is 0 and you'll just get the first values). It would be easy enough to make this bullet proof: def index(a, value): arg = argmax(a == value) if a[arg] != value: raise IndexError("or whatever") return arg Unfortunately, that's no longer two lines and I wouldn't have been able to use that Tim Peter's quote which needs to be thrown out from time to time just on general principles. > >> [snip] >> Peters >> > > I think the organizational problem is more serious in Numpy's case; any > package with about 200 functions (according to > http://www.scipy.org/Numpy_Example_List) in an almost flat namespace > would feel the maintainability problem build up. > I agree with that for the most part. > >> In the case of this particular function, what are the use cases? Are >> they common (not just with you but with other numpy users)? Are the uses >> speed critical? Is it a good building block for other numeric >> functions? I'm skeptical of the above implementation as it doesn't fit >> well with other similar numpy functions (argmin and argmax for example) >> and I personally don't see much in the way of compelling uses for this, >> but feel free to try to convince me otherwise. >> >> -tim >> > > As I said, I would welcome the addition of this function and I would > use it if it was there, but I'm not particularly fervent about it. If I > were to convince you, I'd argue about how numpy arrays can be > considered generalizations of (1-d) sequences in many ways and how > convenient would it be for users whose code expects python lists to > work as is (thanks to duck typing) with 1-d numpy arrays. Another > (indirect) argument would be the addition of functions that implement > set-theoretic operations, and certainly lists are far more common than > sets. > Well, if you were to try to convince me like that, I would try to convince you of your errant ways.by noting that numpy arrays are quite dissimilar from lists in so many ways, that making them closer would just lead to confusion as people asked for more and more list-like behaviors (append, in, etc etc.). Lists and arrays aren't close enough that conflating them could be made to work in any halfway sensible manner. Lists and arrays are both examples of python sequences and they both fully implement the sequence interface (and the iterarable interface for that matter) but that's really the extent of their similarity. Now whether an object based on array that implemented list semantics but kept a numpy arrays memory frugality would be useful I cannot say. However, that seems to be the kind of thing that should be a separate type of object, perhaps reusing much of the array code, but having a completely different interface. It also seems the kind of thing that would likely have fairly limited use in practice, so if someone really wants it I'd like to see it exist as a separate package for some period of time to verify that it actually has a user base before it got incorporated into numpy. Not that I control these things. > I have to agree though that the flat namespace at its current size > makes me uneasy too; I typically use "import numpy as N" instead of > "from numpy import *". > Oh yeah, you should almost never use "from anything import *" IMO. 'N' seems to be the favored abreviation these days although I've always used 'np' (this helped when I was transition from Numeric to numarray and then back to numpy as I used 'na' for numarray and np for numpy). -tim |