You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Tim H. <tim...@co...> - 2006-07-18 16:13:42
|
Eric Emsellem wrote: > thanks for the tips. (indeed your "add.reduce" is correct: I just wrote > this down too quickly, in the script I have a "sum" included). > > And yes you are right for the memory issue, so I may just keep the loop > in and try to make it work on a fast PC...(or use parallel processes) > > (is "sum" different than "add.reduce"?) > > thanks again to both Bill Baxter and Perry Greenfield for their fast > (and helpful!) answers. > I just wanted to add that there are faster, but considerably complicated ways to attack this class of problems. The one I've looked at in the past was the fast multipole method and I believe there are others. I'm not sure whether these can be implemented efficiently in numpy, but you may want to take a look into this kind of more sophisticated/complicated approach if brute forcing the calculation doesn't work. -tim > cheers > > Eric > > Perry Greenfield wrote: > >> On Jul 18, 2006, at 10:23 AM, Eric Emsellem wrote: >> >> >>> Hi, >>> >>> I have a specific quantity to derive from an array, and I am at the >>> moment unable to do it for a too large array because it just takes too >>> long! So I am looking for an advice on how to efficiently compute such a >>> quantity: >>> >>> I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do: >>> >>> result = 0. >>> for i in range(N) : >>> for j in range(i+1,N,1) : >>> result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z[j] - >>> z[i])**2) >>> >>> >>> Of course the procedure written above is very inefficient and I thought >>> of doing: >>> >>> result = 0. >>> for i in range(N) : >>> result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 + >>> (z[i+1:] - z[i])**2) >>> >>> Still, this is quite slow and not workable for very large arrays (> 10^6 >>> floats per array). >>> >>> Any hint on how to speed things up here? >>> >>> Thanks in advance! >>> >>> Eric >>> >> Perhaps I'm misunderstanding the last variant but don't you want >> something like: >> >> result = 0. >> for i in range(N) : >> result += add.reduce(1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - >> y[i])**2 + >> (z[i+1:] - z[i])**2)) >> >> instead since the expression yields an array with a decreasing size >> each iteration? >> >> But besides that, it seems you are asking to do roughly 10^12 of these >> computations for 10^6 points. I don't see any way to avoid that given >> what you are computing. The solution Bill Baxter gives is fine (I >> think, I haven't looked at it closely), but the usual problem of doing >> it without any looping is that it requires an enormous amount of >> memory (~10^12 element arrays) if I'm not mistaken. Since your second >> example is iterating over large arrays (most of the time, not near the >> end), I'd be surprised if you can do much better than that (the >> looping overhead should be negligible for such large arrays). Do you >> have examples written in other languages that run much faster? I guess >> I would be surprised to see it possible to do more than a few times >> faster in any language without some very clever optimizations. >> >> Perry >> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > |
From: Albert S. <fu...@gm...> - 2006-07-18 15:51:25
|
Hello all > -----Original Message----- > From: num...@li... [mailto:numpy- > dis...@li...] On Behalf Of Keith Goodman > Sent: 18 July 2006 15:55 > To: Thomas Heller > Cc: num...@li... > Subject: Re: [Numpy-discussion] Cannot build numpy svn on Windows > > On 7/18/06, Thomas Heller <th...@py...> wrote: > > > When I change this line in the generated config.h file: > > > > #define NPY_ALLOW_THREADS WITH_THREAD > > > > to this one: > > > > #define NPY_ALLOW_THREADS 1 > > > > then I can build. This might be due to some changes Travis made. http://projects.scipy.org/scipy/numpy/changeset/2833 I was able to build r2834 on Windows without problems. Are you still seeing this error? > What part of numpy is threaded? This thread stuff is so that NumPy releases/doesn't release the GIL in certain places. Regards, Albert |
From: Eric E. <ems...@ob...> - 2006-07-18 15:37:54
|
thanks for the tips. (indeed your "add.reduce" is correct: I just wrote this down too quickly, in the script I have a "sum" included). And yes you are right for the memory issue, so I may just keep the loop in and try to make it work on a fast PC...(or use parallel processes) (is "sum" different than "add.reduce"?) thanks again to both Bill Baxter and Perry Greenfield for their fast (and helpful!) answers. cheers Eric Perry Greenfield wrote: > > On Jul 18, 2006, at 10:23 AM, Eric Emsellem wrote: > >> Hi, >> >> I have a specific quantity to derive from an array, and I am at the >> moment unable to do it for a too large array because it just takes too >> long! So I am looking for an advice on how to efficiently compute such a >> quantity: >> >> I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do: >> >> result = 0. >> for i in range(N) : >> for j in range(i+1,N,1) : >> result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z[j] - >> z[i])**2) >> >> >> Of course the procedure written above is very inefficient and I thought >> of doing: >> >> result = 0. >> for i in range(N) : >> result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 + >> (z[i+1:] - z[i])**2) >> >> Still, this is quite slow and not workable for very large arrays (> 10^6 >> floats per array). >> >> Any hint on how to speed things up here? >> >> Thanks in advance! >> >> Eric > > Perhaps I'm misunderstanding the last variant but don't you want > something like: > > result = 0. > for i in range(N) : > result += add.reduce(1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - > y[i])**2 + > (z[i+1:] - z[i])**2)) > > instead since the expression yields an array with a decreasing size > each iteration? > > But besides that, it seems you are asking to do roughly 10^12 of these > computations for 10^6 points. I don't see any way to avoid that given > what you are computing. The solution Bill Baxter gives is fine (I > think, I haven't looked at it closely), but the usual problem of doing > it without any looping is that it requires an enormous amount of > memory (~10^12 element arrays) if I'm not mistaken. Since your second > example is iterating over large arrays (most of the time, not near the > end), I'd be surprised if you can do much better than that (the > looping overhead should be negligible for such large arrays). Do you > have examples written in other languages that run much faster? I guess > I would be surprised to see it possible to do more than a few times > faster in any language without some very clever optimizations. > > Perry |
From: Perry G. <pe...@st...> - 2006-07-18 15:23:44
|
On Jul 18, 2006, at 10:23 AM, Eric Emsellem wrote: > Hi, > > I have a specific quantity to derive from an array, and I am at the > moment unable to do it for a too large array because it just takes too > long! So I am looking for an advice on how to efficiently compute > such a > quantity: > > I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do: > > result = 0. > for i in range(N) : > for j in range(i+1,N,1) : > result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z > [j] - > z[i])**2) > > > Of course the procedure written above is very inefficient and I > thought > of doing: > > result = 0. > for i in range(N) : > result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 + > (z[i+1:] - z[i])**2) > > Still, this is quite slow and not workable for very large arrays (> > 10^6 > floats per array). > > Any hint on how to speed things up here? > > Thanks in advance! > > Eric Perhaps I'm misunderstanding the last variant but don't you want something like: result = 0. for i in range(N) : result += add.reduce(1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y [i])**2 + (z[i+1:] - z[i])**2)) instead since the expression yields an array with a decreasing size each iteration? But besides that, it seems you are asking to do roughly 10^12 of these computations for 10^6 points. I don't see any way to avoid that given what you are computing. The solution Bill Baxter gives is fine (I think, I haven't looked at it closely), but the usual problem of doing it without any looping is that it requires an enormous amount of memory (~10^12 element arrays) if I'm not mistaken. Since your second example is iterating over large arrays (most of the time, not near the end), I'd be surprised if you can do much better than that (the looping overhead should be negligible for such large arrays). Do you have examples written in other languages that run much faster? I guess I would be surprised to see it possible to do more than a few times faster in any language without some very clever optimizations. Perry |
From: Bill B. <wb...@gm...> - 2006-07-18 14:45:38
|
Maybe this will help -- it computes the squared distances between a bunch of points: def dist2(x,c): """ Calculates squared distance between two sets of points. n2 = dist2(x, c) D = DIST2(X, C) takes two matrices of vectors and calculates the squared Euclidean distance between them. Both matrices must be of the same column dimension. If X has M rows and N columns, and C has L rows and N columns, then the result has M rows and L columns. The I, Jth entry is the squared distance from the Ith row of X to the Jth row of C. """ ndata, dimx = x.shape ncentres, dimc = c.shape if dimx != dimc: raise ValueError('Data dimension does not match dimension of centres') n2 = (x*x).sum(1)[:,numpy.newaxis] + (c*c).sum(1) - 2*dot(x,T(c)) # Rounding errors occasionally cause negative entries in n2 #if (n2<0).any(): # n2[n2<0] = 0 return n2 Take 1.0/numpy.sqrt(dist2(V,V)) and from there you can probably get the sum with sum() calls. It's obviously not as efficient as it could be since it'll be computing both halves of a symmetric matrix, but it might be faster than nested Python loops. (The above was adapted from a routine in Netlab). --bb On 7/18/06, Eric Emsellem <ems...@ob...> wrote: > Hi, > > I have a specific quantity to derive from an array, and I am at the > moment unable to do it for a too large array because it just takes too > long! So I am looking for an advice on how to efficiently compute such a > quantity: > > I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do: > > result = 0. > for i in range(N) : > for j in range(i+1,N,1) : > result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z[j] - > z[i])**2) > > > Of course the procedure written above is very inefficient and I thought > of doing: > > result = 0. > for i in range(N) : > result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 + > (z[i+1:] - z[i])**2) > > Still, this is quite slow and not workable for very large arrays (> 10^6 > floats per array). > > Any hint on how to speed things up here? > > Thanks in advance! > > Eric |
From: Eric E. <ems...@ob...> - 2006-07-18 14:25:44
|
Hi, I have a specific quantity to derive from an array, and I am at the moment unable to do it for a too large array because it just takes too long! So I am looking for an advice on how to efficiently compute such a quantity: I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do: result = 0. for i in range(N) : for j in range(i+1,N,1) : result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z[j] - z[i])**2) Of course the procedure written above is very inefficient and I thought of doing: result = 0. for i in range(N) : result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 + (z[i+1:] - z[i])**2) Still, this is quite slow and not workable for very large arrays (> 10^6 floats per array). Any hint on how to speed things up here? Thanks in advance! Eric |
From: Keith G. <kwg...@gm...> - 2006-07-18 13:54:45
|
On 7/18/06, Thomas Heller <th...@py...> wrote: > When I change this line in the generated config.h file: > > #define NPY_ALLOW_THREADS WITH_THREAD > > to this one: > > #define NPY_ALLOW_THREADS 1 > > then I can build. What part of numpy is threaded? |
From: Tom D. <tom...@al...> - 2006-07-18 13:49:56
|
I suggest lexsort itertools.groupby of the indices take I think it would be really great if numpy had the first two as a function or something like that. It is really useful to be able to take an array and bucket it and apply further numpy operations like accumulation functions. On 7/18/06, Stephen Simmons <ma...@st...> wrote: > Hi, > > Does anyone have any suggestions for summarising data in numpy? > > The quick description is that I want to do something like the SQL statement: > SELECT sum(field1), sum(field2) FROM table GROUP BY field3; > > The more accurate description is that my data is stored in PyTables HDF > format, with 24 monthly files, each with 4m records describing how > customers performed that month. Each record looks something like this: > ('200604', 651404500000L, '800', 'K', 12L, 162.0, 2000.0, 0.054581, 0.0, > 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, > 8.80, 0.86, 7.80 17.46, 0.0, 70.0, 0.0, 70.0, -142.93, 0.0, 2000.0, > 2063.93, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.71, 7.75, > 87.46, 77.75, -3.45, 0.22, -0.45, -0.57, 73.95) > The first 5 fields are status fields (month_string, account_number, > product_code, account_status, months_since_customer_joined). The > remaining 48 fields represent different aspects of the customer's > performance during that month. I read 100,000 of these records at a time > and turn them into a numpy recarray with: > dat = hdf_table.read(start=pos, stop=pos+block_size) > dat = numpy.asarray(dat._flatArray, dtype=dat.array_descr) > > I'd like to reduce these 96m records x 53 fields down to monthly > averages for each tuple (month_string, months_since_customer_joined) > which in the case above is ('200604', 12L). This will let me compare the > performance of newly acquired customers at the same point in their > lifecycle as customers acquired 1 or 2 years ago. > > The end result should be a dataset something like > res[month_index, months_since_customer_joined] > = array([ num_records, sum_field_5, sum_field_6, sum_field_7, ... > sum_field_52 ]) > with a shape of (24, 24, 49). > > I've played around with lexsort(), take(), sum(), etc, but get very > confused and end up feeling that I'm making things more complicated than > they need to be. So any advice from numpy veterans on how best to > proceed would be very welcome! > > Cheers > > Stephen > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Stephen S. <ma...@st...> - 2006-07-18 11:59:36
|
Hi, Does anyone have any suggestions for summarising data in numpy? The quick description is that I want to do something like the SQL statement: SELECT sum(field1), sum(field2) FROM table GROUP BY field3; The more accurate description is that my data is stored in PyTables HDF format, with 24 monthly files, each with 4m records describing how customers performed that month. Each record looks something like this: ('200604', 651404500000L, '800', 'K', 12L, 162.0, 2000.0, 0.054581, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.80, 0.86, 7.80 17.46, 0.0, 70.0, 0.0, 70.0, -142.93, 0.0, 2000.0, 2063.93, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -9.71, 7.75, 87.46, 77.75, -3.45, 0.22, -0.45, -0.57, 73.95) The first 5 fields are status fields (month_string, account_number, product_code, account_status, months_since_customer_joined). The remaining 48 fields represent different aspects of the customer's performance during that month. I read 100,000 of these records at a time and turn them into a numpy recarray with: dat = hdf_table.read(start=pos, stop=pos+block_size) dat = numpy.asarray(dat._flatArray, dtype=dat.array_descr) I'd like to reduce these 96m records x 53 fields down to monthly averages for each tuple (month_string, months_since_customer_joined) which in the case above is ('200604', 12L). This will let me compare the performance of newly acquired customers at the same point in their lifecycle as customers acquired 1 or 2 years ago. The end result should be a dataset something like res[month_index, months_since_customer_joined] = array([ num_records, sum_field_5, sum_field_6, sum_field_7, ... sum_field_52 ]) with a shape of (24, 24, 49). I've played around with lexsort(), take(), sum(), etc, but get very confused and end up feeling that I'm making things more complicated than they need to be. So any advice from numpy veterans on how best to proceed would be very welcome! Cheers Stephen |
From: Gary R. <gr...@bi...> - 2006-07-18 10:30:51
|
Nick Fotopoulos wrote: > I've been looking over the wiki and am not sure where the best place > would be for such a snippet. Would it go with the numpy examples > under vectorize or perhaps in a cookbook somewhere? Yes. It seems to me like a cookbook example. In the utopian future, when there are as many cookbook examples as OReilly have, it'll be time for a reorganisation, but for now, make it a cookbook entry. Gary R |
From: Travis O. <oli...@ie...> - 2006-07-18 08:46:44
|
Sebastian Haase wrote: > On Monday 17 July 2006 12:38, Travis Oliphant wrote: > > Any idea on my main question ? > What is the dot product of a 2x2 and 3x2x3 supposed to look like ? > Why are numarray and numpy giving different answers ?? > I'm pretty sure the dot-product in Numeric (and I guess numarray too) was broken for larger than 2-dimensions. This was fixed several months ago in NumPy. NumPy dot-product gives the following result a.shape is (I,L) b.shape is (J,L,K) Then c=dot(a,b) will have c.shape = (I,J,K) with c[i,j,k] = sum(a[i,:]*b[j,:,k]) I'm not even sure what Numeric is computing in this case. -Travis |
From: Thomas H. <th...@py...> - 2006-07-18 08:33:15
|
Building numpy from svn on Windows with Python 2.4.3 fails: c:\svn\numpy\numpy\core\include\numpy\arrayobject.h(986) : fatal error C1017: invalid integer constant expression When I change this line in the generated config.h file: #define NPY_ALLOW_THREADS WITH_THREAD to this one: #define NPY_ALLOW_THREADS 1 then I can build. Thomas |
From: Alan G I. <ai...@am...> - 2006-07-18 08:02:50
|
On Mon, 17 Jul 2006, John Lawless apparently wrote: >>>> from scipy import * >>>> a =3D array((1.2)) >>>> a +=3D 1.3j >>>> a > array(1.2) > Shouldn't this generate either an error or an up-cast, rather than > silently discarding the imaginary part? As I understand it: it cannot upcast, as the '+=3D' operation will use only the memory initially allocated for a. Cheers, Alan Isaac |
From: Steffen L. <ste...@gm...> - 2006-07-18 07:24:50
|
> I also placed in hooks so you can replace the scalarmath (for int, > float, and complex) with the Python version of math (this works because > the int, float, and complex scalars are sub-classes of the corresponding > Python object). Just for completeness some more tests using pythonmath/scalarmath for int, float or both (in usec per loop): sin - array mod - array xx (a) - (no import of numpy.core.scalarmath) numpy 0.9.9.2800 152 76.5 numpy 0.9.9.2800 + math 50.2 (b) - (use_pythonmath(xx)) numpy 0.9.9.2800 107 60.4 (int) numpy 0.9.9.2800 + math 32.7 numpy 0.9.9.2800 148 43 (float) numpy 0.9.9.2800 + math 50.7 numpy 0.9.9.2800 109 26.5 (int, float) numpy 0.9.9.2800 + math 32.4 (c) - (use_scalarmath(xx)) numpy 0.9.9.2800 149 77.1 (int) numpy 0.9.9.2800 + math 50.7 numpy 0.9.9.2800 147 74.3 (float) numpy 0.9.9.2800 + math 50.7 numpy 0.9.9.2800 148 73.5 (int, float) numpy 0.9.9.2800 + math 50.8 Maybe use_pythonmath(int, float, complex) should be set as default? Many thanks, Steffen |
From: Nick F. <nv...@MI...> - 2006-07-18 03:31:03
|
On Jul 16, 2006, at 12:01 AM, Travis Oliphant wrote: > Thanks for the decorator. This should be put on the www.scipy.org > wiki. I've been looking over the wiki and am not sure where the best place would be for such a snippet. Would it go with the numpy examples under vectorize or perhaps in a cookbook somewhere? This seems more specialized than the basic numpy examples, but not worthy of its own cookbook. In general, what do you do with constructs that seem useful, but aren't useful enough to just include somewhere in NumPy/ SciPy? How would anyone think to look for a tip like this? Also, thanks for your helpful responses and additional thanks to Travis for the book update. Take care, Nick |
From: Travis O. <oli...@ie...> - 2006-07-18 03:01:10
|
I'd like to make release 1.0beta on Thursday. Please submit bug-reports and fixes before then. -Travis |
From: Travis O. <oli...@ie...> - 2006-07-18 00:11:34
|
Bill Baxter wrote: > On 7/18/06, Keith Goodman <kwg...@gm...> wrote: > >> On 7/17/06, Travis Oliphant <oli...@ie...> wrote: >> >>> Keith Goodman wrote: >>> >> Does anyone out there save the print defaults across sessions? How do you do it? >> >> Does numpy look for any startup files (~/.numpyrc)? >> > > If you set a PYTHONSTARTUP environment variable then Python will run > the script it points to at startup of interactive sessions. > > But I wonder if maybe some numpy-specific startup script would be > good. My current PYTHONSTARTUP is importing numpy, just so I can > define some numpy shortcuts for my interactive sessions. That's fine > if you always use Numpy in your interactive sessions, but if not, then > it's just dead weight. > > The standard answer is to use ipython which gives you a wealth of startup options. -Travis |
From: Travis O. <oli...@ie...> - 2006-07-18 00:10:38
|
John Lawless wrote: > Travis, > > Thanks! > > 1). I haven't found any documentation on dtype='O'. (I purchased your > trelgol book but it hasn't arrived yet.) Does 'O' guarantee no > wrong answers? > The object data-type uses Python objects instead of low-level C-types for the calculations. So, it gives the same calculations that Python would do (but of course it's much slower). > 2). My actual code was more complex than the example I posted. It was > giving correct answers until I increased the dataset size. Then, > luckily, the result became obviously wrong. I can go through a > code and try to coerce everything to double but, when debugging a > large code, how can one ever be sure that all types are coerced > correctly if no errors are generated? > NumPy uses c data-types for calculations. It is therefore, *much* faster, but you have to take precautions about overflowing on integer operations. > 3). AFAIK, checking for overflows should take no CPU time whatsoever > unless an exception is actually generated. This is true for floating point operations, but you were doing integer multiplication. There is no support for hardware multiply overflow in NumPy (is there even such a thing?). Python checks for overflow on integer arithmetic by doing some additional calculations. It would be possible to add slower, integer over-flow checking ufuncs to NumPy if this was desired and you could replace the standard non-checking functions pretty easily. -Travis |
From: Bill B. <wb...@gm...> - 2006-07-18 00:06:22
|
On 7/18/06, Keith Goodman <kwg...@gm...> wrote: > On 7/17/06, Travis Oliphant <oli...@ie...> wrote: > > Keith Goodman wrote: > Does anyone out there save the print defaults across sessions? How do you do it? > > Does numpy look for any startup files (~/.numpyrc)? If you set a PYTHONSTARTUP environment variable then Python will run the script it points to at startup of interactive sessions. But I wonder if maybe some numpy-specific startup script would be good. My current PYTHONSTARTUP is importing numpy, just so I can define some numpy shortcuts for my interactive sessions. That's fine if you always use Numpy in your interactive sessions, but if not, then it's just dead weight. --bb |
From: Keith G. <kwg...@gm...> - 2006-07-17 23:37:15
|
On 7/17/06, Travis Oliphant <oli...@ie...> wrote: > Keith Goodman wrote: > > How do you display all of the rows of a matrix? > > > > help(numpy.set_prinoptions) That is great! Now I can change the precision as well. Eight significant figures is too precise for me. Does anyone out there save the print defaults across sessions? How do you do it? Does numpy look for any startup files (~/.numpyrc)? |
From: Travis O. <oli...@ie...> - 2006-07-17 23:00:48
|
Keith Goodman wrote: > How do you display all of the rows of a matrix? > help(numpy.set_prinoptions) Look at the threshold keyword numpy.set_printoptions(threshold = 2000) for your example -Travis |
From: Keith G. <kwg...@gm...> - 2006-07-17 22:52:30
|
How do you display all of the rows of a matrix? >> x = zeros((334,3)) >> x matrix([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.], ..., [ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]]) |
From: Sebastian H. <ha...@ms...> - 2006-07-17 20:48:06
|
On Monday 17 July 2006 12:38, Travis Oliphant wrote: > Sebastian Haase wrote: > > On Monday 17 July 2006 12:10, Travis Oliphant wrote: > >> Sebastian Haase wrote: > >>> Traceback (most recent call last): > >>> File "<input>", line 1, in ? > >>> TypeError: array cannot be safely cast to required type > >>> > >>>>>> dd=d.astype(N.float32) > >>>>>> N.dot(dd,ccc) > >>> > >>> [[[ 1. 1. 1.] > >>> [ 1. 1. 1.] > >>> [ 1. 1. 1.]] > >>> > >>> [[ 2. 2. 2.] > >>> [ 2. 2. 2.] > >>> [ 2. 2. 2.]]] > >>> > >>> > >>> > >>> The TypeError looks like a numpy bug ! > >> > >> I don't see why this is a bug. You are trying to coerce a 32-bit > >> integer to a 32-bit float. That is going to lose precision and so you > >> get the error indicated. > >> > >> -Travis > > > > In numarray I do not get an error. Would the error go away if I had 64 > > bit float !? It seems though that having ones and twos in an int array > > should fit just fine into a float32 array !? > > This could be considered a bug in numarray. It's force-casting the > result. That isn't the normal behavior of mixed-type functions. > > Also, the policy on type-casting is not to search the array to see if > it's possible to do the conversion on every element (that would be slow > on large arrays). The policy is to base the decision only on the > data-types themselves (i.e. whether it's *possible* to lose precision*). > > -Travis > > > > *There is one exception to this policy in NumPy: 64-bit integers are > allowed to be cast to 64-bit doubles --- other-wise on you would get a > lot of non-standard long-doubles showing up on 64-bit systems. This > policy was decided after discussion last year. OK - understood. Combining int32 with float64 proves to be less cumbersome ... Any idea on my main question ? What is the dot product of a 2x2 and 3x2x3 supposed to look like ? Why are numarray and numpy giving different answers ?? Thanks, Sebastian Haase |
From: Travis O. <oli...@ie...> - 2006-07-17 19:38:33
|
Sebastian Haase wrote: > On Monday 17 July 2006 12:10, Travis Oliphant wrote: > >> Sebastian Haase wrote: >> >>> Traceback (most recent call last): >>> File "<input>", line 1, in ? >>> TypeError: array cannot be safely cast to required type >>> >>> >>>>>> dd=d.astype(N.float32) >>>>>> N.dot(dd,ccc) >>>>>> >>> [[[ 1. 1. 1.] >>> [ 1. 1. 1.] >>> [ 1. 1. 1.]] >>> >>> [[ 2. 2. 2.] >>> [ 2. 2. 2.] >>> [ 2. 2. 2.]]] >>> >>> >>> >>> The TypeError looks like a numpy bug ! >>> >> I don't see why this is a bug. You are trying to coerce a 32-bit >> integer to a 32-bit float. That is going to lose precision and so you >> get the error indicated. >> >> -Travis >> > In numarray I do not get an error. Would the error go away if I had 64 bit > float !? It seems though that having ones and twos in an int array should > fit just fine into a float32 array !? > This could be considered a bug in numarray. It's force-casting the result. That isn't the normal behavior of mixed-type functions. Also, the policy on type-casting is not to search the array to see if it's possible to do the conversion on every element (that would be slow on large arrays). The policy is to base the decision only on the data-types themselves (i.e. whether it's *possible* to lose precision*). -Travis *There is one exception to this policy in NumPy: 64-bit integers are allowed to be cast to 64-bit doubles --- other-wise on you would get a lot of non-standard long-doubles showing up on 64-bit systems. This policy was decided after discussion last year. |
From: Sebastian H. <ha...@ms...> - 2006-07-17 19:26:51
|
On Monday 17 July 2006 12:10, Travis Oliphant wrote: > Sebastian Haase wrote: > > Traceback (most recent call last): > > File "<input>", line 1, in ? > > TypeError: array cannot be safely cast to required type > > > >>>> dd=d.astype(N.float32) > >>>> N.dot(dd,ccc) > > > > [[[ 1. 1. 1.] > > [ 1. 1. 1.] > > [ 1. 1. 1.]] > > > > [[ 2. 2. 2.] > > [ 2. 2. 2.] > > [ 2. 2. 2.]]] > > > > > > > > The TypeError looks like a numpy bug ! > > I don't see why this is a bug. You are trying to coerce a 32-bit > integer to a 32-bit float. That is going to lose precision and so you > get the error indicated. > > -Travis In numarray I do not get an error. Would the error go away if I had 64 bit float !? It seems though that having ones and twos in an int array should fit just fine into a float32 array !? - Sebastian Haase |