From: Mathew Y. <my...@jp...> - 2006-11-15 20:28:43
|
Hi I'm running a 64 bit Python 2.5 on an x86 with Solaris. I have a function I call over 2^32 times and eventually I run out of memory. The function is def make_B(deltadates): numcols=deltadates.shape[0] B=numpy.zeros((numcols,numcols)) for ind in range(0,numcols): #comment out this loop and all is good B[ind,0:numcols] = deltadates[0:numcols] return B If I comment out the loop lines, my memory is okay. I'm guessing that a reference is being added to "deltadates" and that the reference count is going above 2^32 and reseting. Anybody have any ideas about how I can cure this? Is Numpy increasing the reference count here? Mathew |
From: Robert K. <rob...@gm...> - 2006-11-15 20:34:53
|
Mathew Yeates wrote: > Hi > I'm running a 64 bit Python 2.5 on an x86 with Solaris. I have a > function I call over 2^32 times and eventually I run out of memory. > > The function is > def make_B(deltadates): > numcols=deltadates.shape[0] > B=numpy.zeros((numcols,numcols)) > for ind in range(0,numcols): #comment out this loop and all is good > B[ind,0:numcols] = deltadates[0:numcols] > return B > > > If I comment out the loop lines, my memory is okay. I'm guessing that a > reference is being added to "deltadates" and that the reference count is > going above 2^32 and reseting. Anybody have any ideas about how I can > cure this? Is Numpy increasing the reference count here? Can you give us a small but complete and self-contained script that demonstrates the problem? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Mathew Y. <my...@jp...> - 2006-11-15 20:44:15
|
Robert Kern wrote: > Mathew Yeates wrote: > >> Hi >> I'm running a 64 bit Python 2.5 on an x86 with Solaris. I have a >> function I call over 2^32 times and eventually I run out of memory. >> >> The function is >> def make_B(deltadates): >> numcols=deltadates.shape[0] >> B=numpy.zeros((numcols,numcols)) >> for ind in range(0,numcols): #comment out this loop and all is good >> B[ind,0:numcols] = deltadates[0:numcols] >> return B >> >> >> If I comment out the loop lines, my memory is okay. I'm guessing that a >> reference is being added to "deltadates" and that the reference count is >> going above 2^32 and reseting. Anybody have any ideas about how I can >> cure this? Is Numpy increasing the reference count here? >> > > Can you give us a small but complete and self-contained script that demonstrates > the problem? > > I'll try. But its in a complex program. BTW - I tried B[ind,0:numcols] = deltadates[0:numcols].copy() but that didn't work either. Mathew |
From: Stefan v. d. W. <st...@su...> - 2006-11-15 20:52:20
|
On Wed, Nov 15, 2006 at 02:33:52PM -0600, Robert Kern wrote: > Mathew Yeates wrote: > > Hi > > I'm running a 64 bit Python 2.5 on an x86 with Solaris. I have a=20 > > function I call over 2^32 times and eventually I run out of memory. > >=20 > > The function is > > def make_B(deltadates): > > numcols=3Ddeltadates.shape[0] > > B=3Dnumpy.zeros((numcols,numcols)) > > for ind in range(0,numcols): #comment out this loop and all is go= od > > B[ind,0:numcols] =3D deltadates[0:numcols] > > return B > >=20 > >=20 > > If I comment out the loop lines, my memory is okay. I'm guessing that= a=20 > > reference is being added to "deltadates" and that the reference count= is=20 > > going above 2^32 and reseting. Anybody have any ideas about how I can= =20 > > cure this? Is Numpy increasing the reference count here? >=20 > Can you give us a small but complete and self-contained script that dem= onstrates > the problem? I think this might be related to ticket #378: http://projects.scipy.org/scipy/numpy/ticket/378 Cheers St=E9fan |
From: Mathew Y. <my...@jp...> - 2006-11-15 21:35:34
Attachments:
memsuck.py
|
Stefan van der Walt wrote: > On Wed, Nov 15, 2006 at 02:33:52PM -0600, Robert Kern wrote: > =20 >> Mathew Yeates wrote: >> =20 >>> Hi >>> I'm running a 64 bit Python 2.5 on an x86 with Solaris. I have a=20 >>> function I call over 2^32 times and eventually I run out of memory. >>> >>> The function is >>> def make_B(deltadates): >>> numcols=3Ddeltadates.shape[0] >>> B=3Dnumpy.zeros((numcols,numcols)) >>> for ind in range(0,numcols): #comment out this loop and all is go= od >>> B[ind,0:numcols] =3D deltadates[0:numcols] >>> return B >>> >>> >>> If I comment out the loop lines, my memory is okay. I'm guessing that= a=20 >>> reference is being added to "deltadates" and that the reference count= is=20 >>> going above 2^32 and reseting. Anybody have any ideas about how I can= =20 >>> cure this? Is Numpy increasing the reference count here? >>> =20 >> Can you give us a small but complete and self-contained script that de= monstrates >> the problem? >> =20 > > I think this might be related to ticket #378: > > http://projects.scipy.org/scipy/numpy/ticket/378 > > Cheers > St=E9fan > =20 okay. attached is the smallest program I could make. Before running you=20 will need to create a file named biggie with 669009000 non zero floats. Mathew |
From: Robert K. <rob...@gm...> - 2006-11-15 21:42:41
|
Mathew Yeates wrote: > def delta2day1(delta): > return delta.days/365.0 > deltas2days=numpy.frompyfunc(delta2day1,1,1) If I had to guess where the problem is, it's here. frompyfunc() and vectorize() have always been tricky beasts to get right. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |
From: Tim H. <tim...@ie...> - 2006-11-15 22:15:48
|
Robert Kern wrote: > Mathew Yeates wrote: > > >> def delta2day1(delta): >> return delta.days/365.0 >> deltas2days=numpy.frompyfunc(delta2day1,1,1) >> > > If I had to guess where the problem is, it's here. frompyfunc() and vectorize() > have always been tricky beasts to get right. > <curnudgen mode> IMO, frompyfunc is an attractive nuisance. It doesn't magically make scalar Python functions fast, as people seem to assume, and it prevents people from figuring out how to write vectorized functions in the many cases where that's practicable. And it sounds like it may be buggy to boot. In this case, I don't know that you can easily vectorize this by hand, but there are many ways that it could be rewritten to avoid frompyfunc. For example: def deltas2days(seq): return numpy.fromter((x.days for x in seq), dtype=float, count=len(seq)) One line shorter, about equally opaque and less likely to have mysterious bugs. </curmudgen mode> -tim |
From: Mathew Y. <my...@jp...> - 2006-11-15 22:18:47
|
Robert Kern wrote: > Mathew Yeates wrote: > > >> def delta2day1(delta): >> return delta.days/365.0 >> deltas2days=numpy.frompyfunc(delta2day1,1,1) >> > > If I had to guess where the problem is, it's here. frompyfunc() and vectorize() > have always been tricky beasts to get right. > > It appears the problem is, in fact, with frompyfunc. I'm still running but I'm not seeing the immediate loss of memory as I was before. I replaced the frompyfunc with a simple loop. Thanks for all the help. Mathew |