From: Francesc A. <fa...@ca...> - 2006-08-11 20:42:01
|
Hi, I was tracking down a memory leak in PyTables and it boiled down to a probl= em=20 in the array protocol. The issue is easily exposed by: for i in range(1000000): numarray.array(numpy.zeros(dtype=3Dnumpy.float64, shape=3D3)) and looking at the memory consumption of the process. The same happens with: for i in range(1000000): numarray.asarray(numpy.zeros(dtype=3Dnumpy.float64, shape=3D3)) However, the numpy<--numarray sense seems to work well. for i in range(1000000): numpy.array(numarray.zeros(type=3D"Float64", shape=3D3)) Using numarray 1.5.1 and numpy 1.0b1 I think this is a relatively important problem, because it somewhat prevent= s a=20 smooth transition from numarray to NumPy.=20 Thanks, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Todd M. <jm...@st...> - 2006-08-11 21:13:42
|
Francesc Altet wrote: > Hi, > > I was tracking down a memory leak in PyTables and it boiled down to a problem > in the array protocol. The issue is easily exposed by: > > for i in range(1000000): > numarray.array(numpy.zeros(dtype=numpy.float64, shape=3)) > > and looking at the memory consumption of the process. The same happens with: > > for i in range(1000000): > numarray.asarray(numpy.zeros(dtype=numpy.float64, shape=3)) > > However, the numpy<--numarray sense seems to work well. > > for i in range(1000000): > numpy.array(numarray.zeros(type="Float64", shape=3)) > > Using numarray 1.5.1 and numpy 1.0b1 > > I think this is a relatively important problem, because it somewhat prevents a > smooth transition from numarray to NumPy. > > Thanks, > > I looked at this a little with a debug python and figure it's a bug in numpy.zeros(): >>> numpy.zeros(dtype=numpy.float64, shape=3) array([ 0., 0., 0.]) [147752 refs] >>> numpy.zeros(dtype=numpy.float64, shape=3) array([ 0., 0., 0.]) [147753 refs] >>> numpy.zeros(dtype=numpy.float64, shape=3) array([ 0., 0., 0.]) [147754 refs] >>> numarray.array([1,2,3,4]) array([1, 2, 3, 4]) [147772 refs] >>> numarray.array([1,2,3,4]) array([1, 2, 3, 4]) [147772 refs] >>> numarray.array([1,2,3,4]) array([1, 2, 3, 4]) [147772 refs] Regards, Todd |
From: Travis O. <oli...@ie...> - 2006-08-11 22:11:05
|
Todd Miller wrote: >> >> > I looked at this a little with a debug python and figure it's a bug in > numpy.zeros(): > > Hmmm. I thought of that, but could not get any memory leak by just creating zeros in a four loop. In other words: for i in xrange(10000000): numpy.zeros(dtype=numpy.float64, shape=3) does not leak.. So, it's seems to be related to the array protocol. I have not been able to spot what is going on though. There does not seem to be any reference-counting problem that I can see. -Travis |
From: Travis O. <oli...@ie...> - 2006-08-11 22:30:52
|
Francesc Altet wrote: > Hi, > > I was tracking down a memory leak in PyTables and it boiled down to a problem > in the array protocol. The issue is easily exposed by: > > for i in range(1000000): > numarray.array(numpy.zeros(dtype=numpy.float64, shape=3)) > > More data: The following code does not leak: import numpy import sys for i in xrange(10000000): a = numpy.zeros(dtype=numpy.float64,shape=3) b = a.__array_struct__ as verified by watching the memory growth As far as numpy knows this is all it's supposed to do. This seems to indicate that something is going on inside numarray.array(a) because once you had that line to the loop, memory consumption shows up. In fact, you can just add the line a = _numarray._array_from_array_struct(a) to see the memory growth problem. -Travis -Travis |
From: Todd M. <jm...@st...> - 2006-08-12 11:05:13
|
Travis Oliphant wrote: > As far as numpy knows this is all it's supposed to do. This seems to > indicate that something is going on inside numarray.array(a) > > because once you had that line to the loop, memory consumption shows up. > > In fact, you can just add the line > > a = _numarray._array_from_array_struct(a) > This does demonstrate a huge leak I'll look into. Thanks. Regards, Todd |
From: Travis O. <oli...@ie...> - 2006-08-11 22:52:18
|
Francesc Altet wrote: > Hi, > > I was tracking down a memory leak in PyTables and it boiled down to a problem > in the array protocol. The issue is easily exposed by: > > for i in range(1000000): > numarray.array(numpy.zeros(dtype=numpy.float64, shape=3)) > > and looking at the memory consumption of the process. The same happens with: > > for i in range(1000000): > numarray.asarray(numpy.zeros(dtype=numpy.float64, shape=3)) > > However, the numpy<--numarray sense seems to work well. > > for i in range(1000000): > numpy.array(numarray.zeros(type="Float64", shape=3)) > > Using numarray 1.5.1 and numpy 1.0b1 > > I think this is a relatively important problem, because it somewhat prevents a > smooth transition from numarray to NumPy. > > I tracked the leak to the numarray function NA_FromDimsStridesDescrAndData This function calls NA_NewAllFromBuffer with a brand-new buffer object when data is passed in (like in the case with the array protocol). That function then takes a reference to the buffer object but then the calling function never releases the reference it already holds. This creates the leak. I added the line if (data) {Py_DECREF(buf);} right after the call to NA_NewAllFromBuffer and the leak disappeared. For what it's worth, I also think the base object for the new numarray object should be the object passed in and not the C-object that is created from it. In other words in the NA_FromArrayStruct function a->base = cobj should be replaced with Py_INCREF(obj) a->base = obj Py_DECREF(cobj) Best, -Travis |
From: Francesc A. <fa...@ca...> - 2006-08-12 15:54:01
|
A Dissabte 12 Agost 2006 14:37, Todd Miller va escriure: > I agree with all of Travis' comments below and committed the suggested > changes to numarray CVS. I found one other numarray change needed > for Francesc's examples to run (apparently) leak-free: > > Py_INCREF(obj) > Py_XDECREF(a->base) > a->base =3D obj > Py_DECREF(cobj) > > Thanks Travis! Hey! I checked this morning Travis' patch and seems to work well for me. I'= ll=20 add yours as well later on and see... BTW, where exactly I've to add the=20 above lines? Many thanks Travis and Todd. You are great! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Todd M. <jm...@st...> - 2006-08-13 12:59:09
|
Francesc Altet wrote: > A Dissabte 12 Agost 2006 14:37, Todd Miller va escriure: > >> I agree with all of Travis' comments below and committed the suggested >> changes to numarray CVS. I found one other numarray change needed >> for Francesc's examples to run (apparently) leak-free: >> >> Py_INCREF(obj) >> Py_XDECREF(a->base) >> a->base = obj >> Py_DECREF(cobj) >> >> Thanks Travis! >> > > Hey! I checked this morning Travis' patch and seems to work well for me. I'll > add yours as well later on and see... BTW, where exactly I've to add the > above lines? > The lines above are a modification to Travis' patch, so basically the same place: ******* a = NA_FromDimsStridesTypeAndData(arrayif->nd, shape, strides, t, arrayif->data); if (!a) goto _fail; ! a->base = cobj; return a; ------- a = NA_FromDimsStridesTypeAndData(arrayif->nd, shape, strides, t, arrayif->data); if (!a) goto _fail; ! Py_INCREF(obj); ! Py_XDECREF(a->base); ! a->base = obj; ! Py_DECREF(cobj); return a; Todd |
From: Todd M. <jm...@st...> - 2006-08-12 12:36:57
|
I agree with all of Travis' comments below and committed the suggested changes to numarray CVS. I found one other numarray change needed for Francesc's examples to run (apparently) leak-free: Py_INCREF(obj) Py_XDECREF(a->base) a->base = obj Py_DECREF(cobj) Thanks Travis! Regards, Todd Travis Oliphant wrote: > Francesc Altet wrote: > >> Hi, >> >> I was tracking down a memory leak in PyTables and it boiled down to a problem >> in the array protocol. The issue is easily exposed by: >> >> for i in range(1000000): >> numarray.array(numpy.zeros(dtype=numpy.float64, shape=3)) >> >> and looking at the memory consumption of the process. The same happens with: >> >> for i in range(1000000): >> numarray.asarray(numpy.zeros(dtype=numpy.float64, shape=3)) >> >> However, the numpy<--numarray sense seems to work well. >> >> for i in range(1000000): >> numpy.array(numarray.zeros(type="Float64", shape=3)) >> >> Using numarray 1.5.1 and numpy 1.0b1 >> >> I think this is a relatively important problem, because it somewhat prevents a >> smooth transition from numarray to NumPy. >> >> >> > > I tracked the leak to the numarray function > > NA_FromDimsStridesDescrAndData > > This function calls NA_NewAllFromBuffer with a brand-new buffer object > when data is passed in (like in the case with the array protocol). That > function then takes a reference to the buffer object but then the > calling function never releases the reference it already holds. This > creates the leak. > > I added the line > > if (data) {Py_DECREF(buf);} > > right after the call to NA_NewAllFromBuffer and the leak disappeared. > > For what it's worth, I also think the base object for the new numarray > object should be the object passed in and not the C-object that is > created from it. > > In other words in the NA_FromArrayStruct function > > a->base = cobj > > should be replaced with > > Py_INCREF(obj) > a->base = obj > Py_DECREF(cobj) > > > Best, > > > -Travis > > > > > > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |