From: Jonathan W. <jon...@gm...> - 2006-10-27 21:08:11
|
On 10/27/06, Travis Oliphant <oli...@ie...> wrote: > > Jonathan Wang wrote: > > On 10/27/06, *Travis Oliphant* <oli...@ie... > > <mailto:oli...@ie...>> wrote: > > > > > If I redefine the string function, I encounter another, perhaps > more > > > serious problem leading to a segfault. I've defined my string > > function > > > to be extremely simple: > > > >>> def printer(arr): > > > ... return str(arr[0]) > > > > > > Now, if I try to print an element of the array: > > > >>> mxArr[0] > > > > > > I get to this stack trace: > > > #0 scalar_value (scalar=0x814be10, descr=0x5079e0) at > > > scalartypes.inc.src:68 > > > #1 0x0079936a in PyArray_Scalar (data=0x814cf98, descr=0x5079e0, > > > base=0x814e7a8) at arrayobject.c:1419 > > > #2 0x007d259f in array_subscript_nice (self=0x814e7a8, > > op=0x804eb8c) > > > at arrayobject.c:1985 > > > #3 0x00d17dde in PyObject_GetItem (o=0x814e7a8, key=0x804eb8c) at > > > Objects/abstract.c:94 > > > > > > (Note: for some reason gdb claims that arrayobject.c:1985 is > > > array_subscript_nice, but looking at my source this line is > > actually > > > in array_item_nice. *boggle*) > > > > > > But scalar_value returns NULL for all non-native types. So, > > destptr in > > > PyArray_Scalar is set to NULL, and the call the copyswap > segfaults. > > > > > > Perhaps scalar_value should be checking the scalarkind field of > > > PyArray_Descr, or using the elsize and alignment fields to > > figure out > > > the pointer to return if scalarkind isn't set? > > > > Hmmm... It looks like the modifications to scalar_value did not take > > into account user-defined types. I've added a correction so that > > user-defined types will use setitem to set the scalar value into the > > array. Presumably your setitem function can handle setting the > array > > with scalars of your new type? > > > > I've checked the changes into SVN. > > > > > > Do there also need to be changes in scalartypes.inc.src to use getitem > > if a user-defined type does not inherit from a Numpy scalar? > This needs to be clarified. I don't think it's possible to do it > without inheriting from a numpy scalar at this point (the void numpy > scalar can be inherited from and is pretty generic). I know I was not > considering that case when I wrote the code. > > i.e. at scalartypes.inc.src:114 we should return some pointer > > calculated from the PyArray_Descr's elsize and alignment field to get > > the destination for the "custom scalar" type to be copied. > I think this is a good idea. I doubt it's enough to fix all places that > don't inherit from numpy scalars, but it's a start. > > It seems like we need to figure out where the beginning of the data is > for the type which is assumed to be defined on alignment boundaries > after a PyObject_HEAD (right)? This could actually be used for > everything and all the switch and if statements eliminated. > > I think the alignment field is the only thing needed, though. I don't > see how I would use the elsize field? Hmm, yeah, I guess alignment would be sufficient. Worst case, you could delegate to setitem, right? It would be useful to support arbitrary types. Suppose, for example, that I wanted to make an array of structs. In keeping with the date/time example, I might want to store a long and a double, the long for days in the Gregorian calendar and the double for seconds from midnight on that day. > Furthermore it seems like the scalar conversions prefer the builtin > > types, but it seems to me that the user-defined type should be > preferred. > I'm not sure what this means. > > > > > > i.e. if I try to get an element from my mxDateTime array, I get a > > float back: > > >>> mxArr[0] = DateTime.now() > > >>> mxArr[0][0] > > 732610.60691268521 > Why can you index mxArr[0]? What is mxArr[0]? If it's a scalar, then > why can you index it? What is type(mxArr[0])? Ah, I am mistaken here - I am correctly getting my mxNumpyDateTime type back: mxArr is a 1x1 matrix: >>> mxArr = numpy.empty((1,1), dtype = libMxNumpy.type) >>> mxArr[0] = DateTime.now() >>> type(mxArr) <type 'numpy.ndarray'> >>> type(mxArr[0]) <type 'numpy.ndarray'> >>> type(mxArr[0][0]) <type 'mxNumpyDateTime'> >>> mxArr.shape (1, 1) > But what I really want is the mxDateTime, which, oddly enough, is what > > happens if I use tolist(): > > >>> mxArr.tolist()[0] > > [<DateTime object for '2006-10-27 14:33:57.25' at a73c60>] > > That's not surprising because tolist just calls getitem on each element > in the array to construct the list. I guess this is a degenerate case, since I have getitem returning a mxDateTime while the actual type of the elements in the array is mxNumpyDateTime (i.e. mxNumpyType). Would the correct behavior, then, be for getitem to return a mxNumpyDateTime and register the object cast function to return a mxDateTime? If I try to do math on the array, it seems like the operation is performed via object pointers (mxDateTime - mxDateTime returns a DateTimeDelta object, and mxNumpyDateTime is a float): >>> mxArr = numpy.empty((1,1), dtype = libMxNumpy.type) >>> mxArr[0][0] = DateTime.now() >>> mxArr2 = numpy.empty((1,1), dtype = libMxNumpy.type) >>> mxArr2[0][0] = DateTime.DateTimeFrom('2006-01-01') >>> type(mxArr[0][0]) <type 'mxNumpyDateTime'> >>> type(mxArr2[0][0]) <type 'mxNumpyDateTime'> >>> sub = mxArr - mxArr2 >>> type(sub[0][0]) <type 'DateTimeDelta'> I'm guessing I need to register ufunc loops for all the basic math on my types? |