|
From: N. V. <mit...@we...> - 2006-01-25 07:02:05
|
Hello everyone on the list!
I have been playing around with the latest and greatest numpy 0.94 and
its dtype mechanism. I am especially interested in using the
record-array-like facilities, e.g. the following which is based on an
example from a mail of Travis to this list:
<--
# define array with three "columns".
dtype = numpy.dtype( {'names': ['name', 'age', 'weight'],
'formats': ['U30', 'i2', numpy.float32]} )
a = numpy.array( [(u'Bill', 31, 260), (u'Fred', 15, 135)], dtype=dtype )
# specify column by key
print a ['name']
print a['age']
print a['weight']
#print a['object']
# specify row by number
print a[0]
print a[1]
# first item of row 1 (Fred's age)
print a[1]['age']
# first item of name column (name 'Bill')
print a['name'][0]
-->
I now have a few questions, maybe someone can help me with them:
1) When reading the sample chapter from Travis' documentation, I noticed
that there is also a type 'object' with the character 'O'. So I kind of
hoped that it would be possible to have arbitrary python objects in an
array. However, when I add a fourth "column" of type 'O', then numpy
will mem-fault. Is this not allowed or is this some implementation bug?
2) Is it possible to rename the type descriptors? For my application, I
need to treat these names as keys of dataset columns, so it should be
possible to rename these. More generally speaking: Is it possible to
alter parts of the dtype after instantiation? Of course it should be
possible to copy the dtype, modify it accordingly and create a new
array. However, maybe there is a suggested way to doing this?
3) When I use two identical entries in the names part of the dtype, I
get the message 'TypeError: expected a readable buffer object'. It makes
sense that it is not allowed to have two identical names, but I think
the error message should be worded more descriptive.
4) In the example above, printing any of the strings via 'print' will
yield the characters and then the characters up to the string size
filled up with \x00, e.g.
u'Bill\x00\x00\x00\x00\x00\x00\x00.... (30 characters total)'
Why doesn't 'prin't terminate the output when the first \x00 is reached ?
Overall I am very much impressed by the new numpy and I thank everyone
who contributes to this!
Niklas Volbers.
|