From: Francesc A. <fa...@op...> - 2003-02-04 09:14:18
|
Hi, It seems that recarray doesn't support more than 1-D numarray arrays as fields. Is that a fundamental limitation? If not, do you plan to support arbitrary dimensions in the future?. Thanks, --=20 Francesc Alted |
From: Todd M. <jm...@st...> - 2003-02-04 12:04:55
|
Francesc Alted wrote: >Hi, > >It seems that recarray doesn't support more than 1-D numarray arrays as >fields. Is that a fundamental limitation? > I don't think it is fundamental, merely a question of what is needed and works easily. I see two problems with multi-d numarray fields, both solvable: 1. Multidimensional numarrays must be described in the recarray spec. 2. Either numarray or recarray must be able to handle a (slightly) more complicated case of recomputing array strides from shape and (bytestride,record-length). I didn't design or implement recarray so there may be other problems as well. >If not, do you plan to support >arbitrary dimensions in the future?. > I don't think it's a priority now. What do you need them for? > >Thanks, > > > Regards, Todd |
From: Francesc A. <fa...@op...> - 2003-02-04 13:00:49
|
A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > I see two problems with multi-d numarray fields, both > solvable: > > 1. Multidimensional numarrays must be described in the recarray spec. > > 2. Either numarray or recarray must be able to handle a (slightly) more > complicated case of recomputing array strides from shape and > (bytestride,record-length). > > I didn't design or implement recarray so there may be other problems as > well. I had a look at the code and it seems like you are right. > I don't think it's a priority now. What do you need them for? Well, I've adopted the recarray object (actually a slightly modified vers= ion of it) to be a fundamental building block in next release of PyTables. If arbitrary dimensionality were implemented, the resulting tables would be more general. Moreover, I'm thinking about implementing unlimited (just o= ne axis) array dimension support and having a degenerated recarray with just one column as a multimensional numarray object would easy quite a lot the implementation. Of course, I could implement my own recarray version with that support, b= ut I just don't want to diverge so much from the reference implementation. --=20 Francesc Alted |
From: Perry G. <pe...@st...> - 2003-02-04 15:40:21
|
> > A Dimarts 04 Febrer 2003 13:19, Todd Miller va escriure: > > I see two problems with multi-d numarray fields, both > > solvable: > > > > 1. Multidimensional numarrays must be described in the recarray spec. > > > > 2. Either numarray or recarray must be able to handle a (slightly) more > > complicated case of recomputing array strides from shape and > > (bytestride,record-length). > > > > I didn't design or implement recarray so there may be other problems as > > well. > > I had a look at the code and it seems like you are right. > > > I don't think it's a priority now. What do you need them for? > > Well, I've adopted the recarray object (actually a slightly > modified version > of it) to be a fundamental building block in next release of PyTables. If > arbitrary dimensionality were implemented, the resulting tables would be > more general. Moreover, I'm thinking about implementing unlimited > (just one > axis) array dimension support and having a degenerated recarray with just > one column as a multimensional numarray object would easy quite a lot the > implementation. > > Of course, I could implement my own recarray version with that > support, but > I just don't want to diverge so much from the reference implementation. > > -- > Francesc Alted > As Todd says, the initial implementation was to support only 1-d cases. There is no fundamental reason why it shouldn't support the general case. We'd like to work with you about how that should be best implemented. Basically the issue is how we save the shape information for that field. I don't think it would be hard to implement. Perry |
From: Francesc A. <fa...@op...> - 2003-02-04 18:05:15
|
A Dimarts 04 Febrer 2003 16:40, Perry Greenfield va escriure: > We'd like to work with you about how that should be best implemented. > Basically the issue is how we save the shape information for that field= =2E > I don't think it would be hard to implement. Ok, great! Well, my proposals for extended recarray syntax are: 1.- Extend the actual formats to read something like: ['(1,)i1', '(3,4)i4', '(16,)a', '(2,3,4)i2'] Pro's: - It's the straightforward extension of the actual format - Should be easy to implement - Note that the charcodes has been substituted by a slightly more ver= bose version ('i2' instead of 's', for example) - Short and simple =20 Con's: - It is still string-code based - Implicit field order 2.- Make use of the syntax I'm suggesting in past messages: class Particle(IsRecord): name =3D Col(CharType, (16,), dflt=3D"", pos=3D3) # 16-charac= ter String ADCcount =3D Col(Int8, (1,), dflt=3D0, pos=3D1) # signed byte TDCcount =3D Col(Int32, (3,4), dflt=3D0, pos=3D2) # signed inte= ger grid_i =3D Col(Int16, (2,3,4), dflt=3D0, pos=3D4) # signed sh= ort integer Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - The columns can be defined as __slots__ in the class constructor making impossible to assign (through __setattr__ for example) value= s to non-existing columns. - Is elegant (IMO) Con's: - Requires more typing to define - Not as concise as 1) (but a short representation can be made inside IsRecord!) - Difficult to define dynamically =09 =20 3.- Similar than 2), but with a dictionary like: Particle =3D { "name" : Col(CharType, (16,), dflt=3D"", pos=3D3), # 16-charact= er String "ADCcount" : Col(Int8, (1,), dflt=3D0, pos=3D1), # signed byte "TDCcount" : Col(Int32, (3,4), dflt=3D0, pos=3D2), # signed integ= er "grid_i" : Col(Int16, (2,3,4), dflt=3D0, pos=3D4), # signed sho= rt=20 integer } Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - Easy to build dynamically Con's - No possibility to define __slots__ - Not as elegant as 2), but it looks fine. 4.- List-based approach: Particle =3D [ Col(Int8, (1,), dflt=3D0), # signed byte Col(Int32, (3,4), dflt=3D0), # signed integer Col(CharType, (16,), dflt=3D""), # 16-character String Col(Int16, (2,3,4), dflt=3D0), # signed short integer ] Pro's: - Costs less to type (less verbose) - Easy to build dynamically Con's - Implicit field order - Map between field names and contents not visually clear =20 Note: In the previous discussion explicit order has been considered bette= r than implicit, following the Python mantra, and although some people may think that this don't apply well here, I do (but, again, this is purely subjective). =20 Of course, a combination of 2 alternatives can be the best. My current experience tells me that a combination of 2 and 3 may be very good. In th= at way, a user can define their recarrays as classes, but if he needs to def= ine them dynamically, the recarray constructor can accept also a dictionary l= ike 3 (but, obviously, the same applies to case 4). In the end, the recarray instance should have a variable that points to t= his definition class, where metadata is keeped, but a shortcut in the form 1) can also be constructed for convenience. IMO integrating options 2 and 3 (even 4) are not difficult to implement a= nd in fact, such a combination is already present in PyTables CVS version. I even might provide a recarray version with 2 & 3 integrated for developer= s evaluation. Comments?, --=20 Francesc Alted |