From: Francesc A. <fa...@op...> - 2003-02-04 18:05:15
|
A Dimarts 04 Febrer 2003 16:40, Perry Greenfield va escriure: > We'd like to work with you about how that should be best implemented. > Basically the issue is how we save the shape information for that field= =2E > I don't think it would be hard to implement. Ok, great! Well, my proposals for extended recarray syntax are: 1.- Extend the actual formats to read something like: ['(1,)i1', '(3,4)i4', '(16,)a', '(2,3,4)i2'] Pro's: - It's the straightforward extension of the actual format - Should be easy to implement - Note that the charcodes has been substituted by a slightly more ver= bose version ('i2' instead of 's', for example) - Short and simple =20 Con's: - It is still string-code based - Implicit field order 2.- Make use of the syntax I'm suggesting in past messages: class Particle(IsRecord): name =3D Col(CharType, (16,), dflt=3D"", pos=3D3) # 16-charac= ter String ADCcount =3D Col(Int8, (1,), dflt=3D0, pos=3D1) # signed byte TDCcount =3D Col(Int32, (3,4), dflt=3D0, pos=3D2) # signed inte= ger grid_i =3D Col(Int16, (2,3,4), dflt=3D0, pos=3D4) # signed sh= ort integer Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - The columns can be defined as __slots__ in the class constructor making impossible to assign (through __setattr__ for example) value= s to non-existing columns. - Is elegant (IMO) Con's: - Requires more typing to define - Not as concise as 1) (but a short representation can be made inside IsRecord!) - Difficult to define dynamically =09 =20 3.- Similar than 2), but with a dictionary like: Particle =3D { "name" : Col(CharType, (16,), dflt=3D"", pos=3D3), # 16-charact= er String "ADCcount" : Col(Int8, (1,), dflt=3D0, pos=3D1), # signed byte "TDCcount" : Col(Int32, (3,4), dflt=3D0, pos=3D2), # signed integ= er "grid_i" : Col(Int16, (2,3,4), dflt=3D0, pos=3D4), # signed sho= rt=20 integer } Pro's: - It gets rid of charcodes or string codes - The map between name and type is visually clear - Explicit field order - Easy to build dynamically Con's - No possibility to define __slots__ - Not as elegant as 2), but it looks fine. 4.- List-based approach: Particle =3D [ Col(Int8, (1,), dflt=3D0), # signed byte Col(Int32, (3,4), dflt=3D0), # signed integer Col(CharType, (16,), dflt=3D""), # 16-character String Col(Int16, (2,3,4), dflt=3D0), # signed short integer ] Pro's: - Costs less to type (less verbose) - Easy to build dynamically Con's - Implicit field order - Map between field names and contents not visually clear =20 Note: In the previous discussion explicit order has been considered bette= r than implicit, following the Python mantra, and although some people may think that this don't apply well here, I do (but, again, this is purely subjective). =20 Of course, a combination of 2 alternatives can be the best. My current experience tells me that a combination of 2 and 3 may be very good. In th= at way, a user can define their recarrays as classes, but if he needs to def= ine them dynamically, the recarray constructor can accept also a dictionary l= ike 3 (but, obviously, the same applies to case 4). In the end, the recarray instance should have a variable that points to t= his definition class, where metadata is keeped, but a shortcut in the form 1) can also be constructed for convenience. IMO integrating options 2 and 3 (even 4) are not difficult to implement a= nd in fact, such a combination is already present in PyTables CVS version. I even might provide a recarray version with 2 & 3 integrated for developer= s evaluation. Comments?, --=20 Francesc Alted |