From: Francesc A. <fa...@op...> - 2003-01-24 18:47:05
|
A Divendres 24 Gener 2003 18:02, Todd Miller va escriure: > > My [i.e. Todd's] thoughts about it: > > No. It shows you're thinking about it carefully. Having looked at al= l > of the examples below, I have some comments: I mostly agree with your comments, but let point out some thoughts > > 1. The sparseness and obscurity of the typecode "wordspace" are both > demonstrated here. There are so few letters to choose from, they're > often already used in some other context. Even given the large number > of unused letters, it's often difficult to choose good ones and to > remember what has been chosen. I think this is one of the reasons Perr= y > chose to replace typecodes with true type objects which have rich, > regular, and predictable symbolic names. I completely agree that type objects is a brilliant idea. > 3. STSCI has layered other software on top of numarray and recarray > which astronomers use to do work. It is the friction of that interfac= e > which makes correcting these consistency problems more difficult than > might be immediately apparent. Yeah, I know... > > >I think it's important to agree with a definitive set of charcodes and= use > >them uniformly throughout numarray. > > I wish this were possible, but I'm thinking we should try to find an > alternative approach altogether, one which may be more verbose but > implicitly free of conflict. > > A means for specifying a recarray format might be created from tuples, > type objects, and integer repetition factors. > > The verbosity of this approach might be a litte tedious, but it would > also be transparent, maintainable, and conflict free. I think this is a very good idea. In fact, while working in PyTables I wa= s lately pondering what would be the best way to define record arrays, and = I also think that a verbose approach should be the beast. After considering metaclasses, and tuples, I ended to a compromise soluti= on between both which are dictionaries combined with some function or class = to refine the definition. My current thinking is something like: recarrDescr =3D { "name" : defineType(CharType, 16, ""), # 16-character String "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte "ADCcount" : defineType(Int16, 1, 0), # signed short integer "grid_i" : defineType(Int32, 1, 9), # integer "grid_j" : defineType(Int32, 1, 9), # integer "pressure" : defineType(Float32, 1, 1.), # float (single-precisi= on) "temperature" : defineType(Float64, 32, arange(32)), # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long long=20 } where defineType is a class that accepts (type, shape, default) parameter= s. It can be extended safely in the future if more needs appear. Dictionary has the advantage over tuple in that you can map column name t= o their contents quite easily, and is more flexible than defining the field= s with a metaclass descendent (see http://pytables.sourceforge.net/html-doc/usersguide-html3.html#subsection= 3.1.2) because dictionarys can be built-up in run-time (although that also migth metaclass descendents, but in a more misterious way that I think is not worth of). In addition, dictionary object is available in all python vers= ion whereas metaclasses only from 2.2 on. However, I regard metaclasses as th= e most elegant solution (but elegance is not always equivalent to convenien= ce :(). Perhaps you may want to consider this for using in recarray definition. > > I think we should add an "obsolescent feature" warning to numarray and > recarray which flags any use of character typecodes when the appropriat= e > command line switches are set. Well, I don't fully agree with that. I do believe that classes typecodes = to be a more meaningful way for describing types, but charcodes can be quite advantageous in certain situations, like in describing in compact way the contents of a record, or passing this info to C-routines to deal with the data. For example, consider the benefits of describing a recarray format as: "3s4i20d" instead of ((Int16, 3),=20 (Int32, 4), (Float64, 20), ) the former being more handy in lots of situations. I certainly believe that a coexistence of both can be very beneficious, specially for 3rd party extension makers (like me :). > > >Suggestion: if recarray charcodes are not necessary to match the Numer= ic > >ones, I propose that using the Python convention maybe a good idea. > >Look at the table in: > >http://www.python.org/doc/current/lib/module-struct.html. > > This sounds good to me, except that it will break an existing interfac= e > that I don't have control over. Therefore, I suggest we correct the > problem by coming up with something better. Well, if charcodes finally stay in, this have an additional advantage in that python crew has provided meaningful ways to express padding (charact= er "x"), endianess ("=3D", "<", ">") and alignment ("@"). So having a compac= t expresion like "@3sx4i20d", apart from resembling chinese to occidentals, may give a lot of info in a handy way. --=20 Francesc Alted |