From: Francesc A. <fa...@ca...> - 2006-11-13 08:19:24
|
El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va escriure: > On 11/13/06, Charles R Harris <cha...@gm...> wrote: > > > > > > On 11/12/06, Erin Sheldon <eri...@gm...> wrote: > > > Hi all - > > > > > > Thanks to everyone for the suggestions. > > > I think map(tuple, list) is probably the most compact, > > > but the list comprehension also works well. > > > > > > Because map() is proably going to disappear someday, I'll > > > stick with the list comprehension. > > > array( [tuple(row) for row in result], dtype=dtype) > > > > > > That said, is there some compelling reason that the array > > > function doesn't support this operation? > > > > My understanding is that the array needs to be allocated up front. Since the > > list comprehension is iterative it is impossible to know how big the result > > is going to be. > > Isn't it the same with a list of tuples? But you can send that directly to the > array constructor. I don't see the fundamental difference, except that the > code might be simpler to write. I think that the correct explanation is that Travis has chosen a tuple as the way to refer to a inhomogeneous list of values (a record) and a list as the way to refer to homogenous list of values. I'm not completely sure why he did this, but I guess the reason was to be able to distinguish the records in scenarios where nested records do appear. In any case, you can also use rec.fromrecords for build recarrays from lists of lists. This breaks the aforementioned rule, but Travis allowed this because rec.* had to mimic numarray behaviour as much as possible. Here is an example of use: In [46]:mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')} In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) In [49]:b = numpy.array([tuple(row) for row in results], dtype=mydescriptor) In [50]:a==b Out[50]:recarray([True, True], dtype=bool) OTOH, it is said in the docs that fromrecords is discouraged because it is somewhat slow, but apparently it has similar performance than using comprehensions lists: In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] In [52]:Timer("numpy.array([tuple(row) for row in results], dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = {'names': ('gender','age','weight'), 'formats':('S1','f4', 'f4')}").repeat(3,10) Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] HTH, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth |