From: Erin S. <eri...@gm...> - 2006-11-13 15:10:47
|
On 11/13/06, Francesc Altet <fa...@ca...> wrote: > In any case, you can also use rec.fromrecords for build recarrays from > lists of lists. This breaks the aforementioned rule, but Travis allowed > this because rec.* had to mimic numarray behaviour as much as possible. > Here is an example of use: > > In [46]:mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')} > In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]] > In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor) > In [49]:b = numpy.array([tuple(row) for row in results], > dtype=mydescriptor) > In [50]:a==b > Out[50]:recarray([True, True], dtype=bool) > > OTOH, it is said in the docs that fromrecords is discouraged because it > is somewhat slow, but apparently it has similar performance than using > comprehensions lists: > > In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)", > "import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor = > {'names': ('gender','age','weight'), 'formats':('S1','f4', > 'f4')}").repeat(3,10) > Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211] > > In [52]:Timer("numpy.array([tuple(row) for row in results], > dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000; > mydescriptor = {'names': ('gender','age','weight'), > 'formats':('S1','f4', 'f4')}").repeat(3,10) > Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707] I checked the code. For lists of lists it just creates the recarray and runs a loop copying in the data row by row. The fact that they are of similar speed is actually good news because the list comprehension was making an extra copy of the data in memory. For large memory usage, which is my case, this 50% overhead would have been an issue. Erin |