Re: [Numpy-discussion] array from list of lists

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

El dl 13 de 11 del 2006 a les 02:07 -0500, en/na Erin Sheldon va
escriure:
> On 11/13/06, Charles R Harris <cha...@gm...> wrote:
> >
> >
> > On 11/12/06, Erin Sheldon <eri...@gm...> wrote:
> > > Hi all -
> > >
> > > Thanks to everyone for the suggestions.
> > > I think map(tuple, list) is probably the most compact,
> > > but the list comprehension also works well.
> > >
> > > Because map() is proably going to disappear someday, I'll
> > > stick with the list comprehension.
> > >   array( [tuple(row) for row in result], dtype=dtype)
> > >
> > > That said, is there some compelling reason that the array
> > > function doesn't support this operation?
> >
> > My understanding is that the array needs to be allocated up front. Since the
> > list comprehension is iterative it is impossible to know how big the result
> > is going to be.
> 
> Isn't it the same with a list of tuples?  But you can send that directly to the
> array constructor.  I don't see the fundamental difference, except that the
> code might be simpler to write.

I think that the correct explanation is that Travis has chosen a tuple
as the way to refer to a inhomogeneous list of values (a record) and a
list as the way to refer to homogenous list of values. I'm not
completely sure why he did this, but I guess the reason was to be able
to distinguish the records in scenarios where nested records do appear.

In any case, you can also use rec.fromrecords for build recarrays from
lists of lists. This breaks the aforementioned rule, but Travis allowed
this because rec.* had to mimic numarray behaviour as much as possible.
Here is an example of use:

In [46]:mydescriptor = {'names': ('gender','age','weight'),
'formats':('S1','f4', 'f4')}
In [47]:results=[['M',64.0,75.0],['F',25.0,60.0]]
In [48]:a = numpy.rec.fromrecords(results, dtype=mydescriptor)
In [49]:b = numpy.array([tuple(row) for row in results],
dtype=mydescriptor)
In [50]:a==b
Out[50]:recarray([True, True], dtype=bool)

OTOH, it is said in the docs that fromrecords is discouraged because it
is somewhat slow, but apparently it has similar performance than using
comprehensions lists:

In [51]:Timer("numpy.rec.fromrecords(results, dtype=mydescriptor)",
"import numpy; results = [['M',64.0,75.0]]*10000; mydescriptor =
{'names': ('gender','age','weight'), 'formats':('S1','f4',
'f4')}").repeat(3,10)
Out[51]:[0.44204592704772949, 0.43584394454956055, 0.50145101547241211]

In [52]:Timer("numpy.array([tuple(row) for row in results],
dtype=mydescriptor)", "import numpy; results = [['M',64.0,75.0]]*10000;
mydescriptor = {'names': ('gender','age','weight'),
'formats':('S1','f4', 'f4')}").repeat(3,10)
Out[52]:[0.49885106086730957, 0.4325258731842041, 0.43297886848449707]

HTH,

-- 
Francesc Altet    |  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth

Re: [Numpy-discussion] array from list of lists

A package for scientific computing with Python

Re: [Numpy-discussion] array from list of lists