Re: [Pytables-users] VLArray inside a Table

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Ivan Vilata i Balaguer (el 2008-02-08 a les 09:43:45 +0100) va dir::

> [...]
> Well, this may either be a problem of inherent slowness in ``VLArray``,
> or a problem with the particular way you rebuild your lists.  It'd be
> interesting to measure both times separately.
> [...]

I found the ``numpy.split()`` function, which may be what you need::

  In [36]: data
  Out[36]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
  In [37]: indices
  Out[37]: array([4, 7, 8])
  In [38]: numpy.split(data, indices)
  Out[38]: [array([0, 1, 2, 3]), array([4, 5, 6]), array([7]), array([8, 9])]

The resulting sub-arrays share the same ``data``, so it should be
memory-efficient.  Also, unless you expect empty sub-arrays, you won't
need to store the first 0 index.  Then, to get pure Python lists::

  >>> for i in lrange(vlarray1.nrows):
  ...     data = vlarray1[i]
  ...     indices = vlarray2[i]
  ...     foo([s.tolist() for s in numpy.split(data, indices)])

But please remember to keep a ``numpy`` flavor for both ``VLArray``
nodes.  Do you still have such high read times with this approach?

::

	Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
	       Cárabos Coop. V.  V  V   Enjoy Data
	                          ""