From: Anthony S. <sc...@gm...> - 2012-03-21 14:48:48
|
On the other hand Tom, If you know that you will be doing < N insertions in the future, you can always pre-allocate a Table / Array that is of size N and pre-loaded with null values. You can then 'insert' by over-writing the nth row. Furthermore you can always append size N chunks whenever. For most of my problems, this has worked just fine, especially if you are dealing with sparse data. Be Well Anthony On Wed, Mar 21, 2012 at 9:18 AM, Francesc Alted <fa...@gm...> wrote: > On Mar 21, 2012, at 7:08 AM, Tom Diethe wrote: > > >>> I'm writing a wrapper for sparse matrices (CSR format) and therefore > >>> need to store three vectors and 3 scalars: > >>> > >>> - data (float64 vector) > >>> - indices (int32 vector) > >>> - indptr (int32 vector) > >>> > >>> - nrows (int32 scalar) > >>> - ncols (int32 scalar) > >>> - nnz (int32 scalar) > >>> > >>> data and indices will always be the same length as each other (=nnz) > >>> but the indptr vector is much shorter. > >>> > >>> I've written routines that allow you to insert/update/delete rows or > >>> columns of the matrix by acting on these vectors only. However I'm > >>> struggling to work out the best pytables structure to store these, > >>> that allows me to append/insert/delete elements easily, and is > >>> efficient. > >>> > >>> I was using a Group with an EArray for each vector. This works ok but > >>> it seems like you are unable to delete items - is that correct? > >>> > >>> I also tried using a Group with a separate Table for each of the > >>> vectors (I could possibly just have two - one for data and indices and > >>> the other for indptr), but this seems to add a lot of overhead in > >>> manipulating the arrays. > >>> > >>> Is there something simple I'm missing? > > Inserting on PyTables objects is not supported. The reason is that they > are implemented on top of HDF5 datasets, that does not support this either. > HDF5 is meant for dealing large datasets, and implementing insertions (or > deletions) is not an efficient operation (requires a complete rewrite of > the dataset). So, if you are going to need a lot of insertions or > deletions, then PyTables / HDF5 is probably not what you want. > > HTH, > > -- Francesc Alted > > > > > > > > > ------------------------------------------------------------------------------ > This SF email is sponsosred by: > Try Windows Azure free for 90 days Click Here > http://p.sf.net/sfu/sfd2d-msazure > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |