Re: [Pytables-users] Pytables file reading

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Antonio,

This is the piece of code I use to read the part of the table I need:

data = [case[´loads´][i] for case in table]

where i is the index of the row that I need to read from the matrix (133x6)
stored in each cell of the column "loads".

Juanma

2012/8/5 Antonio Valentino <ant...@ti...>

> Hi Juan Manuel,
>
> Il 05/08/2012 22:28, Juan Manuel Vázquez Tovar ha scritto:
> > Hi Antonio,
> >
> > You are right, I don´t need to load the entire table into memory.
> > The fourth column has multidimensional cells and when I read a single row
> > from every cell in the column, I almost fill the workstation memory.
> > I didn´t expect that process to use so much memory, but the fact is that
> it
> > uses it.
> > May be I didn´t explain very well last time.
> >
> > Thank you,
> >
> > Juanma
> >
>
> Sorry, still don't understand.
> Can you please post a short code snipped that shows how exactly do you
> read data into your program?
>
> My impression is that somewhere you use some instruction that triggers
> loading of unnecessary data into memory.
>
>
>
> > 2012/8/5 Antonio Valentino <ant...@ti...>
> >
> >> Hi Juan Manuel,
> >>
> >> Il 04/08/2012 01:55, Juan Manuel Vázquez Tovar ha scritto:
> >>> Hello all,
> >>>
> >>> I´m managing a file close to 26 Gb size. It´s main structure is  a
> table
> >>> with a bit more than 8 million rows. The table is made by four columns,
> >> the
> >>> first two columns store names, the 3rd one has a 53 items array in each
> >>> cell and the last column has a 133x6 matrix in each cell.
> >>> I use to work with a Linux workstation with 24 Gb. My usual way of
> >> working
> >>> with the file is to retrieve, from each cell in the 4th column of the
> >>> table, the same row from the 133x6 matrix.
> >>> I store the information in a bumpy array with shape 8e6x6. In this
> >> process
> >>> I almost use the whole workstation memory.
> >>> Is there anyway to optimize the memory usage?
> >>
> >> I'm not sure to understand.
> >> My impression is that you do not actually need to have the entire 8e6x6
> >> matrix in memory at once, is it correct?
> >>
> >> In that case you could simply try to load less data using something like
> >>
> >> data = table.read(0, 5e7, field='name of the 4-th field')
> >> process(data)
> >> data = table.read(5e7, 1e8,  field='name of the 4-th field')
> >> process(data)
> >>
> >> See also [1] and [2].
> >>
> >> Does it make sense for you?
> >>
> >>
> >> [1]
> >> http://pytables.github.com/usersguide/libref.html#table-methods-reading
> >> [2] http://pytables.github.com/usersguide/libref.html#tables.Table.read
> >>
> >>> If not, I have been thinking about splitting the file.
> >>>
> >>> Thank you,
> >>>
> >>> Juanma
> >>
> >>
> >> cheers
> >>
> >> --
> >> Antonio Valentino
> >>
>
> --
> Antonio Valentino
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>