From: Juan M. V. T. <jmv...@gm...> - 2012-08-05 20:52:30
|
Hi Antonio, This is the piece of code I use to read the part of the table I need: data = [case[´loads´][i] for case in table] where i is the index of the row that I need to read from the matrix (133x6) stored in each cell of the column "loads". Juanma 2012/8/5 Antonio Valentino <ant...@ti...> > Hi Juan Manuel, > > Il 05/08/2012 22:28, Juan Manuel Vázquez Tovar ha scritto: > > Hi Antonio, > > > > You are right, I don´t need to load the entire table into memory. > > The fourth column has multidimensional cells and when I read a single row > > from every cell in the column, I almost fill the workstation memory. > > I didn´t expect that process to use so much memory, but the fact is that > it > > uses it. > > May be I didn´t explain very well last time. > > > > Thank you, > > > > Juanma > > > > Sorry, still don't understand. > Can you please post a short code snipped that shows how exactly do you > read data into your program? > > My impression is that somewhere you use some instruction that triggers > loading of unnecessary data into memory. > > > > > 2012/8/5 Antonio Valentino <ant...@ti...> > > > >> Hi Juan Manuel, > >> > >> Il 04/08/2012 01:55, Juan Manuel Vázquez Tovar ha scritto: > >>> Hello all, > >>> > >>> I´m managing a file close to 26 Gb size. It´s main structure is a > table > >>> with a bit more than 8 million rows. The table is made by four columns, > >> the > >>> first two columns store names, the 3rd one has a 53 items array in each > >>> cell and the last column has a 133x6 matrix in each cell. > >>> I use to work with a Linux workstation with 24 Gb. My usual way of > >> working > >>> with the file is to retrieve, from each cell in the 4th column of the > >>> table, the same row from the 133x6 matrix. > >>> I store the information in a bumpy array with shape 8e6x6. In this > >> process > >>> I almost use the whole workstation memory. > >>> Is there anyway to optimize the memory usage? > >> > >> I'm not sure to understand. > >> My impression is that you do not actually need to have the entire 8e6x6 > >> matrix in memory at once, is it correct? > >> > >> In that case you could simply try to load less data using something like > >> > >> data = table.read(0, 5e7, field='name of the 4-th field') > >> process(data) > >> data = table.read(5e7, 1e8, field='name of the 4-th field') > >> process(data) > >> > >> See also [1] and [2]. > >> > >> Does it make sense for you? > >> > >> > >> [1] > >> http://pytables.github.com/usersguide/libref.html#table-methods-reading > >> [2] http://pytables.github.com/usersguide/libref.html#tables.Table.read > >> > >>> If not, I have been thinking about splitting the file. > >>> > >>> Thank you, > >>> > >>> Juanma > >> > >> > >> cheers > >> > >> -- > >> Antonio Valentino > >> > > -- > Antonio Valentino > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |