From: Anthony S. <sc...@gm...> - 2013-06-10 19:42:45
|
On Mon, Jun 10, 2013 at 2:28 PM, Edward Vogel <edw...@gm...>wrote: > Yes, exactly. > I'm pulling data out of C that has a 1 to many relationship, and dumping > it into pytables for easier analysis. I'm creating extension classes in > cython to get access to the C structures. > It looks like this (basically, each cv1 has several cv2s): > > h5.create_table('/', 'cv1', schema_cv1) > h5.create_table('/', 'cv2', schema_cv2) > cv1_row = h5.root.cv1.row > cv2_row = h5.root.cv2.row > for cv in sf.itercv(): > cv1_row['addr'] = cv['addr'] > ... > cv1_row.append() > for cv2 in cv.itercv2(): > cv2_row['cv1_addr'] = cv['addr'] > cv2_row['foo'] = cv2_row['foo'] > ... > cv2_row.append() > h5.root.cv2.flush() # This fixes issue > > Adding the flush after the inner loop does fix the issue. (Thanks!) > No problem! I am glad this worked. > So, my followup question, why do I need a flush after the inner loop, but > not when moving from the outer loop to the inner loop? > It has to do with when the write buffer gets created / filled / flushed. These steps need to happen at the proper time or you can mess loose the data you were writing, overflow memory, etc. Be Well Anthony > > Thanks! > > > > On Mon, Jun 10, 2013 at 2:48 PM, Anthony Scopatz <sc...@gm...>wrote: > >> Hi Ed, >> >> Are you inside of a nested loop? You probably just need to flush after >> the innermost loop. >> >> Do you have some sample code you can share? >> >> Be Well >> Anthony >> >> >> On Mon, Jun 10, 2013 at 1:44 PM, Edward Vogel <edw...@gm...>wrote: >> >>> I have a dataset that I want to split between two tables. But, when I >>> iterate over the data and append to both tables, I get a warning: >>> >>> /usr/local/lib/python2.7/site-packages/tables/table.py:2967: >>> PerformanceWarning: table ``/cv2`` is being preempted from alive nodes >>> without its buffers being flushed or with some index being dirty. This may >>> lead to very ineficient use of resources and even to fatal errors in >>> certain situations. Please do a call to the .flush() or .reindex_dirty() >>> methods on this table before start using other nodes. >>> >>> However, if I flush after every append, I get awful performance. >>> Is there a correct way to append to two tables without doing a flush? >>> Note, I don't have any indices defined, so it seems reindex_dirty() >>> doesn't apply. >>> >>> Thanks, >>> Ed >>> >>> >>> ------------------------------------------------------------------------------ >>> This SF.net email is sponsored by Windows: >>> >>> Build for Windows Store. >>> >>> http://p.sf.net/sfu/windows-dev2dev >>> _______________________________________________ >>> Pytables-users mailing list >>> Pyt...@li... >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> >>> >> >> >> ------------------------------------------------------------------------------ >> This SF.net email is sponsored by Windows: >> >> Build for Windows Store. >> >> http://p.sf.net/sfu/windows-dev2dev >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |