Re: [Pytables-users] append to multiple tables

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Mon, Jun 10, 2013 at 2:28 PM, Edward Vogel <edw...@gm...>wrote:

> Yes, exactly.
> I'm pulling data out of C that has a 1 to many relationship, and dumping
> it into pytables for easier analysis. I'm creating extension classes in
> cython to get access to the C structures.
> It looks like this (basically, each cv1 has several cv2s):
>
> h5.create_table('/', 'cv1', schema_cv1)
> h5.create_table('/', 'cv2', schema_cv2)
> cv1_row = h5.root.cv1.row
> cv2_row = h5.root.cv2.row
> for cv in sf.itercv():
>     cv1_row['addr'] = cv['addr']
>     ...
>     cv1_row.append()
>     for cv2 in cv.itercv2():
>         cv2_row['cv1_addr'] = cv['addr']
>         cv2_row['foo'] = cv2_row['foo']
>         ...
>         cv2_row.append()
>     h5.root.cv2.flush()  # This fixes issue
>
> Adding the flush after the inner loop does fix the issue. (Thanks!)
>

No problem!  I am glad this worked.

> So, my followup question, why do I need a flush after the inner loop, but
> not when moving from the outer loop to the inner loop?
>

It has to do with when the write buffer gets created / filled / flushed.
 These steps need to happen at the proper time or you can mess loose the
data you were writing, overflow memory, etc.

Be Well
Anthony

>
> Thanks!
>
>
>
> On Mon, Jun 10, 2013 at 2:48 PM, Anthony Scopatz <sc...@gm...>wrote:
>
>> Hi Ed,
>>
>> Are you inside of a nested loop?  You probably just need to flush after
>> the innermost loop.
>>
>> Do you have some sample code you can share?
>>
>> Be Well
>> Anthony
>>
>>
>> On Mon, Jun 10, 2013 at 1:44 PM, Edward Vogel <edw...@gm...>wrote:
>>
>>> I have a dataset that I want to split between two tables. But, when I
>>> iterate over the data and append to both tables, I get a warning:
>>>
>>> /usr/local/lib/python2.7/site-packages/tables/table.py:2967:
>>> PerformanceWarning: table ``/cv2`` is being preempted from alive nodes
>>> without its buffers being flushed or with some index being dirty.  This may
>>> lead to very ineficient use of resources and even to fatal errors in
>>> certain situations.  Please do a call to the .flush() or .reindex_dirty()
>>> methods on this table before start using other nodes.
>>>
>>> However, if I flush after every append, I get awful performance.
>>> Is there a correct way to append to two tables without doing a flush?
>>> Note, I don't have any indices defined, so it seems reindex_dirty()
>>> doesn't apply.
>>>
>>> Thanks,
>>> Ed
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Pytables-users mailing list
>>> Pyt...@li...
>>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Pytables-users mailing list
>> Pyt...@li...
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>
>>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>