From: David W. <so...@av...> - 2007-08-12 05:56:19
|
Hola Francesc, I appreciate your reply, thanks. The code is embedded in other processes so I will extract the =20 relevant bits and make up a working model. In the meantime, I understand the point you make about block row-=20 flushes. However the task in which this is taking place is designed =20 to accept real-time data which arrives row-at-a-time and I need to do =20= multiple computations on parts of the database which includes the new =20= row. Given that I might need to compute over a few thousand rows =20 (matrix outer-product for example), I'm not sure it is more time-=20 efficient to pull that data in on receipt of a single new row, given =20 that I don't know which of the 10-subgroups of the 3500+ groups it is =20= to be a leaf of t has to be done when the new row arrives, than to =20 write a single row. In any event, speed is not likely to be causing a SegFault. Are you =20 suggesting that even after a row is flushed a pointer or whatever is =20 still retained (al la Ziph's law)? Is so, is there a way of iterating over something else perhaps? I'll get a working model up and post asap. David On 11/08/2007, at 6:25 PM, Francesc Altet wrote: > Hello David, > > A Saturday 11 August 2007, David Worrall escrigu=E9: >> Hello All, >> I was glad to have found Pytables. Thanks to all involved. I hope >> someone more experienced than I can offer some advice. >> My Setup: on OSX 10.4.10, Pytables 2.0, python 2.4. >> >> I'm getting a Segmentation Fault. I'll outline the structure FYI and >> perhaps someone can suggest a better approach: >> DB Structure: Parallel (3500+) group nodes off the root node, each >> with 10 sub-group nodes from which hang data leaves defined with >> Class (tables.IsDescription) as per the tutorial. >> >> The data (time ordered & multiplexed) is read sequentially from a >> flat file.The 3500+ nodes created on-the-fly and before any leaf data >> is appended. >> These nodes are created without error. >> >> The Seg. Fault occurs at some point in the leaf creation process. I'm >> using >> table =3D self.h5fileID.getNode() and then table.row['xxx']=3Ddata. >> table.row.append() and table.flush() are executed after the addition >> of each row. > > Despite that your description is pretty accurate, in order to easy the > life of we, poor developers, it is always better if you can send a > minimal script that can reproduce the issue. That way we can focus > quickly into the problem. > > Having said that, why are you saving just one row per table flush? > This is very inneficient and will consume a lot of resources (not only > memory and I/O but also CPU). It is always useful to write rows by > bunches and then, do a flush. When doing this, it is also more > efficient to use: > > row =3D table.row > for i in xrange(1000): > row.append() > table.flush() > > than: > > for i in xrange(1000): > table.row.append() > table.flush() > > because in the latter a new Row object is instanciated on each > iteration, but in the former only once (see tutorials). > > Cheers, > > --=20 >> 0,0< Francesc Altet http://www.carabos.com/ > V V C=E1rabos Coop. V. Enjoy Data > "-" > > > ----------------------------------------------------------------------=20= > --- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a =20 > browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > _________________________________________________ experimental polymedia: www.avatar.com.au Sonic Communications Research Group, University of Canberra: www.canberra.edu.au/vc-forum/scrg vip=3DVerbal Interactivity Project |