From: Jim K. <jim...@sp...> - 2013-05-03 19:52:25
|
Speed is the problem. I am looking for the fastest possible way to do this. I was thinking of using Pandas and was able to achieve fair performance using that lib. It just seemed like I was using panada as a middle man it introduces some issues with the data types. Could it be faster to pull it into a numpy array in chunks and write it out? From: Anthony Scopatz [mailto:sc...@gm...] Sent: Friday, May 03, 2013 2:14 PM To: Discussion list for PyTables Subject: Re: [Pytables-users] Row.append() On Fri, May 3, 2013 at 1:15 PM, Jim Knoll <jim...@sp...<mailto:jim...@sp...>> wrote: I am trying to make this better / faster... Data comes faster than I can store it on one box. So My though was to have many boxes each storing their own part in their own table. Later I would concatenate the tables together with something like this: dest_h5f = pt.openFile(path + 'big_mater.h5','a') for source_path in source_h5_path_list: h5f = pt.openFile(source_path,'r') for node in h5f.root: dest_table = dest_h5f.getNode('/', name = node.name<http://node.name>) print node.nrows if node.nrows > 0 and node.nrows < 1000000: # found I needed to limit the max size or I would crash dest_table.append(node.read()) dest_table.flush() h5f.close() dest_h5f.close() I could add the logic to iter in chunks over the source data to overcome the crash and but I suspect there could be a better way. Hi Jim, You can just iterate over each row in the table (ie "for row in node"). This is slow, but would solve the problem. Take a table in one h5 file and append it to a table in another h5 file. Looked like Table.copy() would do the trick but don't see how I get it to append to an existing table. You could append directly by using the where_append() method with the condition "'True'" to append the whole table. This will automatically do the chunking for you. Be Well Anthony My h5 files have 4 rec arrays all stored in root. Any suggestions? ________________________________ Jim Knoll DBA/Developer II Spot Trading L.L.C 440 South LaSalle St., Suite 2800 Chicago, IL 60605 Office: 312.362.4550<tel:312.362.4550> Direct: 312-362-4798<tel:312-362-4798> Fax: 312.362.4551<tel:312.362.4551> jim...@sp...<mailto:jim...@sp...> www.spottradingllc.com<http://www.spottradingllc.com/> ________________________________ The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Spot Trading, LLC ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Pytables-users mailing list Pyt...@li...<mailto:Pyt...@li...> https://lists.sourceforge.net/lists/listinfo/pytables-users |