From: Francesc A. <fa...@op...> - 2003-07-01 21:35:33
|
Hi Vineet, A Dimarts 01 Juliol 2003 01:22, Vineet Jain va escriure: > Couple of questions about pytables: > > > > I built two samples. One with pysqlite and one with pytables and I found > pytables to be about 20 times faster than the pysqlite version and used > a lot less space. Let me commend you on a great application. 20 times faster than pysqlite seems too much, and besides, this should depend on what kind of benchmark are you doing. If it is for writing, that seems reasonable, while that for reading the difference should be lot less (see my Europython presentation at http://pytables.sourceforge.net/doc/EuroPython.pdf, for more details). Can you explain a bit what kind of benchmark have you ran?. Anyway, I'm happy to know that pytables works great for your specific application. > > 1. Update certain rows in a table and append to a table. The latter > you handle but am not sure how to do the former. Will updating rows ever > be supported? Appending rows is not a problem, even between different python sessions. Updating is not yet supported and I'm waiting for HDF5 1.6 to appear to see if I can implement that feature. I'll try to release a new version of pytables supporting deleting and updating rows as soon as NCSA folks release the 1.6 version (which should happen more sooner than later). > > > > 2. For arrays or rows returned from a table. How can you do the > following: > > Row1 = table1.read() > > Row2 = table2.read() > > FinalRow = row1+row2 > > Without having to loop through them. > First of all, let me point out that the read() method of a Table object reads the whole table in memory, and returns a recarray object, which is the way the numarray package represents arrays of inhomogeneous data (i.e. tables). Then, you failed to specify if by row1+row2 you meant adding the different rows of tables to get a larger table with nrows1+nrows2 number of rows, or, in case that nrows1 == nrows2 you want to get a table with the same number of rows, but with ncolumns1 + ncolumns2 number of columns. For simplicity, I'll assume that you meant the former case, as the latter seems more complicated. After this clarifications, it seems that you are trying to add two recarray objects, not two tables and this is not currently supported on numarray. But it should be a nice thing to support a __add__ special method, of course. I'll talk with numarray crew so as to see if that can be implemented. > > > 3 Something useful found in pysqlite, and the postgress db driver > is the ability to access field names directly: > > > > row = table.read() > > high = row[10000].high (where high is a field of the table) > Yeah, you can do that using some parameters of the read() method. For example, let's suppose that we have the next Table object: >>> file.root.detector.smalltable /detector/smalltable (Table(10,)) 'Small table with 3 fields' description := { 'var1': Col('CharType', (6,)), 'var2': Col('Int32', (1,)), 'var3': Col('Float64', (1,)) } byteorder = little if you ask for help on its read() method: >>> help(file.root.detector.smalltable.read) Help on method read in module tables.Table: read(self, start=None, stop=None, step=None, field=None, flavor=None) method of ta bles.Table.Table instance Read a range of rows and return an in-memory object. If "start", "stop", or "step" parameters are supplied, a row range is selected. If "field" is specified, only this "field" is returned as a NumArray object. If "field" is not supplied all the fields are selected and a RecArray is returned. If both "field" and "flavor" are provided, an additional conversion to an object of this flavor is made. "flavor" must have any of the next values: "Numeric", "Tuple" or "List". (END) then, you can for example do: >>> file.root.detector.smalltable.read(start=1,stop=5, field="var2") array([1, 2, 3, 4]) and it returns the "var2" column from the rows from 1 up to (and excluding it) 5. It would be handy providing some more pythonic manner to access this data, and that might come in the future. > > > 4 Is there any way the rows returned from table can be treated as > numarray objects? As you have seen in the example before, pytables will always tries to return numarray objects. It will be an Array object if the data is homogeneous (all resulting elements has the same data type). If the resulting elements are of different datatypes, a RecArray object will be returned, as in: >>> print file.root.detector.smalltable.read(start=1,stop=5) RecArray[ ('d: 1', 1, 1024.0), ('d: 2', 2, 2048.0), ('d: 3', 3, 3072.0), ('d: 4', 4, 4096.0) ] Hope that helps to dissipate some of your questions, -- Francesc Alted |