You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(2) |
Mar
|
Apr
(5) |
May
(11) |
Jun
(7) |
Jul
(18) |
Aug
(5) |
Sep
(15) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2004 |
Jan
(5) |
Feb
(2) |
Mar
(5) |
Apr
(8) |
May
(8) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
(20) |
Oct
(11) |
Nov
(31) |
Dec
(41) |
2005 |
Jan
(79) |
Feb
(22) |
Mar
(14) |
Apr
(17) |
May
(35) |
Jun
(24) |
Jul
(26) |
Aug
(9) |
Sep
(57) |
Oct
(64) |
Nov
(25) |
Dec
(37) |
2006 |
Jan
(76) |
Feb
(24) |
Mar
(79) |
Apr
(44) |
May
(33) |
Jun
(12) |
Jul
(15) |
Aug
(40) |
Sep
(17) |
Oct
(21) |
Nov
(46) |
Dec
(23) |
2007 |
Jan
(18) |
Feb
(25) |
Mar
(41) |
Apr
(66) |
May
(18) |
Jun
(29) |
Jul
(40) |
Aug
(32) |
Sep
(34) |
Oct
(17) |
Nov
(46) |
Dec
(17) |
2008 |
Jan
(17) |
Feb
(42) |
Mar
(23) |
Apr
(11) |
May
(65) |
Jun
(28) |
Jul
(28) |
Aug
(16) |
Sep
(24) |
Oct
(33) |
Nov
(16) |
Dec
(5) |
2009 |
Jan
(19) |
Feb
(25) |
Mar
(11) |
Apr
(32) |
May
(62) |
Jun
(28) |
Jul
(61) |
Aug
(20) |
Sep
(61) |
Oct
(11) |
Nov
(14) |
Dec
(53) |
2010 |
Jan
(17) |
Feb
(31) |
Mar
(39) |
Apr
(43) |
May
(49) |
Jun
(47) |
Jul
(35) |
Aug
(58) |
Sep
(55) |
Oct
(91) |
Nov
(77) |
Dec
(63) |
2011 |
Jan
(50) |
Feb
(30) |
Mar
(67) |
Apr
(31) |
May
(17) |
Jun
(83) |
Jul
(17) |
Aug
(33) |
Sep
(35) |
Oct
(19) |
Nov
(29) |
Dec
(26) |
2012 |
Jan
(53) |
Feb
(22) |
Mar
(118) |
Apr
(45) |
May
(28) |
Jun
(71) |
Jul
(87) |
Aug
(55) |
Sep
(30) |
Oct
(73) |
Nov
(41) |
Dec
(28) |
2013 |
Jan
(19) |
Feb
(30) |
Mar
(14) |
Apr
(63) |
May
(20) |
Jun
(59) |
Jul
(40) |
Aug
(33) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Francesco D. D. <ke...@li...> - 2005-09-02 14:09:10
|
Hi, i have an issue with pytables performance: This is my python code for testing: ---SNIP--- from tables import * class PytTest(IsDescription): string = Col('CharType', 16) id = Col('Int32', 1) float = Col('Float64', 1) h5file = openFile('probe.h5','a') try: testGroup = h5file.root.testGroup except NoSuchNodeError: testGroup = h5file.createGroup( "/", "testGroup", "Test Group") try: tbTest = testGroup.test except NoSuchNodeError: tbTest = h5file.createTable( testGroup, 'test', PytTest, 'Test table') import time maxRows = 10**6 ### TEST1 ### startTime = time.time() row = tbTest.row for i in range(0, maxRows): row['string'] = '1234567890123456' row['id'] = 1 row['float'] = 1.0/3.0 row.append() tbTest.flush() diffTime = time.time()-startTime print 'test1: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, maxRows/diffTime) ### TEST2 ### startTime = time.time() row = tbTest.row for i in range(0, maxRows): row['string'] = '1234567890123456' row['id'] = 1 row['float'] = 1.0/3.0 diffTime = time.time()-startTime print 'test2: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, maxRows/diffTime) ### TEST3 ### startTime = time.time() row = tbTest.row row['string'] = '1234567890123456' row['id'] = 1 row['float'] = 1.0/3.0 for i in range(0, maxRows): row.append() tbTest.flush() diffTime = time.time()-startTime print 'test3: %d rows in %s seconds (%s/s)' % (maxRows,diffTime, maxRows/diffTime) h5file.close() ---SNIP--- This code try to insert maxRows (10**6) into a table. The table is similar at table in: http://pytables.sourceforge.net/doc/PyCon.html#section4 (small table) used for benchmarking class Small(IsDescription): var1 = Col("CharType", 16) var2 = Col("Int32", 1) var3 = Col("Float64", 1) As you'll notice, there are 3 possible tests: TEST 1: creation of rows and append() in loop TEST 2: creation of rows in loop, no append (no disk use) TEST 3: creation of row before loop, and append in loop() flush is always out of loop, at the end. The testbed is an AMD Athlon(tm) 64 Processor 2800+, 1GB Ram, and 5400rpm disk I've seen same results on a Dual XEON machine, 1GB Ram, SCSI disk. testbed:~# python test.py test1: 1000000 rows in 22.7905650139 seconds (43877.8064252/s) test2: 1000000 rows in 20.3718218803 seconds (49087.4113211/s) test3: 1000000 rows in 2.01304578781 seconds (496759.68925/s) that troughput (40-50krows/s) is less (10 times circa) than that in http://pytables.sourceforge.net/doc/PyCon.html#section4 (small table) seems that the row assignment: row[fieldName] = value took huge amount of time and that time for writing to disk is 10 times smaller than assignation. I'm doing someting wrong? I've made some test on source code, and i've realized that, on TableExtension.pyx on __setitem__ of Row (called when i do a row[...] = value) the line: self._wfields[fieldName][self._unsavednrows] = value is responsible for that slowness. self._wfields[fieldName] is a numarray.array, isnt't? Assignment took so much time related to disk? I can do a strace of process if you need. I've tried with pytables 1.1 and 1.2-b1 compiled from source, and numarray 1.1.1, 1.3.2, 1.3.3 compiled from source with same results It's a normal beaviour, on your opinion? Thanks in advance, kesko78 |
From: Francesc A. <fa...@py...> - 2005-08-30 17:14:48
|
Hi Nicolas, A Thursday 25 August 2005 18:52, v=E0reu escriure: > Hi Francesco, > > (Although subscriber of the pytables-users mailing list, I couldn't post = to > it because the smtp server of my ISP is listed in a smap list, if I > understood correctly. I hope you'll receive this mail directly...) Most probably your ISP has assigned you an IP that is on a black list (i.e. a list of IP's that sent spam recently). The solution is not easy, and the best is to ask your ISP for a fixed IP (that is not on a black list!) and send your messages from there. > I need to store attributes (integers,floats,arrays of floats) via pytable= s, > that must be read from a fortran program. > Let's say I need to store/read from node n the following attributes: [stripped out...] > Is it possible to store my all attributes using the simplest code, such as > "code version 1", and to be able to read them in C or fortran ? Could you > tell me how ? Yes, this has been discussed earlier in the list (see for example: https://sourceforge.net/mailarchive/message.php?msg_id=3D12493677 In particular, read the node labeled as "Caveat Emptor" in AttributeSet section in the reference chapter of PyTables manual: http://pytables.sourceforge.net/html-doc/usersguide4.html#section4.15 HTH, =2D-=20 =46rancesc Altet |
From: Francesc A. <fa...@ca...> - 2005-08-30 17:00:52
|
Hi John, A Sunday 14 August 2005 03:18, John Pitney va escriure: > Hi, > > I'm having trouble installing PyTables 1.1 from the source tarball on my > Fedora Core 3 x86_64 machine with Python 2.3.4. > > To get the build to find my HDF5 libs, I had to change 'lib/lib' in > setup.py to 'lib64/lib' to reflect the location of my HDF5, zlib, etc. > libs. The build seems to go OK, but when I try running the tests, I get > the following: Mmm, I think it's time to add more directories in the search path list in setup.py. Would you mind to send me your modifications to setup.py so that I can figure out how to check for the new library dirs? > $ python test_all.py > Traceback (most recent call last): > File "test_all.py", line 166, in ? > import tables > File "/home/johnp/Desktop/pytables-1.1/tables/__init__.py", line 33, in > ? from tables.utilsExtension import \ > ImportError: /usr/lib64/libhdf5.so.0: undefined symbol: inflate > > > I tried running h5ls and l5dump installed from a binary HDF5 RPM on the > example HDF5 files, and they work OK. According to ldd, they are linked > to /usr/lib64/libhdf5.so.0. inflate is a function from zlib library. Please run ldd over libhdf5.so and check that all the shared libraries it depends on are accessible. > Maybe unrelated: > > If I do this in the tests directory: > > $ ( for f in *.h5 ; do echo $f ; h5dump $f 2>&1 ; done ) | less > > I see a message saying "h5dump error: unable to print data" after the > "DATA {" line on every dataset with a name starting with "tuple". Is > that normal? Yes, this unrelated. The problem here is that some of the files in test directory are compressed using LZO and UCL compressors which are unsupported by the native HDF5 as of now. However, PyTables can take care of them (try using ptdump, for example), so don't worry. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-08-30 16:49:47
|
Hi Tim, A Sunday 14 August 2005 03:14, Tim Churches va escriure: > The following code used to work with PyTables 0.9 but now fails with > PyTables 1.1, and I don't understand why. Can anyone provide some clues? This is a bug in PyTables 1.1 (as well as in PyTables 1.2-b1). The attached patch offer a *very preliminary* cure to this; however, performance is very low for the kind of operations that you are trying to do. This is mainly due to some naive code in the factory functions for the NestedRecord object introduced in PyTables 1.1. If you don't need nested records, please, use PyTables 1.0 for doing benchmarking. PyTables 1.1 and higher will eventually achieve similar performance (if not better), once this issue will be properly adressed. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-08-30 12:45:41
|
Hi Peter, A Thursday 11 August 2005 21:09, Peter Dobcsanyi va escriure: > While installing pytables some of the numarray related tests failed on > an Opteron based machine. I installed both 1.1 and 1.2-b1 with the same > result. On the i686 platform all tests succeeded for both versions. You have discovered a bug for 64-bit platforms. Please, apply the attached patch (for 1.2-b1, but it should work just fine with 1.1). This will be included in forthcoming releases of PyTables. Thanks for noting this! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-08-30 10:11:48
|
Hi Philippe (and others), [I'm coming back from vacancy, so I'll try to ask the questions that have been posed in the pytables list] A Thursday 04 August 2005 11:37, phi...@ho... va escriure: > Is there a lot of things done when we excecute > tables.openFile(filename,mode)? Well, this mainly depends on how many nodes has your file. Up to version 1.1, PyTables was able to open nodes at a speed of 1000/second roughly (on modern CPU's). If your files have much less nodes than, say, 1000, then you can close and open your file without too much latency. On the contrary, if you have a lot of nodes in your files, then, it's better to not close/open your file, if you can afford this. In forthcoming PyTables 1.2, the opening of files has been accelerated quite a bit by not opening all the nodes on the file but just the ones that are being used (in fact, a completely new object tree cache has been implemented). The new cache also improves memory consumption. See the next report if this process is critical for you: http://pytables.sourceforge.net/doc/NewObjectTreeCache.pdf Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: John P. <jo...@pi...> - 2005-08-14 01:19:07
|
Hi, I'm having trouble installing PyTables 1.1 from the source tarball on my Fedora Core 3 x86_64 machine with Python 2.3.4. To get the build to find my HDF5 libs, I had to change 'lib/lib' in setup.py to 'lib64/lib' to reflect the location of my HDF5, zlib, etc. libs. The build seems to go OK, but when I try running the tests, I get the following: $ python test_all.py Traceback (most recent call last): File "test_all.py", line 166, in ? import tables File "/home/johnp/Desktop/pytables-1.1/tables/__init__.py", line 33, in ? from tables.utilsExtension import \ ImportError: /usr/lib64/libhdf5.so.0: undefined symbol: inflate I tried running h5ls and l5dump installed from a binary HDF5 RPM on the example HDF5 files, and they work OK. According to ldd, they are linked to /usr/lib64/libhdf5.so.0. Any suggestions for how to fix this? Is it a problem with my HDF5 libraries? Maybe unrelated: If I do this in the tests directory: $ ( for f in *.h5 ; do echo $f ; h5dump $f 2>&1 ; done ) | less I see a message saying "h5dump error: unable to print data" after the "DATA {" line on every dataset with a name starting with "tuple". Is that normal? Thanks! John |
From: Tim C. <tc...@op...> - 2005-08-14 01:14:09
|
The following code used to work with PyTables 0.9 but now fails with PyTables 1.1, and I don't understand why. Can anyone provide some clues? Tim C ##################################### import time import sys import numarray import numarray.random_array as ra from numarray import memmap from tables import * nor=100000 #bnor=nor*100 bnor=nor*300 filters = None # Create a Table starttime = time.time() fileh = openFile("array1.h5", mode = "w") # Get the root group root = fileh.root # define table class class TestTable(IsDescription): mycol = FloatCol(pos=1) # create table mytable = fileh.createTable(root, 'testtable', TestTable, "Very Big Test table", filters=filters) rowsinbuf = mytable._v_expectedrows # load table with data for i in xrange(0, nor, rowsinbuf): mytable.append([numarray.arange(i, i+rowsinbuf, type="Float64")]) fileh.flush() print "Creating a %s element table and saving in PyTables took %.3f seconds" % (nor, time.time() - starttime) print fileh.close() ######################################## When run, it gives: Traceback (most recent call last): File "test2.py", line 28, in ? mytable.append([numarray.arange(i, i+rowsinbuf, type="Float64")]) File "/usr/local/lib/python2.4/site-packages/tables/Table.py", line 1252, in append raise ValueError, \ ValueError: rows parameter cannot be converted into a recarray object compliant with table '/testtable (Table(0,)) 'Very Big Test table''. The error was: <The row structure doesn't match that provided by the format specification> |
From: Peter D. <pe...@cs...> - 2005-08-11 19:09:32
|
Hi, I just started playing with pytables. Nice product, thank you for it. I would like to use it for storing a huge and ever growing collection of combinatorial/statistical designs. At the moment they are in bzip2-ed xml files at http://designtheory.org. While installing pytables some of the numarray related tests failed on an Opteron based machine. I installed both 1.1 and 1.2-b1 with the same result. On the i686 platform all tests succeeded for both versions. Here is the failed test output: ---> % python test_all.py -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= PyTables version: 1.2-beta1 Extension version: $Id: utilsExtension.pyx 1135 2005-07-29 11:49:51Z ivilata $ HDF5 version: 1.6.2 numarray version: 1.3.3 Zlib version: 1.2.2 LZO version: 1.08 (Jul 12 2002) UCL version: 1.03 (Jul 20 2004) BZIP2 version: 1.0.2 (30-Dec-2001) Python version: 2.4.1 (#2, Mar 30 2005, 20:41:35) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] Platform: linux2-x86_64 Byte-ordering: little -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Performing only a light (yet comprehensive) subset of the test suite. If you have a big system and lots of CPU to waste and want to do a more complete test, try passing the --heavy flag to this script. The whole suite will take more than 10 minutes to complete on a relatively modern CPU and around 100 MB of main memory. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Numeric (version 23.7) is present. Adding the Numeric test suite. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= ...................................FF........................................... ................................................................................ ... ................................................................................ ....................................................................... ====================================================================== FAIL: None (test_attributes.CloseTypesTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/peter/src/languages/python/pytables/pytables-1.2-b1/test/test_attributes.py", line 522, in test01c_setIntAttributes numarray.array([1,2], type=stype)) AssertionError ====================================================================== FAIL: None (test_attributes.CloseTypesTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/peter/src/languages/python/pytables/pytables-1.2-b1/test/test_attributes.py", line 550, in test01d_setIntAttributes numarray.array([[1,2],[2,3]], type=stype)) AssertionError ---------------------------------------------------------------------- Ran 1991 tests in 77.615s FAILED (failures=2) <--- Regards, Peter |
From: <phi...@ho...> - 2005-08-04 09:37:18
|
Hi list, Is there a lot of things done when we excecute tables.openFile(filename,mode)? In my program, i need to use the file object at the beginning and at the end of the use of it. For the moment, i open the file, retrieve attributes and close the file object. When the user needs to modify data, i open a new object file, save the data and close the file. Finally, i need to retrieve all the data so i open the file another time and close it after. That was to avoid memory waste. Do you think it's better to open the file at the beginning and keep the file object and reuse it for saving data and retrieve all data? Thanks a lot for your suggestions, Philippe Collet |
From: Francesc A. <fa...@ca...> - 2005-07-29 17:42:04
|
Hi List, Before leaving on vacancies (don't worry, we will be back in September), the Carabos crew has made available preliminary versions for PyTables 1.1.1 and PyTables 1.2-beta1. You can find them on: http://www.carabos.com/downloads/pytables/preliminary/ PyTables 1.1.1 is just like 1.1, but with several optimizations included so that the opening of files with a lot of nodes is between 1.5x and more than 2x faster. PyTables 1.2-beta1 is a new beast wearing a complete replacement of the classic object tree by a new object tree featuring a LRU (Least Recent Used) cache. While keeping fully backward compatibility (you will still be able to use the object tree as you are used to, like for example, name completion through the use of TAB key), this will allow to open files with large amounts of nodes more than 100x faster (typically) than with PyTables 1.1. The new object tree with LRU cache will typically save you more memory as well, specially for files with lots of nodes. Despite the fact that PyTables 1.2-beta1 passes all the tests flawlessly on Linux platforms, it stills has some problems on Windows and, unfortunately enough, they are not 100% reproductible (that means that a certain test can pass in a run and can fail on the next one!). If anybody with enough knowledge of Windows issues is willing to have a look at that, the PyTables community will be very grateful to him. Enjoy your summer time! =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-07-28 11:52:34
|
A Wednesday 27 July 2005 13:51, travlr va escriure: > Although my coding experience is still fairly green, and time is pretty > tight, It'd be a pleasure to contribute to pytables. I'll try to see if I > can muster up the know how to implement this for us. Much is appreciated = of > the Carabos crew. Yes, no problem. My personal advice is that you look at how readCoordinates makes use of the H5Sselect_elements, and apply the same to arrays. Read the excellent HDF5 docs available in: http://hdf.ncsa.uiuc.edu/HDF5/doc/ Do not hesitate asking on this list if you get stuck. The C=E1rabos crew will in vacancies for most of the month of August, but some of us will check the e-mail from time to time. Also, I know that there are quite a few other people in the list having the necessary skills to answer to your questions. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-07-28 11:44:47
|
A Wednesday 27 July 2005 17:37, Marcus Mendenhall va escriure: > Please note that this message will contain a full copy of the comment > thread, including the initial issue submission, for this request, > not just the latest update. > Category: None > Group: None > Status: Open > Resolution: None > Priority: 5 > Submitted By: Marcus Mendenhall (mendenhall) > Assigned to: Nobody/Anonymous (nobody) > Summary: numeric attribute format is painful > > Initial Comment: > Somewhere along the line, pytables has changed from writing > numeric attributes as real HDF5 numbers to what appear to be > pickled strings. Although this is very convenient for the python > community, it makes the tables written by pytables very hard to use > by other HDF5-reading software, since numbers are not stored > numerically. > > I'm not sure if this is a bug, since I assume the behavior is > intentional, but it seems sufficiently idiosyncratic that I would like to > see it reverted (if possible) to writing numbers in native HDF5 format. Yes, the new behaviour was introduced in PyTables 1.1 and is completely intentional. This is a consequence of supporting native HDF5 multidimensional arrays as attributes and the desire to map directly numarray objects to native attributes. That includes mapping a numarray scalar (and not a python scalar) to an HDF5 scalar.=20 > Pytables provides a lot of nice extensions to the HDF5 format to > make it more pythonish, but it seems that the goal should be to only > use special python constructs when an object is written which really > cannot be converted to native HDF5. Then, if the user is careful in > selecting reasonable base types, the resulting HDF5 files are highly > portable, which is the goal of HDF5. If you want to continue writing attributes as native HDF5 scalars, please, use numarray scalars to do that. For example: In [3]:f=3Dtables.openFile("/tmp/test.h5","w") In [5]:import numarray In [6]:f.root._v_attrs.test1 =3D numarray.array(1) In [8]:f.root._v_attrs.test2 =3D numarray.array(2, type=3Dnumarray.Float64) In [9]:f.root._v_attrs.test3 =3D numarray.array(3, type=3Dnumarray.Int8) In [10]:f.close() $ h5dump /tmp/test.h5 [...] ATTRIBUTE "test1" { DATATYPE H5T_STD_I32LE DATASPACE SCALAR DATA { 1 } } ATTRIBUTE "test2" { DATATYPE H5T_IEEE_F64LE DATASPACE SCALAR DATA { 2 } } ATTRIBUTE "test3" { DATATYPE H5T_STD_I8LE DATASPACE SCALAR DATA { 3 } } As you can see, this new method has the advantage of being able to completely specify the type of the native HDF5 attribute. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: travlr <vel...@gm...> - 2005-07-27 11:51:47
|
On 7/26/05, Francesc Altet <fa...@ca...> wrote: >=20 > A Monday 25 July 2005 16:44, Bernard KAPLAN va escriure: > > In hdf5 you can define progressively a selected region (adding elements > > one by one) by use of either "H5Sselect_elements" with H5S_SELECT_APPEN= D > > or "H5Sselect_hyperslab" with H5S_SELECT_OR. Then you can read or write > > the selected elements at once with "H5Dread" or "H5Dwrite". >=20 > That's correct. In fact, H5Sselect_elements is already used by > Table.readCoordinates. Applying it to support indexing in your > suggested way was already proposed by Pete in this list a few days > ago, but just for Table objects but expanding its use to *Array > objects would be nice (although not high priority for us right now). >=20 > Cheers, Although my coding experience is still fairly green, and time is pretty=20 tight, It'd be a pleasure to contribute to pytables. I'll try to see if I= =20 can muster up the know how to implement this for us. Much is appreciated of= =20 the Carabos crew. regards, Pete -- > >0,0< Francesc Altet http://www.carabos.com/ > V V C=E1rabos Coop. V. Enjoy Data > "-" >=20 >=20 >=20 > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id=16492&opclick > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Francesc A. <fa...@ca...> - 2005-07-26 10:21:05
|
A Monday 25 July 2005 16:44, Bernard KAPLAN va escriure: > In hdf5 you can define progressively a selected region (adding elements > one by one) by use of either "H5Sselect_elements" with H5S_SELECT_APPEND > or "H5Sselect_hyperslab" with H5S_SELECT_OR. Then you can read or write > the selected elements at once with "H5Dread" or "H5Dwrite". That's correct. In fact, H5Sselect_elements is already used by Table.readCoordinates. Applying it to support indexing in your suggested way was already proposed by Pete in this list a few days ago, but just for Table objects but expanding its use to *Array objects would be nice (although not high priority for us right now). Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Antonio V. <val...@co...> - 2005-07-25 15:33:00
|
Alle 16:44, luned=EC 25 luglio 2005, Bernard KAPLAN ha scritto: > Dear Antonio, > > In hdf5 you can define progressively a selected region (adding elements > one by one) by use of either "H5Sselect_elements" with H5S_SELECT_APPEN= D > or "H5Sselect_hyperslab" with H5S_SELECT_OR. Then you can read or write > the selected elements at once with "H5Dread" or "H5Dwrite". Yes, you are right. The selection operation is not as immediate as in you= r=20 example, but afrer selection you can read all data at once. > Is there a function in pytables that use this functionality ? How would > you do to extract a submatrix in pytables ? I think that at the moment this functionality is not available. Surely=20 Francesk or Ivan can give you more explainations. Usually I get an entire block of data from arrays=20 data =3D h5f.myarray[0:100, 0:100] I never encountered a situation like yours, sorry :(( > Bernard|| > > Antonio Valentino wrote: > >Alle 11:36, luned=EC 25 luglio 2005, Bernard KAPLAN ha scritto: > >>Dear all, > > > >hi Bernard > > > >[...] > > > >>x=3Darray([1,2,5,8]) # row index > >>y=3Darray([0,3,7]) # column index > >>a=3Ddata[x,y] # returns an array of size [4,3], 'data' is a pytable= s 2-d > >>array > >>data[x,y]=3D0 # set the submatrix defined by x rows and y columns t= o 0 > >>data[x,y]=3Drand([4,3]) # update of a submatrix > >>data[x,y]=3Ddata2[x,y] # copy of a submatrix from one array into an= other [...] > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id=16492&op,ick > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users --=20 Antonio Valentino |
From: Bernard K. <ber...@be...> - 2005-07-25 14:44:50
|
Dear Antonio, In hdf5 you can define progressively a selected region (adding elements=20 one by one) by use of either "H5Sselect=5Felements" with H5S=5FSELECT=5FAPP= END=20 or "H5Sselect=5Fhyperslab" with H5S=5FSELECT=5FOR. Then you can read or wri= te=20 the selected elements at once with "H5Dread" or "H5Dwrite". Is there a function in pytables that use this functionality ? How would=20 you do to extract a submatrix in pytables ? Bernard|| Antonio Valentino wrote: >Alle 11:36, luned=EC 25 luglio 2005, Bernard KAPLAN ha scritto: > =20 > >>Dear all, >> =20 >> > >hi Bernard > >[...] > > =20 > >>x=3Darray([1,2,5,8]) # row index >>y=3Darray([0,3,7]) # column index >>a=3Ddata[x,y] # returns an array of size [4,3], 'data' is a pytables 2-d >>array >>data[x,y]=3D0 # set the submatrix defined by x rows and y columns to 0 >>data[x,y]=3Drand([4,3]) # update of a submatrix >>data[x,y]=3Ddata2[x,y] # copy of a submatrix from one array into another >> =20 >> > >I think that what you are asking for can't be done in an "elegant" way. >HDF5 supports hyperslabs selection that is powerfull and elegant but it=20 >requires reqular spacing in the selection, see=20 > >http://hdf.ncsa.uiuc.edu/HDF5/doc/RM=5FH5S.html#Dataspace-SelectHyperslab > >In order to use hyperslabs you have to be able to express the sub-matrix=20 >selection in terms of "start, stride, count, and block". >It seems to me that it is not your case. > > =20 > >>Sincerely, >> >>Bernard KAPLAN >> >> >> >>------------------------------------------------------- >>SF.Net email is sponsored by: Discover Easy Linux Migration Strategies >>from IBM. Find simple to follow Roadmaps, straightforward articles, >>informative Webcasts and more! Get everything you need to get up to >>speed, fast. http://ads.osdn.com/?ad=5Fid=3D7477&alloc=5Fid=3D16492&op=3D= click >>=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F= =5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F=5F >>Pytables-users mailing list >>Pyt...@li... >>https://lists.sourceforge.net/lists/listinfo/pytables-users >> =20 >> > >ciao > > =20 > |
From: Antonio V. <val...@co...> - 2005-07-25 10:57:39
|
Alle 11:36, luned=EC 25 luglio 2005, Bernard KAPLAN ha scritto: > Dear all, hi Bernard [...] > x=3Darray([1,2,5,8]) # row index > y=3Darray([0,3,7]) # column index > a=3Ddata[x,y] # returns an array of size [4,3], 'data' is a pytables = 2-d > array > data[x,y]=3D0 # set the submatrix defined by x rows and y columns to = 0 > data[x,y]=3Drand([4,3]) # update of a submatrix > data[x,y]=3Ddata2[x,y] # copy of a submatrix from one array into anot= her I think that what you are asking for can't be done in an "elegant" way. HDF5 supports hyperslabs selection that is powerfull and elegant but it=20 requires reqular spacing in the selection, see=20 http://hdf.ncsa.uiuc.edu/HDF5/doc/RM_H5S.html#Dataspace-SelectHyperslab In order to use hyperslabs you have to be able to express the sub-matrix=20 selection in terms of "start, stride, count, and block". It seems to me that it is not your case. > Sincerely, > > Bernard KAPLAN > > > > ------------------------------------------------------- > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies > from IBM. Find simple to follow Roadmaps, straightforward articles, > informative Webcasts and more! Get everything you need to get up to > speed, fast. http://ads.osdn.com/?ad_id=3D7477&alloc_id=3D16492&op=3Dcl= ick > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users ciao --=20 Antonio Valentino |
From: Bernard K. <ber...@be...> - 2005-07-25 09:36:37
|
Dear all, I am using pytables mostly to store data in the form of simple two-dimensional arrays. For my calculations I often need to extract and update submatrices of these arrays. Unfortunately the row and column indexes I am using can not be described as slices because those can be quite random. Can anyone teach me a "good" (meaning memory efficient, fast and elegant) way to code submatrix extraction and update ? The ideal for me would be a code close to this one : x=array([1,2,5,8]) # row index y=array([0,3,7]) # column index a=data[x,y] # returns an array of size [4,3], 'data' is a pytables 2-d array data[x,y]=0 # set the submatrix defined by x rows and y columns to 0 data[x,y]=rand([4,3]) # update of a submatrix data[x,y]=data2[x,y] # copy of a submatrix from one array into another Sincerely, Bernard KAPLAN |
From: travlr <vel...@gm...> - 2005-07-19 04:19:03
|
The patch worked fine Francesc, and I'm going to work on the other parts as= =20 you suggested. I'll get back to you soon. Pete On 7/18/05, Francesc Altet <fa...@ca...> wrote: >=20 > A Monday 18 July 2005 16:39, travlr va escriure: > > > However, your suggestion is quite good, and implementing it is a=20 > matter > > > of adding a new case in the Column.__getitem__() special method. > > > Something like: > > > > > > [...] > > > elif isinstance(key, numarray): > > > return self.table.readCoordinates(key, self.name <http://self.name> < > http://self.name>) > > > > This is terrific. I'd also like to mention two things... setting the=20 > attr > > vals via file.root.group.table.cols.blah[idx] =3D array ...would also b= e > > great. I believe this syntax congruency (a la numarray) should also be > > extended to (py)tables.tables and (py)tables.array objects. >=20 > Good suggestion. We will see what can we do. Nevertheless, it would be > great if you can contribute the code (and docs) yourself. >=20 > > I got an Error tossing in your patch : > > > > [...] > > elif isinstance(key, numarray.numarraycore.NumArray): > > return self.table.readCoordinates(key, self.name <http://self.name> < > http://self.name>) > > [...] >=20 > Yes. This is a bug in readCoordinates. Try applying the attached patch > as well. >=20 > Cheers, >=20 > -- > >0,0< Francesc Altet http://www.carabos.com/ > V V C=E1rabos Coop. V. Enjoy Data > "-" >=20 >=20 >=20 > |
From: Francesc A. <fa...@ca...> - 2005-07-18 18:05:51
|
A Monday 18 July 2005 16:39, travlr va escriure: > > However, your suggestion is quite good, and implementing it is a matter > > of adding a new case in the Column.__getitem__() special method. > > Something like: > > > > [...] > > elif isinstance(key, numarray): > > return self.table.readCoordinates(key, self.name <http://self.name>) > > This is terrific. I'd also like to mention two things... setting the attr > vals via file.root.group.table.cols.blah[idx] =3D array ...would also be > great. I believe this syntax congruency (a la numarray) should also be > extended to (py)tables.tables and (py)tables.array objects. Good suggestion. We will see what can we do. Nevertheless, it would be great if you can contribute the code (and docs) yourself. > I got an Error tossing in your patch : > > [...] > elif isinstance(key, numarray.numarraycore.NumArray): > return self.table.readCoordinates(key, self.name <http://self.name>) > [...] Yes. This is a bug in readCoordinates. Try applying the attached patch as well. Cheers, =2D-=20 >0,0< Francesc Altet =A0 =A0 http://www.carabos.com/ V V C=E1rabos Coop. V. =A0=A0Enjoy Data "-" |
From: Francesc A. <fa...@ca...> - 2005-07-18 12:43:27
|
El dl 18 de 07 del 2005 a les 08:07 -0400, en/na travlr va escriure: > Actually I'm aware of the "under the hood indexing", but frankly I > don't use it because it shows nice improvement for some search result > sizes, and is quite latent on others. Yes, that's perfectly possible. This kind of indexing use is mainly aimed for large tables while with small ones (< 10**6 entries) your mileage may vary. > What I'm referring to though, with slice indexing, is being able to > use an array for the index, as numarray does. > file.table.cols.blah[array] where the array can be non-sequential, as > is produced by array=numarray.where(x>y)[0]. Ooops, I see. Well, this is in fact already implemented through the use of Table.readCoordinates() method: file.table.readCoordinates(array, field="blah") However, your suggestion is quite good, and implementing it is a matter of adding a new case in the Column.__getitem__() special method. Something like: [...] elif isinstance(key, numarray): return self.table.readCoordinates(key, self.name) would be enough. > This type of behavior if possible would be less cumbersome than > iterating, and less memory intensive when brought in and bounded to an > actual numarray Type in order to facilitate this behavior. Definitely. I'll try to add such a feature for next release. Cheers, Francesc |
From: Francesc A. <fa...@ca...> - 2005-07-18 11:06:04
|
Hi Peter, El dl 18 de 07 del 2005 a les 03:06 -0400, en/na travlr va escriure: > Having recieved the 1.1 release notification, I also noticed the > "support for indexes" in the 1.0 release. Does this happen to include > slicing with an index array (a la numarray)? An example would be the > non-sequential return of: index = numarray.where(...)[0], then applied > to further pytable retrieval. Also, I didn't notice any mentionings of > the index improvements in the documentation. Let me explain some points first. Unfortunately, the "indexing" verb receives many meanings in computer sciences. When I first announced the PyTables support for indexing, I meant that it can sort the columns of tables in order to do faster searches (i.e. values that fulfill some condition). This kind of indexation was actually implemented back in 0.9, although it was improved in 1.0 to allow the indexation of tables with more than 2**31 columns. In fact, from 1.0 on, all the objects in PyTables are supposed to support a number of entries up to 2**62. In fact, this kind of support is mentioned in the User's Manual as the first feature in the "Main Features" section: http://pytables.sourceforge.net/html-doc/usersguide1.html#section1.1 The other kind of indexation that is used on a regular basis in PyTables is what you called "slicing" indexation (ex. table[2:300:3], i.e. a la numarray, with the exception that negative values for steps are not supported). This kind of index support is in PyTables from well before 0.9. Hope I have clearified somewhat this issue, Francesc |
From: Francesc A. <fa...@ca...> - 2005-07-18 10:28:32
|
Hola Antonio, El ds 16 de 07 del 2005 a les 12:40 +0200, en/na Antonio Valentino va escriure: > RPM build errors: > Bad exit status from /var/tmp/rpm-tmp.16679 (%build) > error: command 'rpmbuild' failed with exit status 1 > > $ > > I solved it by adding VERSION in the MANIFEST.in file but I'm but an expert in > this kind of questions so may be this is not the bast way to fix the problem. > Yes, you are right. This is fixed now in SVN trunk, as well as a few additional embellishments. I'm attaching the new MANIFEST.in. Thanks, Francesc |
From: travlr <vel...@gm...> - 2005-07-18 07:07:09
|
Hi Francesc and company :) Having recieved the 1.1 release notification, I also noticed the "support for indexes" in the 1.0 release. Does this happen to include slicing with an index array (a la numarray)? An example would be the non-sequential return of: index =3D numarray.where(...)[0], then applied to further pytable retrieval. Also, I didn't notice any mentionings of the index improvements in the documentation. Thank you guys, Peter |