You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(2) |
Mar
|
Apr
(5) |
May
(11) |
Jun
(7) |
Jul
(18) |
Aug
(5) |
Sep
(15) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2004 |
Jan
(5) |
Feb
(2) |
Mar
(5) |
Apr
(8) |
May
(8) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
(20) |
Oct
(11) |
Nov
(31) |
Dec
(41) |
2005 |
Jan
(79) |
Feb
(22) |
Mar
(14) |
Apr
(17) |
May
(35) |
Jun
(24) |
Jul
(26) |
Aug
(9) |
Sep
(57) |
Oct
(64) |
Nov
(25) |
Dec
(37) |
2006 |
Jan
(76) |
Feb
(24) |
Mar
(79) |
Apr
(44) |
May
(33) |
Jun
(12) |
Jul
(15) |
Aug
(40) |
Sep
(17) |
Oct
(21) |
Nov
(46) |
Dec
(23) |
2007 |
Jan
(18) |
Feb
(25) |
Mar
(41) |
Apr
(66) |
May
(18) |
Jun
(29) |
Jul
(40) |
Aug
(32) |
Sep
(34) |
Oct
(17) |
Nov
(46) |
Dec
(17) |
2008 |
Jan
(17) |
Feb
(42) |
Mar
(23) |
Apr
(11) |
May
(65) |
Jun
(28) |
Jul
(28) |
Aug
(16) |
Sep
(24) |
Oct
(33) |
Nov
(16) |
Dec
(5) |
2009 |
Jan
(19) |
Feb
(25) |
Mar
(11) |
Apr
(32) |
May
(62) |
Jun
(28) |
Jul
(61) |
Aug
(20) |
Sep
(61) |
Oct
(11) |
Nov
(14) |
Dec
(53) |
2010 |
Jan
(17) |
Feb
(31) |
Mar
(39) |
Apr
(43) |
May
(49) |
Jun
(47) |
Jul
(35) |
Aug
(58) |
Sep
(55) |
Oct
(91) |
Nov
(77) |
Dec
(63) |
2011 |
Jan
(50) |
Feb
(30) |
Mar
(67) |
Apr
(31) |
May
(17) |
Jun
(83) |
Jul
(17) |
Aug
(33) |
Sep
(35) |
Oct
(19) |
Nov
(29) |
Dec
(26) |
2012 |
Jan
(53) |
Feb
(22) |
Mar
(118) |
Apr
(45) |
May
(28) |
Jun
(71) |
Jul
(87) |
Aug
(55) |
Sep
(30) |
Oct
(73) |
Nov
(41) |
Dec
(28) |
2013 |
Jan
(19) |
Feb
(30) |
Mar
(14) |
Apr
(63) |
May
(20) |
Jun
(59) |
Jul
(40) |
Aug
(33) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Francesc A. <fa...@ca...> - 2005-01-09 19:21:29
|
Yes, you have detected a bug in PyTables. You can solve it by replacing the function H5TB_find_field in src/H5TB.c by the next one: int H5TB_find_field( const char *field, const char *field_list ) { const char *start =3D field_list;=20 const char *end;=20 while ( (end =3D strstr( start, "," )) !=3D 0 ) {=20 if ( strncmp(start,field,end-start) =3D=3D 0 ) return 1;=20 start =3D end + 1;=20 }=20 if ( strcmp( start, field ) =3D=3D 0 ) return 1;=20 return -1; =20 } =20 I'll commit the necessary changes to the software repository soon. Cheers, A Divendres 07 Gener 2005 09:58, russ va escriure: > Hello, I'm working with pytables 0.9.1 on windows 2000 and seem to have=20 > discovered a bug somewhere in the works. Essentially I have a table=20 > with two columns, "mmid" and "mmidn", and pytables can't read the=20 > "mmidn" column. Following is a short program that demonstrates the=20 > problem, and following that is the output of the program. The problem=20 > seems to be in the naming; changing "mmidn" to "abcd" fixes the=20 > problem. Similarly, changing the pair to "abc" and "abcd" recreates the= =20 > problem. Any help would be appreciated. > Thanks, -Russ > ru...@ag... > -------------------------------------------------------------------------= =2D------ > import sys,os > import tables as hdf >=20 > class Consolidatedtable(hdf.IsDescription): > mmid =3D hdf.Int32Col(pos=3D4) > mmidn =3D hdf.Int32Col(pos=3D5) > fname=3D'tmp.h5' > print 'pytables version:', hdf.__version__ >=20 > print 'writing', fname > h5 =3D hdf.openFile(fname, mode=3D'w', title=3D'bwdata') > t=3Dh5.createTable('/','q',Consolidatedtable,'data',expectedrows=3D10000) > r=3Dt.row > r['mmid']=3D0 > r['mmidn']=3D0 > r.append() > h5.close() >=20 >=20 > print 'reading' > h5=3Dhdf.openFile(fname) > print 'this works:', h5.root.q['mmid'] > print 'this does not:', h5.root.q['mmidn'] >=20 >=20 > -------------------------------------------------------------------------= =2D------ > program output: >=20 > $ /d/py/test_mmidn.py > pytables version: 0.9.1 > writing tmp.h5 > reading > this works: [0] > this does not:HDF5-DIAG: Error detected in HDF5 library version:=20 > 1.6.3-patch thr > ead 0. Back trace follows. > #000: D:\hdf5\src\H5Tcompound.c line 327 in H5Tinsert(): unable to=20 > insert memb > er > major(13): Datatype interface > minor(45): Unable to insert object > #001: D:\hdf5\src\H5Tcompound.c line 418 in H5T_insert(): member=20 > overlaps with > another member > major(13): Datatype interface > minor(45): Unable to insert object >=20 > Traceback (most recent call last): > File "d:/util/mypython.py", line 15, in ? > execfile(sys.argv[0]) > File "d:/py/test_mmidn.py", line 24, in ? > print 'this does not:', h5.root.q['mmidn'] > File "D:\Python23\lib\site-packages\tables\Table.py", line 778, in=20 > __getitem__ >=20 > return self.read(field=3Dkey) > File "D:\Python23\lib\site-packages\tables\Table.py", line 592, in read > return self._read(start, stop, step, field, coords) > File "D:\Python23\lib\site-packages\tables\Table.py", line 731, in _read > self._read_field_name(result, start, stop, step, field) > File "hdf5Extension.pyx", line 1831, in=20 > hdf5Extension.Table._read_field_name > RuntimeError: Problems reading table column. >=20 >=20 >=20 > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users >=20 =2D-=20 =46rancesc Altet =A0 >qo< =A0 http://www.carabos.com/ C=E1rabos Coop. V. =A0 V =A0V =A0 Enjoy Data =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "" |
From: russ <ru...@ag...> - 2005-01-07 08:58:27
|
Hello, I'm working with pytables 0.9.1 on windows 2000 and seem to have discovered a bug somewhere in the works. Essentially I have a table with two columns, "mmid" and "mmidn", and pytables can't read the "mmidn" column. Following is a short program that demonstrates the problem, and following that is the output of the program. The problem seems to be in the naming; changing "mmidn" to "abcd" fixes the problem. Similarly, changing the pair to "abc" and "abcd" recreates the problem. Any help would be appreciated. Thanks, -Russ ru...@ag... -------------------------------------------------------------------------------- import sys,os import tables as hdf class Consolidatedtable(hdf.IsDescription): mmid = hdf.Int32Col(pos=4) mmidn = hdf.Int32Col(pos=5) fname='tmp.h5' print 'pytables version:', hdf.__version__ print 'writing', fname h5 = hdf.openFile(fname, mode='w', title='bwdata') t=h5.createTable('/','q',Consolidatedtable,'data',expectedrows=10000) r=t.row r['mmid']=0 r['mmidn']=0 r.append() h5.close() print 'reading' h5=hdf.openFile(fname) print 'this works:', h5.root.q['mmid'] print 'this does not:', h5.root.q['mmidn'] -------------------------------------------------------------------------------- program output: $ /d/py/test_mmidn.py pytables version: 0.9.1 writing tmp.h5 reading this works: [0] this does not:HDF5-DIAG: Error detected in HDF5 library version: 1.6.3-patch thr ead 0. Back trace follows. #000: D:\hdf5\src\H5Tcompound.c line 327 in H5Tinsert(): unable to insert memb er major(13): Datatype interface minor(45): Unable to insert object #001: D:\hdf5\src\H5Tcompound.c line 418 in H5T_insert(): member overlaps with another member major(13): Datatype interface minor(45): Unable to insert object Traceback (most recent call last): File "d:/util/mypython.py", line 15, in ? execfile(sys.argv[0]) File "d:/py/test_mmidn.py", line 24, in ? print 'this does not:', h5.root.q['mmidn'] File "D:\Python23\lib\site-packages\tables\Table.py", line 778, in __getitem__ return self.read(field=key) File "D:\Python23\lib\site-packages\tables\Table.py", line 592, in read return self._read(start, stop, step, field, coords) File "D:\Python23\lib\site-packages\tables\Table.py", line 731, in _read self._read_field_name(result, start, stop, step, field) File "hdf5Extension.pyx", line 1831, in hdf5Extension.Table._read_field_name RuntimeError: Problems reading table column. |
From: Francesc A. <fa...@ca...> - 2004-12-24 18:09:15
|
A Divendres 24 Desembre 2004 06:22, Andrew Straw va escriure: > Should I be able to do something like the following? >=20 > for row in table: > row['y'] =3D row['x'] + 2 >=20 > If so, how do I commit these changes to disk? table.flush() doesn't seem= =20 > to work. Well, you can do something like:=20 # Modify field 'y' table.cols.y[:] =3D table.cols.x[:]+2 # Modify rows 1 to the end with an stride of 2 in field 'y' table.cols.y[1::2] =3D table.cols.x[1::2]+2 Which is equivalent to: # Modify field 'y' table.modifyColumns(start=3D1, columns=3D[[table.cols.x[1]+2]], names=3D= ["y"]) # Modify rows 1 to the end with an stride of 2 in field 'y' columns =3D numarray.records.fromarrays([table.cols.x[1::2]+2], formats= =3D"i4") table.modifyColumns(start=3D1, step=3D2, columns=3Dcolumns, names=3D["y"= ]) [Example taken from the pytables user's manual (Column.__setitem__ section), just slightly adapted to your example] > It seems like this simple example should be in the tutorial and the unit= =20 > tests, but I can't find it anywhere. ("How to edit a row") =20 Maybe I should improve the manual, most specially adding a tutorial for value modification. Cheers, =2D-=20 =46rancesc Altet =A0 >qo< =A0 http://www.carabos.com/ C=E1rabos Coop. V. =A0 V =A0V =A0 Enjoy Data =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "" |
From: Andrew S. <str...@as...> - 2004-12-24 05:23:10
|
Hi, Should I be able to do something like the following? for row in table: row['y'] = row['x'] + 2 If so, how do I commit these changes to disk? table.flush() doesn't seem to work. It seems like this simple example should be in the tutorial and the unit tests, but I can't find it anywhere. ("How to edit a row") Or maybe rows of tables can't be edited? Cheers! Andrew |
From: Francesc A. <fa...@ca...> - 2004-12-16 07:49:41
|
A Dijous 16 Desembre 2004 03:16, Andrew Straw va escriure: > Hi again, >=20 > A question/suggestion-for-the-docs: >=20 > in section "6.2.2. Indexed searches" you don't mention if adding rows to= =20 > a table with a column that was declared indexed is particularly=20 > expensive. What are the speed tradeoffs involved with creating the=20 > index "on the fly" versus afterwards with .createIndex()? Well, while I've not done extensive benchmarks on this, my impression is that createIndex() would be a little bit faster, but not for a large extent. However, you should take this as a feeling at first glance on my part. If you have a better benchs on this subject I'll be happy to include your results as part of the docs. Cheers, =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Andrew S. <str...@as...> - 2004-12-16 02:16:56
|
Hi again, A question/suggestion-for-the-docs: in section "6.2.2. Indexed searches" you don't mention if adding rows to a table with a column that was declared indexed is particularly expensive. What are the speed tradeoffs involved with creating the index "on the fly" versus afterwards with .createIndex()? Cheers! Andrew |
From: Andrew S. <str...@as...> - 2004-12-15 19:08:40
|
>I'm afraid that compression/uncompression is driven by HDF5 itself, so, with >the actual patch it will happen in the same thread as the one doing the real >I/O. > This is actually good news from a multi-CPU system (or perhaps even one with HyperThreading) perspective. It means that substantial CPU processing can simultaneously occur in two threads because one of them (the thread doing compression) does not need the Python GIL, thus allowing the other thread to continue along in Python. |
From: Francesc A. <fa...@ca...> - 2004-12-15 18:43:51
|
A Dimecres 15 Desembre 2004 18:31, Andrew Straw va escriure: > It may be more clear to say this patch improves "interactivity" rather=20 > than "performance". (Even on single CPU systems.) To explain, imagine=20 [...] > usually I/O bound, not CPU bound). In fact, Python's GIL is a major=20 > obstacle to writing multi-threaded programs that make use of=20 > multi-CPUs, so I envision this patch having perhaps more significant=20 > impact on single CPU machines. Aha!, very good explanation. That's been very interesting to know about. > Note that pytable's realtime compression does make "disk access" much=20 > more CPU bound. Thus, releasing the GIL may not result in speed=20 > increases on single CPUs with compression turned on. I haven't dug=20 > into the pytables internals very much, but if the (de)compression can=20 > be done after releasing the GIL, I predict this would improve=20 > performance on a multi-CPU system. In that case, one CPU can busy=20 > itself with compressing data, while another CPU can perform other=20 > tasks. I'm afraid that compression/uncompression is driven by HDF5 itself, so, with the actual patch it will happen in the same thread as the one doing the real I/O. However, as you already said, pytables online compression is more "disk access" bound than CPU bound, specially when fast compressors (i.e. LZO) or fast CPU's are used. With the advent of newer CPUs the bottleneck will be more and more on "disk access", so it may not be worth the effort trying to make pure I/O and compression happen in different threads. > So, I hope this example is relatively clear. As I said, I will endeavor=20 > to write a simple (as possible) example. That would be great. Thanks! =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Andrew S. <str...@as...> - 2004-12-15 17:31:19
|
On Dec 15, 2004, at 5:53 AM, Francesc Altet wrote: > Hi Andrew, > > A Dimecres 15 Desembre 2004 03:49, Andrew Straw va escriure: >> I enclose a small patch (against CVS HEAD) which demonstrates how to >> release the GIL during potentially long operations. > The patched version is already in CVS HEAD. Super! >> For example purposes, I've only tracked down this single operation. >> If >> you'd like any more explanation/justification/etc., please let me know >> and I'll do my best. > > I think it would be interesting if you can provide some (small) > example on > how you can speed-up your I/O by using threading. I can manage to put > that > example (or similar) in the chapter of "Optimization Tips" in PyTables > manual. By the way, do you think that doing multithreading on a > single-CPU > machine would made some processes more efficient? I mean, if you have a > thread for reading and other for processing the read data, perhaps you > can > get some speed-up (while I doubt it, that would be just great). I will endeavor to write a demo WhenIHaveFreeTime (DubiousWiki link ;). It may be more clear to say this patch improves "interactivity" rather than "performance". (Even on single CPU systems.) To explain, imagine a case where the program must A) acquire data from an instrument and provide a low latency data stream to something else, (e.g. via ethernet) and B) save all this data to a file. (One could further elaborate by including a discussion of a GUI, but that's extraneous to the fundamental point.) Even if task A is performed in a thread separate from the thread for task B, the disk writing will block A's thread (unless the GIL is released), resulting in possible dropped data and unnecessary latency increases. Because writing to disk takes a potentially long time to complete but doesn't require the Python interpreter, it's better to avoid this situation and let A continue while B is ongoing by releasing the GIL in B's thread. All of this holds true for a single CPU system because often the blocking involved with saving to disk is not related to CPU usage but things like waiting for the drive (disk access is usually I/O bound, not CPU bound). In fact, Python's GIL is a major obstacle to writing multi-threaded programs that make use of multi-CPUs, so I envision this patch having perhaps more significant impact on single CPU machines. Note that pytable's realtime compression does make "disk access" much more CPU bound. Thus, releasing the GIL may not result in speed increases on single CPUs with compression turned on. I haven't dug into the pytables internals very much, but if the (de)compression can be done after releasing the GIL, I predict this would improve performance on a multi-CPU system. In that case, one CPU can busy itself with compressing data, while another CPU can perform other tasks. So, I hope this example is relatively clear. As I said, I will endeavor to write a simple (as possible) example. Thanks again for PyTables! Cheers! Andrew |
From: Francesc A. <fa...@ca...> - 2004-12-15 13:53:44
|
Hi Andrew, A Dimecres 15 Desembre 2004 03:49, Andrew Straw va escriure: > I enclose a small patch (against CVS HEAD) which demonstrates how to=20 > release the GIL during potentially long operations. I think all=20 > potentially long operations (basically most of the C H5* functions)=20 > should be bracketed by these statements. Done. I've applied your patch, and besides, looked at the places where potentially long operations would happen and backeted them by BEGIN_ALLOW_THREADS/END_ALLOW_THREADS. After some work (some variables passed to the C functions were Python objects), all the tests units pass fine. The patched version is already in CVS HEAD. > For example purposes, I've only tracked down this single operation. If=20 > you'd like any more explanation/justification/etc., please let me know=20 > and I'll do my best. I think it would be interesting if you can provide some (small) example on how you can speed-up your I/O by using threading. I can manage to put that example (or similar) in the chapter of "Optimization Tips" in PyTables manual. By the way, do you think that doing multithreading on a single-CPU machine would made some processes more efficient? I mean, if you have a thread for reading and other for processing the read data, perhaps you can get some speed-up (while I doubt it, that would be just great). Thanks for your contribution! =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Francesc A. <fa...@ca...> - 2004-12-15 12:12:03
|
Hi Norbert, A Dimarts 14 Desembre 2004 19:04, Norbert Nemec va escriure: > I understand, that pytables buffers data until flush() is called. I also= =20 > understand, that this buffer may also be flushed automatically when it is= =20 > full. >=20 > Would it be possible to prohibit this automatic flushing? No, the flushing mechanism works exactly as you have described. > The idea behind this is, that I would like to have the h5-file intact at = least=20 > with a good probability if the program is interrupted or breaks down. If = it=20 > would be possible to defer all writing actions to an explicit call of flu= sh()=20 > (or the closing of the file), I could simply write to pytables in some ra= ndom=20 > order, and then just call flush() when the data is in a coherent state. E= ven=20 > though this will not give us real atomic operations, it would still be go= od=20 > enough for many applications that continuously write to a file. >=20 > For performance reasons, this should probably be an option. Well, if you are interested in providing your own buffers to Table objects you can always create a RecArray (or a list of Numeric/NumArray/CharArray) objects on your own, fill it with your info and then call Table.append(recarray) for writing it to disk. Table.append() forces an immediate write to disk, and don't touch the internal table buffers at all. This would be a nice approach for doing what you are interested in. > It should also be clear which operations are buffered at all. Normally, the only buffered operations in PyTables are those related with the extension class Row (Row.append(), Row.__getitem__() and Row.__setitem__()). All the rest does not use buffering at all (unless you write big chunks of data on your own as described above). =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Ivan V. i B. <iv...@ca...> - 2004-12-15 08:17:00
|
On Tue, Dec 14, 2004 at 06:50:43PM +0100, Norbert Nemec wrote: > [...] > I don't think that there is any point in giving that warning. If one is a= ble=20 > to create a node with a non-pythonic name, he obviously knows how to work= =20 > around natural naming, so he can access it afterwards as well. I still believe that the warning would be useful. Imagine the following statement: >>> t =3D h5f.createEArray('/measurements', '1st', ...) That does'nt need setattr() voodoo and still creates a non-pythonic id. Other strange characters may also slip in (i.e. dots in file names when using FileNode), so issuing a warning is justified. And remember that using it gives us for free the possibility of turning it into an exception (or completely off), if the user wishes so. --=20 Ivan Vilata i Balaguer >qo< http://www.carabos.com/ C=E1rabos Coop. V. V V Enjoy Data "" |
From: Andrew S. <str...@as...> - 2004-12-15 02:49:15
|
Dear Francesc (and others), I enclose a small patch (against CVS HEAD) which demonstrates how to release the GIL during potentially long operations. I think all potentially long operations (basically most of the C H5* functions) should be bracketed by these statements. For example purposes, I've only tracked down this single operation. If you'd like any more explanation/justification/etc., please let me know and I'll do my best. It's worth checking the Pyrex-outputted C file to ensure that there are no calls to the C Python API between the BEGIN_ALLOW_THREADS and the END_ALLOW_THREADS. I did this for this example. Other cases may require that the C return value (e.g. an integer) is stored in an intermediate variable (as in this case) so that the GIL can be acquired again before raising a Python exception upon error return from C. Hoping this (and similar for other potentially long operations) will make it into PyTables, Andrew RCS file: /cvsroot/pytables/pytables/src/hdf5Extension.pyx,v retrieving revision 1.150 diff -c -r1.150 hdf5Extension.pyx *** src/hdf5Extension.pyx 9 Dec 2004 13:01:58 -0000 1.150 --- src/hdf5Extension.pyx 15 Dec 2004 02:35:44 -0000 *************** *** 116,121 **** --- 116,125 ---- char *PyString_AsString(object string) object PyString_FromString(char *) + # To release global interpreter lock (GIL) for threading + void Py_BEGIN_ALLOW_THREADS() + void Py_END_ALLOW_THREADS() + # To access to str and tuple structures. This does not work with Pyrex 0.8 # This is not necessary, though # ctypedef class __builtin__.str [object PyStringObject]: *************** *** 1658,1666 **** --- 1662,1677 ---- if not self._open: self._open_append(recarr) + # release GIL (allow other threads to use the Python interpreter) + Py_BEGIN_ALLOW_THREADS + # Append the records: ret = H5TBOappend_records(&self.dataset_id, &self.mem_type_id, nrecords, self.totalrecords, self.rbuf) + + # acquire GIL (disallow other threads from using the Python interpreter) + Py_END_ALLOW_THREADS + if ret < 0: raise RuntimeError("Problems appending the records.") |
From: Norbert N. <No...@ne...> - 2004-12-14 18:04:57
|
Hi there, I understand, that pytables buffers data until flush() is called. I also understand, that this buffer may also be flushed automatically when it is full. Would it be possible to prohibit this automatic flushing? The idea behind this is, that I would like to have the h5-file intact at least with a good probability if the program is interrupted or breaks down. If it would be possible to defer all writing actions to an explicit call of flush() (or the closing of the file), I could simply write to pytables in some random order, and then just call flush() when the data is in a coherent state. Even though this will not give us real atomic operations, it would still be good enough for many applications that continuously write to a file. For performance reasons, this should probably be an option. It should also be clear which operations are buffered at all. Greetings, Norbert -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |
From: Francesc A. <fa...@ca...> - 2004-12-14 17:52:03
|
A Dimarts 14 Desembre 2004 16:39, Ivan Vilata i Balaguer va escriure: > Allowing non-pyhtonic ids in nodes would make node naming a lot > easier for the user, specially when working with visual interfaces > like ViTables, where interactive scripting is not required. This > switch in PyTables behaviour should not cause backwards compatibility > problems, I believe. On the other hand, since PyTables interactive > ease of use is one of its distinctive features, I totally agree with > issuing a warning when creating a non-pythonically named node. A new > warning class (for instance, NodeNameWarning) should be used for that. > In this way, if the user wants only natural naming, she can use the > warning filter to turn these warnings into exceptions. Ok. So I think the best would be to add a new file, say 'PTExceptions.py' where one can define the new exceptions and warnings like NodeNameWarning, AttributeNameWarning, although probably NaturalNameWarning (that would include both), would be better, and change the NameError by, say, NaturalNameWarning. That would allow the user to deal with both compliant and not compliant natural naming names, but if the latter is the case, a warning will be issued. I think this is a well-balanced solution. In this file one can also put other exceptions that would describe better the reasons of failure. =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Norbert N. <Nor...@gm...> - 2004-12-14 17:50:53
|
Am Dienstag, 14. Dezember 2004 14:23 schrieb Francesc Altet: > What about replacing the NameError > exception with an UserWarning? The other possibility is allowing any string > as node or attribute name, even if they cannot be used by natural naming, > without any warning at all. What do you think is best? Other people would > chime in? I don't think that there is any point in giving that warning. If one is able to create a node with a non-pythonic name, he obviously knows how to work around natural naming, so he can access it afterwards as well. -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |
From: Ivan V. i B. <iv...@ca...> - 2004-12-14 15:39:51
|
Well, now I must say that the little piece of Python that Norbert sent has opened my eyes quite a lot! Wow! Thank you! I've always had the feeling that the imposition of natural naming for easy interactive access was in a way hindering some possible usage scenarios for PyTables. That became clearer to me while developing the tables.nodes.FileNode module, when I realized that files could be stored in a PyTables file easily, but their names would be a pain to keep (one would have to use the node's title or some other attribute). Wait, we even have a filesystem path-like naming scheme! Allowing non-pyhtonic ids in nodes would make node naming a lot easier for the user, specially when working with visual interfaces like ViTables, where interactive scripting is not required. This switch in PyTables behaviour should not cause backwards compatibility problems, I believe. On the other hand, since PyTables interactive ease of use is one of its distinctive features, I totally agree with issuing a warning when creating a non-pythonically named node. A new warning class (for instance, NodeNameWarning) should be used for that. In this way, if the user wants only natural naming, she can use the warning filter to turn these warnings into exceptions. Well, this is only my opinion... import disclaimer --=20 Ivan Vilata i Balaguer >qo< http://www.carabos.com/ C=E1rabos Coop. V. V V Enjoy Data "" |
From: Ivan V. i B. <iv...@ca...> - 2004-12-14 15:17:27
|
On Mon, Dec 13, 2004 at 07:31:15PM +0100, Francesc Altet wrote: > A Dilluns 13 Desembre 2004 18:53, Norbert Nemec va escriure: > [...] > > In any case, the correct error to raise in the first case would seem to= be=20 > > ValueError. > >=20 > > I'm really confused now. Anybody has better ideas? >=20 > Well, IMO NameError is the exception name that better fits in this case > (much better that SyntaxError). However, I guess that something like a new > NodenameError would be the best.=20 > [...] I agree with Norbert in that the kind of exception that should be raised is ValueError, since the type of the argument is right (no TypeError), the Python syntax is right (no SyntaxError), and there is no reference to a non-existent name (no NameError). In the same spirit as previous mails, maybe a NodeNameError or something like that, inheriting from ValueError, may do the job. (More on non-natural naming later. ;) --=20 Ivan Vilata i Balaguer >qo< http://www.carabos.com/ C=E1rabos Coop. V. V V Enjoy Data "" |
From: Norbert N. <Nor...@gm...> - 2004-12-14 14:59:38
|
As I just read from the documentation, hdf5 node-names may contain any character except '/' and '\0'. I think, pytables should follow this principle. Of course natural naming will not work for that kind of identifier, so you have to use getattr/setattr manually if you want to access nodes with funny names. Am Dienstag, 14. Dezember 2004 08:57 schrieb Norbert Nemec: > Am Montag, 13. Dezember 2004 19:31 schrieb Francesc Altet: > > A Dilluns 13 Desembre 2004 18:53, Norbert Nemec va escriure: > > > Using SyntaxError for non-pythonic identifiers is a nice idea, even > > > though it should be noted that the following code: > > > > > > ---------- > > > class A: > > > pass > > > a=A() > > > setattr(a,"!$%#**",42) > > > print a.__dict__ > > > ---------- > > > > > > works fine and gives: > > > > > > {'!$%#**': 4} > > > > > > This attribute is therefore inaccessible by natural naming but still > > > legal. Shouldn't pytables behave the same? > > > > I don't think so. Allowing that kind of identifiers would break natural > > naming and that should be avoided for the sake of usability. > > Of course, natural naming is broken by that. On the other hand, natural > naming may not always be feasible anyway. > > In my case, for example, I produce one .h5 file containing several groups. > The core distinction between them is just one floating point parameter, say > E, that runs over several values, say {0.0, 1.5, 2.3}. Most natural would > be to call the groups "E=0.0", "E=1.5" and "E=2.3". Every other name I > could give is artifical and meaningless. (Currently, I'm creating names > like "data1", "data2" and "data3".) Natural naming is not an option anyway, > since I'm looping over the groups all the time and even create them with a > loop. > > Considering all of this, I would suggest that any kind of name should be > allowed for nodes (are there any limitations given by HDF5?) Even reserved > prefixes need not necessarily be prohibited. Everyone has to understand > that they cannot be accessed by natural naming, but if you don't use that, > why should you be unnecessarily restricted? > > I think this should be cleared first, since it might mean that we don't > need any Exceptions for that case at all. > > Greetings, > Norbert -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |
From: Francesc A. <fa...@ca...> - 2004-12-14 13:23:34
|
A Dimarts 14 Desembre 2004 13:01, vareu escriure: > In my case, for example, I produce one .h5 file containing several groups. > The core distinction between them is just one floating point parameter, = say > E, that runs over several values, say {0.0, 1.5, 2.3}. Most natural woul= d be > to call the groups "E=3D0.0", "E=3D1.5" and "E=3D2.3". Every other name = I could > give is artifical and meaningless. (Currently, I'm creating names like > "data1", "data2" and "data3".) Natural naming is not an option anyway, s= ince > I'm looping over the groups all the time and even create them with a loo= p. >=20 > Considering all of this, I would suggest that any kind of name should be > allowed for nodes (are there any limitations given by HDF5?) Even reserved > prefixes need not necessarily be prohibited. Everyone has to understand t= hat > they cannot be accessed by natural naming, but if you don't use that, why > should you be unnecessarily restricted? Good observation. And no, HDF5 does not put any limitations (that I'm aware of) on node names. Mmmm... on the other hand, I really find natural naming to be *really* useful specially in interactive mode. However, I agree that this should be a decision on the user. What about replacing the NameError exception with an UserWarning? The other possibility is allowing any string as node or attribute name, even if they cannot be used by natural naming, without any warning at all. What do you think is best? Other people would chime in? > I think this should be cleared first, since it might mean that we don't n= eed > any Exceptions for that case at all. Agreed. =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Norbert N. <Nor...@gm...> - 2004-12-14 07:57:18
|
Am Montag, 13. Dezember 2004 19:31 schrieb Francesc Altet: > A Dilluns 13 Desembre 2004 18:53, Norbert Nemec va escriure: > > Using SyntaxError for non-pythonic identifiers is a nice idea, even > > though it should be noted that the following code: > > > > ---------- > > class A: > > pass > > a=A() > > setattr(a,"!$%#**",42) > > print a.__dict__ > > ---------- > > > > works fine and gives: > > > > {'!$%#**': 4} > > > > This attribute is therefore inaccessible by natural naming but still > > legal. Shouldn't pytables behave the same? > > I don't think so. Allowing that kind of identifiers would break natural > naming and that should be avoided for the sake of usability. Of course, natural naming is broken by that. On the other hand, natural naming may not always be feasible anyway. In my case, for example, I produce one .h5 file containing several groups. The core distinction between them is just one floating point parameter, say E, that runs over several values, say {0.0, 1.5, 2.3}. Most natural would be to call the groups "E=0.0", "E=1.5" and "E=2.3". Every other name I could give is artifical and meaningless. (Currently, I'm creating names like "data1", "data2" and "data3".) Natural naming is not an option anyway, since I'm looping over the groups all the time and even create them with a loop. Considering all of this, I would suggest that any kind of name should be allowed for nodes (are there any limitations given by HDF5?) Even reserved prefixes need not necessarily be prohibited. Everyone has to understand that they cannot be accessed by natural naming, but if you don't use that, why should you be unnecessarily restricted? I think this should be cleared first, since it might mean that we don't need any Exceptions for that case at all. Greetings, Norbert -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |
From: Francesc A. <fa...@ca...> - 2004-12-13 18:31:27
|
A Dilluns 13 Desembre 2004 18:53, Norbert Nemec va escriure: > Using SyntaxError for non-pythonic identifiers is a nice idea, even thoug= h it=20 > should be noted that the following code: >=20 > ---------- > class A: > pass > a=3DA() > setattr(a,"!$%#**",42) > print a.__dict__ > ---------- >=20 > works fine and gives: >=20 > {'!$%#**': 4} >=20 > This attribute is therefore inaccessible by natural naming but still lega= l.=20 > Shouldn't pytables behave the same? I don't think so. Allowing that kind of identifiers would break natural naming and that should be avoided for the sake of usability. > In any case, the correct error to raise in the first case would seem to b= e=20 > ValueError. >=20 > I'm really confused now. Anybody has better ideas? Well, IMO NameError is the exception name that better fits in this case (much better that SyntaxError). However, I guess that something like a new NodenameError would be the best.=20 We have to decide however, what to do with attribute names. Should we allow beasts like "!$%#**", or they should be avoided at all? and... should we issue a different exception in case of bad attribute names like AttrnameError? My opinion is that "!$%#**" beasts should be avoided and that introducing an AttrnameError would be nice. Perhaps Ivan have more ideas on that. Cheers, =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Norbert N. <Nor...@gm...> - 2004-12-13 17:53:52
|
Hi there, starting to sort out some of the raised exceptions, I hit another case that I don't know how to solve: utils.checkNameValidity currently raises NameErrors when a name uses a reserved prefix or is not a legal python identifier. In contrast to that, documentation states in several places that the latter case causes a SyntaxError. Following the Python documentation, NameError is meant for local and global symbols only. Despite the appealing name, I'm therefore not sure whether it is a good idea to use it for this purpose. Using SyntaxError for non-pythonic identifiers is a nice idea, even though it should be noted that the following code: ---------- class A: pass a=A() setattr(a,"!$%#**",42) print a.__dict__ ---------- works fine and gives: {'!$%#**': 4} This attribute is therefore inaccessible by natural naming but still legal. Shouldn't pytables behave the same? In any case, the correct error to raise in the first case would seem to be ValueError. I'm really confused now. Anybody has better ideas? Greetings, Norbert -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |
From: Francesc A. <fa...@ca...> - 2004-12-11 11:57:00
|
A Divendres 10 Desembre 2004 15:42, Norbert Nemec va escriure: > OK, doing a quick grep through the sources, I realize that there is quite= some=20 > work to do. I don't know enough about the general state of exceptions in= =20 > pytables to really estimates what has to be done. I would suggest that th= e=20 > better solution than "abusing" KeyError, LookupError and NameError would = be=20 > to define new exceptions somewhere (NodeError was suggested by Ivan, mayb= e=20 > additional ones unless you want to merge all node-errors together) and do= a=20 > plain search and replace. With NodeError inheriting AttributeError, this= =20 > would have the immediate advantage of fixing hasattr (which should defini= tely=20 > work for nodes as well, since they are defined using __setattr__/__getatt= r__) Good enough. So let's work on implementing a NodeError exception first in order to solve those inconsistencies and then progressing towards a better exception system.=20 If you are willing to plan that, no problem :) Otherwise, I'll try myself to carry out the job. Cheers, =2D-=20 =46rancesc Altet Who's your data daddy? =A0PyTables |
From: Norbert N. <Nor...@gm...> - 2004-12-10 14:42:15
|
Am Freitag, 10. Dezember 2004 09:39 schrieb Francesc Altet: > A Divendres 10 Desembre 2004 08:23, Norbert Nemec va escriure: > > > However, at a Python level, nodes are accessed as attributes (or > > > members), so the right exception should be AttributeError. If we > > > want to give this kind of access an additional meaning in PyTables, > > > then maybe a new subclass of AttributeError (e.g NodeError) should be > > > defined and used instead. In addition, this would not break > > > compatibility with existing code where AttributeError was caught. Well, > > > this is just an alternative. Bye! > > > > I like this idea. > > Yes, me too. However, that implies a few more more work, and perhaps a > better redesign of *many* others exceptions in PyTables. While this a task > that should be done, I'm still very tempted to use a KeyError, just to > diferentiate behaviours. Although perhaps is still better to let the > current LookupError (__getattr__ and __delattr__) and NameError > (__setattr__) happening until the big exception redesing would happen. OK, doing a quick grep through the sources, I realize that there is quite some work to do. I don't know enough about the general state of exceptions in pytables to really estimates what has to be done. I would suggest that the better solution than "abusing" KeyError, LookupError and NameError would be to define new exceptions somewhere (NodeError was suggested by Ivan, maybe additional ones unless you want to merge all node-errors together) and do a plain search and replace. With NodeError inheriting AttributeError, this would have the immediate advantage of fixing hasattr (which should definitely work for nodes as well, since they are defined using __setattr__/__getattr__) -- _________________________________________Norbert Nemec Bernhardstr. 2 ... D-93053 Regensburg Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199 eMail: <No...@Ne...> |