You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(5) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
(2) |
Mar
|
Apr
(5) |
May
(11) |
Jun
(7) |
Jul
(18) |
Aug
(5) |
Sep
(15) |
Oct
(4) |
Nov
(1) |
Dec
(4) |
2004 |
Jan
(5) |
Feb
(2) |
Mar
(5) |
Apr
(8) |
May
(8) |
Jun
(10) |
Jul
(4) |
Aug
(4) |
Sep
(20) |
Oct
(11) |
Nov
(31) |
Dec
(41) |
2005 |
Jan
(79) |
Feb
(22) |
Mar
(14) |
Apr
(17) |
May
(35) |
Jun
(24) |
Jul
(26) |
Aug
(9) |
Sep
(57) |
Oct
(64) |
Nov
(25) |
Dec
(37) |
2006 |
Jan
(76) |
Feb
(24) |
Mar
(79) |
Apr
(44) |
May
(33) |
Jun
(12) |
Jul
(15) |
Aug
(40) |
Sep
(17) |
Oct
(21) |
Nov
(46) |
Dec
(23) |
2007 |
Jan
(18) |
Feb
(25) |
Mar
(41) |
Apr
(66) |
May
(18) |
Jun
(29) |
Jul
(40) |
Aug
(32) |
Sep
(34) |
Oct
(17) |
Nov
(46) |
Dec
(17) |
2008 |
Jan
(17) |
Feb
(42) |
Mar
(23) |
Apr
(11) |
May
(65) |
Jun
(28) |
Jul
(28) |
Aug
(16) |
Sep
(24) |
Oct
(33) |
Nov
(16) |
Dec
(5) |
2009 |
Jan
(19) |
Feb
(25) |
Mar
(11) |
Apr
(32) |
May
(62) |
Jun
(28) |
Jul
(61) |
Aug
(20) |
Sep
(61) |
Oct
(11) |
Nov
(14) |
Dec
(53) |
2010 |
Jan
(17) |
Feb
(31) |
Mar
(39) |
Apr
(43) |
May
(49) |
Jun
(47) |
Jul
(35) |
Aug
(58) |
Sep
(55) |
Oct
(91) |
Nov
(77) |
Dec
(63) |
2011 |
Jan
(50) |
Feb
(30) |
Mar
(67) |
Apr
(31) |
May
(17) |
Jun
(83) |
Jul
(17) |
Aug
(33) |
Sep
(35) |
Oct
(19) |
Nov
(29) |
Dec
(26) |
2012 |
Jan
(53) |
Feb
(22) |
Mar
(118) |
Apr
(45) |
May
(28) |
Jun
(71) |
Jul
(87) |
Aug
(55) |
Sep
(30) |
Oct
(73) |
Nov
(41) |
Dec
(28) |
2013 |
Jan
(19) |
Feb
(30) |
Mar
(14) |
Apr
(63) |
May
(20) |
Jun
(59) |
Jul
(40) |
Aug
(33) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Francesc A. <fa...@gm...> - 2012-10-31 20:02:23
|
On 10/31/12 10:12 AM, Andrea Gavana wrote: > Hi Francesc & All, > > On 31 October 2012 14:13, Francesc Alted wrote: >> On 10/31/12 4:30 AM, Andrea Gavana wrote: >>> Thank you for all your suggestions. I managed to slightly modify the >>> script you attached and I am also experimenting with compression. >>> However, in the newly attached script the underlying table is not >>> modified, i.e., this assignment: >>> >>> for p in table: >>> p['results'][:NUM_SIM, :, :] = numpy.random.random(size=(NUM_SIM, >>> len(ALL_DATES), 7)) >>> table.flush() >> For modifying row values you need to assign a complete row object. >> Something like: >> >> for i in range(len(table)): >> myrow = table[i] >> myrow['results'][:NUM_SIM, :, :] = >> numpy.random.random(size=(NUM_SIM, len(ALL_DATES), 7)) >> table[i] = myrow >> >> You may also use Table.modifyColumn() for better efficiency. Look at >> the different modification methods here: >> >> http://pytables.github.com/usersguide/libref/structured_storage.html#table-methods-writing >> >> and experiment with them. > Thank you, I have tried different approaches and they all seem to run > more or less at the same speed (see below). I had to slightly modify > your code from: > > table[i] = myrow > > to > > table[i] = [myrow] > > To avoid exceptions. > > In the newly attached file, I switched to blosc for compression (but > with compression level 1) and run a few sensitivities. By calling the > attached script as: > > python pytables_test.py NUM_SIM > > where "NUM_SIM" is an integer, I get the following timings and file sizes: > > C:\MyProjects\Phaser\tests>python pytables_test.py 10 > Number of simulations : 10 > H5 file creation time : 0.879s > Saving results for table: 6.413s > H5 file size (MB) : 193 > > > C:\MyProjects\Phaser\tests>python pytables_test.py 100 > Number of simulations : 100 > H5 file creation time : 4.155s > Saving results for table: 86.326s > H5 file size (MB) : 1935 > > > I dont think I will try the 1,000 simulations case :-) . I believe I > still don't understand what the best strategy would be for my problem. > I basically need to save all the simulation results for all the 1,200 > "objects", each of which has a timeseries matrix of 600x7 size. In the > GUI I have, these 1,200 "objects" are grouped into multiple > categories, and multiple categories can reference the same "object", > i.e.: > > Category_1: object_1, object_23, object_543, etc... > Category_2: object_23, object_100, object_543, etc... > > So my idea was to save all the "objects" results to disk and, upon the > user's choice, build the categories results "on the fly", i.e. by > seeking the H5 file on disk for the "objects" belonging to that > specific category and summing up all their results (over time, i.e. > the 600 time-steps). Maybe I would be better off with a 4D array > (NUM_OBJECTS, NUM_SIM, TSTEPS, 7) as a table, but then I will lose the > ability to reference the "objects" by their names... You should keep trying experimenting with different approaches and discover the one that works for you the best. Regarding using the 4D array as a table, I might be misunderstanding your problem, but you can still reference objects by name by using: row = table.where("name == %s" % my_name) table[row.nrow] = ... You may want to index the 'name' column for better performance. -- Francesc Alted |
From: Andrea G. <and...@gm...> - 2012-10-31 14:12:44
|
Hi Francesc & All, On 31 October 2012 14:13, Francesc Alted wrote: > On 10/31/12 4:30 AM, Andrea Gavana wrote: >> Thank you for all your suggestions. I managed to slightly modify the >> script you attached and I am also experimenting with compression. >> However, in the newly attached script the underlying table is not >> modified, i.e., this assignment: >> >> for p in table: >> p['results'][:NUM_SIM, :, :] = numpy.random.random(size=(NUM_SIM, >> len(ALL_DATES), 7)) >> table.flush() > > For modifying row values you need to assign a complete row object. > Something like: > > for i in range(len(table)): > myrow = table[i] > myrow['results'][:NUM_SIM, :, :] = > numpy.random.random(size=(NUM_SIM, len(ALL_DATES), 7)) > table[i] = myrow > > You may also use Table.modifyColumn() for better efficiency. Look at > the different modification methods here: > > http://pytables.github.com/usersguide/libref/structured_storage.html#table-methods-writing > > and experiment with them. Thank you, I have tried different approaches and they all seem to run more or less at the same speed (see below). I had to slightly modify your code from: table[i] = myrow to table[i] = [myrow] To avoid exceptions. In the newly attached file, I switched to blosc for compression (but with compression level 1) and run a few sensitivities. By calling the attached script as: python pytables_test.py NUM_SIM where "NUM_SIM" is an integer, I get the following timings and file sizes: C:\MyProjects\Phaser\tests>python pytables_test.py 10 Number of simulations : 10 H5 file creation time : 0.879s Saving results for table: 6.413s H5 file size (MB) : 193 C:\MyProjects\Phaser\tests>python pytables_test.py 100 Number of simulations : 100 H5 file creation time : 4.155s Saving results for table: 86.326s H5 file size (MB) : 1935 I dont think I will try the 1,000 simulations case :-) . I believe I still don't understand what the best strategy would be for my problem. I basically need to save all the simulation results for all the 1,200 "objects", each of which has a timeseries matrix of 600x7 size. In the GUI I have, these 1,200 "objects" are grouped into multiple categories, and multiple categories can reference the same "object", i.e.: Category_1: object_1, object_23, object_543, etc... Category_2: object_23, object_100, object_543, etc... So my idea was to save all the "objects" results to disk and, upon the user's choice, build the categories results "on the fly", i.e. by seeking the H5 file on disk for the "objects" belonging to that specific category and summing up all their results (over time, i.e. the 600 time-steps). Maybe I would be better off with a 4D array (NUM_OBJECTS, NUM_SIM, TSTEPS, 7) as a table, but then I will lose the ability to reference the "objects" by their names... I welcome in advance any suggestion on how to improve my thinking on this matter. Thanks for all the answers I received. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # |
From: Francesc A. <fa...@gm...> - 2012-10-31 13:13:30
|
On 10/31/12 4:30 AM, Andrea Gavana wrote: > Thank you for all your suggestions. I managed to slightly modify the > script you attached and I am also experimenting with compression. > However, in the newly attached script the underlying table is not > modified, i.e., this assignment: > > for p in table: > p['results'][:NUM_SIM, :, :] = numpy.random.random(size=(NUM_SIM, > len(ALL_DATES), 7)) > table.flush() For modifying row values you need to assign a complete row object. Something like: for i in range(len(table)): myrow = table[i] myrow['results'][:NUM_SIM, :, :] = numpy.random.random(size=(NUM_SIM, len(ALL_DATES), 7)) table[i] = myrow You may also use Table.modifyColumn() for better efficiency. Look at the different modification methods here: http://pytables.github.com/usersguide/libref/structured_storage.html#table-methods-writing and experiment with them. > > Seems to be doing nothing (i.e., printing out the 'results' attribute > for an object class prints a matrix full of zeros instead of random > numbers...). Also, on my PC at work, the file creation time is > tremendously slow (76 seconds for a 100 simulations - 1.9 GB file). > > In order to understand what's going on, I set back the number of > simulations to 10 (NUM_SIM=10), but still I am getting only zeros out > of the table. This is what my script is printing out: > > H5 file creation time: 7.652 Hmm, in my modest Core2 laptop I'm getting this: H5 file creation time: 1.294 Also, by using compression with zlib level 1: H5 file creation time: 1.900 And using blosc level 5: H5 file creation time: 0.244 HTH, -- Francesc Alted |
From: Andrea G. <and...@gm...> - 2012-10-31 08:30:45
|
Hi Anthony & All, On 30 October 2012 23:31, Anthony Scopatz wrote: > On Tue, Oct 30, 2012 at 6:20 PM, Andrea Gavana <and...@gm...> > wrote: >> >> Hi Anthony, >> >> On 30 October 2012 22:52, Anthony Scopatz wrote: >> > Hi Andrea, >> > >> > Your problem is two fold. >> > >> > 1. Your timing wasn't reporting the time per data set, but rather the >> > total >> > time since writing all data sets. You need to put the start time in the >> > loop to get the time per data set. >> > >> > 2. Your larger problem was that you were writing too many times. >> > Generally >> > it is faster to write fewer, bigger sets of data than performing a lot >> > of >> > small write operations. Since you had data set opening and writing in a >> > doubly nested loop, it is not surprising that you were getting terrible >> > performance. You were basically maximizing HDF5 overhead ;). Using >> > slicing I removed the outermost loop and saw timings like the following: >> > >> > H5 file creation time: 7.406 >> > >> > Saving results for table: 0.0105440616608 >> > Saving results for table: 0.0158948898315 >> > Saving results for table: 0.0164661407471 >> > Saving results for table: 0.00654292106628 >> > Saving results for table: 0.00676298141479 >> > Saving results for table: 0.00664114952087 >> > Saving results for table: 0.0066990852356 >> > Saving results for table: 0.00687289237976 >> > Saving results for table: 0.00664210319519 >> > Saving results for table: 0.0157809257507 >> > Saving results for table: 0.0141618251801 >> > Saving results for table: 0.00796294212341 >> > >> > Please see the attached version, at around line 82. Additionally, if >> > you >> > need to focus on performance I would recommend reading the following >> > (http://pytables.github.com/usersguide/optimization.html). PyTables can >> > be >> > blazingly fast when implemented correctly. I would highly recommend >> > looking >> > into compression. >> > >> > I hope this helps! >> >> Thank you for your answer; indeed, I was timing it wrongly (I really >> need to go to sleep...). However, although I understand the need of >> "writing fewer", I am not sure I can actually do it in my situations. >> Let me explain: >> >> 1. I have a GUI which starts a number of parallel processes (up to 16, >> depending on a user selection); >> 2. These processes actually do the computation/simulations - so, if I >> have 1,000 simulations to run and 8 parallel processes, each process >> gets 125 simulations (each of which holds 1,200 "objects" with a 600x7 >> timeseries matrix per object). > > > Well, you can at least change the order of the loops and see if that helps. > That is rather than doing: > > for i in xrange(): > for p in table: > > Do the following instead: > > for p in table: > for i in xrange(): > > I don't believe that this will help too much since you are still writing > every element individually.. > >> >> >> If I had to write out the results only at the end, it would mean for >> me to find a way to share the 1,200 "objects" matrices in all the >> parallel processes (and I am not sure if pytables is going to complain >> when multiple concurrent processes try to access the same underlying >> HDF5 file). > > > Reading in parallel works pretty well. Writing causes more headaches > but can be done. > >> >> Or I could create one HDF file per process, but given the nature of >> the simulation I am running, every "object" in the 1,200 "objects" >> pool would need to keep a reference to a 125x600x7 matrix (assuming >> 1,000 simulations and 8 processes) around in memory *OR* I will need >> to write the results to the HDF5 file for every simulation. Although >> we have extremely powerful PCs at work, I am not sure it is the right >> way to go... >> >> As always, I am open to all suggestions on how to improve my approach. > > > My basic suggestion is to have all of you processes produce results which > are then > aggregated by a single master process. This master is the only one which > has write > access to the hdf5 file and will allow you to create larger arrays and > minimize the > number of writes that you do. > > You'll probably want to take a look at this example: > https://github.com/PyTables/PyTables/blob/develop/examples/multiprocess_access_queues.py > > I think that there might be a page in the docs about it now too... > > But I think that this is the strategy that you want to pursue. Multiple > compute processes, one write process. Thank you for all your suggestions. I managed to slightly modify the script you attached and I am also experimenting with compression. However, in the newly attached script the underlying table is not modified, i.e., this assignment: for p in table: p['results'][:NUM_SIM, :, :] = numpy.random.random(size=(NUM_SIM, len(ALL_DATES), 7)) table.flush() Seems to be doing nothing (i.e., printing out the 'results' attribute for an object class prints a matrix full of zeros instead of random numbers...). Also, on my PC at work, the file creation time is tremendously slow (76 seconds for a 100 simulations - 1.9 GB file). In order to understand what's going on, I set back the number of simulations to 10 (NUM_SIM=10), but still I am getting only zeros out of the table. This is what my script is printing out: H5 file creation time: 7.652 Saving results for table: 1.03400015831 Results (should be random...) Object name : KB0001 Object results: [[[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] [[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] [[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] ..., [[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] [[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]] [[ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] ..., [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.] [ 0. 0. 0. ..., 0. 0. 0.]]] I am on Windows Vista, Python 2.7.2 64-bit from EPD 7.1-2, pytables version '2.3b1.devpro'. Any suggestion is really appreciated. Thank you in advance. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # |
From: Anthony S. <sc...@gm...> - 2012-10-30 22:32:00
|
On Tue, Oct 30, 2012 at 6:20 PM, Andrea Gavana <and...@gm...>wrote: > Hi Anthony, > > On 30 October 2012 22:52, Anthony Scopatz wrote: > > Hi Andrea, > > > > Your problem is two fold. > > > > 1. Your timing wasn't reporting the time per data set, but rather the > total > > time since writing all data sets. You need to put the start time in the > > loop to get the time per data set. > > > > 2. Your larger problem was that you were writing too many times. > Generally > > it is faster to write fewer, bigger sets of data than performing a lot of > > small write operations. Since you had data set opening and writing in a > > doubly nested loop, it is not surprising that you were getting terrible > > performance. You were basically maximizing HDF5 overhead ;). Using > > slicing I removed the outermost loop and saw timings like the following: > > > > H5 file creation time: 7.406 > > > > Saving results for table: 0.0105440616608 > > Saving results for table: 0.0158948898315 > > Saving results for table: 0.0164661407471 > > Saving results for table: 0.00654292106628 > > Saving results for table: 0.00676298141479 > > Saving results for table: 0.00664114952087 > > Saving results for table: 0.0066990852356 > > Saving results for table: 0.00687289237976 > > Saving results for table: 0.00664210319519 > > Saving results for table: 0.0157809257507 > > Saving results for table: 0.0141618251801 > > Saving results for table: 0.00796294212341 > > > > Please see the attached version, at around line 82. Additionally, if you > > need to focus on performance I would recommend reading the following > > (http://pytables.github.com/usersguide/optimization.html). PyTables > can be > > blazingly fast when implemented correctly. I would highly recommend > looking > > into compression. > > > > I hope this helps! > > Thank you for your answer; indeed, I was timing it wrongly (I really > need to go to sleep...). However, although I understand the need of > "writing fewer", I am not sure I can actually do it in my situations. > Let me explain: > > 1. I have a GUI which starts a number of parallel processes (up to 16, > depending on a user selection); > 2. These processes actually do the computation/simulations - so, if I > have 1,000 simulations to run and 8 parallel processes, each process > gets 125 simulations (each of which holds 1,200 "objects" with a 600x7 > timeseries matrix per object). > Well, you can at least change the order of the loops and see if that helps. That is rather than doing: for i in xrange(): for p in table: Do the following instead: for p in table: for i in xrange(): I don't believe that this will help too much since you are still writing every element individually.. > > If I had to write out the results only at the end, it would mean for > me to find a way to share the 1,200 "objects" matrices in all the > parallel processes (and I am not sure if pytables is going to complain > when multiple concurrent processes try to access the same underlying > HDF5 file). > Reading in parallel works pretty well. Writing causes more headaches but can be done. > Or I could create one HDF file per process, but given the nature of > the simulation I am running, every "object" in the 1,200 "objects" > pool would need to keep a reference to a 125x600x7 matrix (assuming > 1,000 simulations and 8 processes) around in memory *OR* I will need > to write the results to the HDF5 file for every simulation. Although > we have extremely powerful PCs at work, I am not sure it is the right > way to go... > > As always, I am open to all suggestions on how to improve my approach. > My basic suggestion is to have all of you processes produce results which are then aggregated by a single master process. This master is the only one which has write access to the hdf5 file and will allow you to create larger arrays and minimize the number of writes that you do. You'll probably want to take a look at this example: https://github.com/PyTables/PyTables/blob/develop/examples/multiprocess_access_queues.py I think that there might be a page in the docs about it now too... But I think that this is the strategy that you want to pursue. Multiple compute processes, one write process. > > Thank you again for your quick and enlightening answer. > No problem! Be Well Anthony > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://www.infinity77.net > > # ------------------------------------------------------------- # > def ask_mailing_list_support(email): > > if mention_platform_and_version() and include_sample_app(): > send_message(email) > else: > install_malware() > erase_hard_drives() > # ------------------------------------------------------------- # > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Andrea G. <and...@gm...> - 2012-10-30 22:20:35
|
Hi Anthony, On 30 October 2012 22:52, Anthony Scopatz wrote: > Hi Andrea, > > Your problem is two fold. > > 1. Your timing wasn't reporting the time per data set, but rather the total > time since writing all data sets. You need to put the start time in the > loop to get the time per data set. > > 2. Your larger problem was that you were writing too many times. Generally > it is faster to write fewer, bigger sets of data than performing a lot of > small write operations. Since you had data set opening and writing in a > doubly nested loop, it is not surprising that you were getting terrible > performance. You were basically maximizing HDF5 overhead ;). Using > slicing I removed the outermost loop and saw timings like the following: > > H5 file creation time: 7.406 > > Saving results for table: 0.0105440616608 > Saving results for table: 0.0158948898315 > Saving results for table: 0.0164661407471 > Saving results for table: 0.00654292106628 > Saving results for table: 0.00676298141479 > Saving results for table: 0.00664114952087 > Saving results for table: 0.0066990852356 > Saving results for table: 0.00687289237976 > Saving results for table: 0.00664210319519 > Saving results for table: 0.0157809257507 > Saving results for table: 0.0141618251801 > Saving results for table: 0.00796294212341 > > Please see the attached version, at around line 82. Additionally, if you > need to focus on performance I would recommend reading the following > (http://pytables.github.com/usersguide/optimization.html). PyTables can be > blazingly fast when implemented correctly. I would highly recommend looking > into compression. > > I hope this helps! Thank you for your answer; indeed, I was timing it wrongly (I really need to go to sleep...). However, although I understand the need of "writing fewer", I am not sure I can actually do it in my situations. Let me explain: 1. I have a GUI which starts a number of parallel processes (up to 16, depending on a user selection); 2. These processes actually do the computation/simulations - so, if I have 1,000 simulations to run and 8 parallel processes, each process gets 125 simulations (each of which holds 1,200 "objects" with a 600x7 timeseries matrix per object). If I had to write out the results only at the end, it would mean for me to find a way to share the 1,200 "objects" matrices in all the parallel processes (and I am not sure if pytables is going to complain when multiple concurrent processes try to access the same underlying HDF5 file). Or I could create one HDF file per process, but given the nature of the simulation I am running, every "object" in the 1,200 "objects" pool would need to keep a reference to a 125x600x7 matrix (assuming 1,000 simulations and 8 processes) around in memory *OR* I will need to write the results to the HDF5 file for every simulation. Although we have extremely powerful PCs at work, I am not sure it is the right way to go... As always, I am open to all suggestions on how to improve my approach. Thank you again for your quick and enlightening answer. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # |
From: Andrea G. <and...@gm...> - 2012-10-30 20:55:12
|
Hi All, I am pretty new to pytables and I am facing a problem of actually storing and retrieving data to/from a large dataset. My situation is the following: 1. I am running stochastic simulations of a number of objects (typically between 100-1,000 simulations); 2. For every simulation, I have around 1,200 "objects", and for each of them I have 7 timeseries of 600 time-steps each. I thought of using pytables to try and get some sense out of my simulations, but I am failing to implement something intelligent (or fast, which is important as well...). The attached script (modified from the pytables tutorial) does the following: 1. Create a table containing these "objects"; 2. Adds 1,200 rows, one per "object": for each "object", I assign a 3D array defined as: results = Float32Col(shape=(NUM_SIM, len(ALL_DATES), 7)) Where NUM_SIM is the number of simulations and ALL_DATES are the timesteps. 3. For every simulation, I update the "object" results (using random numbers in the script). The timings on my computer are as follows (in seconds): H5 file creation time: 22.510 Saving results for simulation 1 : 3.33599996567 Saving results for simulation 2 : 6.2429997921 Saving results for simulation 3 : 9.15199995041 Saving results for simulation 4 : 12.0759999752 Saving results for simulation 5 : 15.2199997902 Saving results for simulation 6 : 17.9159998894 Saving results for simulation 7 : 21.0659999847 Saving results for simulation 8 : 23.6459999084 Saving results for simulation 9 : 26.5359997749 Saving results for simulation 10 : 29.5579998493 As you can see, at every simulation the processing time increases by 3 seconds, so by the time I get to 100 or 1,000 I will have more than enough time for 15 coffees in the morning :-D Also, the file creation time is somewhat on the slow side... I am sure I am missing a lot of things here, so I would appreciate any suggestion to implement my code in a better/more intelligent way (and also suggestions on other approaches in order to do what I am trying to do). Thank you in advance for your suggestions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net |
From: Francesc A. <fa...@py...> - 2012-10-30 15:17:46
|
On 10/30/12 10:44 AM, Aquil H. Abdullah wrote: > Hello All, > > I am querying a table that has a field with a string value. I would > like to determine if the string matches a pattern. Is there a simple > way to do that through readWhere and the condition syntax? None of > the following work, but I was wondering if it were possible to do > something similar: > > table.readWhere('"CLZ' in field') or table.readWhere('symbol[:3] == > "CLZ"') As Anthony said, there is not support for this for in-kernel (or indexed) queries, but you can always use a regular query for that. I.e. something along the lines: np.fromiter((r for r in table if 'CLZ' in r['symbol']), dtype=table.dtype) -- Francesc Alted |
From: Aquil H. A. <aqu...@gm...> - 2012-10-30 14:54:19
|
Bah! Thanks for the quick reply! -- Aquil H. Abdullah "I never think of the future. It comes soon enough" - Albert Einstein On Tuesday, October 30, 2012 at 10:49 AM, Anthony Scopatz wrote: > Hello Aquil, > > Unfortunately, You currently cannot use indexing in queries (ie "symbol[:3] == x") and may only use the whole variable ("symbol == x". This is a limitation of numexpr. Please file a ticket with them, if you would like to see this changed. Sorry! > > Be Well > Anthony > > On Tue, Oct 30, 2012 at 10:44 AM, Aquil H. Abdullah <aqu...@gm... (mailto:aqu...@gm...)> wrote: > > Hello All, > > > > I am querying a table that has a field with a string value. I would like to determine if the string matches a pattern. Is there a simple way to do that through readWhere and the condition syntax? None of the following work, but I was wondering if it were possible to do something similar: > > > > table.readWhere('"CLZ' in field') or table.readWhere('symbol[:3] == "CLZ"') > > > > Thanks! > > > > -- > > Aquil H. Abdullah > > "I never think of the future. It comes soon enough" - Albert Einstein > > > > > > ------------------------------------------------------------------------------ > > Everyone hates slow websites. So do we. > > Make your web apps faster with AppDynamics > > Download AppDynamics Lite for free today: > > http://p.sf.net/sfu/appdyn_sfd2d_oct > > _______________________________________________ > > Pytables-users mailing list > > Pyt...@li... (mailto:Pyt...@li...) > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > > _______________________________________________ > Pytables-users mailing list > Pyt...@li... (mailto:Pyt...@li...) > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Anthony S. <sc...@gm...> - 2012-10-30 14:50:20
|
Hello Aquil, Unfortunately, You currently cannot use indexing in queries (ie "symbol[:3] == x") and may only use the whole variable ("symbol == x". This is a limitation of numexpr. Please file a ticket with them, if you would like to see this changed. Sorry! Be Well Anthony On Tue, Oct 30, 2012 at 10:44 AM, Aquil H. Abdullah < aqu...@gm...> wrote: > Hello All, > > I am querying a table that has a field with a string value. I would like > to determine if the string matches a pattern. Is there a simple way to do > that through readWhere and the condition syntax? None of the following > work, but I was wondering if it were possible to do something similar: > > table.readWhere('"CLZ' in field') or table.readWhere('symbol[:3] == "CLZ"') > > Thanks! > > -- > Aquil H. Abdullah > "I never think of the future. It comes soon enough" - Albert Einstein > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |
From: Aquil H. A. <aqu...@gm...> - 2012-10-30 14:45:09
|
Hello All, I am querying a table that has a field with a string value. I would like to determine if the string matches a pattern. Is there a simple way to do that through readWhere and the condition syntax? None of the following work, but I was wondering if it were possible to do something similar: table.readWhere('"CLZ' in field') or table.readWhere('symbol[:3] == "CLZ"') Thanks! -- Aquil H. Abdullah "I never think of the future. It comes soon enough" - Albert Einstein |
From: Anthony S. <sc...@gm...> - 2012-10-29 16:36:15
|
Hello Jack, I am not really sure what is going wrong because you did not post the full code where the exception is happening. However, this error seems to be because the pnts array is one dimensional. (Which is why pnts.shape has a length of 1.) You could verify this by printing out pnts right before the line that fails. Also, why are you using ctypes? This seems wrong... Be Well Anthony On Sun, Oct 28, 2012 at 9:25 PM, JACK <you...@ya...> wrote: > > > Hi all, > > I am new to python and pytables. Currently I am writing a project about > clustering and KNN algorithm. That is what I have got. > > ********************** code ******************************* > > import numpy.random as npr > import numpy as np > > #step0: obtain the cluster > dtype = np.dtype('f4') > > pnts_inds = np.arange(100) > npr.shuffle(pnts_inds) > pnts_inds = pnts_inds[:10] > pnts_inds = np.sort(pnts_inds) > for i,ind in enumerate(pnts_inds): > clusters[i] = pnts_obj[ind] > > #step1: save the result to a HDF5 file called clst_fn.h5 > > filters = tables.Filters(complevel = 1, complib = 'zlib') > clst_fobj = tables.openFile('clst_fn.h5', 'w') > clst_obj = clst_fobj.createCArray(clst_fobj.root, 'clusters', > tables.Atom.from_dtype(dtype), clusters.shape, > filters = filters) > clst_obj[:] = clusters > clst_fobj.close() > > #step2: other function > blabla > > #step3: load the cluster from clst_fn > > pnts_fobj= tables.openFile('clst_fn.h5','r') > for pnts in pnts_fobj.walkNodes('/', classname = 'Array'): > break > > ############################# > #step4: evoke another function (called knn). The function input argument > is the > #data from pnts. I have checked the knn function individually. This > function > #works well if the input is pnts = npr.rand(100,128) > > def knn(pnts): > pnts = numpy.ascontiguousarray(pnts) > N = ctypes.c_uint(pnts.shape[0]) > D = ctypes.c_uint(pnts.shape[1]) > ############################# > > # evoke knn using the cluster from clst_fn (see step 3) > knn(pnts) > > > ********************** end of code ******************************* > > My problem now is that python is giving me a hard time by showing: > error: IndexError: tuple index out of range > This error comes from > "D = ctypes.c_uint(pnts.shape[1])" this line. > > Obviously, there must be something wrong with the input argument. Any > thought > about fixing the problem? Thank you in advance. > > > > > > > ------------------------------------------------------------------------------ > The Windows 8 Center - In partnership with Sourceforge > Your idea - your app - 30 days. > Get started! > http://windows8center.sourceforge.net/ > what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: JACK <you...@ya...> - 2012-10-29 01:30:09
|
Hi all, I am new to python and pytables. Currently I am writing a project about clustering and KNN algorithm. That is what I have got. ********************** code ******************************* import numpy.random as npr import numpy as np #step0: obtain the cluster dtype = np.dtype('f4') pnts_inds = np.arange(100) npr.shuffle(pnts_inds) pnts_inds = pnts_inds[:10] pnts_inds = np.sort(pnts_inds) for i,ind in enumerate(pnts_inds): clusters[i] = pnts_obj[ind] #step1: save the result to a HDF5 file called clst_fn.h5 filters = tables.Filters(complevel = 1, complib = 'zlib') clst_fobj = tables.openFile('clst_fn.h5', 'w') clst_obj = clst_fobj.createCArray(clst_fobj.root, 'clusters', tables.Atom.from_dtype(dtype), clusters.shape, filters = filters) clst_obj[:] = clusters clst_fobj.close() #step2: other function blabla #step3: load the cluster from clst_fn pnts_fobj= tables.openFile('clst_fn.h5','r') for pnts in pnts_fobj.walkNodes('/', classname = 'Array'): break ############################# #step4: evoke another function (called knn). The function input argument is the #data from pnts. I have checked the knn function individually. This function #works well if the input is pnts = npr.rand(100,128) def knn(pnts): pnts = numpy.ascontiguousarray(pnts) N = ctypes.c_uint(pnts.shape[0]) D = ctypes.c_uint(pnts.shape[1]) ############################# # evoke knn using the cluster from clst_fn (see step 3) knn(pnts) ********************** end of code ******************************* My problem now is that python is giving me a hard time by showing: error: IndexError: tuple index out of range This error comes from "D = ctypes.c_uint(pnts.shape[1])" this line. Obviously, there must be something wrong with the input argument. Any thought about fixing the problem? Thank you in advance. |
From: Francesc A. <fa...@gm...> - 2012-10-27 17:59:03
|
On 10/27/12 12:21 PM, Antonio Valentino wrote: > Hi Francesc, > congratulations! > > Il 27/10/2012 13:16, Francesc Alted ha scritto: >> Hi, >> >> You may be interested on my IPython notebooks and slides for the conference: >> >> http://pytables.org/download/PyData2012-NYC.tar.gz >> PyData-NYC-2012-v3.pptx http://www.pytables.org/docs/PyData2012-NYC.pdf >> >> [BTW this time I felt in love with IPython notebook: it is great!] > yes, the IPython notebuok is fantastic! > > ... and the idea of saving tutorials into notebook files is very very > nice :)) > > Maybe we could provide notebook files for all tutorials in the official doc. Yeah, that's a good idea. However, provided that you can just drop pure Python code into a IPython notebook and that just works, I'm not sure whether bothering about this is worth the effort. -- Francesc Alted |
From: Anthony S. <sc...@gm...> - 2012-10-27 16:23:06
|
On Sat, Oct 27, 2012 at 11:21 AM, Antonio Valentino < ant...@ti...> wrote: > Hi Francesc, > congratulations! > > Il 27/10/2012 13:16, Francesc Alted ha scritto: > > Hi, > > > > You may be interested on my IPython notebooks and slides for the > conference: > > > > http://pytables.org/download/PyData2012-NYC.tar.gz > > PyData-NYC-2012-v3.pptx http://www.pytables.org/docs/PyData2012-NYC.pdf > > > > [BTW this time I felt in love with IPython notebook: it is great!] > > yes, the IPython notebuok is fantastic! > > ... and the idea of saving tutorials into notebook files is very very > nice :)) > > Maybe we could provide notebook files for all tutorials in the official > doc. > +1 > > > ciao > > -- > Antonio Valentino > > > ------------------------------------------------------------------------------ > WINDOWS 8 is here. > Millions of people. Your app in 30 days. > Visit The Windows 8 Center at Sourceforge for all your go to resources. > http://windows8center.sourceforge.net/ > join-generation-app-and-make-money-coding-fast/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Antonio V. <ant...@ti...> - 2012-10-27 16:21:16
|
Hi Francesc, congratulations! Il 27/10/2012 13:16, Francesc Alted ha scritto: > Hi, > > You may be interested on my IPython notebooks and slides for the conference: > > http://pytables.org/download/PyData2012-NYC.tar.gz > PyData-NYC-2012-v3.pptx http://www.pytables.org/docs/PyData2012-NYC.pdf > > [BTW this time I felt in love with IPython notebook: it is great!] yes, the IPython notebuok is fantastic! ... and the idea of saving tutorials into notebook files is very very nice :)) Maybe we could provide notebook files for all tutorials in the official doc. ciao -- Antonio Valentino |
From: Anthony S. <sc...@gm...> - 2012-10-27 15:35:28
|
Great! Thanks Francesc! On Sat, Oct 27, 2012 at 6:16 AM, Francesc Alted <fa...@gm...> wrote: > Hi, > > You may be interested on my IPython notebooks and slides for the > conference: > > http://pytables.org/download/PyData2012-NYC.tar.gz > PyData-NYC-2012-v3.pptx http://www.pytables.org/docs/PyData2012-NYC.pdf > > [BTW this time I felt in love with IPython notebook: it is great!] > > Unfortunately, I had only 45 minutes for the presentation, so I have not > been able to show the PyTables files samples that some of you kindly > send to me (but I'll keep them for the future, one never knows!). > > -- > Francesc Alted > > > > ------------------------------------------------------------------------------ > WINDOWS 8 is here. > Millions of people. Your app in 30 days. > Visit The Windows 8 Center at Sourceforge for all your go to resources. > http://windows8center.sourceforge.net/ > join-generation-app-and-make-money-coding-fast/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Francesc A. <fa...@gm...> - 2012-10-27 11:17:06
|
Hi, You may be interested on my IPython notebooks and slides for the conference: http://pytables.org/download/PyData2012-NYC.tar.gz PyData-NYC-2012-v3.pptx http://www.pytables.org/docs/PyData2012-NYC.pdf [BTW this time I felt in love with IPython notebook: it is great!] Unfortunately, I had only 45 minutes for the presentation, so I have not been able to show the PyTables files samples that some of you kindly send to me (but I'll keep them for the future, one never knows!). -- Francesc Alted |
From: Jason M. <moo...@gm...> - 2012-10-27 04:33:41
|
I just tried installing python-tables on a clean install of 12.10 on a different machine and all went fine. So I've got something corrupted on my machine...just a localized bug. Jason On Fri, Oct 26, 2012 at 2:40 PM, Jason Moore <moo...@gm...> wrote: > I've posted a bug report here: > https://bugs.launchpad.net/ubuntu/+source/pytables/+bug/1071918 > > Maybe others could see if it is reproducible in Ubuntu 12.10. > > Thanks, > > Jason > > > On Fri, Oct 26, 2012 at 2:09 PM, Jason Moore <moo...@gm...> wrote: > >> The simlink is a workaround for Ubuntu 12.10. It is certainly not the >> long term solution, but I don't see why it is a bad idea. python-tables in >> the Ubuntu 12.10 repos cannot find the HDF5 library because it is looking >> for libhdf5.so.6 on the path but there is only libhdf5.so.7 (which is a >> simlink to libhdf5.so.7.0.2). This must be hard coded in the >> utilsExtensions.so binary that is include with PyTables. Seems like that >> these need to be recompiled for the distribution or something. >> >> Jason >> >> >> On Fri, Oct 26, 2012 at 1:12 PM, Antonio Valentino < >> ant...@ti...> wrote: >> >>> Hi Jason, >>> >>> Il 26/10/2012 21:59, Jason Moore ha scritto: >>> > Solution was simple once I found it. Here is the workaround: >>> > >>> > https://bugs.launchpad.net/ubuntu/+source/octave/+bug/1005243 >>> > >>> > Just make a symlink to the new file. >>> > >>> > Jason >>> >>> Honestly I don't think it is a good idea. >>> >>> >>> ciao >>> >>> -- >>> Antonio Valentino >>> >>> >>> ------------------------------------------------------------------------------ >>> WINDOWS 8 is here. >>> Millions of people. Your app in 30 days. >>> Visit The Windows 8 Center at Sourceforge for all your go to resources. >>> http://windows8center.sourceforge.net/ >>> join-generation-app-and-make-money-coding-fast/ >>> _______________________________________________ >>> Pytables-users mailing list >>> Pyt...@li... >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> >> >> >> >> -- >> Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> >> Davis Bike Collective <http://www.davisbikecollective.org> Minister, >> Davis, CA >> BikeDavis.info >> Google Voice: +01 530-601-9791 >> Home: +01 530-753-0794 >> >> >> >> > > > -- > Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> > Davis Bike Collective <http://www.davisbikecollective.org> Minister, > Davis, CA > BikeDavis.info > Google Voice: +01 530-601-9791 > Home: +01 530-753-0794 > > > > -- Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Davis Bike Collective <http://www.davisbikecollective.org> Minister, Davis, CA BikeDavis.info Google Voice: +01 530-601-9791 Home: +01 530-753-0794 |
From: Jason M. <moo...@gm...> - 2012-10-26 21:41:15
|
I've posted a bug report here: https://bugs.launchpad.net/ubuntu/+source/pytables/+bug/1071918 Maybe others could see if it is reproducible in Ubuntu 12.10. Thanks, Jason On Fri, Oct 26, 2012 at 2:09 PM, Jason Moore <moo...@gm...> wrote: > The simlink is a workaround for Ubuntu 12.10. It is certainly not the long > term solution, but I don't see why it is a bad idea. python-tables in the > Ubuntu 12.10 repos cannot find the HDF5 library because it is looking for > libhdf5.so.6 on the path but there is only libhdf5.so.7 (which is a simlink > to libhdf5.so.7.0.2). This must be hard coded in the utilsExtensions.so > binary that is include with PyTables. Seems like that these need to be > recompiled for the distribution or something. > > Jason > > > On Fri, Oct 26, 2012 at 1:12 PM, Antonio Valentino < > ant...@ti...> wrote: > >> Hi Jason, >> >> Il 26/10/2012 21:59, Jason Moore ha scritto: >> > Solution was simple once I found it. Here is the workaround: >> > >> > https://bugs.launchpad.net/ubuntu/+source/octave/+bug/1005243 >> > >> > Just make a symlink to the new file. >> > >> > Jason >> >> Honestly I don't think it is a good idea. >> >> >> ciao >> >> -- >> Antonio Valentino >> >> >> ------------------------------------------------------------------------------ >> WINDOWS 8 is here. >> Millions of people. Your app in 30 days. >> Visit The Windows 8 Center at Sourceforge for all your go to resources. >> http://windows8center.sourceforge.net/ >> join-generation-app-and-make-money-coding-fast/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> > > > > -- > Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> > Davis Bike Collective <http://www.davisbikecollective.org> Minister, > Davis, CA > BikeDavis.info > Google Voice: +01 530-601-9791 > Home: +01 530-753-0794 > > > > -- Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Davis Bike Collective <http://www.davisbikecollective.org> Minister, Davis, CA BikeDavis.info Google Voice: +01 530-601-9791 Home: +01 530-753-0794 |
From: Jason M. <moo...@gm...> - 2012-10-26 21:09:37
|
The simlink is a workaround for Ubuntu 12.10. It is certainly not the long term solution, but I don't see why it is a bad idea. python-tables in the Ubuntu 12.10 repos cannot find the HDF5 library because it is looking for libhdf5.so.6 on the path but there is only libhdf5.so.7 (which is a simlink to libhdf5.so.7.0.2). This must be hard coded in the utilsExtensions.so binary that is include with PyTables. Seems like that these need to be recompiled for the distribution or something. Jason On Fri, Oct 26, 2012 at 1:12 PM, Antonio Valentino < ant...@ti...> wrote: > Hi Jason, > > Il 26/10/2012 21:59, Jason Moore ha scritto: > > Solution was simple once I found it. Here is the workaround: > > > > https://bugs.launchpad.net/ubuntu/+source/octave/+bug/1005243 > > > > Just make a symlink to the new file. > > > > Jason > > Honestly I don't think it is a good idea. > > > ciao > > -- > Antonio Valentino > > > ------------------------------------------------------------------------------ > WINDOWS 8 is here. > Millions of people. Your app in 30 days. > Visit The Windows 8 Center at Sourceforge for all your go to resources. > http://windows8center.sourceforge.net/ > join-generation-app-and-make-money-coding-fast/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > -- Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Davis Bike Collective <http://www.davisbikecollective.org> Minister, Davis, CA BikeDavis.info Google Voice: +01 530-601-9791 Home: +01 530-753-0794 |
From: Antonio V. <ant...@ti...> - 2012-10-26 20:13:06
|
Hi Jason, Il 26/10/2012 21:59, Jason Moore ha scritto: > Solution was simple once I found it. Here is the workaround: > > https://bugs.launchpad.net/ubuntu/+source/octave/+bug/1005243 > > Just make a symlink to the new file. > > Jason Honestly I don't think it is a good idea. ciao -- Antonio Valentino |
From: Antonio V. <ant...@ti...> - 2012-10-26 20:05:55
|
Hi Jason, Il 26/10/2012 21:37, Jason Moore ha scritto: > I'll post the bug report, but I'd like to get this working on my system. > I've always had trouble compiling pytables from source due to the > dependencies. Right now I just need to get this working because I can no > longer use my software now that PyTabes is broken. > If you use ubuntu 12.04 you can install build dependencies as follows: $ sudo apt-get install libhdf5-dev python-dev cython python-numexpr libbz2-dev zlib1g-dev, liblzo2-dev or simply $ sudo apt-get build-dep pytables Then you can build pytables typing python setup.py build > Question 1: > > What are the exact command for installing for source (including all flags)? > I can't find this explicitly in the documentation especially how to use the > --hdf5 flag and other flags to point to where the dependencies are > installed. I'm trying: > > sudo python setup.py build_ext --inplace --hdf5=/usr/lib/libhdf5.so.7 > > But having little luck. It still can't find my hdf5 library. > The use of the --hdf5 flag is explained in [1], anyway you should not need it on ubuntu. [1] http://pytables.github.com/usersguide/installation.html > Question 2: > > I tried your pytables2.4 ppa but it also can't find the hdf5 library. How > can I install the old /usr/lib/lbhdf5.so.6 file? I also remember it being > painful to install the HDF5 libraries from source. Is this file available > in the Ubuntu repositories? > This is very strange. It seems to me a misconfiguration of the apt system. Are you sure that all you apt sources point to quantal? Maybe some of them still points to precise cheers -- Antonio Valentino |
From: Jason M. <moo...@gm...> - 2012-10-26 19:59:40
|
Solution was simple once I found it. Here is the workaround: https://bugs.launchpad.net/ubuntu/+source/octave/+bug/1005243 Just make a symlink to the new file. Jason On Fri, Oct 26, 2012 at 12:37 PM, Jason Moore <moo...@gm...> wrote: > I'll post the bug report, but I'd like to get this working on my system. > I've always had trouble compiling pytables from source due to the > dependencies. Right now I just need to get this working because I can no > longer use my software now that PyTabes is broken. > > Question 1: > > What are the exact command for installing for source (including all > flags)? I can't find this explicitly in the documentation especially how to > use the --hdf5 flag and other flags to point to where the dependencies are > installed. I'm trying: > > sudo python setup.py build_ext --inplace --hdf5=/usr/lib/libhdf5.so.7 > > But having little luck. It still can't find my hdf5 library. > > Question 2: > > I tried your pytables2.4 ppa but it also can't find the hdf5 library. How > can I install the old /usr/lib/lbhdf5.so.6 file? I also remember it being > painful to install the HDF5 libraries from source. Is this file available > in the Ubuntu repositories? > > Jason > > > On Fri, Oct 26, 2012 at 10:16 AM, Antonio Valentino < > ant...@ti...> wrote: > >> Hi Jason, >> >> Il 26/10/2012 18:44, Jason Moore ha scritto: >> > Where exactly do I submit the bug report? There doesn't seem to be a bug >> > option here: >> https://launchpad.net/ubuntu/quantal/+package/python-tables >> > >> > Jason >> >> of course you need a launchpad account, than you can follow instruction >> of the ReportingBugs page on the ubuntu wiki [1] >> >> [1] https://help.ubuntu.com/community/ReportingBugs >> >> >> ciao >> >> > On Thu, Oct 25, 2012 at 11:27 PM, Antonio Valentino < >> > ant...@ti...> wrote: >> > >> >> Hi Jason, >> >> >> >> Il giorno 26/ott/2012, alle ore 07:28, Jason Moore < >> moo...@gm...> >> >> ha scritto: >> >> >> >>> So it looks like python-tables in Ubuntu 12.10 requires libhdf5-7 and >> >> libhdf5-7 has /usr/lib/libhdf5.so.7 not libhdf5.so.6. >> >> >> >> Correct, the hdf5 package has been updated in ubuntu 12.10. >> >> If you use the standard python-tables package from ubuntu repositories >> >> please file a bug report on launchpad. >> >> >> >> >> >>> Jason >> >>> >> >>> On Thu, Oct 25, 2012 at 4:10 PM, Aquil Abdullah < >> >> aqu...@gm...> wrote: >> >>> Can you check to see if libhdf5.so is in your path? If not, you can >> add >> >> it to the path where it resides to your PATH variable. >> >>> >> >>> Hopefully, that helps. >> >>> >> >>> Aquil H. Abdullah >> >>> >> >>> On Oct 25, 2012, at 18:42, Jason Moore <moo...@gm...> wrote: >> >>> >> >>>> I just updated to Ubuntu 12.10 and my pytables install is broken. I >> >> reinstalled and it seems like I have hdf5 1.8.4 installed but I get >> this >> >> error: >> >>>> >> >>>> moorepants@moorepants-LT:BicycleDataProcessor(master)$ vitables >> >> InstrumentedBicycleData.h5 >> >>>> Traceback (most recent call last): >> >>>> File "/usr/bin/vitables", line 105, in <module> >> >>>> main(sys.argv) >> >>>> File "/usr/bin/vitables", line 48, in main >> >>>> from vitables.vtapp import VTApp >> >>>> File "/usr/share/vitables/vitables/vtapp.py", line 35, in <module> >> >>>> import tables >> >>>> File "/usr/local/lib/python2.7/dist-packages/tables/__init__.py", >> >> line 30, in <module> >> >>>> from tables.utilsExtension import getPyTablesVersion, >> getHDF5Version >> >>>> ImportError: libhdf5.so.6: cannot open shared object file: No such >> file >> >> or directory >> >>>> >> >>>> What am I missing? >> >>>> >> >>>> Jason >> >> >> >> >> >> Anyway you can also use the eotools PPA [1] that has a more updated >> >> version of pytables: 2.4 against 2.3.1 of the official universe repo. >> >> >> >> [1] >> >> >> https://launchpad.net/~a.valentino/+archive/eotools?field.series_filter=quantal >> >> >> >> best regards >> >> >> >> >> >> -- >> >> Antonio Valentino >> >> >> >> -- >> Antonio Valentino >> >> >> ------------------------------------------------------------------------------ >> The Windows 8 Center >> In partnership with Sourceforge >> Your idea - your app - 30 days. Get started! >> http://windows8center.sourceforge.net/ >> what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> > > > > -- > Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> > Davis Bike Collective <http://www.davisbikecollective.org> Minister, > Davis, CA > BikeDavis.info > Google Voice: +01 530-601-9791 > Home: +01 530-753-0794 > > > > -- Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Davis Bike Collective <http://www.davisbikecollective.org> Minister, Davis, CA BikeDavis.info Google Voice: +01 530-601-9791 Home: +01 530-753-0794 |
From: Jason M. <moo...@gm...> - 2012-10-26 19:37:28
|
I'll post the bug report, but I'd like to get this working on my system. I've always had trouble compiling pytables from source due to the dependencies. Right now I just need to get this working because I can no longer use my software now that PyTabes is broken. Question 1: What are the exact command for installing for source (including all flags)? I can't find this explicitly in the documentation especially how to use the --hdf5 flag and other flags to point to where the dependencies are installed. I'm trying: sudo python setup.py build_ext --inplace --hdf5=/usr/lib/libhdf5.so.7 But having little luck. It still can't find my hdf5 library. Question 2: I tried your pytables2.4 ppa but it also can't find the hdf5 library. How can I install the old /usr/lib/lbhdf5.so.6 file? I also remember it being painful to install the HDF5 libraries from source. Is this file available in the Ubuntu repositories? Jason On Fri, Oct 26, 2012 at 10:16 AM, Antonio Valentino < ant...@ti...> wrote: > Hi Jason, > > Il 26/10/2012 18:44, Jason Moore ha scritto: > > Where exactly do I submit the bug report? There doesn't seem to be a bug > > option here: https://launchpad.net/ubuntu/quantal/+package/python-tables > > > > Jason > > of course you need a launchpad account, than you can follow instruction > of the ReportingBugs page on the ubuntu wiki [1] > > [1] https://help.ubuntu.com/community/ReportingBugs > > > ciao > > > On Thu, Oct 25, 2012 at 11:27 PM, Antonio Valentino < > > ant...@ti...> wrote: > > > >> Hi Jason, > >> > >> Il giorno 26/ott/2012, alle ore 07:28, Jason Moore < > moo...@gm...> > >> ha scritto: > >> > >>> So it looks like python-tables in Ubuntu 12.10 requires libhdf5-7 and > >> libhdf5-7 has /usr/lib/libhdf5.so.7 not libhdf5.so.6. > >> > >> Correct, the hdf5 package has been updated in ubuntu 12.10. > >> If you use the standard python-tables package from ubuntu repositories > >> please file a bug report on launchpad. > >> > >> > >>> Jason > >>> > >>> On Thu, Oct 25, 2012 at 4:10 PM, Aquil Abdullah < > >> aqu...@gm...> wrote: > >>> Can you check to see if libhdf5.so is in your path? If not, you can add > >> it to the path where it resides to your PATH variable. > >>> > >>> Hopefully, that helps. > >>> > >>> Aquil H. Abdullah > >>> > >>> On Oct 25, 2012, at 18:42, Jason Moore <moo...@gm...> wrote: > >>> > >>>> I just updated to Ubuntu 12.10 and my pytables install is broken. I > >> reinstalled and it seems like I have hdf5 1.8.4 installed but I get this > >> error: > >>>> > >>>> moorepants@moorepants-LT:BicycleDataProcessor(master)$ vitables > >> InstrumentedBicycleData.h5 > >>>> Traceback (most recent call last): > >>>> File "/usr/bin/vitables", line 105, in <module> > >>>> main(sys.argv) > >>>> File "/usr/bin/vitables", line 48, in main > >>>> from vitables.vtapp import VTApp > >>>> File "/usr/share/vitables/vitables/vtapp.py", line 35, in <module> > >>>> import tables > >>>> File "/usr/local/lib/python2.7/dist-packages/tables/__init__.py", > >> line 30, in <module> > >>>> from tables.utilsExtension import getPyTablesVersion, > getHDF5Version > >>>> ImportError: libhdf5.so.6: cannot open shared object file: No such > file > >> or directory > >>>> > >>>> What am I missing? > >>>> > >>>> Jason > >> > >> > >> Anyway you can also use the eotools PPA [1] that has a more updated > >> version of pytables: 2.4 against 2.3.1 of the official universe repo. > >> > >> [1] > >> > https://launchpad.net/~a.valentino/+archive/eotools?field.series_filter=quantal > >> > >> best regards > >> > >> > >> -- > >> Antonio Valentino > > > > -- > Antonio Valentino > > > ------------------------------------------------------------------------------ > The Windows 8 Center > In partnership with Sourceforge > Your idea - your app - 30 days. Get started! > http://windows8center.sourceforge.net/ > what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > -- Personal Website <http://biosport.ucdavis.edu/lab-members/jason-moore> Davis Bike Collective <http://www.davisbikecollective.org> Minister, Davis, CA BikeDavis.info Google Voice: +01 530-601-9791 Home: +01 530-753-0794 |