From: Anthony S. <sc...@gm...> - 2013-06-04 22:38:31
|
On Tue, Jun 4, 2013 at 12:30 PM, Seref Arikan <ser...@gm...> wrote: > I think I've seen this in the release notes of 3.0. This is actually > something that I'm looking into as well. So any experience/feedback about > creating files in memory would be much appreciated. > I think that you want to set parameters.DRIVER to H5DF_CORE [1]. I haven't ever used this personally, but it would be great to have an example script, if someone wants to write one ;) Be Well Anthony 1. http://pytables.github.io/usersguide/parameter_files.html#hdf5-driver-management > > Best regards > Seref > > > > On Tue, Jun 4, 2013 at 2:09 PM, Andreas Hilboll <li...@hi...> wrote: > >> On 04.06.2013 05:35, Tim Burgess wrote: >> > My thoughts are: >> > >> > - try it without any compression. Assuming 32 bit floats, your monthly >> > 5760 x 2880 is only about 65MB. Uncompressed data may perform well and >> > at the least it will give you a baseline to work from - and will help if >> > you are investigating IO tuning. >> > >> > - I have found with CArray that the auto chunksize works fairly well. >> > Experiment with that chunksize and with some chunksizes that you think >> > are more appropriate (maybe temporal rather than spatial in your case). >> > >> > On Jun 03, 2013, at 10:45 PM, Andreas Hilboll <li...@hi...> wrote: >> > >> >> On 03.06.2013 14:43, Andreas Hilboll wrote: >> >> > Hi, >> >> > >> >> > I'm storing large datasets (5760 x 2880 x ~150) in a compressed >> EArray >> >> > (the last dimension represents time, and once per month there'll be >> one >> >> > more 5760x2880 array to add to the end). >> >> > >> >> > Now, extracting timeseries at one index location is slow; e.g., for >> four >> >> > indices, it takes several seconds: >> >> > >> >> > In [19]: idx = ((5000, 600, 800, 900), (1000, 2000, 500, 1)) >> >> > >> >> > In [20]: %time AA = np.vstack([_a[i,j] for i,j in zip(*idx)]) >> >> > CPU times: user 4.31 s, sys: 0.07 s, total: 4.38 s >> >> > Wall time: 7.17 s >> >> > >> >> > I have the feeling that this performance could be improved, but I'm >> not >> >> > sure about how to properly use the `chunkshape` parameter in my case. >> >> > >> >> > Any help is greatly appreciated :) >> >> > >> >> > Cheers, Andreas. >> >> >> >> PS: If I could get significant performance gains by not using an EArray >> >> and therefore re-creating the whole database each month, then this >> would >> >> also be an option. >> >> >> >> -- Andreas. >> >> Thanks a lot, Anthony and Tim! I was able to get down the readout time >> considerably using chunkshape=(32, 32, 256) for my 5760x2880x150 array. >> Now, reading times are about as fast as I expected. >> >> the downside is that now, building up the database takes up a lot of >> time, because i get the data in chunks of 5760x2880x1. So I guess that >> writing the data to disk like this causes a load of IO operations ... >> >> My new question: Is there a way to create a file in-memory? If possible, >> I could then build up my database in-memory and then, once it's done, >> just copy the arrays to an on-disk file. Is that possible? If so, how? >> >> Thanks a lot for your help! >> >> -- Andreas. >> >> >> >> ------------------------------------------------------------------------------ >> How ServiceNow helps IT people transform IT departments: >> 1. A cloud service to automate IT design, transition and operations >> 2. Dashboards that offer high-level views of enterprise services >> 3. A single system of record for all IT processes >> http://p.sf.net/sfu/servicenow-d2d-j >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> > > > > ------------------------------------------------------------------------------ > How ServiceNow helps IT people transform IT departments: > 1. A cloud service to automate IT design, transition and operations > 2. Dashboards that offer high-level views of enterprise services > 3. A single system of record for all IT processes > http://p.sf.net/sfu/servicenow-d2d-j > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |