From: Jeffrey S W. <Jef...@no...> - 2004-11-09 21:10:21
|
Francesc Altet wrote: > Hi Jeff, > > Yep, it seems that some rework on buffer sizes calculation in 0.9 has made > the chunk sizes for compression much smaller, and hence the compression > ratio. Please, try to apply the next patch and tell me if that works better: > > --- pytables-0.9/tables/EArray.py 2004-10-05 14:30:31.000000000 +0200 > +++ EArray.py 2004-11-09 21:51:11.000000000 +0100 > @@ -254,7 +254,7 @@ > if maxTuples > 10: > # Yes. So the chunk sizes for the non-extendeable dims will be > # unchanged > - chunksizes[extdim] = maxTuples // 10 > + chunksizes[extdim] = maxTuples > else: > # No. reduce other dimensions until we get a proper chunksizes > # shape > @@ -268,7 +268,7 @@ > break > chunksizes[j] = 1 > # Compute the chunksizes correctly for this j index > - chunksize = maxTuples // 10 > + chunksize = maxTuples > if j < len(chunksizes): > # Only modify chunksizes[j] if needed > if chunksize < chunksizes[j]: > > > If works better, I'll have to double check that indexation performance won't > suffer because of this change. To say the truth, I don't quite remember why > I've reduced the chunksizes by a factor of 10, although I want to believe > that there was a good reason :-/ > > Cheers, > > A Dimarts 09 Novembre 2004 17:02, Jeffrey S Whitaker va escriure: > >>Hi: >> >>I just noticed that compression doesn't seem to be working right (for me >>at least) in 0.9. Here's an example: >> >>with pytables 0.9 >> >>[mac28:~/python] jsw% nctoh5 --complevel=6 -o test.nc test.h5 >> >>[mac28:~/python] jsw% ls -l test.nc test.h5 >>-rw-r--r-- 1 jsw jsw 12089048 9 Nov 08:59 test.h5 >>-rw-r--r-- 1 jsw jsw 26355656 4 Nov 17:10 test.nc >> >>with pytables 0.8.1 >> >>[mac28:~/python] jsw% ls -l test.nc test.h5 >>-rw-r--r-- 1 jsw jsw 5344279 9 Nov 09:00 test.h5 >>-rw-r--r-- 1 jsw jsw 26355656 4 Nov 17:10 test.nc >> >>No matter what netcdf file I use as input, the resulting h5 file is >>about twice as large using 0.9 as it is in 0.8.1. >> >>BTW: the test.nc file I used here can be found at >>ftp://ftp.cdc.noaa.gov/Public/jsw. >> >> >>-Jeff >> > > Francesc: That helped a little bit. Now I get [mac28:~/python] jsw% ls -l test.nc test.h5 -rw-r--r-- 1 jsw jsw 9281104 9 Nov 14:04 test.h5 -rw-r--r-- 1 jsw jsw 26355656 4 Nov 17:10 test.nc Still a long way from the 0.8.1 result of 5344279 though. -Jeff -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/CDC R/CDC1 Email : Jef...@no... 325 Broadway Web : www.cdc.noaa.gov/~jsw Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124 |