From: Francesc A. <fa...@py...> - 2004-11-09 20:57:58
|
Hi Jeff, Yep, it seems that some rework on buffer sizes calculation in 0.9 has made the chunk sizes for compression much smaller, and hence the compression ratio. Please, try to apply the next patch and tell me if that works better: --- pytables-0.9/tables/EArray.py 2004-10-05 14:30:31.000000000 +0200 +++ EArray.py 2004-11-09 21:51:11.000000000 +0100 @@ -254,7 +254,7 @@ if maxTuples > 10: # Yes. So the chunk sizes for the non-extendeable dims will be # unchanged - chunksizes[extdim] = maxTuples // 10 + chunksizes[extdim] = maxTuples else: # No. reduce other dimensions until we get a proper chunksizes # shape @@ -268,7 +268,7 @@ break chunksizes[j] = 1 # Compute the chunksizes correctly for this j index - chunksize = maxTuples // 10 + chunksize = maxTuples if j < len(chunksizes): # Only modify chunksizes[j] if needed if chunksize < chunksizes[j]: If works better, I'll have to double check that indexation performance won't suffer because of this change. To say the truth, I don't quite remember why I've reduced the chunksizes by a factor of 10, although I want to believe that there was a good reason :-/ Cheers, A Dimarts 09 Novembre 2004 17:02, Jeffrey S Whitaker va escriure: > Hi: > > I just noticed that compression doesn't seem to be working right (for me > at least) in 0.9. Here's an example: > > with pytables 0.9 > > [mac28:~/python] jsw% nctoh5 --complevel=6 -o test.nc test.h5 > > [mac28:~/python] jsw% ls -l test.nc test.h5 > -rw-r--r-- 1 jsw jsw 12089048 9 Nov 08:59 test.h5 > -rw-r--r-- 1 jsw jsw 26355656 4 Nov 17:10 test.nc > > with pytables 0.8.1 > > [mac28:~/python] jsw% ls -l test.nc test.h5 > -rw-r--r-- 1 jsw jsw 5344279 9 Nov 09:00 test.h5 > -rw-r--r-- 1 jsw jsw 26355656 4 Nov 17:10 test.nc > > No matter what netcdf file I use as input, the resulting h5 file is > about twice as large using 0.9 as it is in 0.8.1. > > BTW: the test.nc file I used here can be found at > ftp://ftp.cdc.noaa.gov/Public/jsw. > > > -Jeff > -- Francesc Altet |