From: Huddleston, J. <Hud...@ci...> - 2011-10-10 12:08:22
|
Hi Arlindo The AMSR data was stored in HDF4 format; and, the HDF4r3 version works best. I sent a report to the hdfgroup and they have started a help ticket. It took 1.2 seconds to process the AMSR HDF4 file with HDF4r3 and it took nearly two minutes using later HDF4 libraries. The GrADS-2.0.0 was faster than GrADS-2.0a9 and a8 on a Linux Ubuntu server. It was about 2X faster than Windows 7. The Ubuntu server has single processor 2.8GHz with 3GB of RAM. The Windows 7 has two 3.6GHz processors with 8 GB RAM. You'd expect the Windows version to be faster. I'm working on a MinGW native windows build. All the HDF4, HDF5, and NetCDF applications under MingW run 10x faster than equivalent Ubuntu versions (which is what you would expect from the hardware difference.) e.g. See http://vista.cira.colostate.edu/nco/Grib2/ WGRIB2 executables I build for Wesley. John ________________________________ From: mor...@gm... [mor...@gm...] On Behalf Of Arlindo da Silva [arl...@na...] Sent: Sunday, October 09, 2011 4:41 PM To: Huddleston, John Cc: ope...@li... Subject: Re: 4.1.1 vs. 4.1.2 On Fri, Oct 7, 2011 at 12:27 PM, Huddleston, John <Hud...@ci...<mailto:Hud...@ci...>> wrote: Jennifer While creating the AMSR CTL file I did some testing on the Ubuntu Linux build of GrADS and found that version 2.0.0 was faster than both 2.0a8 and 2.0a9 by some 20-30%. So, I did the same run on Windows and it was horribly slow. I re-compiled GrADS with the GNU profiler and found that the HDF5 HAPatom_object was the culprit. % cumulative self self total time seconds seconds calls ms/call ms/call name 99.21 85.51 85.51 HAPatom_object 0.15 85.64 0.13 475492 0.00 0.00 gahrow 0.07 85.70 0.06 475544 0.00 0.00 gree 0.06 85.75 0.05 HTPinquire 0.05 85.79 0.04 76373 0.00 0.00 gxfill 0.03 85.82 0.03 475559 0.00 0.00 galloc 0.03 85.85 0.03 DFKconvert I checked and I was using HDF5-1.8.6; so, I’m building HDF5-1.8.7 for Cygwin and will rebuild GrADS 2.0.0. Please let me know that you find. I have tried my windows build of 2.0.a9 on this large netcdf-4 file (with compression) ftp://gma...@ft.../fp/das/Y2011/M10/D01/DAS.ops.asm.inst3_3d_asm_Np.GEOS572.20111001_0000.V01.nc4 and found that lats4d -i DAS.ops.asm.inst3_3d_asm_Np.GEOS572.20111001_0000.V01.nc4 -format stats ran slightly faster on Windows 7 than Mac OS X 10.6 (48.6s vs 51.6s) on the same hardware (different boots). This was based on HDF5 1.8.4 and NetCDF 4.1.1 that I used for the 2.0.a9 builds. I will now build 2.0.0 with new libraries and rerun the benchmark. Arlindo John From: Jennifer Adams [mailto:jm...@co...<mailto:jm...@co...>] Sent: Wednesday, October 05, 2011 3:39 PM To: Arlindo da Silva Cc: Brian Doty; Huddleston, John Subject: Re: 4.1.1 vs. 4.1.2 I think netcdf-4.1.2 (with zlib-1.2.5 and hdf5-1.8.7) is what we should use for GrADS 2.0.0. We've been using 4.1.2 at COLA for just about a year. I have made new builds for darwin and CentOS and posted them to our FTP server, and I changed my supplibs.html page too. Arlindo, thanks for bringing this bug to my attention. John, I guess you have no extra work since your builds are already linked with 4.1.2. --Jennifer p.s. Here is what Dennis Heimbigner wrote about the performance bug: "The performance bug has been in the code since almost the beginning of opendap support. It is possible (even probable) that you will not encounter it. It occurs when you read a large variable and specify it as a constraint in the url. E.g. http://x.com/y.nc?var[0:1] The bug is that the code ignores the [0:1] and reads the whole variable. The result is correct, but unnecessary data is read." I don't exactly know whether that syntax he shows is different from the requests in the gds log that look like this: "...GET /gfs.dods?ps.ps<http://ps.ps>[0][0:360][0:719] ..." Maybe they ARE different and that's why we don't encounter a performance change between old versions and the new snapshot. In any event, I think 4.1.2 is the way to go for now. On Oct 5, 2011, at 1:20 PM, Arlindo da Silva wrote: On Wed, Oct 5, 2011 at 11:51 AM, Jennifer Adams <jm...@co...<mailto:jm...@co...>> wrote: Hi, Arlindo -- The release notes for 4.1.2 ( http://www.unidata.ucar.edu/software/netcdf/release-notes-4.1.2.html ) mention a lot of items that are relevant for GrADS, one especially that was pointed out by me (speedup in opening of large files). I will definitely not be going to back to 4.1.1. We've been using 4.1.2 at COLA for a long time. This is good to know. If you have been using 4.1.2 for sometime than this is proof of stability. I'll follow your lead. Here are the 7 tests I ran with the two builds -- I could not detect any difference in the time it took for these tasks to run when comparing 4.1.2 and 4.2.snapshot. I share your concern about 4.2. First, it is very new, and second, a large portion of the opendap codebase was apparently rewritten. Let me know when you make the final decision about 4.1.2 and I'll update my supplibs accordingly. Cheers! Arlindo --Jennifer 'reinit' '!date' if (0) * The original bug 'sdfopen http://opendap.gsfc.nasa.gov:9090/dods/GEOS-5/fp/0.25_deg/assim/inst1_2d_hwl_Nx' 'set x 1 1152' 'd slp' say result endif if (0) * NOMADS GDS 'sdfopen http://nomads.ncep.noaa.gov:9090/dods/gfs/gfs20111003/gfs_00z' 'set x 1 360' 'set y 1 181' 'set z 1 26' 'set t 1 65' 'define foo = hgtprs' endif if (0) * COLA GDS, Ensembles 'sdfopen http://monsoondata.org:9090/dods/gfsens/gfsens.2011100400' 'set x 1 360' 'set y 1 181' 'set z 1 7' 'set e 1 22' 'define foo = z' endif if (0) * Netcdf4 high res data behind GDS 'sdfopen http://monsoondata.org:9090/dods/nicam/sst' 'set x 1 5120' 'set y 1 2556' 'd ave(sst,t=1,t=8)' endif if (0) * Netcdf4 high res data on local disk 'open /data/hdf5/grids/sst.ctl' 'set x 1 5120' 'set y 1 2556' 'd ave(sst,t=1,t=8)' endif if (0) * Local file, PDEF'd 'open /data/netcdf/gswp/Albedo_mean_csu.ctl' 'set x 1 360' 'set y 1 150' 'set z 1' 'set t 1 12' 'define foo = albedo' endif if (1) * Classic netcdf 'sdfopen /data/netcdf/rean/air.mon.mean.nc<http://air.mon.mean.nc>' 'set x 1 144' 'set y 1 73' 'set z 1 17' 'set t 1 349' 'define foo = air' endif '!date' On Oct 5, 2011, at 10:26 AM, Arlindo da Silva wrote: On Tue, Oct 4, 2011 at 5:06 PM, Jennifer Adams <jm...@co...<mailto:jm...@co...>> wrote: Hi, Dennis -- Using the same GrADS source code, I have created two builds, one linked with netcdf-4.1.2 and the other netcdf-4.2-snapshot2011100320. Both use hdf5-1.8.7. The bug that showed up with 4.1.3 does not reproduce with either build. Furthermore, I am unable to detect any performance difference with these two builds, having tested some bulky I/O requests via OPeNDAP from several different GDS servers, and also from local classic netcdf files and very large compressed netcdf4 files. I am inclined to use 4.1.2 for my GrADS release. Can you give me an example for how to test the performance enhancements in the 4.2 snapshot? I have a similar experience. Comparing builds with older 4.1.1 and most recent 4.2 snapshot (with subsetting) gives essentially the same timing (if anything, 4.1.1 is a tiny bit faster: 49.69 vs 45.63, but this difference could be "natural variability"). Jennifer: BTW, here is my grads benchmark: lats4d -i http://opendap.gsfc.nasa.gov:9090/dods/GEOS-5/fp/0.25_deg/assim/inst1_2d_hwl_Nx -format stats -lat 0 30 -ntimes 10 Since 4.1.1 has been working for us very reliably for over a year, I am very tempted to stick with it in name of stability. Are there any major bug fixes going from 4.1.1 to 4.1.2? Arlindo -- Jennifer M. Adams IGES/COLA 4041 Powder Mill Road, Suite 302 Calverton, MD 20705 jm...@co...<mailto:jm...@co...> -- Jennifer M. Adams IGES/COLA 4041 Powder Mill Road, Suite 302 Calverton, MD 20705 jm...@co...<mailto:jm...@co...> |