Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

help make this more efficient

Help
k_knapp
2008-11-04
2013-10-17
  • k_knapp
    k_knapp
    2008-11-04

    I have ~70000 netCDF files which I want to subset over Europe then ncrcat together in time.
    Currently, each file is (time,lat,lon)=(1,715,1429) (where time is unlimited) [HEADER is below]

    I am running the script below which uses two steps to do the concatenation. It is pretty slow.
    I suspect this could be done in one step?
    I'd appreciate any help to make this script more efficient.
    Thanks-
    -Ken

    SCRIPT=====================:
    #!/bin/bash
    HOVFILE=~/$France.1983-2006.nc
    OUTFILE=temp.nc

    for YEAR in `seq 1983 2006`
    do
      for file in `ls $YEAR/ALL/ | grep nc`
        do

        ncks -v IRWIN -d lon,0.,5. -d lat,45.,50. $file -A $OUTFILE
        if [ ! -e $HOVFILE ]
        then
        ncrcat -h $OUTFILE -O -o $HOVFILE
        else
        ncrcat -h $HOVFILE $OUTFILE -O -o $HOVFILE
        fi

      done
    done

    SAMPLE FILE HEADER======================:
    netcdf HURSAT-BASIN.NA.1999.04.29.03.beta {
    dimensions:
            lon = 1429 ;
            lat = 715 ;
            Ngeo = 5 ;
            StrLen = 50 ;
            nts = 1 ;
            time = UNLIMITED ; // (1 currently)
            num_coef = 4 ;
    variables:
            float lat(lat) ;
                    lat:units = "degrees_north" ;
                    lat:long_name = "Latitude" ;
                    lat:actual_range = 0.f, 50.f ;
            float lon(lon) ;
                    lon:units = "degrees_east" ;
                    lon:long_name = "Longitude" ;
                    lon:actual_range = -100.f, 0.f ;
            int time(time) ;
                    time:units = "hours since 1970-01-01 00:00:00" ;
                    time:long_name = "time" ;
            short IRWIN(time, lat, lon) ;
                    IRWIN:scale_factor = 0.01f ;
                    IRWIN:add_offset = 200.f ;
                    IRWIN:long_name = "Brightness Temperature (~11um)" ;
                    IRWIN:units = "Kelvin" ;

    ...

     
    • Dave Allured
      Dave Allured
      2008-11-04

      For 70000 files you might consider a custom program in NCL, Fortran, C, or your favorite Netcdf interface language.  Make an estimate of how much run time will be needed with the script and NCO method.  Then consider how long it would take you to write a program, the anticipated run time for that, and also how long you are content to wait for NCO to chug along.  HTH.

      Dave Allured
      CU/CIRES Climate Diagnostics Center (CDC)
      http://cires.colorado.edu/science/centers/cdc/
      NOAA/ESRL/PSD, Climate Analysis Branch (CAB)
      http://www.cdc.noaa.gov/

       
    • Charlie Zender
      Charlie Zender
      2008-11-05

      first read this:

      http://nco.sf.net/nco.html#lrg

      ncrcat could do this in one command with no loop
      if you fed it all 70000 file names as standard input.
      this would be quicker than what you have.
      ncrcat can do the hyperslabbing itself, there is no need for ncks.

      hope this helps,
      charlie

       
  • Kevin
    Kevin
    2013-02-04

    Does anybody know how to feed large numbers of files like that into ncecat using the Windows cmd prompt?  I've tried many combinations piping the 'ls' and 'dir' commands into grep, then into ncecat, but I can't seem to get the syntax just right.

     
  • Charlie Zender
    Charlie Zender
    2013-02-05

    windows will not work with the standard unix filters like ls, grep, …
    speak up if you know a way to fool windows to use tools like this from the command line.
    instead your best bet for large numbers of files is to use the automatic filename generation feature of NCO, aka, the -n switch,
    http://nco.sf.net/nco.html#input
    cz