How to join multiple NetCDF files of adjacent

Help
2012-10-07
2013-10-17
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-07

    Hi Zender,

    I want to use DAYMET  data (multiple NetCDF files of adjacent regions)  to compare the Weather research and forecasting modelling system (WRF) output. Since the domain area of my model output cover more than one DAYMET tile so I need to merge multiple DAYMET tile into single continuous data set. 

    I  have tried to follow your suggestion in this post but getting error message:

    ncks -mk_rec_dmn lat in1.nc tmp1.nc           # make lat record dimension
    ncks -mk_rec_dmn lat in2.nc tmp2.nc           # make lat record dimension
    ncrcat tmp1.nc tmp2.nc out.nc                          # concatenate latitudes

    First two went well but for third line i got following error:  \"ncrcat: ERROR input file temp1.nc lacks a record dimension\"

    I am using  NCO 4.0.8

    I also tried to follow  another suggestion by Zader

    ncecat in1.nc tmp1.nc # add record dimension
    ncecat in2.nc tmp2.nc # add record dimension
    ncpdq -a lat,lon,time,record tmp1.nc tmp1.nc # make lat record dimension
    ncpdq -a lat,lon,time,record tmp2.nc tmp2.nc # make lat record dimension
    ncrcat tmp1.nc tmp2.nc out.nc # concatenate latitudes
    ncwa -a record out.nc out.nc # remove record dimension
    ncpdq -a time,lat,lon out.nc out.nc # restore original dimension ordering

    But  this time the first 4 instruction runs fine and give the same error as before on instruction line 5th.

    ==================================
    Here is the description of my 1st input file:
    ==================================

    netcdf \\11930_2002_prcp {
    dimensions:
            x = 181 ;
            y = 214 ;
            time = 365 ;
            nv = 2 ;
    and
    variables:
    float prcp(time, y, x)

    ==================================
    Here is the description of my 2nd input file:
    ================================== 

    netcdf \\11750_2002_prcp {
    dimensions:
            x = 193 ;
            y = 242 ;
            time = 365 ;
            nv = 2 ;
    and
    variables:

    float prcp(time, y, x) ;

    Thanks
    Kamal
    PhD candidate
    University of Waterloo, Canada

     
  • Charlie Zender

    Charlie Zender - 2012-10-08

    hi kamal,
    your email brought to my attention some issues that needed fixing. i have now done that. i feel like i walked a mile for a kamal :)

    first, your input files do not appear to have the dimension lat.
    ncks -mk_rec_dmn should have told you this and failed but instead it silently failed.
    the code currently in cvs, and that will be in nco 4.2.2 fixes this.
    all you need to do with method 1 is change lat to y.

    regarding the ncpdq method, i suspect that since lat is not a dimension, it failed too.
    anyway, go back and try the ncks method. it should work with correct arguments.
    cz

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-08

    Hi Zender,

    I have tried to follow you suggestion but get the following error message:

    $ ncks -mk_rec_dmn y 11930_2002_prcp.nc  tmp1.nc

    ncks: ERROR You defined the output record dimension to be "y". Yet in the input variable "prcp" the record dimension is dimension number 2. NCO (and the netCDF3 API) only supports the record dimension beging the first dimensi
    on. Consider using ncpdq to permute the location of the record dimension in the input file.

    Thanks
    Kamal

     
  • Charlie Zender

    Charlie Zender - 2012-10-08

    Ahhh yes. I forgot ncks cannot permute dimensions but ncpdq can. So try the ncpdq method with the correct dimension names.
    c

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-08

    Hi Zender,

    Then what should be the final command structure:

    Is it look like :

    ncpdq  -mk_rec_dmn y 11930_2002_prcp.nc  tmp1.nc

    Thanks
    Kamal

     
  • Charlie Zender

    Charlie Zender - 2012-10-08

    ncecat in1.nc tmp1.nc # add record dimension ncecat in2.nc tmp2.nc # add record dimension
    ncpdq -a y,x,time,record tmp1.nc tmp1.nc # make y record dimension
    ncpdq -a y,x,time,record tmp2.nc tmp2.nc # make y record dimension
    ncrcat tmp1.nc tmp2.nc out.nc # concatenate latitudes
    ncwa -a record out.nc out.nc # remove record dimension
    ncpdq -a time,y,x out.nc out.nc # restore original dimension ordering

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-08

    Hi Zender,

    This time I get the following error

    gpc-f101n084-$ ncpdq -O -a y,x,time,record tmp1.nc -o tmp1.nc
    ncpdq: INFO nco_var_dmn_rdr_mtd() for variable lat reports old input record dimension record is now ordinal dimension 2, new record dimension must be y
    ncpdq: ERROR nco_var_dmn_rdr_mtd() did not find record dimension in variable yearday which claims to be record variable

     
  • Dave Allured

    Dave Allured - 2012-10-09

    Kamal and Charlie,

    I think that Daymet tiles have overlaps along the X and Y edges.  This means that simple concatenation across either X or Y will probably not work.  This is not an easy data set.

    For starters, note that *both* X and Y dimension sizes differ between the two sample inputs above. I expect that ncrcat will signal an X dimension mismatch, if you try to concatenate along Y as shown in #6 above.

    To get two north-south adjacent tiles to align, you would need to add the right amount of padding (missing values) along the correct edges on the X dimension, probably to both input files.  Then you would delete an overlap area on the Y dimension from ONE of the input files.  A small number of non-overlap grid points on one end of the cropped area would be lost by this method.

    Carefully examine the X and Y coordinates in both input files to determine the exact offsets for padding and cropping.  For example, Daymet tiles 11737 and 11917 (Colorado and Wyoming) have the following 13 Y coordinates in common, so these would be cropped from ONE but not both of the input files:

       -23500, -24500, -25500, -26500, -27500, -28500, -29500, -30500,
       -31500, -32500, -33500, -34500, -35500       (meters)

    After that, ncrcat may work correctly.  Deleting the overlap can be done with -d (hyperslab), but I do not know how to add padding plus extra coordinates with NCO.  A crop-only solution would be easier, with more lost grid points.

    A better solution would be a program that would allocate one super array, then determine offsets and overlay the tile hyperslabs in the correct positions within the super array.

    Beyond that, it seems that there are already GIS oriented solutions that can do spatial merging of Daymet tiles.  Check out the Daymet Community Tools page, in particular step 4 in the document for the GDAL Utilities.  I do not know whether any can make Netcdf output.  HTH.

    -Dave

     
  • Dave Allured

    Dave Allured - 2012-10-09

    Here is the link for the Daymet Community Tools page mentioned above:
    http://daymet.ornl.gov/Tools

     
  • Dave Allured

    Dave Allured - 2012-10-09

    Sorry, I just found out that my pad and crop solution above will not work correctly, because the triangular overlap areas at the edges of the tiles are filled with missing values.  There may be a way to make NCO "join" such areas.  Otherwise, a specific program or GIS are the only ways to go.

    Figure 2 and the note at the bottom of page 5 of this PDF documentation explain the missing values:
    http://daymet.ornl.gov/sites/default/files/UserGuides/current/readme_dailygriddedsurfacedata.pdf

    -Dave

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-09

    Dear  Dave,

    I was able to merge adjacent DAYMET tile using GIS but it was really time consuming. It takes couple of days to get one month average of 8/10 tiles. That is why I am trying to look for an alternate solution.

    Kamal 

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-10-09

    Hi Zender,

    I have come up with a problematic solution. That is now I can add adjacent tile following the below command:

    ncecat -O 11930_2002_prcp.nc  tmp1.nc
    ncecat -O 11750_2002_prcp.nc  tmp2.nc
    ncpdq -O -a y,x,time tmp1.nc -o tmp1.nc
    ncpdq -O -a y,x,time tmp2.nc -o tmp2.nc
    ncrcat tmp1.nc tmp2.nc -o  out.nc

    ncwa -a record out.nc -o out.nc
    ncpdq -O -a time,y,x out.nc -o out.nc

    But the problem is that  the final output  shows overlapping tiles instead of adjacent one.  I do not know how to solve the problem???

    Thanks
    Kamal

       

     
  • Dave Allured

    Dave Allured - 2012-10-10

    Kamal, here is a working solution in the NCL programming language.  I had to deal with Daymet sooner or later, so here you go.

    This demo merges two adjacent Daymet tiles, including both data and coordinates.  The overlap areas are handled correctly, without erasing data.  Run time is 3 seconds to merge one year of data for two tiles, on Mac OS.  NCL is a free programming language with Netcdf capabilities, available from Unidata:

       http://www.ncl.ucar.edu

    Charlie, feel free to copy code from this demo.  I do not really think that hyperslab spatial overlays need to be within the realm of NCO, especially with all the Daymet quirks that need to be handled.  But you might want to do it anyway.

    -Dave

    ;---------------------------------------------------------------------------
    ;
    ; NCL demo program to spatially join two Netcdf Daymet tiles.
    ;
    ; 2012-oct-09   By Dave Allured, NOAA/PSD/CIRES Climate Analysis Branch.
    ;       Infill of missing 2-D coordinates is not yet provided.
    ;
    ; Notes:
    ;
    ; This version requires that the two input tiles have at least
    ; one grid point in common.
    ;
    ; This method can generally be extended in both X and Y to any
    ; number of tiles, by calculating the appropriate offsets.
    ;
    ;---------------------------------------------------------------------------
    load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
    begin
       var     = "prcp"
       file2   = "11737_1980/" + var + ".nc"
       file1   = "11917_1980/" + var + ".nc"
       outfile = var + ".1980.nc"
       f1 = addfile (file1, "r")        ; open input files
       f2 = addfile (file2, "r")
       x1 = f1->x               ; read 1-D coordinate vectors
       y1 = f1->y
       x2 = f2->x
       y2 = f2->y
       i1a = max ((/ 0, ind (x2 .eq. x1(0)) /))  ; compute starting offsets
       i2a = max ((/ 0, ind (x1 .eq. x2(0)) /))  ; uses missing value trick
       j1a = max ((/ 0, ind (y2 .eq. y1(0)) /))
       j2a = max ((/ 0, ind (y1 .eq. y2(0)) /))
       i1b = i1a + dimsizes (x1) - 1         ; compute ending offsets
       i2b = i2a + dimsizes (x2) - 1
       j1b = j1a + dimsizes (y1) - 1
       j2b = j2a + dimsizes (y2) - 1
       nxout = 1 + max ((/ i1b, i2b /)) ; combined grid size
       nyout = 1 + max ((/ j1b, j2b /))
       print ("nxout, nyout = " + nxout + ", " + nyout)
       ntimes = dimsizes (f1->time)     ; allocate output arrays
       dims2  = (/ nyout, nxout /)
       dims3  = (/ ntimes, nyout, nxout /)
       sample = f1->$var$(0,0,0)
       olat = new (dims2, typeof (f1->lat), "No_FillValue")
       olon = new (dims2, typeof (f1->lon), "No_FillValue")
       xout = new (dims3, typeof (sample),  sample@_FillValue)
       print ("Overlay first  tile into arrays.")
       olat(j1a:j1b,i1a:i1b)   = f1->lat    ; overlay 2-D coordinates
       olon(j1a:j1b,i1a:i1b)   = f1->lon
       xout(:,j1a:j1b,i1a:i1b) = f1->$var$  ; overlay main array
       print ("Overlay second tile into arrays.")
       olat(j2a:j2b,i2a:i2b)   = f2->lat    ; overlay 2-D coordinates
       olon(j2a:j2b,i2a:i2b)   = f2->lon
       xout(:,j2a:j2b,i2a:i2b) = (/ where (ismissing (f2->$var$), \
          xout(:,j2a:j2b,i2a:i2b), f2->$var$) /)
                            ; overlay main array; use mask
                        ; to preserve existing data
       xout&x(i2a:i2b) = (/ x2 /)       ; 1-D coordinates were excluded by
       xout&y(j2a:j2b) = (/ y2 /)       ; masking; must copy explicitly
       print ("Write output file: " + outfile)
       if (isfilepresent (outfile)) then    ; overwrite any previous file
          system ("rm " + outfile)
       end if
       out = addfile (outfile, "c")     ; create new output file
       copy_VarAtts (f1, out)       ; copy the global attributes
       time_stamp  = systemfunc ("date")    ; add history attribute
       out@history = time_stamp + ": Tiles joined by join.daymet.1009.ncl"
       delete (out@tileid)          ; fix up the tile ID attribute
       out@tileid = (/ f1@tileid, f2@tileid /)
    ; Write variables in approximately the same order as in the input files.
    ; Some variables are copied directly because they are not affected
    ; by the join process.
    ; Coordinate variables get copied automatically.
       out->lambert_conformal_conic = f1->lambert_conformal_conic
       out->lat = olat          ; write the merged 2-D coordinates
       out->lon = olon
       out->yearday   = f1->yearday     ; copy auxiliary time arrays
       out->time_bnds = f1->time_bnds
       out->$var$ = xout            ; write the main data array last,
                        ; to improve speed
       delete (out->x@_FillValue)       ; remove vestigial attributes
       delete (out->y@_FillValue)
    end
    exit
    
     
  • Charlie Zender

    Charlie Zender - 2012-10-10

    Hi Kamal and Dave,

    Thanks, Dave.

    Dave is right that if both dimensions have different sizes in both files then NCO will complain.
    In that case the index/array capabilities of ncap2 are called for.
    ncap2 can do all of the above in the NCL script except ncap2 only accepts one input file so using would require first merging both inputs into one file and renaming variables and the speed would not be worth the ugliness.
    So use the NCL script.

    By the way, your use case above of ncpdq failed unexpectedly with this command

    ncpdq -O -a y,x,time,record tmp1.nc -o tmp1.nc

    That was a ncpdq bug to which I have just committed the fix.
    So if you want to give it another try with NCO, then check-out and build the latest code first and the above command should work.
    cz

     
  • Ping Yang

    Ping Yang - 2012-10-24

    Hi Dave,

    Thanks for the NCL script for the joining files.

    Actually I tried this script with the actual Daymet files:

    and I am sure if it working properly:

    the file information I got(I use cdo sinfo **)
    the original file1:
    cdo sinfo 11390_1980_prcp.nc |more
       File format: netCDF
        -1 : Institut Source  Table Code   Time Typ  Grid Size Num  Levels Num
         1 : unknown  unknown     0   -1   var  I16         1   1       1   1
         2 : unknown  unknown     0   -2   var  F32     52430   2       1   1
       Horizontal grids :
         1 : generic      > size      : dim = 1
         2 : curvilinear  > size      : dim = 52430  nx = 214  ny = 245
                            lon       : min = -82.5079845  max = -79.4318514  degrees_east
                            lat       : min = 35.6549601  max = 38.339592  degrees_north
                            available : xvals yvals
       Vertical grids :
         1 : surface                 : 0
       Time axis :  365 steps
         RefTime =  1980-01-01 00:00:00  Units = days  Calendar = STANDARD
      YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss

    the original file2:
    cdo sinfo 11390_1980_prcp.nc |more
       File format: netCDF
        -1 : Institut Source  Table Code   Time Typ  Grid Size Num  Levels Num
         1 : unknown  unknown     0   -1   var  I16         1   1       1   1
         2 : unknown  unknown     0   -2   var  F32     52430   2       1   1
       Horizontal grids :
         1 : generic      > size      : dim = 1
         2 : curvilinear  > size      : dim = 52430  nx = 214  ny = 245
                            lon       : min = -82.5079845  max = -79.4318514  degrees_east
                            lat       : min = 35.6549601  max = 38.339592  degrees_north
                            available : xvals yvals
       Vertical grids :
         1 : surface                 : 0
       Time axis :  365 steps
         RefTime =  1980-01-01 00:00:00  Units = days  Calendar = STANDARD
      YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss

    the merged file:
    do sinfo 11389-90_1980_prcp.nc |more
       File format: netCDF
        -1 : Institut Source  Table Code   Time Typ  Grid Size Num  Levels Num
         1 : unknown  unknown     0   -1   var  I16         1   1       1   1
         2 : unknown  unknown     0   -2   var  F32    105840   2       1   1
       Horizontal grids :
         1 : generic      > size      : dim = 1
         2 : curvilinear  > size      : dim = 105840  nx = 378  ny = 280
                            lon       : min = -84.4456074  max = 0  degrees_east
                            lat       : min = 0  max = 38.339592  degrees_north
                            available : xvals yvals
       Vertical grids :
         1 : surface                 : 0
       Time axis :  365 steps
         RefTime =  1980-01-01 00:00:00  Units = days  Calendar = STANDARD
      YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss  YYYY-MM-DD hh:mm:ss
      1980-01-01 12:00:00  1980-01-02 12:00:00  1980-01-03 12:00:00  1980-01-04 12:00:00

    I saw the merged file with the max and min 0, Is this correct or not?

    Looking forward to hearing from you.

    Regards,

    Ping

     
  • Dave Allured

    Dave Allured - 2012-10-25

    Hello Ping.

    Your output above for the merged file looks okay to me.   In particular, the dimensions of the curvilinear grid look about right.  The merged dimensions should be slightly different than the simple concatenation of the two input grids, because of offsetting and overlapping that is characteristic for adjacent Dayment tiles.

    For the input files, I think you showed 11390 twice, and you did not show 11389, but that does not really matter right now.

    The reason for the max and min of zero in the 2-D coordinates is because the two input grids are offset when they are merged, and this results in gaps along the edges of the 2-D coordinate arrays.  NCL has apparently filled these gaps with zero.  However, since the gaps are only associated with missing data, the file is still valid with many software methods.

    The gaps in the 2-D coordinates are a deficiency in this method which I hope to fix some time soon.  The gaps will cause trouble for some software.  In particular, NCL's RasterFill mode (colored boxes) when plotting fails dramatically when there are gaps in the 2-D coordinates.  It is likely that some functions in CDO and other generic software will have trouble with these gaps.

    I have a newer version of the join script that fixes a couple problems and allows you to join more tiles.  SourceForge's spam filter prevented me from posting a full description of the new version.  I will try again after this is posted.

    -Dave

     
  • Dave Allured

    Dave Allured - 2012-10-25

    All,

    I am working on an upgrade to the NCL join script that will handle merged files as inputs.  This will allow you to join larger numbers of Daymet tiles by combining previous merges.  You can download my current working version here.  The file name is join.daymet.1018.ncl, and it fixes a couple of problems in the original version.  The problem with gaps in coordinates, mentioned previously, is NOT fixed yet:

       http://www.esrl.noaa.gov/psd/people/dave.allured/data/daymet/

    This version successfully joined 12 Daymet tiles by use of the driver script make-wyoming.1018.csh.  The idiot SourceForge spam filter is still preventing me from showing a plot of the result.  Grrr.  Just display wyoming.gif from the above directory.  To check your own merged files, try using this plot script with appropriate changes to display them.

    -Dave

     
  • Dave Allured

    Dave Allured - 2012-10-25

    Here is the plot of the 12 merged tiles for Wyoming:

    This was made by plot.wyoming.1024.ncl in the web directory mentioned above.

    -Dave

     
  • Ping Yang

    Ping Yang - 2012-10-26

    Hi Dave,

    I tried to run the script for join 5 files

    I have the code(only one place changed)

    #!/bin/sh(I chaned here from csh to sh)

    #2012-oct-25 Merge control script for Northeast U.S.

    set join  = 'MergeScript/join.daymet.1018.ncl'
    set base1 = 'prcp.nc'
    set base2 = 'prcp.1980.nc'

    # Modify NCL code for use with other years (I have a question here, how can I change the join.daymet.1018.ncl file to use a year as variable which can increasingly from 1980-2010?

    # Note: All joins must be touching on at least one corner

    # 1 tile to 2 tiles

    echo ncl $join file1=\"11389_1980/$base1\" file2=\"11390_1980/$base1\" join=1138990
    echo ncl $join file1=\"11391_1980/$base1\" file2=\"11392_1980/$base1\" join=1139192

    #I have 5 files for one row here

    echo ncl $join file1=\"1138990.$base2\" file2=\"1139192.$base2\" join=11380909192
    echo ncl $join file1=\"11389909192.$base2\" file2=\"11393_1980/$base1\" join=113

    I tried to run above code but ncl require me to input and I changed the code by adding 'echo' and I got the commandline showing

    ncl file1="11389_1980/" file2="11390_1980/" join=1138990
    ncl file1="11391_1980/" file2="11392_1980/" join=1139192
    ncl file1="1138990." file2="1139192." join=11380909192
    ncl file1="11389909192." file2="11393_1980/" join=113

    which showed that the 'join' parameter is not working.

    and I change the sh to csh but the it told me "-bash: ./TestMerge.NEUS.sh: /bin/csh: bad interpreter: No such file or directory"

    Could you please tell me where is the problem?

    Regards,

    Ping

     
  • Dave Allured

    Dave Allured - 2012-10-26

    Ping,

    The driver script is written in C shell, which is my preferred shell scripting language.  It appears that on your system, C shell (/bin/csh) is either not installed, or in a different path.  You can ask your system administrator for the correct path for C shell, or else you can convert the script to bash (/bin/sh).

    I do not know bash very well, but I think mainly you need to change the set commands as follows.  Remove the keyword "set", and also remove the spaces around the first equal sign.  E.g.:

    join='MergeScript/join.daymet.1018.ncl'

    Also, the "ll" command is nonstandard in either shell.  Use "ls -l" instead.

    -Dave

     
  • Mostofa Kamal

    Mostofa Kamal - 2012-11-23

    Hi Dave,

    I have been trying to Join adjacent Daymet tiles using the code "join.daymet.1027.ncl" written by you but getting the following error message:

    fatal:Variable (Join) is undefined
    fatal:{"Execute.c":7743}:Execute: Error occurred at or near line 47 in file join.daymet.1027.ncl

    As I have just start learning NCL so I could not figure out the problem. Could you please tell me what is the problem?

    Thanks
    Kamal

     
  • Dave Allured

    Dave Allured - 2012-11-23

    All,

    (1) Private reply sent to Kamal to help him get started.  I think he did not correctly include the argument join=\"prefix\" on the NCL command line, as shown in the included usage instructions, but there are other possibilities.

    (2) This is an NCO forum, not NCL.  Please send questions about general NCL programming and this NCL join script to the NCL user list.  There is already a long discussion on this topic.

    (3) I have started a web page, with a more thorough introduction:

       "Joining Daymet Tiles with NCL"
       http://www.esrl.noaa.gov/psd/people/dave.allured/data/daymet/daymet.html

    (4) Please check with that page for the latest version of my NCL join script.  I try to provide improvements and bug fixes, as time permits.

    -Dave

     

Log in to post a comment.