ncpdq is crashing when attempting to reorient the axes of a NetCDF file

Help
2013-12-05
2013-12-10
  • James Adams

    James Adams - 2013-12-05

    I have a NetCDF file which has variables in (time, lon, lat) and I want to reorient the variables to (lon, lat, time). I am using ncpdq for this, like so:

    $ ncpdq -a lon,lat,time grid_5km.nc grid_5km_lonlattime.nc

    When I run this on a Linux machine it goes for a minute or so then stops with a "Killed" message. When I run the same command on a Windows PC at the I get an error telling me that the executable has stopped working. The NetCDF seems OK in other programs, but it is crashing ncpdq every time on both Linux and Windows, not sure yet if this is an error in my file or with ncpdq.

    How can I go about determining what the issue is here?

    Below is a ncdump -h of the NetCDF file, in case that's helpful. Thanks in advance for any suggestions.

    --James

    $ ncdump -h grid_5km.nc
    netcdf grid_5km {
    dimensions:
            time = UNLIMITED ; // (1420 currently)
            lon = 1385 ;
            lat = 584 ;
    variables:
            int time(time) ;
                    time:units = "days since 1800-01-01 00:00:00" ;
                    time:calendar = "gregorian" ;
                    time:long_name = "time" ;
                    time:standard_name = "time" ;
            float lat(lat) ;
                    lat:units = "degrees_north" ;
                    lat:long_name = "latitude" ;
                    lat:standard_name = "latitude" ;
            float lon(lon) ;
                    lon:units = "degrees_east" ;
                    lon:long_name = "longitude" ;
                    lon:standard_name = "longitude" ;
            float prcp(time, lon, lat) ;
                    prcp:_FillValue = -9999.9f ;
                    prcp:units = "millimeters" ;
                    prcp:standard_name = "precipitation_amount" ;
                    prcp:long_name = "Terrestrial Precipitation, monthly total" ;
                    prcp:valid_max = 10000.f ;
                    prcp:valid_min = 0.f ;
                    prcp:missing_value = -9999.9f ;
                    prcp:cell_methods = "time: sum" ;
            float tmax(time, lon, lat) ;
                    tmax:_FillValue = -9999.9f ;
                    tmax:units = "degrees_celsius" ;
                    tmax:standard_name = "surface_temperature" ;
                    tmax:long_name = "Terrestrial Air Temperature, monthly maximum" ;
                    tmax:valid_max = 100.f ;
                    tmax:valid_min = -100.f ;
                    tmax:missing_value = -9999.9f ;
                    tmax:cell_methods = "time: maximum" ;
            float tmin(time, lon, lat) ;
                    tmin:_FillValue = -9999.9f ;
                    tmin:units = "degrees_celsius" ;
                    tmin:standard_name = "surface_temperature" ;
                    tmin:long_name = "Terrestrial Air Temperature, monthly minimum" ;
                    tmin:valid_max = 100.f ;
                    tmin:valid_min = -100.f ;
                    tmin:missing_value = -9999.9f ;
                    tmin:cell_methods = "time: minimum" ;
    
    // global attributes:
                    :geospatial_lat_min = 24.5625 ;
                    :Conventions = "CF-1.6" ;
                    :institution = "National Climatic Data Center, NESDIS, NOAA, U.S. Department of Commerce" ;
                    :geospatial_lon_max = -67.0209 ;
                    :geospatial_lon_min = -124.6875 ;
                    :geospatial_lat_max = 49.3542 ;
                    :title = "Gridded 5-km resolution CONUS climate data from NCDC, for internal use only" ;
                    :standard_name_vocabulary = "CF Standard Name Table (v26, 08 November 2013)" ;
                    :date_modified = "2013-12-04 16:56:47" ;
                    :summary = "Gridded 5-km resolution CONUS climate data from NCDC, for internal use only" ;
                    :source = "GHCN Daily, maxs/mins/means" ;
                    :date_created = "2013-12-04 16:56:47" ;
    
     
    Last edit: James Adams 2013-12-05
    • Pedro Vicente

      Pedro Vicente - 2013-12-05

      Is it possible to put the complete netCDF file in a place where we can get it?

      Also, can you post here the output of this command?

      ncks -r

      Pedro

       
  • James Adams

    James Adams - 2013-12-06

    Thanks, Pedro.

    The output of ncks -r is below:

    C:\nco>ncks -r
    NCO netCDF Operators version VERSION built Jun  8 2013 on HOSTNAME by USER
    ncks version VERSION
    Linked to netCDF library version 4.3.0, compiled Jun  2 2013 18:45:55
    Copyright (C) 1995--2013 Charlie Zender
    NCO is free software and comes with a BIG FAT KISS and ABOLUTELY NO WARRANTY
    License: GNU General Public License (GPL) Version 3
    Homepage: http://nco.sf.net
    User's Guide: http://nco.sf.net/nco.html
    Configuration Option:   Active? Meaning or Reference:
    Check _FillValue        Yes     http://nco.sf.net/nco.html#mss_val
    Check missing_value     No      http://nco.sf.net/nco.html#mss_val
    Compressed netCDF3      No      http://nco.sf.net/nco.html#znetcdf (pre-alpha)
    DAP clients (libdap)    No      http://nco.sf.net/nco.html#dap
    DAP clients (libnetcdf) No      http://nco.sf.net/nco.html#dap
    Debugging: Custom       No      Pedantic, bounds checking (slowest execution)
    Debugging: Symbols      No      Produce symbols for debuggers (e.g., dbx, gdb)
    GNU Scientific Library  Yes     http://nco.sf.net/nco.html#gsl
    Internationalization    No      http://nco.sf.net/nco.html#i18n (pre-alpha)
    MPI parallelization     No      http://nco.sf.net/nco.html#mpi (beta)
    netCDF3 64-bit files    Yes     http://nco.sf.net/nco.html#lfs
    netCDF4/HDF5 available  Yes     http://nco.sf.net/nco.html#nco4
    netCDF4/HDF5 enabled    Yes     http://nco.sf.net/nco.html#nco4
    OpenMP SMP threading    No      http://nco.sf.net/nco.html#omp
    Optimization: run-time  No      Fastest execution possible (slowest compilation)
    
    Parallel netCDF3        No      http://nco.sf.net/nco.html#pnetcdf (pre-alpha)
    Regular Expressions     No      http://nco.sf.net/nco.html#rx
    Shared libraries built  No      Small, dynamically linked executables
    Shell globbing          No      http://nco.sf.net/nco.html#glb
    Static libraries built  No      Large executables with private namespaces
    UDUnits conversions     No      http://nco.sf.net/nco.html#udunits
    UDUnits2 conversions    No      http://nco.sf.net/nco.html#udunits
    
    Aprés prp_axs
    

    I will try to find a place to post the NetCDF file, it is rather large ~12GB.

    --James

     
  • Pedro Vicente

    Pedro Vicente - 2013-12-08

    The purpose of the ncks -r output was for us to know which NCO version are you using.

    It prints the version in this format

    <quote>
    NCO netCDF Operators version "4.4.0" built Dec 8 2013 on glace by pvicente
    ncks version "4.4.0"
    Linked to netCDF library version 4.3.1-rc2, compiled Sep 30 2013 13:55:16
    </quote>

    but your version has an issue in this, since it prints "VERSION". Do you happen to know which version are you using?

    In case the version it's not the current 4.3.9, you may give it a try to the current version, to see if that problem also happens.

    I will try to find a place to post the NetCDF file, it is rather large ~12GB.

    This is not needed, on second thought, since we can generate some fake data for the variable arrays.

    Pedro

     
  • Charlie Zender

    Charlie Zender - 2013-12-09

    Please run the command with ncpdq -D 2 and post the screen output.
    thx,
    cz

     
  • James Adams

    James Adams - 2013-12-09

    The below runs for a few minutes before going kaput:

    $ ncpdq -D 2 -a lon,lat,time ~/data/grid_5km.nc grid_5km_lonlattime.nc
    ncpdq: INFO Build compiler lacked (or user turned off) OpenMP support. Code will execute with single thread in Uni-Processor (UP) mode.
    ncpdq: INFO Overwriting global attribute geospatial_lat_min
    ncpdq: INFO Overwriting global attribute Conventions
    ncpdq: INFO Overwriting global attribute institution
    ncpdq: INFO Overwriting global attribute geospatial_lon_max
    ncpdq: INFO Overwriting global attribute geospatial_lon_min
    ncpdq: INFO Overwriting global attribute geospatial_lat_max
    ncpdq: INFO Overwriting global attribute title
    ncpdq: INFO Overwriting global attribute standard_name_vocabulary
    ncpdq: INFO Overwriting global attribute date_modified
    ncpdq: INFO Overwriting global attribute summary
    ncpdq: INFO Overwriting global attribute source
    ncpdq: INFO Overwriting global attribute date_created
    ncpdq: INFO nco_cpy_var_dfn_trv() is defining dimension lon as record dimension in output file per user request
    ncpdq: INFO Input file 0 is /home/james.adams/data/grid_5km.nc
    ncpdq: TIMER Metadata setup and file layout before main loop took    0.07 s
    Killed
    

    The below is what I see when I run the same command on my PC, it dies after a few minutes as well:

    C:\nco>ncpdq.exe -D 2 -a lon,lat,time -v prcp,tmin,tmax N:\james\datasets\grid_5
    km\netcdf\grid_5km.nc N:\james\datasets\grid_5km\netcdf\grid_5km_lonlattime.nc
    ncpdq: INFO Build compiler lacked (or user turned off) OpenMP support. Code will
     execute with single thread in Uni-Processor (UP) mode.
    ncpdq: INFO Requested re-order will change record dimension from time to lon. ne
    tCDF3 allows only one record dimension. Hence ncpdq will make lon record (i.e.,
    least rapidly varying) dimension in all variables that contain it.
    ncpdq: INFO Input file 0 is N:\james\datasets\grid_5km\netcdf\grid_5km.nc
    ncpdq: TIMER Metadata setup and file layout before main loop took    2.67 s
    
     
  • Charlie Zender

    Charlie Zender - 2013-12-09

    One thing this output makes clear is that ncpdq appears to copy all the global attributes twice(because of the "Overwriting" message). This is unnecessary, though probably not related to the larger issue. nco_att_cpy() appears to be called twice for all the global attributes. We'll fix this smaller issue. What puzzles me about the "killed" issue is that I've never seen that message before. Normally when NCO dies with large files it dies gracefully with a self-reported malloc() error. That is not happening. Almost as if your system has some supervising process that kills memory intensive jobs....
    cz

     
  • James Adams

    James Adams - 2013-12-09

    Yes this appears to be the case here, at least on the Linux machine where the container is virtual and has fixed memory allocation limit. However the same unceremonious death happens on my Windows box so not sure if that tells us anything or not...

     
  • Charlie Zender

    Charlie Zender - 2013-12-09

    OK, now I know what it looks like ("killed") when NCO hits resource limits under a hypervisor. That's helpful to know. This is not an NCO bug. However, a workaround may help you get your work done. The best strategy for you now is to split your file into separate files for each input variable, permute those files, then rejoin them into an output file. Or split the original file along time dimension and permut those subfiles then rejoin. Either way should work fine.
    cz

     
  • Pedro Vicente

    Pedro Vicente - 2013-12-09

    James in addition to what Charlie mentions, you may also give a try to the latest development code.

    I made some changes to the NCO memory requirements, and now NCO runs considerably faster.

    This requires to build from source, if that's an option for you.

    This command retrieves the current (“bleeding edge”) development version of NCO into a local directory named nco:

    cvs -z3 -d:pserver:anonymous@nco.cvs.sf.net:/cvsroot/nco co -kk nco

    then to build

    cd nco
    ./configure --prefix="any path you want to install NCO"
    make
    make install

    this installs NCO to the example path "any path you want to install NCO"

    If you omit --prefix, and do just

    ./configure

    a default location is chosen

    Pedro

     
    Last edit: Pedro Vicente 2013-12-09
    • Charlie Zender

      Charlie Zender - 2013-12-10

      what does this mean?
      how did the ncpdq memory requirements change?

      Le 09/12/2013 15:11, Pedro Vicente a écrit :

      I made some changes to the NCO memory requirements, and now NCO runs
      considerably faster.

      --
      Charlie Zender, Earth System Sci. & Computer Sci.
      University of California, Irvine 949-891-2429 )'(

       
  • Pedro Vicente

    Pedro Vicente - 2013-12-10

    There were 2 changes recently that make the NCO regression tests run 1 order of magnitude faster: Introduction of a hash table look-up and replacement of a static array for dimensions (with the maximum netCDF dimension size) with a dynamic array, allocated only with the existing dimensions for a variable.

    These 2 tables show the time execution before and after the changes

    Before

    Test Results Seconds to complete
    -------------------------- ----------------------------------------
    Test Success Failure Total WallClock Real User System Diff
    ncap2: 11 11 0.38 0.00 0.00 0.00 0.00
    ncatted: 11 11 10.11 0.00 0.00 0.00 0.00
    ncbo: 24 24 29.94 0.00 0.00 0.00 0.00
    ncflint: 8 8 13.22 0.00 0.00 0.00 0.00
    nces: 17 17 10.62 0.00 0.00 0.00 0.00
    ncecat: 10 10 6.74 0.00 0.00 0.00 0.00
    ncks: 72 72 33.23 0.00 0.00 0.00 0.00
    ncpdq: 47 47 40.50 0.00 0.00 0.00 0.00
    ncra: 30 30 25.05 0.00 0.00 0.00 0.00
    ncrcat: 25 25 26.29 10.00 0.00 0.00 10.00
    ncrename: 22 22 2.16 0.00 0.00 0.00 0.00
    ncwa: 51 51 45.69 0.00 0.00 0.00 0.00

    After

    Test Results Seconds to complete
    -------------------------- ----------------------------------------
    Test Success Failure Total WallClock Real User System Diff
    ncap2: 11 11 0.36 0.00 0.00 0.00 0.00
    ncatted: 11 11 1.91 0.00 0.00 0.00 0.00
    ncbo: 24 24 2.15 0.00 0.00 0.00 0.00
    ncflint: 8 8 1.47 0.00 0.00 0.00 0.00
    nces: 17 17 2.38 0.00 0.00 0.00 0.00
    ncecat: 10 10 1.17 0.00 0.00 0.00 0.00
    ncks: 72 72 5.16 0.00 0.00 0.00 0.00
    ncpdq: 47 47 4.13 0.00 0.00 0.00 0.00
    ncra: 30 30 2.95 0.00 0.00 0.00 0.00
    ncrcat: 25 25 3.07 10.00 0.00 0.00 10.00
    ncrename: 22 22 1.48 0.00 0.00 0.00 0.00
    ncwa: 51 51 3.75 0.00 0.00 0.00 0.00

     
    Last edit: Pedro Vicente 2013-12-10

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks