Menu

Similar data w/ different metadata causes SLOW process

Help
2019-03-12
2019-03-13
  • Ken Mankoff

    Ken Mankoff - 2019-03-12

    I have two NetCDF files from a regional climate model. When I post-process one variable w/ Python (xarray + dask) it takes 3 minutes. When I post-process another it takes 3 days (!). The only difference appears to be the metadata. I'd like to clone the "fast" metadata to the "slow" file to see if that fixes things. Below is the fast metadat, then the slow, then the diff. Can you advise how to use nco to do this?

    Thanks,

    -k.

    FAST:

    dimensions:
        time = 90 ;
        x = 1496 ;
        y = 2700 ;
    variables:
        float time(time) ;
            time:units = "DAYS since 2017-01-01 00:00:00" ;
            time:long_name = "time" ;
            time:standard_name = "time" ;
        float x(x) ;
            x:units = "km" ;
            x:long_name = "x" ;
            x:standard_name = "x" ;
        float y(y) ;
            y:units = "km" ;
            y:long_name = "y" ;
            y:standard_name = "y" ;
        float LON(y, x) ;
            LON:units = "km" ;
            LON:long_name = "Easting" ;
            LON:standard_name = "Easting" ;
            LON:actual_range = 0.f, 0.f ;
            LON:missing_value = 0.f, -1.225254e+28f ;
        float LAT(y, x) ;
            LAT:units = "km" ;
            LAT:long_name = "Northing" ;
            LAT:standard_name = "Northing" ;
            LAT:actual_range = 0.f, 0.f ;
            LAT:missing_value = 0.f, -1.225254e+28f ;
        float runoffcorr(time, y, x) ;
            runoffcorr:units = "mm w.e. per day" ;
            runoffcorr:long_name = "Downscaled corrected snowmelt" ;
            runoffcorr:standard_name = "Downscaled_corrected_snowmelt" ;
            runoffcorr:actual_range = 0.f, 28.61563f ;
            runoffcorr:missing_value = -1.e+30f ;
    
    // global attributes:
            :grid = "Map Projection:Polar Stereographic Ellipsoid - Map Reference Latitude: 90.0 - Map Reference Longitude: -39.0 - Map Second Reference Latitude: 71.0 - Map Eccentricity: 0.081819190843 ;wgs84 - Map Equatorial Radius: 6378137.0 ;wgs84 meters - Grid Map Origin Column: 160 - Grid Map Origin Row: -120 - Grid Map Units per Cell: 5000 - Grid Width: 301 - Grid Height: 561" ;
            :netcdf = "4.4.1.1 of Nov 25 2017 10:57:26 $" ;
            :_Format = "classic" ;
    }
    

    SLOW:

    dimensions:
        time = 90 ;
        x = 1496 ;
        y = 2700 ;
    variables:
        float time(time) ;
            time:units = "DAYS since 2017-01-01 00:00:00" ;
            time:long_name = "time" ;
            time:standard_name = "time" ;
        float x(x) ;
            x:units = "km" ;
            x:long_name = "x" ;
            x:standard_name = "x" ;
        float y(y) ;
            y:units = "km" ;
            y:long_name = "y" ;
            y:standard_name = "y" ;
        float LON(y, x) ;
            LON:units = "Degree" ;
            LON:long_name = "Longitude" ;
            LON:standard_name = "Longitude" ;
            LON:actual_range = -639.4561f, 855.5441f ;
            LON:missing_value = -1.e+30f ;
        float LAT(y, x) ;
            LAT:units = "Degree" ;
            LAT:long_name = "Latitude" ;
            LAT:standard_name = "Latitude" ;
            LAT:actual_range = -3355.096f, -656.096f ;
            LAT:missing_value = -1.e+30f ;
        float precipcorr(time, y, x) ;
            precipcorr:units = "mm w.e. per day" ;
            precipcorr:long_name = "1km Topography precip" ;
            precipcorr:standard_name = "1km_Topography_precip" ;
            precipcorr:actual_range = -0.0154459f, 575.9426f ;
            precipcorr:missing_value = -1.e+30f ;
    
    // global attributes:
            :grid = "Map Projection:Polar Stereographic Ellipsoid - Map Reference Latitude: 90.0 - Map Reference Longitude: -39.0 - Map Second Reference Latitude: 71.0 - Map Eccentricity: 0.081819190843 ;wgs84 - Map Equatorial Radius: 6378137.0 ;wgs84 meters - Grid Map Origin Column: 160 - Grid Map Origin Row: -120 - Grid Map Units per Cell: 5000 - Grid Width: 301 - Grid Height: 561" ;
            :netcdf = "4.4.1 of Jun 27 2017 09:19:19 $" ;
            :_Format = "classic" ;
    }
    

    DIFF from

    diff <(ncdump -s -h precip/precip_WJB_int.2017_JFM.BN_RACMO2.3p2_FGRN055_1km.DD.nc) <(ncdump -s -h runoff/runoff_WJB_int.2017_JFM.BN_RACMO2.3p2_FGRN055_1km.DD.nc)
    

    produces:

    20,24c20,24
    <       LON:units = "Degree" ;
    <       LON:long_name = "Longitude" ;
    <       LON:standard_name = "Longitude" ;
    <       LON:actual_range = -639.4561f, 855.5441f ;
    <       LON:missing_value = -1.e+30f ;
    ---
    >       LON:units = "km" ;
    >       LON:long_name = "Easting" ;
    >       LON:standard_name = "Easting" ;
    >       LON:actual_range = 0.f, 0.f ;
    >       LON:missing_value = 0.f, -1.225254e+28f ;
    26,36c26,36
    <       LAT:units = "Degree" ;
    <       LAT:long_name = "Latitude" ;
    <       LAT:standard_name = "Latitude" ;
    <       LAT:actual_range = -3355.096f, -656.096f ;
    <       LAT:missing_value = -1.e+30f ;
    <   float precipcorr(time, y, x) ;
    <       precipcorr:units = "mm w.e. per day" ;
    <       precipcorr:long_name = "1km Topography precip" ;
    <       precipcorr:standard_name = "1km_Topography_precip" ;
    <       precipcorr:actual_range = -0.0154459f, 575.9426f ;
    <       precipcorr:missing_value = -1.e+30f ;
    ---
    >       LAT:units = "km" ;
    >       LAT:long_name = "Northing" ;
    >       LAT:standard_name = "Northing" ;
    >       LAT:actual_range = 0.f, 0.f ;
    >       LAT:missing_value = 0.f, -1.225254e+28f ;
    >   float runoffcorr(time, y, x) ;
    >       runoffcorr:units = "mm w.e. per day" ;
    >       runoffcorr:long_name = "Downscaled corrected snowmelt" ;
    >       runoffcorr:standard_name = "Downscaled_corrected_snowmelt" ;
    >       runoffcorr:actual_range = 0.f, 28.61563f ;
    >       runoffcorr:missing_value = -1.e+30f ;
    
     

    Last edit: Ken Mankoff 2019-03-12
    • Charlie Zender

      Charlie Zender - 2019-03-12

      I note that have multiple missing_values is non-standard, and may contribute to slowness. That said, NCO has the ncatted operator that can change multiple attributes with one command. That is the safest best bet. An alternative, unsupported, would be to try appending the variables data from one file to the other, but restricting the propagation of metadata (with -m -M) or of data (with -H).

       
      • Ken Mankoff

        Ken Mankoff - 2019-03-13

        Hi Charlie,

        The file with two missing_value actually processes orders of magnitude faster. The other is the slow one.

        Digging further, I find that the fast file has empty LON and LAT arrays, but the slow one has populated LON and LAT arrays (and for some reason, using weird range/values)

            float LON(y, x) ;
                    LON:units = "Degree" ;
                    LON:actual_range = -639.4561f, 855.5441f ;
        

        Can you advise what nco command can empty a variable so that ncdump shows:

        LON =
        , , , , , , , , , , , , , , , , , , , , , , , , ,
        , , , , , , , , , , , , , , , , , , , , , , , ,
        , , , , , , , , , , , , , , , , , , , , , , , _,

        Thanks,

        -k.

         
        • Charlie Zender

          Charlie Zender - 2019-03-13

          ncap2 can set any variable to any value, http://nco.sf.net/nco.html#ncap2

           

Log in to post a comment.