Charlie Zender - 2001-04-04

Here is an interesting bug we need to solve.
Anyone interested in tackling it?
Read the description in the TODO list first.
The problem arises because running averages
are initialized to 0.0 and then incremented.
However, when the missing_value also = 0.0,
then the averagers appear to get confused.
So presumably the initialization of the
running average needs to be done more
carefully.

I am opposed to any solution which makes all
the operators slower or consume a lot more memory.
A fix which makes the initialization of the
running average take longer, but does not slow
down the succeeding increments is acceptable.
Probably the first step is to print a warning
when missing_value = 0.0 in an averager, just
so the user is aware of the problem.

Charlie

From: John Sheldon <jps@GFDL.NOAA.GOV>

Hi again Charlie-

A user here pointed out that NCRA appears to treat "0" as
something special, though it shouldn't.  When a variable has
a "missing_value" of "0", the average comes out 0, even though
there are no 0's in the data.

We have put two very simple netcdf files out on our anon-FTP
the demonstrate this:
  ftp://ftp.gfdl.noaa.gov/pub/pjk/sphum.nc
ftp://ftp.gfdl.noaa.gov/pub/pjk/sphum_missing_equals_zero.nc
Or, in case it's easier, I'll append them at the end of this mail.

We ran NCRA (from NCO 1.2.2) on the these like so:
  ncra -y avg sphum_missing_equals_zero.nc out0.nc
  ncra -y avg sphum.nc out1.nc

File out0.nc yields an average of 0, where out1.nc shows a proper
average.

Our current workaround is to change the value of the "missing_value"
attribute, but, obviously, this won't always work (eg, if there really
are missing data points).

Thanks in advance for any help!
John Sheldon
==========================================
netcdf sphum {
dimensions:
        lat = 1 ;
        lon = 1 ;
        pfull = 1 ;
        time = UNLIMITED ; // (2 currently)
variables:
        float lat(lat) ;
                lat:long_name = "latitude" ;
                lat:units = "degrees_N" ;
                lat:cartesian_axis = "Y" ;
                lat:edges = "latb" ;
        float lon(lon) ;
                lon:long_name = "longitude" ;
                lon:units = "degrees_E" ;
                lon:cartesian_axis = "X" ;
                lon:edges = "lonb" ;
        float pfull(pfull) ;
                pfull:long_name = "approx full pressure level" ;
                pfull:units = "hPa" ;
                pfull:cartesian_axis = "Z" ;
                pfull:positive = "down" ;
        float sphum(time, pfull, lat, lon) ;
                sphum:long_name = "specific humidity" ;
                sphum:units = "none" ;
                sphum:missing_value = -999.f ;
                sphum:time_avg_info = "average_T1,average_T2,average_DT" ;
        double time(time) ;
                time:long_name = "time" ;
                time:units = "days since 1979-01-01 00:00:00" ;
                time:cartesian_axis = "T" ;
                time:calendar_type = "JULIAN" ;

// global attributes:
                :filename = "1994jul01h00.atmos_daily.nc" ;
                :MPP_IO_VERSION = "$Id: mpp_io.F90,v 5.5 2000/07/28 20:17:19 fms Exp $" ;
                :title = "Diagnostics from an FMS integration" ;
                :history = "Mon Apr  2 14:55:48 2001: ncks -v sphum -d pfull,0 -d lat,0 -d lon,0 -d time,0,1 -O one_pfull.temp_sphum.nc sphum.nc\n",
    "Fri Mar 30 08:12:35 2001: ncks -d pfull,0 1994jul01h00.atmos_daily.temp_sphum.nc one_pfull.temp_sphum.nc\n",
    "Thu Mar  1 12:24:40 2001: /t90/jps/pub/netcdf/util/bin/ncks -v temp,sphum -d lat,0,2 -d lon,0,3 -d pfull,15,18 1994jul01h00.atmos_daily.nc temp_sphum.nc" ;
data:

lat = -89 ;

lon = 1.25 ;

pfull = 592.9865 ;

sphum =
  2.922699e-05,
  1.910843e-05 ;

time = 5631, 5632 ;
}
============================
netcdf sphum_missing_equals_zero {
dimensions:
        lat = 1 ;
        lon = 1 ;
        pfull = 1 ;
        time = UNLIMITED ; // (2 currently)
variables:
        float lat(lat) ;
                lat:long_name = "latitude" ;
                lat:units = "degrees_N" ;
                lat:cartesian_axis = "Y" ;
                lat:edges = "latb" ;
        float lon(lon) ;
                lon:long_name = "longitude" ;
                lon:units = "degrees_E" ;
                lon:cartesian_axis = "X" ;
                lon:edges = "lonb" ;
        float pfull(pfull) ;
                pfull:long_name = "approx full pressure level" ;
                pfull:units = "hPa" ;
                pfull:cartesian_axis = "Z" ;
                pfull:positive = "down" ;
        float sphum(time, pfull, lat, lon) ;
                sphum:long_name = "specific humidity" ;
                sphum:units = "none" ;
                sphum:missing_value = 0.f ;
                sphum:time_avg_info = "average_T1,average_T2,average_DT" ;
        double time(time) ;
                time:long_name = "time" ;
                time:units = "days since 1979-01-01 00:00:00" ;
                time:cartesian_axis = "T" ;
                time:calendar_type = "JULIAN" ;

// global attributes:
                :filename = "1994jul01h00.atmos_daily.nc" ;
                :MPP_IO_VERSION = "$Id: mpp_io.F90,v 5.5 2000/07/28 20:17:19 fms Exp $" ;
                :title = "Diagnostics from an FMS integration" ;
                :history = "Mon Apr  2 14:56:31 2001: ncatted -a missing_value,sphum,m,f,0 sphum.nc new_sphum.nc\n",
    "Mon Apr  2 14:55:48 2001: ncks -v sphum -d pfull,0 -d lat,0 -d lon,0 -d time,0,1 -O one_pfull.temp_sphum.nc sphum.nc\n",
    "Fri Mar 30 08:12:35 2001: ncks -d pfull,0 1994jul01h00.atmos_daily.temp_sphum.nc one_pfull.temp_sphum.nc\n",
    "Thu Mar  1 12:24:40 2001: /t90/jps/pub/netcdf/util/bin/ncks -v temp,sphum -d lat,0,2 -d lon,0,3 -d pfull,15,18 1994jul01h00.atmos_daily.nc temp_sphum.nc" ;
data:

lat = -89 ;

lon = 1.25 ;

pfull = 592.9865 ;

sphum =
  2.922699e-05,
  1.910843e-05 ;

time = 5631, 5632 ;
}