Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

missing_value switch to _FillValue

Developers
2010-09-28
2013-10-17
  • Susan Bates
    Susan Bates
    2010-09-28

    With our group's update to version 4.0.3, we became aware of nco's switch to using _FillValue rather than missing_value. With this version, nco commands started to break since some of our variables only define a missing_value. I am the ocean model liaison at NCAR, and this will be a huge problem for us and our users. We have at least 6 years worth of data that currently have variables with the missing_value attribute and no _FillValue attribute. From our experience, we know that data that is generated today may be requested and processed by users 5-10 years down the road. This raises quite a few issues we will have to resolve and we wanted to bring you into the discussion to see if you have any other tips or ways to help us.

    Through some digging, we have found the ncrename work-around as well as the build time flag CPPFLAGS=’-DNCO_MSS_VAL_SNG=missing_value’. We do plan to change our model code so that it uses _FillValue, but this does not solve our problem of what to do with all of the existing data. As a side note, the ncrename command does not work for me. I have submitted that to the bug report page, but it's use does affect potential solutions for our missing_value/_FillValue problem.

    We could build our nco version with the flag, but we have hundreds of users that we'll need to notify about this build option (if they read our help section for model output), so the "fix" may be fine for us, but we envision many emails from users about this problem. Also, how long will this build flag be able to work? Will you continue to include it with future versions? Is there a way to continue having the missing_value be backwards compatible, or are you firm on this issue?

     
  • Charlie Zender
    Charlie Zender
    2010-09-29

    scbates,

    > With our group's update to version 4.0.3, we became aware of nco's switch to
    > using _FillValue rather than missing_value. With this version, nco commands
    > started to break since some of our variables only define a missing_value. I
    > am the ocean model liaison at NCAR, and this will be a huge problem for us and
    > our users. We have at least 6 years worth of data that currently have variables
    > with the missing_value attribute and no _FillValue attribute. From our experience,
    > we know that data that is generated today may be requested and processed by
    > users 5-10 years down the road. This raises quite a few issues we will have
    > to resolve and we wanted to bring you into the discussion to see if you have
    > any other tips or ways to help us.

    > Through some digging, we have found the ncrename work-around as well as the
    > build time flag CPPFLAGS=’-DNCO_MSS_VAL_SNG=missing_value’. We do plan to
    > change our model code so that it uses _FillValue, but this does not solve our
    > problem of what to do with all of the existing data.

    > As a side note, the ncrename command does not work for me. I have
    > submitted that to the bug report page, but it's use does affect
    > potential solutions for our missing_value/_FillValue problem.

    The ncrename bug you encountered was probably introduced in NCO 4.0.1
    and fixed in NCO 4.0.4, released last week, and is described here:

    http://nco.sf.net#bug_ncrename_dot

    Let us know if 4.0.4 does not solve this problem.

    > We could build our nco version with the flag, but we have hundreds of users
    > that we'll need to notify about this build option (if they read our help section
    > for model output), so the "fix" may be fine for us, but we envision many emails
    > from users about this problem. Also, how long will this build flag be able to
    > work?

    Probably "forever".

    > Will you continue to include it with future versions?

    Yes. The only difference between the old behavior and new behavior is
    the value of a string in a few places. Easy to keep this option
    "forever".

    > Is there a way to continue having the missing_value be backwards
    > compatible, or are you firm on this issue?

    Both. Yes, we are firm in that the default will be _FillValue.
    The backwards compatibility issue can be solved if you do as many
    other GCM groups do and, in the future, define both "missing_value"
    and "_FillValue" to the same value for all variables. And run a script
    that loops through all your old files and executes

    ncatted -O -a _FillValue,,o,f,1.0e36 inout.nc

    with whatever you currently use for missing_value instead of 1.0e36.

    Note that recent versions (4.0.3+) of NCO warn the user whenever
    _only_ _FillValue or missing_value is defined for a variable _and_ NCO
    was compiled to pay attention to the other. Some users will ignore
    these warnings and bug you anyway. Practic non-violent communication
    with them.

    Installing 4.0.4 compiled with default switches, and defining both
    attributes for every variable, should solve the backwards
    compatibility problems with a minimum of effort for you, the data
    provider, while giving the user confidence that they will be warned
    whenever the operators are incompatible with older data they may have
    acquired.

    Charlie

     
  • Susan Bates
    Susan Bates
    2010-09-29

    Charlie,
    Thanks for the quick and thorough reply. I will pass all of your suggestions on to our group; however, there is one issue I'd like to clear up with you first. You stated that you can still use the nco commands on files with only the missing_value or _FillValue and they will still work but give a warning. Using the ncra command, we find that the command does still work; however, it adds data for additional time steps filling them in with zeros. I have attached two files containing the output from an 'ncdump -h' command. The first (info) is the description from an original POP ocean model output file, called testin.nc. I then used the ncra command to try to average two such ocean output files to create the file named testout.nc. The second file (outinfo) is the description of this testout.nc file. Notice that the time descriptor for the testin.nc file says '(1 currently)' while the time for testout.nc says '(29 currently)'. In the file testout.nc, the first time step contains the data, which appears to have been created correctly, and the following time steps are filled with zeros. Is this because we haven't built the nco tool package with the flag to still allow the missing_value attribute or is this a bug?

    Thanks so much for your help,
    Susan

     
  • Susan Bates
    Susan Bates
    2010-09-29

    Oops, I didn't include the files. Since I can't attach them, I'll just post the first part describing the dimensions. If you'd like more of the file, or the netcdf files, let me know.
    Thanks again,
    Susan

    file titled "info":
    netcdf testin {
    dimensions:
    d2 = 2 ;
    time = UNLIMITED ; // (1 currently)
    nchar = 256 ;
    moc_comp = 3 ;
    transport_comp = 5 ;
    transport_reg = 2 ;
    z_t = 60 ;
    z_t_150m = 15 ;
    z_w = 60 ;
    z_w_top = 60 ;
    z_w_bot = 60 ;
    lat_aux_grid = 105 ;
    moc_z = 61 ;
    nlon = 100 ;
    nlat = 116 ;

    file titled "outinfo":
    netcdf testout {
    dimensions:
    d2 = 2 ;
    time = UNLIMITED ; // (29 currently)
    nchar = 256 ;
    moc_comp = 3 ;
    transport_comp = 5 ;
    transport_reg = 2 ;
    z_t = 60 ;
    z_t_150m = 15 ;
    z_w = 60 ;
    z_w_top = 60 ;
    z_w_bot = 60 ;
    lat_aux_grid = 105 ;
    moc_z = 61 ;
    nlon = 100 ;
    nlat = 116 ;

     
  • Charlie Zender
    Charlie Zender
    2010-09-29

    Susan,

    What I mean by the "commands will still work" is that NCO will happily
    process (e.g., average) files that have metadata incompatible with the
    NCO build. For example, ncra built to adhere to the "_FillValue"
    convention will average a variable with only the "missing_value"
    attribute. And ncra built to pay attention to "missing_value" will
    average a variable with only the "_FillValue" attribute. Because of
    the incompatibility of the executable with the metadata, the operator
    will probably produce a numerically correct but unintended answer
    (because missing values will not be detected). However, NCO
    automatically WARNs the user about this. The onus to pay attention to
    these warnings and resolve the incompatibility (with, e.g., ncatted),
    and re-run the averaging command, is on the user.

    The following demonstrates this using http://dust.ess.uci.edu/nco/in.nc:

    ncks -C -v one_dmn_rec_var_missing_value,one_dmn_rec_var__FillValue ~/nco/data/in.nc
    ncra -O -v one_dmn_rec_var_missing_value,one_dmn_rec_var__FillValue ~/nco/data/in.nc ~/foo.nc
    ncks -C -H ~/foo.nc

    You will see that the WARNING message tries to get the user to behave.

    Regarding the behavior of the test files whose metadata you posted,
    the summary is incomplete and the files are too complex. Please
    reduce this problem to its simplest possible form using ncks to
    eliminate all the extraneous variables and attributes leaving the fewest
    variables that demonstrate the problem, post the exact commands, and explain why the results
    may be erroneous.

    Thanks,
    Charlie

    zender@givre:~/nco$ ncks -C -v one_dmn_rec_var_missing_value,one_dmn_rec_var__FillValue ~/nco/data/in.nc
    one_dmn_rec_var__FillValue: type NC_FLOAT, 1 dimension, 2 attributes, chunked? no, compressed? no, packed? no, ID = 227
    one_dmn_rec_var__FillValue RAM size is 10*sizeof(NC_FLOAT) = 10*4 = 40 bytes
    one_dmn_rec_var__FillValue dimension 0: time, size = 10 NC_DOUBLE, dim. ID = 20 (CRD)(REC)
    one_dmn_rec_var__FillValue attribute 0: long_name, size = 124 NC_CHAR, value = One dimensional record variable with missing data indicated by _FillValue attribute only. No missing_value attribute exists.
    one_dmn_rec_var__FillValue attribute 1: _FillValue, size = 1 NC_FLOAT, value = 1e+36

    one_dmn_rec_var_missing_value: type NC_FLOAT, 1 dimension, 2 attributes, chunked? no, compressed? no, packed? no, ID = 226
    one_dmn_rec_var_missing_value RAM size is 10*sizeof(NC_FLOAT) = 10*4 = 40 bytes
    one_dmn_rec_var_missing_value dimension 0: time, size = 10 NC_DOUBLE, dim. ID = 20 (CRD)(REC)
    one_dmn_rec_var_missing_value attribute 0: long_name, size = 124 NC_CHAR, value = One dimensional record variable with missing data indicated by missing_value attribute only. No _FillValue attribute exists.
    one_dmn_rec_var_missing_value attribute 1: missing_value, size = 1 NC_FLOAT, value = 1e+36

    time=1 one_dmn_rec_var__FillValue=1
    time=2 one_dmn_rec_var__FillValue=2
    time=3 one_dmn_rec_var__FillValue=3
    time=4 one_dmn_rec_var__FillValue=4
    time=5 one_dmn_rec_var__FillValue=5
    time=6 one_dmn_rec_var__FillValue=6
    time=7 one_dmn_rec_var__FillValue=7
    time=8 one_dmn_rec_var__FillValue=8
    time=9 one_dmn_rec_var__FillValue=9
    time=10 one_dmn_rec_var__FillValue=1e+36

    time=1 one_dmn_rec_var_missing_value=1
    time=2 one_dmn_rec_var_missing_value=2
    time=3 one_dmn_rec_var_missing_value=3
    time=4 one_dmn_rec_var_missing_value=4
    time=5 one_dmn_rec_var_missing_value=5
    time=6 one_dmn_rec_var_missing_value=6
    time=7 one_dmn_rec_var_missing_value=7
    time=8 one_dmn_rec_var_missing_value=8
    time=9 one_dmn_rec_var_missing_value=9
    time=10 one_dmn_rec_var_missing_value=1e+36

    zender@givre:~/nco$ ncra -O -v one_dmn_rec_var_missing_value,one_dmn_rec_var__FillValue ~/nco/data/in.nc ~/foo.nc

    ncra: WARNING Variable one_dmn_rec_var_missing_value has attribute "missing_value" but not "_FillValue". To comply with netCDF conventions, NCO ignores values that equal the _FillValue attribute when performing arithmetic. Confusingly, values equal to the missing_value should also be neglected. However, it is tedious and (possibly) computationally expensive to check each value against multiple missing values during arithmetic on large variables. So NCO thinks that processing variables with a "missing_value" attribute and no "_FillValue" attribute may produce undesired arithmetic results (i.e., where values that were intended to be neglected were not, in fact, neglected). We suggest you rename all "missing_value" attributes to "_FillValue" or include both "missing_value" and "_FillValue" attributes (with the _same values_) for all variables that have either attribute. Because it is long, this message is only printed once per operator even though multiple variables may have the same attribute configuration. More information on missing values is given at:
    http://nco.sf.net/nco.html#mss_val
    Examples of renaming attributes are at:
    http://nco.sf.net/nco.html#xmp_ncrename
    Examples of creating and deleting attributes are at:
    http://nco.sf.net/nco.html#xmp_ncatted
    zender@givre:~/nco$ ncks -C -H ~/foo.nc
    time=5.5 one_dmn_rec_var__FillValue=5

    time=5.5 one_dmn_rec_var_missing_value=1e+35

     
  • Susan Bates
    Susan Bates
    2010-10-04

    Hi Charlie,
    Well, the ncra problem has grown larger and is not straightforward to simplify. I am now convinced that it is unrelated to the missing_value/_FillValue issue. ncra is giving inconsistent and illogical output. For example, Adam Phillips used ncra on two of our ocean output files, each containing 1 time step. The resulting averaged file should have contained 1 time step, but it contained 29. He then used various different subsets of variables from these files, and the resulting ncra output had the expected 1 time step. Additionally, I extracted one variable (N_SALT) from the files and used ncra to average just the files with the N_SALT variable. I got a segmentation fault. However, if I extract TEMP and N_SALT into files and then use ncra on those, the ncra command works perfectly. Why would it segmentation fault on N_SALT alone but work (even for N_SALT) when both N_SALT and TEMP are present? These are just a couple of  many wacky examples that we have. As you can see, it's difficult to simply the file when we don't know which variables or combinations are the problem. I spoke with Dennis today, who suggested that this might be a build time error. He also thought you had access to our machines and might be willing to look at a couple of files if I pointed you to the directory. I don't want to waste your time if this is a build time error. From this description, can you tell me if you suspect that this is the case? I would be willing to discuss other failures with you on the phone if that would help…or I can send the location of a couple of files if you are willing to go at this that way. Just let me know. We are really baffled here.

    Thanks,
    Susan

     
  • Charlie Zender
    Charlie Zender
    2010-10-05

    Susan,

    Sorry to hear things have gotten weird…
    Remember that you can always downgrade NCO until we find a solution.
    Strange behavior can occur when there are multiple shared libnco.so's running around.
    I suggest you simplify a known problem like that mentioned above as much as possible,
    i.e., size, number of variables, names then post the resulting files publicly, along with
    the exact commands and the expected results.

    The only account I have is on bluefire and that's too hard to debug on.
    Plus I don't have my cryptocard, so just post everything.

    C

     
  • Susan Bates
    Susan Bates
    2010-10-05

    Charlie,

    I have put one of our errors described above with the ncra command on the bug report page so that I could attach a file.
    Susan

     
  • Charlie Zender
    Charlie Zender
    2010-10-05

    Susan,
    Thanks for giving me a test case.
    I can replicate the problem and will post here when fixed or if I need more info.
    C

     
  • Charlie Zender
    Charlie Zender
    2010-10-06

    Hi Susan,

    Well, I think I've found and fixed the problem.

    http://nco.sf.net#bug_ncra_cf_crd

    It's due to my losing track of the complexity of handling
    metadata conventions. Your ocean model files triggered
    the bug because they follow CF-metadata conventions and include
    "time" in the "coordinates" attribute of some variables like N_HEAT.

    If you have a chance to re-build from source and test this
    on your problem, I would appreciate hearing any feedback before
    relasing it in NCO 4.0.5 which is supposed to fix this (and one
    other) bug.

    Charlie

     
  • Susan Bates
    Susan Bates
    2010-10-12

    Hi Charlie,

    Thanks for the quick fix for this. I am still waiting for it to be installed on our machines here. It was marked urgent last week. :-) I'll let you know how the testing goes as soon as the new version is installed.

    Susan

     
  • Charlie Zender
    Charlie Zender
    2010-10-12

    Susan,

    Thanks for the update. NCAR is one of my best customers so I want to make sure NCO works as advertised on your  files.
    Unfortunately, the only NCAR machine I have an account on is bluefire, which  runs AIX, so I can't build a linux executable there for you. However, I have upgraded my personal executables in ~/bin/AIX on bluefire if you want to test there.

    C

     
  • Susan Bates
    Susan Bates
    2010-10-13

    Hi Charlie,

    Your fix seems to have worked! We have retried many of the operations that were failing before and they are all working now. Thanks so much for your help. When will the next version with this fix be released?

    Susan

     
  • Charlie Zender
    Charlie Zender
    2010-10-14

    Done.  4.0.5 is released.
    c

     
  • Susan Bates
    Susan Bates
    2010-10-14

    Thanks so much Charlie! We are in the process of having the new version installed. We really appreciate your quick response to this!
    Susan