Menu

"time since..." attribute

Developers
2003-03-20
2013-10-17
  • Rorik Peterson

    Rorik Peterson - 2003-03-20

    I'm looking into conditionally adding unidata's UDUNITS library to NCO; in particular, I'd like to use the "time since ..." attribute with ncks, ncrcat, etc.  (It is also on the TODO list, what a bonus!) I like to cut out 1 or 2 days of NCEP's reanalysis data for my work, and they use that attribute with their files that cover an entire year.  Usually I know I want something like 6am to 6pm on Mar. 19, and then I have to find the julian day, multiply for 4, add one, was this a leap year?? etc., which is a pain.  Instead, I'd like to use

    ncks -d time,"2001-03-19 06:00:0.0","2001-03-19 18:00:0.0" 2001.nc out.nc

    or maybe even

    ncrcat -d time,"2001-12-21 12:00:0.0","2002-1-1 06:00:0.0" 2001.nc 2002.nc out.nc

    I have this working right now with some hacks to nco_lmt.c, but I wonder if there are any standards I can assume for the strings sent with the -d flag.  Currently, I assume the 20-character format I show above.

    (1) can I assume it is 20 characters long?
    (2) can I assume there are dashes, colons, or spaces delimiting certain fields?

    I are going to have to assume something to discrimiate these limits from dimension indexes or values.  Otherwise,

    ncks -d time,2001,2002 in.nc out.nc

    would be ambiguous, are those indexes or years?  This wouldn't be a problem if the above are invalid "time since..." strings, but should they be?

    What I think is there are three things you can send the -d flag: indexes, values, or these "time since..." strings. I wonder if you should have to prescribe the hour, minute, and second if the time data unit is "days since 2001-01-01"? 

    What about using "2001 01 01", without the dashes? 

    Then is "2001 01 01 16 14" equal to 4:14pm on the Jan 1st? 

    Any input?

    rorik    

     
    • Charlie Zender

      Charlie Zender - 2003-03-20

      Hi Rorik,

      I love your plan. The way you end up treating the time coordinate
      should be consistent with the Climate/Forecast (CF) metadata convention

      http://www.cgd.ucar.edu/cms/eaton/cf-metadata/CF-working.html

      Section 4.4 is all about time. The CF convention describes many features
      that NCO might want to support. NCO does not have to support any
      CF feature to work generically, but there are some features, like
      UDunits time coordinates that would be very useful to support.

      Metadata conventions (e.g., packing/unpacking) tends to interface
      with the low-level NCO routines. It sounds like you've already got
      something working which is great. We'll apply any of your patches
      that add CF functionality without sacrificing generic usability.
      I am not opposed to making UDunits a required component of NCO,
      if it comes to that, since UDunits is just as accessible as netCDF.
      But I'd prefer to keep it optional for now.

      > I have this working right now with some hacks to nco_lmt.c, but I wonder if
      > there are any standards I can assume for the strings sent with the -d flag.
      > Currently, I assume the 20-character format I show above.
      >
      > (1) can I assume it is 20 characters long?

      No. This seems too restrictive to me.

      > (2) can I assume there are dashes, colons, or spaces delimiting certain fields?

      Yes. I think this is the better strategy. I would shy away from using
      colons, if possible. We may use colons to delineate subscript ranges
      in future versions of ncap, as per Fortran90 standard. Spaces and
      perhaps dashes seem like reasonable indicators. Also, the presence or
      absence of certain metadata fields could be used. For example, CF
      specifies that time must have a "units" attribute formatted as per the
      recommendations in the UDunits package.

      Perhaps if you find special characters (space, dash, letters) in the
      -d argument then you can send the limit strings to UDunits and check
      the return code. If UDunits can interpret them then let it, otherwise
      send the limits string through the normal lmt_evl() procedure.

      > I are going to have to assume something to discrimiate these limits from dimension
      > indexes or values.  Otherwise,
      >
      > ncks -d time,2001,2002 in.nc out.nc
      >
      > would be ambiguous, are those indexes or years?  This wouldn't be a problem
      > if the above are invalid "time since..." strings, but should they be?

      Yes, for now 2001,2002 must be interpreted as indices otherwise it
      will break the generic functionality.
      That said, it may be that 2001,2002 is a valid UDunits specification
      of years 2001-2002 but in that case there must be ancillary metadata
      in the "units" attribute telling you that. Whether you wish to support
      the "units" attribute to this extent is your decision.

      > What I think is there are three things you can send the -d flag: indexes, values,
      > or these "time since..." strings.

      Yes. I would call them indices, simple values, and everything is a
      complex coordinate limit that requires additional interpretation.
      In practice, that's just valide "time since" strings for now.

      > I wonder if you should have to prescribe the hour, minute, and second
      > if the time data unit is "days since 2001-01-01"?

      I do not think you should have to supply H, M, and S if you are
      working in days.

      > What about using "2001 01 01", without the dashes? 

      I do not like this format but it is unambiguous in that everything is
      specified as whole numbers so it is fine if you want to add support
      for it.

      > Then is "2001 01 01 16 14" equal to 4:14pm on the Jan 1st? 

      Yes. But in which time zone? Greenwich unless otherwise specified.

      > Any input?

      My suggestion is to add the generic hooks necessary to support more
      than you wish to code right now. Think about the formats _you_ have to
      read now. Think about CF. Decide what supports your minimal
      requirements. Think what supporting the next level of generality would
      take. Add the hooks that would support that level, even though you
      require only a subset of the hooks immediately. Basically you want to
      separate policy from implementation:
      Does current hyperslab limit require additional interpretation?
      If so, branch into special limits handler which converts complex
      coordinate values into indices.
      Does this complex coordinate limit appear to be a time coordinate?
      If so, branch into time coordinate handler.
      Special time limits handler may eventually support CF, NCEP, etc.
      Right now you just code the NCEP-specific support you need.
      Maybe you or someone else will fill in the rest of the hooks later.

      Good luck. Keep us posted!
      Charlie

       
    • Rorik Peterson

      Rorik Peterson - 2003-03-24

      I made some commits to add unidata's udunits library support.  Autoconf checks for the library, and if it is there, the support is compiled in.  The following strings are legal '-d' options

      "2001-01-01 12:00:0.0"
      2001-01-01
      2001-01-01 1200

      The CF conventions appear to require the YYYY-MM-DD format, and unidata's udunits library lexer does as well, so I decided that is how to look for the string.  In the end,  look for a dash anywhere other than the first character (which would be a negative coordinate value).  The colon should be optional.  The HHMM spec is ok, but HHMMSS is not, also following unidata's lexer.

      I moved the limit type enumeration from nco.h to nco_lmt.h because nco_lmt.c is the only thing that uses it

       
      • Charlie Zender

        Charlie Zender - 2003-03-24

        Hi Rorik,

        Nice bit o'hackin' there! This is a really useful new feature.
        I'll start to use it in my own scripts quite soon.

        I've added UDUNITS=Y support to bld/Makefile (off by default).
        Seems to build fine when enabled but I have not tested it yet.

        Will you please add some new coordinate variable(s) (called anything
        but "time") with appropriate "units" attributes to data/in.cdl and
        associated tests to bld/nco_tst.sh so we can do regression testing
        with this feature? I suggest using variants of time or date, e.g.,
        time_udunits, time_YYYY-MM-DD, time_YYYY-MM-DD_HHMM.
        Perhaps they should be in their own block and of the "will not work
        unless UDUNITS-enabled" variety.

        Once we have some tests in then we can release this as 2.7.3.

        Thanks!
        Charlie

         
    • Rorik Peterson

      Rorik Peterson - 2003-03-25

      I added a little more udunits support for unit conversions.  Now the following works:

      ncks -H -C -d wvl,"0.5 micron","1.0 micron" -v wvl in.nc
      wvl[1]=1e-06

      ncks -H -C -d wvl,"1 picometer","1 furlong" -v wvl in.nc
      wvl[0]=5e-07
      wvl[1]=5e-06

      I'm not sure if it is particularly useful in most instances, but wasn't much work to do.

      I added a test for both the time/date strings and a unit conversion like above.

      rorik

       
      • Nobody/Anonymous

        Yes, this is a most elegant generalization. Take a bow!
        And it really shows the power of the autotools build
        that this can be made transparent to the user.
        Bonus coolness for user transparency!

        Three questions to make sure I understand the new features before creating a release:

        Do all the UDUnits features work with any operator
        that does hyperslabs, i.e., ncks, ncra, ncea....or is
        it restricted for some reason?

        Are all of the UDUnits features based on the "units"
        attribute? Is there any case where you can use these
        features on a variable without the "units" attribute?

        Does the time coordinate hyperslabbing work correctly
        across multiple input files for multi-file operators? e.g.,

        ncra -d time,"2001-03-19 06:00:0.0","2001-03-19 18:00:0.0" in1.n in2.nc in3.nc out.nc

        Will this grab all the records from all three input files
        that meet these criterion?

        Thanks,
        Charlie

         
    • Rorik Peterson

      Rorik Peterson - 2003-03-26

      > Three questions to make sure I understand the new features before creating a
      > release:
      >
      > Do all the UDUnits features work with any operator
      > that does hyperslabs, i.e., ncks, ncra, ncea....or is
      > it restricted for some reason?

      It should work with any operator that uses the '-d' flag.  If units are supplied to -d, the new function converts the user-specified value into coordinate values.  Everything carries on the same from there.  Except for dates with no time (i.e. -d,2003-03-12 ), spaces are required between the value and the unit in order to discriminate things like -d junk,10d,20d .  In this example, the units are not d's, you need -d junk,"10 d","20 d".  If the user specifies a unit not in the udunits database, errors result (unless the specified unit is the same as that in the file, then we don't need to convert, so things carry on.)  For dates, a non-initial dash is required, so -d,"2001 03 12" is not good enough, but the udunits library wouldn't accept that either, so I think it is an ok requirement.  The CCM stuff says the same.

      > Are all of the UDUnits features based on the "units"
      > attribute? Is there any case where you can use these
      > features on a variable without the "units" attribute?

      It is based on the units attribute. I'm not sure how we could implement it for a dimension without a units attribute, because we would not know what the native units are in the file, so we couldn't do any conversion.  A dimension without units is sort of meaningless, anyway.  There are things like 'lat', where we assume the units are degrees north, but that list is pretty limited.  Maybe I don't understand the question entirely.

      > Does the time coordinate hyperslabbing work correctly
      > across multiple input files for multi-file operators? e.g.,
      >
      > ncra -d time,"2001-03-19 06:00:0.0","2001-03-19 18:00:0.0" in1.n in2.nc in3.nc
      > out.nc
      >
      > Will this grab all the records from all three input files
      > that meet these criterion?

      Yes, I've test ncra, ncrcat, nea, and ncks and they all seem to work.

      >
      > Thanks,
      > Charlie

      One warning, however.  There seems to be a problem with libthread and libpthread on my Solaris machines when using the udunits library.  The problem does not exist with libpthread on Linux.  The problem is not my code, however, because this program fails when compiled with -lthread on Solaris

      #include <stdlib.h>
      #include <stdio.h>
      #include <udunits.h>

      int main() {
        utInit(""); /* initialize the udunits library */
        utTerm();  /* clear memory allocated by udunits library */
        return 0;
        }

      I don't know much about threads, so I'm having a little trouble deciphering the problem exactly.  Everything is fine without the thread library.

      rorik

       
      • Charlie Zender

        Charlie Zender - 2003-03-26

        > It should work with any operator that uses the '-d' flag. 

        Great, just making sure.
        Thanks for the explanation of the implementation.

        > > Are all of the UDUnits features based on the "units"
        > > attribute? Is there any case where you can use these
        > > features on a variable without the "units" attribute?

        > Maybe I don't understand the question entirely.

        You understood it perfectly.

        > > Does the time coordinate hyperslabbing work correctly
        > > across multiple input files for multi-file operators? e.g.,

        > Yes, I've test ncra, ncrcat, nea, and ncks and they all seem to work.

        Great.

        > One warning, however.  There seems to be a problem with libthread and libpthread
        > on my Solaris machines when using the udunits library.  The problem does not

        I'll post this question to Unidata.

        Charlie

         

Log in to post a comment.