Menu

Multislabbing discussions archive

Developers
2002-12-28
2013-10-17
  • Charlie Zender

    Charlie Zender - 2002-12-28

    For those interested in the new "multislabbing" feature, here
    is a thread where I hope to archive the offline discussions
    between Henry and myself that led up to this.

    I did not save all the messages, but I just found one that
    will be useful reference material, and will post that first.
    Perhaps Henry will post what he has. Note that multi-slabbing,
    loosely defined, is simply a set of hyperslabbing operations.
    By collecting the operations into a set, rather than performing
    them sequentially, advantages in speed can sometimes be gained. Next post will discuss two multislabbing algorithms.

     
    • Charlie Zender

      Charlie Zender - 2002-12-28

      Sometime about November 16, 2002, I sent Henry this:

      Hi Henry,

      I was trying to figure out your new algorithm and I couldn't
      so I wrote down my algorithm below. After doing that then I
      figured out what you were saying. So here first is my algorithm,
      followed by some comparisons between mine and yours.

      The following multislabs are legal:

      #1. -d lon,0,360,10 -d lon,1,360
      #2. -d lon,0,360,10 -d lon,1,360 -d lat,0,180,10 -d lat,90,180,5

      In #1 and #2, the indices grabbed should be the union of the indices
      specified in the individual -d commands.
      It is important to know in advance the size of the full extracted
      multi-slab so that memory can be allocated and it can be written
      into memory as it is read.
      The method I proposed for reading in a previous email does not rely
      on any complicated algorithms for obtaining indices of elements.
      The whole procedure is doable in a simple loop over the existing
      retrieval algorithm.
      What is needed is a new algorithm that maps the hyperslabs into
      the continuous buffer from which they will be output.
      Here is the generic request to get a multislab of an N-dimensional
      variable, which is to be hyperslabbed along two of its N dimensions.

      ncks -d dim1,lmt11 ... -d dim1,lmt12 -d dim2,lmt21 -ddim2,lmt22 in.nc out.nc

      The read loop reads 4 slabs into memory where the 4 slabs are

      [lmt11,lmt21] = slab1
      [lmt11,lmt22] = slab2
      [lmt12,lmt21] = slab3
      [lmt12,lmt22] = slab4

      Existing infracture can be used for this. The new algorithm
      must decide how to map these 4 slabs into the final, contiguous
      buffer which will be used for output.
      The contiguous output buffer is
      sizeof(union(slab1,slab2,slab3,slab4)).
      Thus the algorithm must loop over each of slabs 1--4 and, for
      each element of the slab, determine what its location in the
      final output buffer will be.
      The location in the final output buffer is uniquely specifiable
      as an N element vector.
      Computing the values of each element of this vector is the tricky part. 
      It will involve creating map vectors as is already done in
      var_avg and in prn_var_val_lmt.
      In base-10 counting map vectors are the digits offset, e.g.,
      [1,10,100,...] and a number is specified as a list of coefficients
      times the digits offset, so that 737 = 7*100+3*10+1*1.
      The output hyperslabs needs a similar base, [C1,C2,C3,...CN] where
      C1=count{dim1},C2=count{dim2},....
      Once the base [C1,C2,...CN] is known the problem reduces to determining
      the coefficients of each base element [i.e., 7,3,7].
      These coefficients are what I call the map vector, [M1,M2,...MN].
      The M_o (where _o denotes the index in the output file) can be
      determined for each element by counting the position of the current
      dim_i (where _i denotes the index in the input file) in the output slab. 
      Thus if the multislab specifies that K elements of dim_n should
      be extracted from the input file, then 0 <= M_o < K.
      These K elements are indexed M_i1,M_i2,M_i3,...M_ik,...M_iK in the
      input file.
      Each output index M_ok uniquely corresponds to an input index M_ik,
      and, in fact, M_ok=k (whereas M_ik is the k'th index of that dimension
      specified by the multi-hyperslab).

      For efficiency the algorithm can do this after each slab is read,
      map that slab into the output buffer, then free the memory for the
      slab before the next slab is read. It does not matter if a subsequent
      slab is read and some of its elements overlap (map directly onto)
      elements of a previous slab already stored in the output buffer.
      After all input slabs have been read and mapped to the output
      buffer, the output buffer will be completely full and ready to
      write to the output file (or to screen).
      If it's being written to an output file, it can be written with
      a single write command, since its dimensions and count vectors are
      known.

      I've read your algorithm twice and it will give the same results
      as the algorithm that I just described.
      The basic difference is that your algorithm ensures that each
      value is read from disk at most one time, whereas, if the
      user specifies overlapping hyperslabs, my algorithm will retrieve
      the same value from disk as many times as the user specifies.
      I understand the bookkeeping required for my algorithm better
      than I do the bookkeeping required for your algorithm.
      Probably the algorithm that requires the fewest disk reads
      is superior in a time-of-execution sense.
      It is clear that my algorithm requires as many reads as the
      product of the number of hyperslabs specified for each dimension,
      e.g., 1*1*2*1*2=4 reads for the above example.

      It is possible that your algorithm _could_ reduce the total number
      of reads if the hyperslabs specified by the user contain a
      completely redundant specification, e.g., -d dim1,1,2 -d dim1,1,1
      would be reduced to one read by your algorithm but would be
      two reads with my algorithm.
      I think the chances of a user actually specifying a redundant
      hyperslab are very small.
      In any case, we should weigh the difficulty of writing your
      nco_calc_indices() algorithm against the difficulty of writing a loop
      around the current ncks algorithm which simply uses the user-specified
      hyperslab limits (and already does the appropriate recomputations
      when a hyperslab is wrapped).

      > I'm not sure yet how to deal with wrapped hyperslabs, If they are on their
      > own then their treatment should be as is, If they are with other limits then
      > break them into two separate ranges ????

      That is what the current lmt_evl() imlementation does: What is
      specified as a single hypslab is automatically broken into two
      separate reads.

      I havn't really considered wrapped multi-slabs because they bend my
      brain. All I know is that the current infrastructure handles wrapping
      correctly, and I _think_ that my algorithm, which simply loops over
      the current infrastructure, might therefore behave correctly with
      wrapped hyperslabs but I'm not sure...

      > Have to make sure that nco_calc_indicies() can cope with repeated dims in a
      > var
      > e.g bigvar[ time,lat,lat,lon,lon] or other such stuff

      In all the time I've written scientific software for netCDF this
      situation has never arisen. Is this even allowed by the netCDF
      standard? There are good physical grounds for hoping that it never
      does arise, but I can see situations where an array dimensioned, e.g.,
      var[dim_lon,dim_lon] would be useful, e.g., for cross-correlations
      of two variables that are functions of longitude.
      I have a feeling ncwa would misbehave if it ever encountered such a
      variable, because it's algorithms are based on comparing names of
      dimensions in variables to names of dimensions in weights...
      In any case, both your algorithm and my algorithm could be altered
      to handle such situations, but I would not worry about such
      variables because they have never been encountered.

      > For printing what about keeping the existing routine and calling it with
      > each of the hyperslab args ?

      Yes, something like that.

      Would you please post the full archive of our private discussions
      of this subject and any future followups to NCO developers so we have
      an archive of this discussion that others can refer to?
      BTW Sourceforge appears to be down at the present moment.

      Thanks,
      Charlie

       
      • Charlie Zender

        Charlie Zender - 2003-01-04

        Hi Henry,

        I was trying to figure out your new algorithm and I couldn't
        so I wrote down my algorithm below. After doing that then I
        figured out what you were saying. So here first is my algorithm,
        followed by some comparisons between mine and yours.

        The following multislabs are legal:

        #1. -d lon,0,360,10 -d lon,1,360
        #2. -d lon,0,360,10 -d lon,1,360 -d lat,0,180,10 -d lat,90,180,5

        In #1 and #2, the indices grabbed should be the union of the indices
        specified in the individual -d commands.
        It is important to know in advance the size of the full extracted
        multi-slab so that memory can be allocated and it can be written
        into memory as it is read.
        The method I proposed for reading in a previous email does not rely
        on any complicated algorithms for obtaining indices of elements.
        The whole procedure is doable in a simple loop over the existing
        retrieval algorithm.
        What is needed is a new algorithm that maps the hyperslabs into
        the continuous buffer from which they will be output.
        Here is the generic request to get a multislab of an N-dimensional
        variable, which is to be hyperslabbed along two of its N dimensions.

        ncks -d dim1,lmt11 ... -d dim1,lmt12 -d dim2,lmt21 -ddim2,lmt22 in.nc out.nc

        The read loop reads 4 slabs into memory where the 4 slabs are

        [lmt11,lmt21] = slab1
        [lmt11,lmt22] = slab2
        [lmt12,lmt21] = slab3
        [lmt12,lmt22] = slab4

        Existing infracture can be used for this. The new algorithm
        must decide how to map these 4 slabs into the final, contiguous
        buffer which will be used for output.
        The contiguous output buffer is
        sizeof(union(slab1,slab2,slab3,slab4)).
        Thus the algorithm must loop over each of slabs 1--4 and, for
        each element of the slab, determine what its location in the
        final output buffer will be.
        The location in the final output buffer is uniquely specifiable
        as an N element vector.
        Computing the values of each element of this vector is the tricky part.
        It will involve creating map vectors as is already done in
        var_avg and in prn_var_val_lmt.
        In base-10 counting map vectors are the digits offset, e.g.,
        [1,10,100,...] and a number is specified as a list of coefficients
        times the digits offset, so that 737 = 7*100+3*10+1*1.
        The output hyperslabs needs a similar base, [C1,C2,C3,...CN] where
        C1=count{dim1},C2=count{dim2},....
        Once the base [C1,C2,...CN] is known the problem reduces to determining
        the coefficients of each base element [i.e., 7,3,7].
        These coefficients are what I call the map vector, [M1,M2,...MN].
        The M_o (where _o denotes the index in the output file) can be
        determined for each element by counting the position of the current
        dim_i (where _i denotes the index in the input file) in the output slab.
        Thus if the multislab specifies that K elements of dim_n should
        be extracted from the input file, then 0 <= M_o < K.
        These K elements are indexed M_i1,M_i2,M_i3,...M_ik,...M_iK in the
        input file.
        Each output index M_ok uniquely corresponds to an input index M_ik,
        and, in fact, M_ok=k (whereas M_ik is the k'th index of that dimension
        specified by the multi-hyperslab).

        For efficiency the algorithm can do this after each slab is read,
        map that slab into the output buffer, then free the memory for the
        slab before the next slab is read. It does not matter if a subsequent
        slab is read and some of its elements overlap (map directly onto)
        elements of a previous slab already stored in the output buffer.
        After all input slabs have been read and mapped to the output
        buffer, the output buffer will be completely full and ready to
        write to the output file (or to screen).
        If it's being written to an output file, it can be written with
        a single write command, since its dimensions and count vectors are
        known.

        I've read your algorithm twice and it will give the same results
        as the algorithm that I just described.
        The basic difference is that your algorithm ensures that each
        value is read from disk at most one time, whereas, if the
        user specifies overlapping hyperslabs, my algorithm will retrieve
        the same value from disk as many times as the user specifies.
        I understand the bookkeeping required for my algorithm better
        than I do the bookkeeping required for your algorithm.
        Probably the algorithm that requires the fewest disk reads
        is superior in a time-of-execution sense.
        It is clear that my algorithm requires as many reads as the
        product of the number of hyperslabs specified for each dimension,
        e.g., 1*1*2*1*2=4 reads for the above example.

        It is possible that your algorithm _could_ reduce the total number
        of reads if the hyperslabs specified by the user contain a
        completely redundant specification, e.g., -d dim1,1,2 -d dim1,1,1
        would be reduced to one read by your algorithm but would be
        two reads with my algorithm.
        I think the chances of a user actually specifying a redundant
        hyperslab are very small.
        In any case, we should weigh the difficulty of writing your
        nco_calc_indices() algorithm against the difficulty of writing a loop
        around the current ncks algorithm which simply uses the user-specified
        hyperslab limits (and already does the appropriate recomputations
        when a hyperslab is wrapped).

        > I'm not sure yet how to deal with wrapped hyperslabs, If they are on their
        > own then their treatment should be as is, If they are with other limits then
        > break them into two separate ranges ????

        That is what the current lmt_evl() imlementation does: What is
        specified as a single hypslab is automatically broken into two
        separate reads.

        I havn't really considered wrapped multi-slabs because they bend my
        brain. All I know is that the current infrastructure handles wrapping
        correctly, and I _think_ that my algorithm, which simply loops over
        the current infrastructure, might therefore behave correctly with
        wrapped hyperslabs but I'm not sure...

        > Have to make sure that nco_calc_indicies() can cope with repeated dims in a
        > var
        > e.g bigvar[ time,lat,lat,lon,lon] or other such stuff

        In all the time I've written scientific software for netCDF this
        situation has never arisen. Is this even allowed by the netCDF
        standard? There are good physical grounds for hoping that it never
        does arise, but I can see situations where an array dimensioned, e.g.,
        var[dim_lon,dim_lon] would be useful, e.g., for cross-correlations
        of two variables that are functions of longitude.
        I have a feeling ncwa would misbehave if it ever encountered such a
        variable, because it's algorithms are based on comparing names of
        dimensions in variables to names of dimensions in weights...
        In any case, both your algorithm and my algorithm could be altered
        to handle such situations, but I would not worry about such
        variables because they have never been encountered.

        > For printing what about keeping the existing routine and calling it with
        > each of the hyperslab args ?

        Yes, something like that.

        Would you please post the full archive of our private discussions
        of this subject and any future followups to NCO developers so we have
        an archive of this discussion that others can refer to?
        BTW Sourceforge appears to be down at the present moment.

        Thanks,
        Charlie

         
    • Charlie Zender

      Charlie Zender - 2003-01-04

      Hi Charlie,

      Have thought up a neat method of calculating the indices of dimensions with
      multiple hyperslabs.
      Basically we have a function nco_calc_indices(lmt_all lmt_a, long srt,long
      cnt,long srd, bool end)
      The function has static variables , and returns hyperslab indices untill
      bool end is true.

      E.g suppose we have ncks -d time,0,4,2 -d time,3,3,1 -v time_test in.nc
      foo.nc

      The first hyperslab returned is 0,2,2
      second hyperslab                     3,4,1       ( end is true )

      The function nco_calc_indices() keeps an index for each hyperslap argument
      And searches for the lowest index from all the hyperslab arguments, it also
      keeps track of the difference between sucessive values.
      Another example

      ncks -d time 0,6,2 -d time 1,5,2

      The single hyperslab returned is  0,6,1      (end is true)

      Other combinations may not be so neat
      ncks -d time,0,10,3 ncks -d time 0,10,4

      First hyperslab --- 0,0,1
      second              --- 3,4,1
      third                      6,6,1
      four                       8,9,1        (end is true)

      I'm not sure yet how to deal with wrapped hyperslabs, If they are on their
      own then their treatment should be as is, If they are with other limits then
      break them into two separate ranges ????

      Have to make sure that nco_calc_indicies() can cope with repeated dims in a
      var
      e.g bigvar[ time,lat,lat,lon,lon] or other such stuff

      For printing what about keeping the existing routine and calling it with
      each of the hyperslab args ?

      Regards Henry

      Thanks for looking into this. I agree with your analysis in general.
      Specific comments below.

      Henry Butowsky wrote:

      > Hi Charlie,
      >
      > Have had a look at implementing hyperslab args with multiple args for a
      > single
      > dimension.
      >
      > e,g. ncks -d time,1,5,2 -d time,7,9 -d lon,90,180 in.nc foo.nc
      >
      > The task is certainly non-trivial. ncks is like the jewel in the crown of
      > the nco software, so the mods should not compromise existing

      functionality.

      >
      > I envisige the following
      >
      > Use nco_lmt_evl() to create multiple lmt_sct for each dimension and hold
      > them in
      > the following structure :-
      >
      > struct lmt_all {
      > char *dmn_nm; /* dimension name */
      > int dmn_cnt; /* total number of hyperslabs to extract */
      > int lmt_nbr; /* number of lmt args */
      > lmt_sct lmt*;
      > }
      >
      > Hold all lmt_sct for each dimension in lmt_all.

      Right, something like this is mandatory to solve the problem.

      >
      > We now require a function which can calculate the total number of

      hyberslabs

      > to extract
      > for each dimesion - Bear in mind that hyperslabs in lmt_all may overlap

      e.g

      > ncks -d time,1,6 -d time,3,10,4.
      > There are two types of hyperlab args those with stride and those with

      simply

      > a min and max index.
      >
      > With dmn_cnt defined we can call nco_cpy_var_dfn_lmt() to define the
      > variable and the
      > modified dimensions in the output.
      >
      >
      > If the variable contains only dimensions with O or 1 hyperslab args then

      use

      > the
      > existing nco_cpy_var_val_lmt() function to read/write the variable.

      Yes, nco_cpy_var_val_lmt() is already very complicated, and adding
      the ability to handle lists of dimensions should be done in a new
      routine.

      >
      >
      > Writing var with multiple hyperslap indices.
      > nco_cpy_var_val_lmt_all()
      >
      > Because of the possiblility of overlapping or multiple indices we cannot

      use

      > nco_get_varm () and nco_put_vara() to read/write.
      >
      > We have to read the whole variable into memory and manually loop though

      each

      > hyperslap.

      This conclusion should be re-examined because reading entire
      variables into memory should be avoided for large datasets.
      I envisage that reads could/should be done in a loop requiring
      the minimal number, which would be the product of the number of
      limits specified for each dimension, so if only one limit specified
      for each dimension, it should be done in one read, as currently,
      but if limits per dimension were, say, (1,1,2,1,3) for a five
      dimensional variable then it would require 1*1*2*1*3=6 reads.
      Each read does essentially what is done right now by ncks.

      The problem is knowing where to store the results in a contin
      but that's a problem that has to be tackled regardless of the read-in
      method. The algorithm to solve it would be similar to what is done in
      var_avg() already.

      >
      > We need a function: bool is_valid_index(lmt_all *lmt_1, int index)
      > which returns true if the index is in the hyperslab. ( this is fairly

      crude

      > and more
      > sophisticated optimizations may be possible by merging hyperslab info ,
      > or if the dimension is fairly small e.g < 1000 , producing a complete

      table

      > of valid
      > indices ) ( c.f What is the upper limit on a dimension ?)
      > I guess some kind of clever recursive function is called for. !
      > For a large multi-dimension variable there is gonna be a big performance

      hit

      > !

      I think all of that can be avoided.

      >
      > I have not yet thought about the special treatment of the record dimension
      > or the
      > printing of these hyperslabs.

      Yes, printing would be a real bear.

      I gather from today's message that you have had an inspired idea on
      how this all might be done in a cleaner way. I am interested to hear
      more.

      Thanks,
      Charlie

      >
      > Regards Henry
      >
      > .
      >

       
    • Charlie Zender

      Charlie Zender - 2003-01-04

      Hi Charlie,
      I have come up with a multi-slabbing algorithm that is a combination of mine
      and your's.

      My algorithim will be very inefficient for things like

      #1. -d lon,0,360,10 -d lon,0,360,6

      where the hyperslabs are interleaved. Your algorithm is superior when their
      are only a few overlapping indices.
      I could not fully comprehend your "book-keeping" ---- But realised that  can
      I use my nco_calc_indices() function to merge the output hyperlab info for
      each dimension

      We can also use nco_calc_indices to find the size of the output dimensions ,
      so that we can define the variables before writing out the data

      FYI
      I have noticed that tha ncks doesn't handle correctly hyperslabs with two or
      more wrapped hyperslabs

      Do you want me to preceed with coding ?
      I thing the entire project will require less than 2000 lines of code. :-)

      Regards Henry

       
    • Charlie Zender

      Charlie Zender - 2003-01-04

      Hi Henry,

      I was trying to figure out your new algorithm and I couldn't
      so I wrote down my algorithm below. After doing that then I
      figured out what you were saying. So here first is my algorithm,
      followed by some comparisons between mine and yours.

      The following multislabs are legal:

      #1. -d lon,0,360,10 -d lon,1,360
      #2. -d lon,0,360,10 -d lon,1,360 -d lat,0,180,10 -d lat,90,180,5

      In #1 and #2, the indices grabbed should be the union of the indices
      specified in the individual -d commands.
      It is important to know in advance the size of the full extracted
      multi-slab so that memory can be allocated and it can be written
      into memory as it is read.
      The method I proposed for reading in a previous email does not rely
      on any complicated algorithms for obtaining indices of elements.
      The whole procedure is doable in a simple loop over the existing
      retrieval algorithm.
      What is needed is a new algorithm that maps the hyperslabs into
      the continuous buffer from which they will be output.
      Here is the generic request to get a multislab of an N-dimensional
      variable, which is to be hyperslabbed along two of its N dimensions.

      ncks -d dim1,lmt11 ... -d dim1,lmt12 -d dim2,lmt21 -ddim2,lmt22 in.nc out.nc

      The read loop reads 4 slabs into memory where the 4 slabs are

      [lmt11,lmt21] = slab1
      [lmt11,lmt22] = slab2
      [lmt12,lmt21] = slab3
      [lmt12,lmt22] = slab4

      Existing infracture can be used for this. The new algorithm
      must decide how to map these 4 slabs into the final, contiguous
      buffer which will be used for output.
      The contiguous output buffer is
      sizeof(union(slab1,slab2,slab3,slab4)).
      Thus the algorithm must loop over each of slabs 1--4 and, for
      each element of the slab, determine what its location in the
      final output buffer will be.
      The location in the final output buffer is uniquely specifiable
      as an N element vector.
      Computing the values of each element of this vector is the tricky part.
      It will involve creating map vectors as is already done in
      var_avg and in prn_var_val_lmt.
      In base-10 counting map vectors are the digits offset, e.g.,
      [1,10,100,...] and a number is specified as a list of coefficients
      times the digits offset, so that 737 = 7*100+3*10+1*1.
      The output hyperslabs needs a similar base, [C1,C2,C3,...CN] where
      C1=count{dim1},C2=count{dim2},....
      Once the base [C1,C2,...CN] is known the problem reduces to determining
      the coefficients of each base element [i.e., 7,3,7].
      These coefficients are what I call the map vector, [M1,M2,...MN].
      The M_o (where _o denotes the index in the output file) can be
      determined for each element by counting the position of the current
      dim_i (where _i denotes the index in the input file) in the output slab.
      Thus if the multislab specifies that K elements of dim_n should
      be extracted from the input file, then 0 <= M_o < K.
      These K elements are indexed M_i1,M_i2,M_i3,...M_ik,...M_iK in the
      input file.
      Each output index M_ok uniquely corresponds to an input index M_ik,
      and, in fact, M_ok=k (whereas M_ik is the k'th index of that dimension
      specified by the multi-hyperslab).

      For efficiency the algorithm can do this after each slab is read,
      map that slab into the output buffer, then free the memory for the
      slab before the next slab is read. It does not matter if a subsequent
      slab is read and some of its elements overlap (map directly onto)
      elements of a previous slab already stored in the output buffer.
      After all input slabs have been read and mapped to the output
      buffer, the output buffer will be completely full and ready to
      write to the output file (or to screen).
      If it's being written to an output file, it can be written with
      a single write command, since its dimensions and count vectors are
      known.

      I've read your algorithm twice and it will give the same results
      as the algorithm that I just described.
      The basic difference is that your algorithm ensures that each
      value is read from disk at most one time, whereas, if the
      user specifies overlapping hyperslabs, my algorithm will retrieve
      the same value from disk as many times as the user specifies.
      I understand the bookkeeping required for my algorithm better
      than I do the bookkeeping required for your algorithm.
      Probably the algorithm that requires the fewest disk reads
      is superior in a time-of-execution sense.
      It is clear that my algorithm requires as many reads as the
      product of the number of hyperslabs specified for each dimension,
      e.g., 1*1*2*1*2=4 reads for the above example.

      It is possible that your algorithm _could_ reduce the total number
      of reads if the hyperslabs specified by the user contain a
      completely redundant specification, e.g., -d dim1,1,2 -d dim1,1,1
      would be reduced to one read by your algorithm but would be
      two reads with my algorithm.
      I think the chances of a user actually specifying a redundant
      hyperslab are very small.
      In any case, we should weigh the difficulty of writing your
      nco_calc_indices() algorithm against the difficulty of writing a loop
      around the current ncks algorithm which simply uses the user-specified
      hyperslab limits (and already does the appropriate recomputations
      when a hyperslab is wrapped).

      > I'm not sure yet how to deal with wrapped hyperslabs, If they are on their
      > own then their treatment should be as is, If they are with other limits then
      > break them into two separate ranges ????

      That is what the current lmt_evl() imlementation does: What is
      specified as a single hypslab is automatically broken into two
      separate reads.

      I havn't really considered wrapped multi-slabs because they bend my
      brain. All I know is that the current infrastructure handles wrapping
      correctly, and I _think_ that my algorithm, which simply loops over
      the current infrastructure, might therefore behave correctly with
      wrapped hyperslabs but I'm not sure...

      > Have to make sure that nco_calc_indicies() can cope with repeated dims in a
      > var
      > e.g bigvar[ time,lat,lat,lon,lon] or other such stuff

      In all the time I've written scientific software for netCDF this
      situation has never arisen. Is this even allowed by the netCDF
      standard? There are good physical grounds for hoping that it never
      does arise, but I can see situations where an array dimensioned, e.g.,
      var[dim_lon,dim_lon] would be useful, e.g., for cross-correlations
      of two variables that are functions of longitude.
      I have a feeling ncwa would misbehave if it ever encountered such a
      variable, because it's algorithms are based on comparing names of
      dimensions in variables to names of dimensions in weights...
      In any case, both your algorithm and my algorithm could be altered
      to handle such situations, but I would not worry about such
      variables because they have never been encountered.

      > For printing what about keeping the existing routine and calling it with
      > each of the hyperslab args ?

      Yes, something like that.

      Would you please post the full archive of our private discussions
      of this subject and any future followups to NCO developers so we have
      an archive of this discussion that others can refer to?
      BTW Sourceforge appears to be down at the present moment.

      Thanks,
      Charlie

       
    • henry Butowsky

      henry Butowsky - 2003-01-06

      Hi Charlie ,
      below is an explanation of how nco_msa_rec_clc() works.        nco_msa_rec_clc
              ---------------

      void *
      nco_msa_rec_clc /* [fnc] Multi slab algorithm (recursive routine, returns a single slab pointer */
      (int dpt_crr, /* current depth, we start at 0 */
      int dpt_crr_max, /* maximium depth (i.e the number of dims in variable (does not change)*/         
      lmt_sct **lmt, /* limits of the current hyperslabs these change as we recurse */
      lmt_all **lmt_lst, /* list of limits in each dimension (this remains STATIC as we recurse) */
      var_sct *vara) /* Info for routine to read var info and pass info between calls */

      Consider the example  

      ncks -v three_dmn_var_dbl  -d time,1,3,1 -d time,2,8,2 -d lon,3,1,1 in.nc foo3.nc

      where

      three_dmn_var_dbl(time, lat,lon)=
                       1, 2, 3, 4, 5, 6, 7, 8,
                       9,10,11,12,13,14,15,16,
                      17,18,19,20,21,22,23,24,
                      -99,-99,-99,-99,-99,-99,-99,-99,
                      33,34,35,36,37,38,39,40,
                      41,42,43,44,45,46,47,48,
                      49,50,51,52,53,54,55,56,
                      -99,58,59,60,61,62,63,64,
                      65,66,67,68,69,70,71,72,
                      -99,74,75,76,77,78,79,-99;

      In nco_cpy_var_val_mlt_lmt() , preparations are made for the call to
      call nco_msa_rec_clc()

      lmt_mult[] is populated with the approriate dimension limits.

      lmt_mult[0] ---- time limits {1,3,1} {2,8,2}   
      lmt_mult[1] ---- lat limit   {0,1,1}        ( This is the same as input dim )        
      lmt_mult[2] ---- lon limit   {3,1,1}        ( A  wrapped limit )

      Now call recursive routine :-
      void_ptr=nco_msa_rec_clc(0,nbr_dim,lmt,lmt_mult,&vara);

      in this case nbr_dmn=3, lmt= (0,0,0) .

      TIME LIMIT 
      Since lmt_lst[0]->lmt_dmn_nbr == 2 (or lmt_mult[0]->lmt_dmn_nbr==2 ) we move
      to the hyperslab section of the function.
      since we have 2 limits we require 2 hyperslabs. So we make the following calls to
      get them:-

      cp_wrp[0] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {1,3,1},0,0 )
      cp_wrp[1] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {2,8,2},0,0 )

      We now have two strands. Lets follow cp_wrp[0]

      LAT LIMIT
      Since the depth = 1 we are considering the lat limit.
      In this case lmt_lst[1]->lmt_dmn_nbr ==1 (lmt_mult[1]->lmt_dmn_nbr ==1)
      vp=nco_msa_rec_clc(2,3,lmt,lmt_lst,vara)    (lmt=  {1,3,1},{0,1,1}, 0 )
      There is no branching or wrapping here, we simply add the lat limit to the list and move on

      cp_wrp[0] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {1,3,1},0,0 )
      cp_wrp[1] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {2,8,2},0,0 )

      LON LIMIT
      depth = 2 so we are at lon limit
      This is a wrapped dimension . It broken into two pieces
      {3,1,1} ->  {3,3,1}  and {0,1,1} 
      cp_wrp[0] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={1,3,1},{0,1,1},{3,3,1} )
      cp_wrp[1] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={1,3,1},{0,1,1},{0,1,1} )

      READ SECTION
      Since depth =3 , this means lmt is completly populated and so hyperslabs are read from the
      file and pointers returned. This read section is called 4 times.

      Below are all the calls to nco_msa_rec_clc in the order they are called.

      void_ptr=nco_msa_rec_clc(0,3,lmt,lmt_mult,&vara) (lmt= 0,0,0)

          cp_wrp[0] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {1,3,1},0,0 )
              vp=nco_msa_rec_clc(2,3,lmt,lmt_lst,vara)    (lmt=  {1,3,1},{0,1,1}, 0 )
                  cp_wrp[0] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={1,3,1},{0,1,1},{3,3,1} ) READ
                  cp_wrp[1] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={1,3,1},{0,1,1},{0,1,1} ) READ

          cp_wrp[1] = nco_msa_rec_clc(1,3,lmt,lmt_lst,vara)  ( lmt = {2,8,2},0,0 )
              vp=nco_msa_rec_clc(2,3,lmt,lmt_lst,vara)    (lmt=  {2,8,2},{0,1,1}, 0 )
                  cp_wrp[0] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={2,8,2},{0,1,1},{3,3,1} ) READ
                  cp_wrp[1] = nco_msa_rec_clc(3,3,lmt,lmt_lst,vara) ( lmt={2,8,2},{0,1,1},{0,1,1} ) READ

      Merging Hyperlabs
      -----------------
      In the above example, at depth 0 we need to merge the two TIME hyperslabs. The function nco_msa_clc_idx() Provides
      us with the recipe to merge the hyperslabs:- This is

      slb_nbr=0 srt=1 end=3 cnt=3 srd=1
      slb_nbr=1 srt=4 end=8 cnt=3 srd=2

      Because the depth is 0 we simply concatenate the two slabs:

        12, 9, 10,
        16, 13, 14,
        20, 17, 18,
        24, 21, 22,
        -99, -99, -99,
        -99, -99, -99,

      and

        36, 33, 34,
        40, 37, 38,
        52, 49, 50,
        56, 53, 54,
        68, 65, 66,
        72, 69, 70 ;

      Consider a more complex example e.g  -d time,0,,2 -d time,0,3,5 -d time,3,,3 in.nc foo3.nc
      The recipe is now:-

      slb_nbr=0 srt=0 end=2 cnt=2 srd=2
      slb_nbr=2 srt=3 end=3 cnt=1 srd=1
      slb_nbr=0 srt=4 end=8 cnt=3 srd=2
      slb_nbr=2 srt=9 end=9 cnt=1 srd=1

      ( Note that because of overlapping the 3rd slab isn't used)

      Interleaved slabs with large dims can rapidly expand the recipe. e.g -d time,,,2 -d time,,,3

      slb_nbr=0 srt=0 end=2 cnt=2 srd=2
      slb_nbr=1 srt=3 end=3 cnt=1 srd=1
      slb_nbr=0 srt=4 end=8 cnt=3 srd=2
      slb_nbr=1 srt=9 end=9 cnt=1 srd=1

      ---------------------------------------------------------------------------------------------------------

      Lets look at the nco_msa_rec_clc call structure for:-

      ncks -v four_dmn_rec_var -d time,9,2,1 -d lev,0,0,1 -d lev,2,2,1 -d lon,2,3,1 in.nc foo3.nc

      ( float four_dmn_rec_var(time,lat,lev,lon); )

      void_ptr=nco_msa_rec_clc(0,4,...) (lmt= 0,0,0,0)

      cp_wrp[0]=nco_msa_rec_clc(1,4,..)                       lmt = {9,9,1},0,0,0
          vp =nco_msa_rec_clc(2,4,..)                     lmt = {9,9,1},{0,1,1},0,0
              cp_wrp[0]= nco_msa_rec_clc(3,4,...)     lmt = {9,9,1},{0,1,1},{0,0,1},0
                  vp = nco_msa_rec_clc(4,4,...)   lmt = {9,9,1},{0,1,1},{0,0,1},{2,3,1} READ
              cp_wrp[1]= nco_msa_rec_clc(3,4,...)     lmt = {9,9,1},{0,1,1},{2,2,1},0                
                  vp = nco_msa_rec_clc(4,4,...)   lmt = {9,9,1},{0,1,1},{2,2,1},{2,3,1} READ                
      cp_wrp[1]=nco_msa_rec_clc(1,4,..)                       lmt=  {0,2,1},0,0,0
          vp =nco_msa_rec_clc(2,4,..)                     lmt = {0,2,1},{0,1,1},0,0
              cp_wrp[0]= nco_msa_rec_clc(3,4,...)     lmt = {0,2,1},{0,1,1},{0,0,1},0
                  vp = nco_msa_rec_clc(4,4,...)   lmt = {0,2,1},{0,1,1},{0,0,1},{2,3,1} READ
              cp_wrp[1]= nco_msa_rec_clc(3,4,...)     lmt = {9,9,1},{0,1,1},{2,2,1},0                
                  vp = nco_msa_rec_clc(4,4,...)   lmt = {0,2,1},{0,1,1},{2,2,1},{2,3,1} READ                

      The number of reads a var requires is simply the multiple of the number of limits for each dimension.

      I hope this explanation is usefull

       
    • Nobody/Anonymous

      Hi Charlie,
      The print routine in ncks needs re-writing to take account of multi-slabbing etc -- Do you want to tackle it or shall I ? It is quite a complex routine and has many inputs.

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-01-22

        Hi Henry,

        Yes, please!  It would be great if you would do this.
        Let's keep prn_var_val_lmt() as is and use it as
        the basis for prn_var_val_msa() and then switch
        to the latter when you're done with it.

        Thanks,
        Charlie

         
    • henry Butowsky

      henry Butowsky - 2003-02-03

      Hi Charlie,
      Have started thinking about the printing problem.
      I am at a loss with the variable indicies.

      i.e
      ncks -H  -C  -v three_dmn_var_dbl  -d time,0,1  -d lon,0,,2 in.nc
      time[0]=1 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[0]=1
      time[0]=1 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[1]=3
      time[0]=1 lat[1]=90 lon[0]=0 three_dmn_var_dbl[2]=5
      time[0]=1 lat[1]=90 lon[2]=180 three_dmn_var_dbl[3]=7
      time[1]=2 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[4]=9
      time[1]=2 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[5]=11
      time[1]=2 lat[1]=90 lon[0]=0 three_dmn_var_dbl[6]=13
      time[1]=2 lat[1]=90 lon[2]=180 three_dmn_var_dbl[7]=15

      What do the variable indices( 0-7 ) refer to ? since lon has stride ,
      these values are not coninuous in the input file ?

      Another example ?
      ncks -H  -C  -v three_dmn_var_dbl  -d time,8,9  -d lat,0,0 -d lon,0,,2 in.nc
      time[8]=9 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[64]=65
      time[8]=9 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[65]=67
      time[9]=10 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[66]=-99
      time[9]=10 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[67]=75

      Obviously my routine needs to backwardly compatible ?

      In order to get the correct indices  for the dimensions I shall add a
      "mapping" array to the lmt_all structure. and a routine nco_msa_org_inx()
      to populate this array (what fun!)

      /* Container for limit structures */
      typedef struct {
      char *dmn_nm;        /* dimension name */
      long dmn_cnt;       /* total number of hyperslabs to extract */
      long *dmn_map       /* will be initialized to size dmn_cnt   */
                            /* dmn_map[i] will contain original index */
      int lmt_dmn_nbr;     /* number of lmt args */
      lmt_sct **lmt_dmn;   /* list of limit structures associated with each dimension */
      } lmt_all

      Have you informed Al Franz of my new algorithm ? It would be nice for it to
      checked out with real data ? If you want I can e-mail him !

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-02-03

        > Have started thinking about the printing problem.
        > I am at a loss with the variable indicies.
        >
        > i.e
        > ncks -H  -C  -v three_dmn_var_dbl  -d time,0,1  -d lon,0,,2 in.nc
        > time[0]=1 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[0]=1
        > time[0]=1 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[1]=3
        > time[0]=1 lat[1]=90 lon[0]=0 three_dmn_var_dbl[2]=5
        > time[0]=1 lat[1]=90 lon[2]=180 three_dmn_var_dbl[3]=7
        > time[1]=2 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[4]=9
        > time[1]=2 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[5]=11
        > time[1]=2 lat[1]=90 lon[0]=0 three_dmn_var_dbl[6]=13
        > time[1]=2 lat[1]=90 lon[2]=180 three_dmn_var_dbl[7]=15
        >
        > What do the variable indices( 0-7 ) refer to ? since lon has stride ,
        > these values are not coninuous in the input file ?

        > Another example ?
        > ncks -H  -C  -v three_dmn_var_dbl  -d time,8,9  -d lat,0,0 -d lon,0,,2 in.nc
        > time[8]=9 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[64]=65
        > time[8]=9 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[65]=67
        > time[9]=10 lat[0]=-90 lon[0]=0 three_dmn_var_dbl[66]=-99
        > time[9]=10 lat[0]=-90 lon[2]=180 three_dmn_var_dbl[67]=75

        ncks index printing only works as you expect it to when strides and
        wraps are not involved. The values printed are correct but the element
        number (the subscript on the variable whose values are being dumped)
        is an artifact of the simplicity of the current printing algorithm.
        I think the algorithm you are proposing has all the information
        necessary to print exactly what you want it to even when wraps and
        strides are involved.

        The question is, what should the algorithm print?
        It could either print the element number in the input file,
        or the element number in the hyperslab (output file).
        The dimension subscripts do refer to the offset in the input file,
        so to be most consistent, the element subscript should also refer
        to the element number in the output file.
        If the user wishes to know the element number in the hyperslab,
        they can simply run ncks -H on the output file.

        > Obviously my routine needs to backwardly compatible ?

        Backwards compatibility is not necessary in this case for a variety
        of reasons. Your new algorithm will be more flexible, capable, and
        consistent than the old algorithm, which should just go away
        eventually.

        > In order to get the correct indices  for the dimensions I shall add a
        > "mapping" array to the lmt_all structure. and a routine nco_msa_org_inx()
        > to populate this array (what fun!)
        >
        >
        > /* Container for limit structures */
        > typedef struct {
        >  char *dmn_nm;        /* dimension name */
        >  long dmn_cnt;       /* total number of hyperslabs to extract */
        >  long *dmn_map       /* will be initialized to size dmn_cnt   */
        >                       /* dmn_map[i] will contain original index */
        >  int lmt_dmn_nbr;     /* number of lmt args */
        >  lmt_sct **lmt_dmn;   /* list of limit structures associated with each dimension
        > */
        > } lmt_all

        Sounds good.

        > Have you informed Al Franz of my new algorithm ? It would be nice for it to
        > checked out with real data ? If you want I can e-mail him !

        Yes, I think you should e-mail him so he can test your stuff out
        and so he does not waste his time making an obsolete NCO contribution.

        Thanks,
        Charlie

         
    • henry Butowsky

      henry Butowsky - 2003-02-04

      hi Charlie,
      Have you got Al Franz's email ? My computer had a "meltdown" and I  lost all my old e-mails.
      I am a bit confused - you want the element subscript for the vars to be based on the output file -even though the dim subscripts are based on the input file ?

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-02-04

        Hi Henry,

        You misunderstood me. I think the indices from the
        input file should be printed by default. It might be
        nice to have a (long) option to print indices from
        the ouput hyperslab, but not necessary. Here is
        the relevant portion of what I said:

        The question is, what should the algorithm print?
        It could either print the element number in the input file,
        or the element number in the hyperslab (output file).
        The dimension subscripts do refer to the offset in the input file,
        so to be most consistent, the element subscript should also refer
        to the element number in the output file.
        If the user wishes to know the element number in the hyperslab,
        they can simply run ncks -H on the output file.

        So, despite what ncks does now, I think printing
        the element numbers from the input file is the way
        to go.

        Al Franz is catburglar at llnl dot gov

        Regards,
        Charlie

         
    • henry Butowsky

      henry Butowsky - 2003-02-18

      Hi Charlie,
      Have commited a replacement for nco_prn_var_val_lmt , its in
      nco_msa.c and is nco_msa_prn_var_val.
      Beware its still Beta !

      At the moment it doesn't do anything special with NC_CHAR variables.
      It would be great if you could summarise for me, what it should be doing. I look at the TODO list then your old code and I am lost !

      regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-02-21

        Hi Henry,

        I read through the TODO and found some contradictory
        statements concerning NC_CHAR which I attempted to
        reconcile. Basically, we want to ncks to be able to
        print out strings stored as NC_CHAR's. Since a lot
        of people NUL-terminate their strings, we want to take
        that as an indication that the string is ended, and not
        print and ugly stuff in the remainder of the variable.
        Since netCDF does not support a true string type,
        many files are forced to declare dimensions for NC_CHAR variables that are waaaaay too long,
        just to be sure they have enough room to store whatever comes along at runtime.

        The existing algorithm tries to do all this.
        However, I think you will find that it also tries
        to print the offset relative to the begining of the
        string, rather than the beginning of the variable.
        Imagine a 2-D NC_CHAR variable, e.g., a list of
        filenames. Ask yourself what you as a user would
        like to see if you did ncks -H on the file. I think you
        will agree that the answer is slightly different than
        what you would want to see for a two dimensional
        numeric array. This has been my thinking anyway.

        I know I havn't written down an exact policy here,
        but I do not want to contrain you too much.
        Try and stay backwards compatible where possible,
        but if it does not make sense, then break compatibility. I'm going away for the weekend soon.
        If you find any specific examples of problematic
        NC_CHAR printing, you might post them for discussion and votes as to what the users would prefer.

        Thanks!
        Charlie

         
        • Charlie Zender

          Charlie Zender - 2003-02-24

          Hi Henry,

          In preparation for 2.7.2, which will mainly be your
          ncks printing mods, would you please clean up the
          compiler warning from -ansi -pedantic?

          Thanks!
          Charlie

          gcc -std=c99 -ansi -pedantic -D_BSD_SOURCE  -DLINUX -DNO_NETCDF_2 -DVERSION='20030224' -DHOSTNAME='dust.ps.uci.edu' -DUSER='zender'  -DHAVE_GETOPT_H -DHAVE_MKSTEMP -DHAVE_STRCASECMP -DHAVE_STRDUP -Di386 -I./ -I../src/nco -I/usr/local/include -U_OPENMP -Wall -O -c ../src/nco/nco_msa.c -o /home/zender/obj/LINUX/nco_msa.o
          ../src/nco/nco_msa.c: In function `nco_msa_wrp_splt':
          ../src/nco/nco_msa.c:394: warning: `index' might be used uninitialized in this function
          ../src/nco/nco_msa.c: In function `nco_msa_prn_var_val':
          ../src/nco/nco_msa.c:667: warning: assignment discards qualifiers from pointer target type
          ../src/nco/nco_msa.c:681: warning: assignment of read-only location
          ../src/nco/nco_msa.c:683: warning: implicit declaration of function `sng_ascii_trn'
          ../src/nco/nco_msa.c:732: warning: implicit declaration of function `nco_typ_fmt_sng'
          ../src/nco/nco_msa.c:732: warning: format argument is not a pointer (arg 3)
          ../src/nco/nco_msa.c:739: warning: format argument is not a pointer (arg 3)
          ../src/nco/nco_msa.c:800: warning: implicit declaration of function `cast_void_nctype'
          ../src/nco/nco_msa.c:842: warning: format argument is not a pointer (arg 3)
          ../src/nco/nco_msa.c:865: warning: format argument is not a pointer (arg 3)
          ar rv /home/zender/lib/LINUX/libnco.a /home/zender/obj/LINUX/nco_msa.o

           
    • henry Butowsky

      henry Butowsky - 2003-02-25

      Hi Charlie,
      Have done a tidy up of nco_msa.c. 
      I have added some basic string handling in nxo_msa_prn_var_val .

      For string NOT null terminated I was thinking of displaying the actual escape sequences rather than printing the char. i.e will ned a function sng_trn_asci

      regards Henryi

       
      • Charlie Zender

        Charlie Zender - 2003-03-05

        Thanks. I also did some more tidying to make the builds cleaner.
        It now builds as clean as possible on IRIX, AIX, and LINUX.

        >For string NOT null terminated I was thinking of displaying the
        >actual escape sequences rather than printing the char. i.e will ned
        >a function sng_trn_asci

        We should be careful here. If the variable is type NC_CHAR, then I
        would assume that the information content of the variable is strings,
        i.e., words. Fortran is an example of a language with no NUL
        terminator in its strings, so Fortran apllications usually do not end
        their strings with NUL. Instead is garbage after the last valid
        character. Without NUL, there is no way to know the intended last
        valid character in a NC_CHAR variable (although you can guess that
        atoi(*cp) > 127 signifies you're in garbage-land.

        Here is the current ouput of ncks -H on the character and byte variables.
        I inserted some comments on formatting...

        zender@dust:~/nco/data$ ncks -C -H -v char_var,char_var_space,char_var_nul,char_var_multinul,fl_nm,fl_nm_arr,non_nul_trm_char_one_dim,non_nul_trm_char_two_dim,byte_var,one_dmn_rec_var_sng,two_dmn_rec_var_sng
        ~/nco/data/in.nc 
        byte_var = 122
        char_var='z'
        I like that non-NUL-terminated characters are enclosed in '' rather
        than "". To me '' means non-string, and "" means string (i.e.,
        NUL-terminated).
        lev(0)=100 char_var_multinul(0--2)="
        "
        Parentheses for index subscripts should be square brackets [] when
        they adhere to the C (0-based) formatting convention. If user
        specifies --fortran then offset indices by +1 and use (). This problem
        occurs in many of the following examples 

        char_var_nul=''
        I think char_var_nul is a valid 0-character string and so should be printed ""
        char_var_space=' '
        char_dmn_lng(0) fl_nm(0--28)="/home/zender/nco/data/in.cdl"

        fl_dim(0)=a char_dmn_lng(0) fl_nm_arr(0--41)="/data/zender/dstccm04/dstccm04_8589_01.nc"
        fl_dim(1)=b char_dmn_lng(0) fl_nm_arr(0--41)="/data/zender/dstccm04/dstccm04_8589_02.nc"
        fl_dim(2)=3 char_dmn_lng(0) fl_nm_arr(0--41)="/data/zender/dstccm04/dstccm04_8589_03.nc"
        Are you happy with this? Should fl_nm_arr use the total offsets or just the string length?
        I know we discussed this earlier, but what do you think is more useful?
        I think I like the following better:
        fl_dim[0]=a char_dmn_lng[0--41] fl_nm_arr[0--41]="/data/zender/dstccm04/dstccm04_8589_01.nc"
        fl_dim[1]=b char_dmn_lng[0--41] fl_nm_arr[42--81]="/data/zender/dstccm04/dstccm04_8589_02.nc"

        char_dmn_sml(0) non_nul_trm_char_one_dim(0--2)="ab"
        I would prefer that this be 'ab' (unless somehow netCDF wrote a NUL
        terminator into position 3, e.g., when it pre-filled the variable)

        fl_dim(0)=a char_dmn_sml(0) DMN SIZE= 4
        non_nul_trm_char_two_dim(0--3)=abcd
        fl_dim(1)=b char_dmn_sml(0) DMN SIZE= 4
        non_nul_trm_char_two_dim(0--3)=efgh
        fl_dim(2)=3 char_dmn_sml(0) DMN SIZE= 4
        non_nul_trm_char_two_dim(0--3)=ijkm
        I think these should have single quotes, e.g., 'abcd'
        Also, what's with the "DMN SIZE = X" portion?
        Do you intend to keep that?
        I like the following output format better
        fl_dim[0]=a char_dmn_sml[0--3] non_nul_trm_char_two_dim[0--3]='abcd'
        fl_dim[1]=a char_dmn_sml[0--3] non_nul_trm_char_two_dim[4--7]='efgh'
        What do you think?

        time(0)=1 one_dmn_rec_var_sng(0--9)="Hello Wor"

        time(0)=1 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=abc
        time(1)=2 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=bcd
        time(2)=3 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=cde
        time(3)=4 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=def
        time(4)=5 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=efg
        time(5)=6 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=fgh
        time(6)=7 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=ghi
        time(7)=8 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=hij
        time(8)=9 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=jkl
        time(9)=10 lev(0)=100 DMN SIZE= 3
        two_dmn_rec_var_sng(0--2)=klm
        Same comments as above, I think I would prefer, e.g.,
        time[1]=2 lev[0]=100 two_dmn_rec_var_sng(3--5)='bcd'

        Charlie

         
        • Nobody/Anonymous

          Hi Charlie,
          II've digested your comments. - I've got quite a bit of paid work on at the moment and shall not get a chance to do the mods untill possibly the end of next week. The printing of "DMN_SIZE" is just for debugging and can safely be removed.

          Regards Henry

           
    • henry Butowsky

      henry Butowsky - 2003-03-14

      Hi Charlie,
      Have sorted out the FORTRAN/C indices printing. Sorry for the mix up.
      Non strings are now enclosed within ' ' ,
      Dimensionless nulls and variables hyperslabed to null are  printed   ""

      Non strings indices are offests from lmn i.el_dim[0]=a char_dmn_sml[0--3] non_nul_trm_char_two_dim[0--3]='abcd'
      fl_dim[1]=a char_dmn_sml[0--3] non_nul_trm_char_two_dim[4--7]='efgh'

      I have done the same with strings the offset size is the dmn_sz
      If we used your method then the indices could end up being identical to the form when the nc_char dimension has limits

      i.e
      ncks -H -C -v fl_nm_arr in.nc  would be the same as

      nkcs -H -C -v fl_nm_arr -d char_dmn_lng,0,41 in.nc

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-03-14

        Hi Henry,

        Thanks. I agree with all of the mods you've made except the
        following one, which I'm not sure about. The current behavior seems to
        be keying the indices XXX--YYY in fl_nm_arr[XXX--YYY] to the buffer
        in memory rather than to the file on disk. Is this intentional?
        It is what ncks used to do, but I thought we discussed this and ended
        up agreeing that XXX--YYY should refer to the variable on disk. If the
        user wants to see the XXX--YYY refer to the slab in memory, they can
        use ncks twice, once to put the slab in a new file, and the second
        time to print the contents of the new file. To me this makes more
        sense. What do you think?

        zender@dust:~/nco/data$ ncks -H -C -v fl_nm_arr -d char_dmn_lng,0,41 in.nc
        fl_dmn[0]=a char_dmn_lng[0] fl_nm_arr[0--41]="/data/zender/dstccm04/dstccm04_8589_01.nc"
        fl_dmn[1]=b char_dmn_lng[0] fl_nm_arr[42--83]="/data/zender/dstccm04/dstccm04_8589_02.nc"
        fl_dmn[2]=3 char_dmn_lng[0] fl_nm_arr[84--125]="/data/zender/dstccm04/dstccm04_8589_03.nc"

        zender@dust:~/nco/data$ ncks -H -C -v fl_nm_arr -d char_dmn_lng,1,41 in.nc
        fl_dmn[0]=a char_dmn_lng[1] fl_nm_arr[0--40]="data/zender/dstccm04/dstccm04_8589_01.nc"
        fl_dmn[1]=b char_dmn_lng[1] fl_nm_arr[41--81]="data/zender/dstccm04/dstccm04_8589_02.nc"
        fl_dmn[2]=3 char_dmn_lng[1] fl_nm_arr[82--122]="data/zender/dstccm04/dstccm04_8589_03.nc"

        I suggest that the latter command print the following XXX--YYY instead:

        zender@dust:~/nco/data$ ncks -H -C -v fl_nm_arr -d char_dmn_lng,1,41 in.nc
        fl_dmn[0]=a char_dmn_lng[1] fl_nm_arr[1--41]="data/zender/dstccm04/dstccm04_8589_01.nc"
        fl_dmn[1]=b char_dmn_lng[1] fl_nm_arr[43--83]="data/zender/dstccm04/dstccm04_8589_02.nc"
        fl_dmn[2]=3 char_dmn_lng[1] fl_nm_arr[85--125]="data/zender/dstccm04/dstccm04_8589_03.nc"

        Note that the behavior (XXX--YYY referring to on disk values)
        should be the same for in this regard for floats, doubles, etc.

        Your printing mods are the only new feature for 2.7.2.
        If you agree with my suggested change, then we will release 2.7.2
        after you supply the patch. Otherwise, I'll ready 2.7.2 for release now.

        It's great to hear that you're getting some consulting work!
        I will try to get an "SDO" proposal submitted to NASA this year.

        Thanks,
        Charlie

         
    • henry Butowsky

      henry Butowsky - 2003-03-17

      Hi Charlie, sorry about the confusion over the indices for char
      Have commited a new nco_msa.c
      Indices are now relative to disk.

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2003-03-17

        Great, I'll make a 2.7.2 release out of this soon.

        Thanks!
        Charlie

         

Log in to post a comment.