Menu

ncap top-level loop re-design

Developers
2005-03-30
2013-10-17
  • Charlie Zender

    Charlie Zender - 2005-03-30

    Hi All,

    Henry and I are thinking about re-designing the high level structure
    of ncap so that it (once again) does two passes through the parser.
    Here's what we're currently thinking:

    The remaining benchmark sluggishness at start appears to be caused by the ncap-specific
    behavior of opening+closing the output file to define each variable.
    Because adding variables can incur a complete file re-write, this
    penalty kills ncap performance on scripts which define many variables.
    One solution is to make two passes through the script:
    Pass 1 collects the metadata and does some dependency analysis.
    The metadata must be sufficient to define each output variable.
    Thus the type, size, and order of dimensions must be determined.
    At the end of pass one, ncap defines all variables where the
    dependency analysis showed that all the information needed was
    available during pass 1.

    Pass 2 evaluates the statements and populates all variables.
    Some variables, such as string variables, might need to be defined
    and populated both in pass 2 in exceptional cases.
    LHS-defined attributes pose a special challenge because they
    are metadata, yet their length or values may be unknown until
    evaluation. However it would be simple to accumulate a list
    of aed structures during pass 2 and to process them all at once
    at the end of the pass.

    For proof of concept, pass 1 could define only LHS-casted variables
    since determining their metadata is trivial (it's all in the LHS
    cast expression and needs no RHS evaluation).
    Determining the rank and dimension ordering of a general RHS
    expression is not-trivial but not impossible either.
    Nevertheless it's probably wiser to start with defining only
    the simplest variables during pass 1.
    This will, by the way, have the advantage of speeding the benchmark
    building script which is virtually _all_ LHS-casting.
    This is a good opportunity for others to comment on this ncap
    top-level loop re-design.

    Thanks,
    Charlie

     
    • Charlie Zender

      Charlie Zender - 2005-04-18

      Hi Henry,

      What is the memory cost of this algorithm?
      Does it allocate all the RAM to hold the variables in the first pass?
      Or does it just allocate space for metadata in the first pass?

      > What I've done to optimize ncap is to parse the
      > script twice.  In the first parse variables are "wrtten to memory"
      > rather than to disk. After this the list of variables are defined in
      > the output.  In the second parse these variables are populated with
      > real variables. To realize this - means in practice wrting code around
      > ncap_var_init() and ncap_var_write(). Im so close to completion it
      > hurts -- I just need to solve some memory problems -- ncap.in almost
      > runs

      > The advantages of my approach is that if/when we add dimension
      > altering functions the code will still be able to calcuate of the
      > dims/size of the final vars. One problem is double the memory leakage

      > ( this is different from the old code where the script was scanned
      > once then parsed once)

      I'm particularly interested in how memory usage scales.
      The file creation part of the benchmarking script, for example,
      uses LHS casting to create ipcc_T85 files of size ~ 1 GB.
      Does your algorithm require 1 GB RAM (or twice that or whatever)
      to run this script because it holds buffers for all the variables
      in memory?

      Thanks,
      Charlie

       
    • Nobody/Anonymous

      Hi Charlie,
      On the first pass only the variable metadata is saved.
      If this newly defined variable is required later in the first pass then it is populated with nulls ( use calloc) and after use the memory freed.

      After the first pass, the variables in the list (additional variables in prs_arg) are defined in output. On the second pass these variables are then populated. The advantage of my system is that almost all LHS vars can be defined , even if we introduce dimension altering fumctions. if/then stucture will requrie some tinkering but should be possible.

      Later on an interesting addtional optimization may be possible on the first pass. If we modify the arithmetic functions so they work with  vars with no data then there will be no need to temporarily populate them. For example we call ncap_var_var_add() and the returned variable has no data (i.e val.vp=NULL) but the variable has the dimesion structure we need...

      Anyway. The code is a bit rough -- (Its been a while since I have done a major project on NCO)
      Need to add code to handle LHS attributes on the first pass ( modify the attrubute code in the lexer so It looks for vars in the define list. We need to do this because the  operation (var op scaler) can change the variable type

      The proof is in the pudding - so benchmark away ( Im dying to here the results)

      Regards Henry

       
      • Charlie Zender

        Charlie Zender - 2005-04-19

        Hi Henry,

        Sounds like a promising approach.
        Builds fine for me.
        Seems to fail on ncap.in when writing val_half_half
        Is this the known problem with LHS attributes to which
        you referred?

        Also, you should be able to run the benchmarks yourself
        to check performance, with something like

        cd nco/bm;nco_bm.pl

        It will take awhile (twenty minutes?)

        Thanks,
        Charlie

        zender@elnino:~/nco/data$ ncap -O -v -S ncap.in in.nc foo.nc
        defined in output a2
        defined in output a3
        defined in output a4
        defined in output a5
        defined in output a6
        defined in output prs_mdp
        defined in output a7
        defined in output a8
        defined in output nine
        defined in output one
        defined in output two
        defined in output (null)
        Segmentation fault

         
      • Charlie Zender

        Charlie Zender - 2005-04-20

        Hey Henry,

        The new code smokes the old code.
        ncap IPCC file creation time is down to about 6 minutes.
        It's as fast on my laptop as the ESMF, too.
        This makes running the benchmarks much less painful.
        This speedup is about 200-250% relative to old code, if memory servers.

        Do you think  that further significant improvements in speed
        are possible at this point?

        Thanks,
        Charlie

         
    • Nobody/Anonymous

      Hi Charlie & Harry
      Have committed changes to all ncap code

      Regards Henry

       
    • Nobody/Anonymous

      Hi Charlie,
      Very excited about the benchmark !
      In theory it may be possible to nearly halve the current time.
      Doing two parses is a complicated busines ; I would like to expand on two points I made in my post.

      1) Arithmetic Operations  with zero data
         if we can get the function families
        ncap_var_var_op() & ncap_var_scv_op() and
        ncap_var_retype()
        ncap_var_cnf_dmn()
        byte(), char(),short() etc    
           
        The idea is that when there  is zero data (var->val.vp=NULL), The functions simly return an empty var  var with the correct shape.If it is not possible to get a function to behave like this (eg pack ) then we have a work around --(zulu functions and undefined)           

        With no lage data chunks to define and very little writing to disk the first pass Should take at least 40%   off the current time        

      2) Zulu Functions and Undefined
         Consider the sctipt below    

      aa[lat]=1L;;
      bb[lat,lev]=2.0f;
      cc[lat,lev,lon]3.0d;

      cc@min=9.2;
      cc@max=zulu1(aa);  /* cc@max is also now undefined in first pass */
      d=zulu2(cc,10.0); /* dd is also now undefined in first pass */

      f=d+cc@min ;   
      g=cc+cc@max
      g@long_att=aa /* puttinga 1D var into an attribute */

      A zulu function cannot handle an empty var (val.vp=NULL).So on the first pass parse it returns a var or attribute that is undefined. ( See the above script "cc@max" and "d" are both declared but undefined) .
      If these udefined vars are subsequently used in an expression , then that expression also becomes undefined ( so in the first parse  "f" and "g" are also undfined). ( ie undefined" percolates through an expression )

      Obviously undefined variables are defined and populated in the second parse.
      With this mechanism we can choose what functions to optimize in the first parse.
      The variables & attribtes in an if/then will I think be zulu's.

      I hope this makes sense

      Regards Henry
      Hi Charlie,
      Very excited about the benchmark !
      In theory it may be possible to nearly halve the current time.
      Doing two parses is a complicated busines ; I would like to expand on two points I made in my post.

      1) Arithmetic Operations  with zero data
         if we can get the function families
        ncap_var_var_op() & ncap_var_scv_op() and
        ncap_var_retype()
        ncap_var_cnf_dmn()
        byte(), char(),short() etc    
           
        The idea is that when there  is zero data (var->val.vp=NULL), The functions simly return an empty var  var with the correct shape.If it is not possible to get a function to behave like this (eg pack ) then we have a work around --(zulu functions and undefined)           

        With no lage data chunks to define and very little writing to disk the first pass Should take at least 40%   off the current time        

      2) Zulu Functions and Undefined
         Consider the sctipt below    

      aa[lat]=1L;;
      bb[lat,lev]=2.0f;
      cc[lat,lev,lon]3.0d;

      cc@min=9.2;
      cc@max=zulu1(aa);  /* cc@max is also now undefined in first pass */
      d=zulu2(cc,10.0); /* dd is also now undefined in first pass */

      f=d+cc@min ;   
      g=cc+cc@max
      g@long_att=aa /* puttinga 1D var into an attribute */

      A zulu function cannot handle an empty var (val.vp=NULL).So on the first pass parse it returns a var or attribute that is undefined. ( See the above script "cc@max" and "d" are both declared but undefined) .
      If these udefined vars are subsequently used in an expression , then that expression also becomes undefined ( so in the first parse  "f" and "g" are also undfined). ( ie undefined" percolates through an expression )

      Obviously undefined variables are defined and populated in the second parse.
      With this mechanism we can choose what functions to optimize in the first parse.
      The variables & attribtes in an if/then will I think be zulu's.

      I hope this makes sense

      Regards Henry

       

Log in to post a comment.