NCO netCDF Operators / Discussion / Developers: ncap top-level loop re-design

Charlie Zender - 2005-03-30

Hi All,

Henry and I are thinking about re-designing the high level structure
of ncap so that it (once again) does two passes through the parser.
Here's what we're currently thinking:

The remaining benchmark sluggishness at start appears to be caused by the ncap-specific
behavior of opening+closing the output file to define each variable.
Because adding variables can incur a complete file re-write, this
penalty kills ncap performance on scripts which define many variables.
One solution is to make two passes through the script:
Pass 1 collects the metadata and does some dependency analysis.
The metadata must be sufficient to define each output variable.
Thus the type, size, and order of dimensions must be determined.
At the end of pass one, ncap defines all variables where the
dependency analysis showed that all the information needed was
available during pass 1.

Pass 2 evaluates the statements and populates all variables.
Some variables, such as string variables, might need to be defined
and populated both in pass 2 in exceptional cases.
LHS-defined attributes pose a special challenge because they
are metadata, yet their length or values may be unknown until
evaluation. However it would be simple to accumulate a list
of aed structures during pass 2 and to process them all at once
at the end of the pass.

For proof of concept, pass 1 could define only LHS-casted variables
since determining their metadata is trivial (it's all in the LHS
cast expression and needs no RHS evaluation).
Determining the rank and dimension ordering of a general RHS
expression is not-trivial but not impossible either.
Nevertheless it's probably wiser to start with defining only
the simplest variables during pass 1.
This will, by the way, have the advantage of speeding the benchmark
building script which is virtually _all_ LHS-casting.
This is a good opportunity for others to comment on this ncap
top-level loop re-design.

Thanks,
Charlie

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Charlie Zender - 2005-04-18
  
  Hi Henry,
  
  What is the memory cost of this algorithm?
  Does it allocate all the RAM to hold the variables in the first pass?
  Or does it just allocate space for metadata in the first pass?
  
  > What I've done to optimize ncap is to parse the
  > script twice. In the first parse variables are "wrtten to memory"
  > rather than to disk. After this the list of variables are defined in
  > the output. In the second parse these variables are populated with
  > real variables. To realize this - means in practice wrting code around
  > ncap_var_init() and ncap_var_write(). Im so close to completion it
  > hurts -- I just need to solve some memory problems -- ncap.in almost
  > runs
  
  > The advantages of my approach is that if/when we add dimension
  > altering functions the code will still be able to calcuate of the
  > dims/size of the final vars. One problem is double the memory leakage
  
  > ( this is different from the old code where the script was scanned
  > once then parsed once)
  
  I'm particularly interested in how memory usage scales.
  The file creation part of the benchmarking script, for example,
  uses LHS casting to create ipcc_T85 files of size ~ 1 GB.
  Does your algorithm require 1 GB RAM (or twice that or whatever)
  to run this script because it holds buffers for all the variables
  in memory?
  
  Thanks,
  Charlie
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2005-04-19
  
  Hi Charlie,
  On the first pass only the variable metadata is saved.
  If this newly defined variable is required later in the first pass then it is populated with nulls ( use calloc) and after use the memory freed.
  
  After the first pass, the variables in the list (additional variables in prs_arg) are defined in output. On the second pass these variables are then populated. The advantage of my system is that almost all LHS vars can be defined , even if we introduce dimension altering fumctions. if/then stucture will requrie some tinkering but should be possible.
  
  Later on an interesting addtional optimization may be possible on the first pass. If we modify the arithmetic functions so they work with vars with no data then there will be no need to temporarily populate them. For example we call ncap_var_var_add() and the returned variable has no data (i.e val.vp=NULL) but the variable has the dimesion structure we need...
  
  Anyway. The code is a bit rough -- (Its been a while since I have done a major project on NCO)
  Need to add code to handle LHS attributes on the first pass ( modify the attrubute code in the lexer so It looks for vars in the define list. We need to do this because the operation (var op scaler) can change the variable type
  
  The proof is in the pudding - so benchmark away ( Im dying to here the results)
  
  Regards Henry
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Charlie Zender - 2005-04-19
    
    Hi Henry,
    
    Sounds like a promising approach.
    Builds fine for me.
    Seems to fail on ncap.in when writing val_half_half
    Is this the known problem with LHS attributes to which
    you referred?
    
    Also, you should be able to run the benchmarks yourself
    to check performance, with something like
    
    cd nco/bm;nco_bm.pl
    
    It will take awhile (twenty minutes?)
    
    Thanks,
    Charlie
    
    zender@elnino:~/nco/data$ ncap -O -v -S ncap.in in.nc foo.nc
    defined in output a2
    defined in output a3
    defined in output a4
    defined in output a5
    defined in output a6
    defined in output prs_mdp
    defined in output a7
    defined in output a8
    defined in output nine
    defined in output one
    defined in output two
    defined in output (null)
    Segmentation fault
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Charlie Zender - 2005-04-20
    
    Hey Henry,
    
    The new code smokes the old code.
    ncap IPCC file creation time is down to about 6 minutes.
    It's as fast on my laptop as the ESMF, too.
    This makes running the benchmarks much less painful.
    This speedup is about 200-250% relative to old code, if memory servers.
    
    Do you think that further significant improvements in speed
    are possible at this point?
    
    Thanks,
    Charlie
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2005-04-19
  
  Hi Charlie & Harry
  Have committed changes to all ncap code
  
  Regards Henry
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2005-04-20
  
  Hi Charlie,
  Very excited about the benchmark !
  In theory it may be possible to nearly halve the current time.
  Doing two parses is a complicated busines ; I would like to expand on two points I made in my post.
  
  1) Arithmetic Operations with zero data
     if we can get the function families
  ncap_var_var_op() & ncap_var_scv_op() and
  ncap_var_retype()
  ncap_var_cnf_dmn()
  byte(), char(),short() etc
  
  The idea is that when there is zero data (var->val.vp=NULL), The functions simly return an empty var var with the correct shape.If it is not possible to get a function to behave like this (eg pack ) then we have a work around --(zulu functions and undefined)
  
  With no lage data chunks to define and very little writing to disk the first pass Should take at least 40%   off the current time
  
  2) Zulu Functions and Undefined
     Consider the sctipt below
  
  aa[lat]=1L;;
  bb[lat,lev]=2.0f;
  cc[lat,lev,lon]3.0d;
  
  cc@min=9.2;
  cc@max=zulu1(aa); /* cc@max is also now undefined in first pass */
  d=zulu2(cc,10.0); /* dd is also now undefined in first pass */
  
  f=d+cc@min ;
  g=cc+cc@max
  g@long_att=aa /* puttinga 1D var into an attribute */
  
  A zulu function cannot handle an empty var (val.vp=NULL).So on the first pass parse it returns a var or attribute that is undefined. ( See the above script "cc@max" and "d" are both declared but undefined) .
  If these udefined vars are subsequently used in an expression , then that expression also becomes undefined ( so in the first parse "f" and "g" are also undfined). ( ie undefined" percolates through an expression )
  
  Obviously undefined variables are defined and populated in the second parse.
  With this mechanism we can choose what functions to optimize in the first parse.
  The variables & attribtes in an if/then will I think be zulu's.
  
  I hope this makes sense
  
  Regards Henry
  Hi Charlie,
  Very excited about the benchmark !
  In theory it may be possible to nearly halve the current time.
  Doing two parses is a complicated busines ; I would like to expand on two points I made in my post.
  
  1) Arithmetic Operations with zero data
     if we can get the function families
  ncap_var_var_op() & ncap_var_scv_op() and
  ncap_var_retype()
  ncap_var_cnf_dmn()
  byte(), char(),short() etc
  
  The idea is that when there is zero data (var->val.vp=NULL), The functions simly return an empty var var with the correct shape.If it is not possible to get a function to behave like this (eg pack ) then we have a work around --(zulu functions and undefined)
  
  With no lage data chunks to define and very little writing to disk the first pass Should take at least 40%   off the current time
  
  2) Zulu Functions and Undefined
     Consider the sctipt below
  
  aa[lat]=1L;;
  bb[lat,lev]=2.0f;
  cc[lat,lev,lon]3.0d;
  
  cc@min=9.2;
  cc@max=zulu1(aa); /* cc@max is also now undefined in first pass */
  d=zulu2(cc,10.0); /* dd is also now undefined in first pass */
  
  f=d+cc@min ;
  g=cc+cc@max
  g@long_att=aa /* puttinga 1D var into an attribute */
  
  A zulu function cannot handle an empty var (val.vp=NULL).So on the first pass parse it returns a var or attribute that is undefined. ( See the above script "cc@max" and "d" are both declared but undefined) .
  If these udefined vars are subsequently used in an expression , then that expression also becomes undefined ( so in the first parse "f" and "g" are also undfined). ( ie undefined" percolates through an expression )
  
  Obviously undefined variables are defined and populated in the second parse.
  With this mechanism we can choose what functions to optimize in the first parse.
  The variables & attribtes in an if/then will I think be zulu's.
  
  I hope this makes sense
  
  Regards Henry
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ncap top-level loop re-design

Command-line operators for netCDF and HDF files

Forums

Help

ncap top-level loop re-design

ncap top-level loop re-design

Command-line operators for netCDF and HDF files

Forums

Help

ncap top-level loop re-design document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

ncap top-level loop re-design