From: <ha...@no...> - 2002-03-19 17:31:15
|
> > > Clubmask batch spooling can run scripts via BProc > > > > I must correct myself (looked to docs between) - it runs scripts on > > master only, right? > Well the scripts start off there -- but it doesnt need to be a script that > is run, you can specify a binary that uses all of the bproc commands to do > the process invocation. OK, looked to your docs for the third time and again got more of it... :-) There are two types of scripts one could care about: 1) Scripts preparing environments for parallel jobs (e.g. using MPI). Mostly written by cluster administrator, these scripts are moreorless part of computing system and can contain things like 'getnodes' and 'bpsh'. Predefined example scripts play this role in Clubmask. They are part of "Parallel Environment" definitions in Grid Engine. They should execute on master in BProc-based spooling systems. 2) Scripts for non-parallel jobs. Each such script requires one processor only, does some housekeeping on the entry and exit and most likely runs few heavy executables (or just one) to do the hard work. Instead of message passing inside MPI, these jobs read and write files and they are synchronized using job dependencies (start job 11 when jobs 1-10 are finished). These jobs are written by users and are expected to be the same across various implementations of batch spooling systems. Some sites seem to care about 1) and MPI (or PVM) only. But in some areas (e.g. our speech recognizer training) problems are best solved using 2). This is where things start to conflict: - I want to use BProc because it makes cluster administration easy - I want let my users to install some standard spooling system at home, at laptops etc., read standard documentation, prepare standard job scripts, learn and debug - I want them to carry unchanged scripts to cluster and just see the job done much more quickly (well, also finish debuging) Standard spooling systems like GE or PBS expect job scripts to be executed on slave nodes. BProc does not quite like it. The best solution I see so far is to mark heavy executables with prefix which expands to nothing on laptops and home computers and expands to 'bpsh right-node' on cluster (where all these scripts would execute on master, off-loading heavy executables to nodes) Probably it is good compromise. But I am not exactly happy to learn users that they should mark heavy executables and risk master node overload when they do not. Moving the user's script as a whole is still tempting... Best Regards Vaclav |