Re: [BProc] Re: How scripts are run in batch spooling

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Mar 20, 2002 at 11:54:40AM +0100, ha...@no... wrote:
> > > Moving the user's script as a whole is still tempting...
> > 
> > That is what we tend to do for the embarrasingly parallel batch jobs, the 
> > wrapper ( master node bpsh... script) just executes a ton of the indivual 
> > scripts. What we have is the binaries mounted on GigE NFS/PVFS so that we 
> > dont have to move the binaries over bproc, just the script that will call 
> > the programs. It seems to work pretty well.
> 
> Ah, here we are finally. One could think that dirs like /bin and
> /usr/bin network-mounted on nodes are insane (and running scripts on nodes
> is insane) once you have BProc but real-life statistics say otherwise:
> 
> - it happens in Clubmask, as indicated above
> - it happens in my system, I did not know otherwise
> - commercial Scyld port of PBS is likely to do it cause probably there
>   is no other way to run standard PBS user script jobs (even more
>   likely given the old version of BProc they use)
> - Clustermatic does not do it (?) (yet? - no batch spooling there yet)

We have our own effort to create a simple and hopefully elegant
scheduler specifically for BProc type environments.  The guy writing
tells me it's pretty much ready to test on some realy boxes so I'm
going to be taking a closer look at it this week.

The internals are based on the Distributed Job Manager (DJM) from the
CM-5.  No attempt has been made to provide an interface that's
compatible with anything out there right now.

It does not directly start anything on BProc nodes.  It simply does
bproc_chown, etc. on nodes and then runs a script on the front end.
An environment variable contains a list of node numbers that mpirun or
whatever should run processes on.

I'm trying to get people out of the habit of running some script on
every node.

> On the other side, nice unified PID space still works nicely with
> network-mounted binaries as long as all processes which are to belong
> to unified PID space are descendants of processes BProc-moved to
> nodes, right?

Yup.

> Given this situation, what are the future strategic directions for
> BProc to go?
>
> I can imagine this: Quite general file cache in RAM and local
> harddisks, not only for libraries but also for executables and
> data. This cache may or may not be part of BProc but should
> collaborate closely.
> 
> This cache could behave as filesystem which mimics whole file
> namespace on master and script chrooted to it could be perfectly
> happy.
> 
> I would quite enjoy data file cache on slave harddisks. We use huge
> sets of files and our script jobs use repeated subsets of them. Local
> harddisks are big and much quicker than our network interconnect. (And
> I hate copying files around by hand or creating scripts for it; file
> cache should take care, not me.)

What you've described here is a network file system with caching
ablilities.  I think that problem is independent from BProc.  I have
no plans to try and solve it well within BProc.  The library caching
stuff that's a part of BProc right now is basically a really crude
network file system with no support for coherency at all.

- Erik