From: Erik A. H. <er...@he...> - 2002-03-19 12:33:14
|
On Mon, Mar 18, 2002 at 06:32:57PM +0100, ha...@no... wrote: > Clubmask batch spooling can run scripts via BProc and commercial PBS > also can. BProc 3.1.6+ can exec() on slave getting image from master, > but this most likely is not the PBS way of running scripts. We are > somewhere near to scripts transparently executing on slaves, but not > yet there. > > How exactly is script run? What's where during script execution? > > I can imagine these scenarios: > > 1) Slave nodes have NFS mounted not only /home but also /bin, /usr/bin > etc. Execution server is on master. Script is moved to slave somehow like > > bpsh NODE bash script > > and whenever bash executes command from script, it > > a) just gets the binary over NFS mount and does normal local exec() > > b) uses NFS just to look around, when it comes to exec(), binary is > fetched from master via BProc transparently to bash which does > not know that exec() does something unusual > > 2) Already execution server is BProc-moved to slave. Everything is > NFS-mounted as in 1), server runs on slave and gets everything > via NFS. > > 3) Execution server is on master, scripts are run on master via > modified bash which BProc-moves just certain heavy-duty executables > to slaves. Just /home is NFS-mounted. > > > I do not quite beleive that /bin and /usr/bin are NFS mounted on all > slaves as in 1) and 2) and I do not beleive that there is modified > bash as in 3). So Clubmask and PBS probably use some method which is > over my imagination. Please tell me what it is. PBS is not a good match for a BProc based system. It comes with a lot of baggage for managing remote machines that you just don't need with BProc. We don't use PBS (haven't even tried) or any other existing scheduler on our systems here. The scheduler has been one of the holes in our environment. We have a student working a on an entirely new (and very simple) scheduler for BProc based systems. I believe Scyld has one too although it's not open source. As far as scripts go, I try to discourage people from trying to run scripts on nodes. It's really not designed for it. That being said, there are some facilities for running scripts. There is a gross hack to make #! style execs work with execmove (bpsh). There's also an Aexecve() hook which uses the ghost process on the front end to exec() non-existent binaries. There's no caching of binaries on the slave nodes though. > I also wander about performance comparison of BProc and NFS. For > dynamically linked executable, overhead of 1)a) and 1)b) should be > comparable: > > 1)a) executable is moved via BProc > libraries are cached on slave > > 1)b) executable got via NFS > libraries got via NFS, next time probably cached in filesystem cache > > so the only advantage of BProc would be common PID space. However many > docs imply BProc move is better. So what is wrong with comparison > above? NFS is flaky and does not move data as quickly as BProc? > Something else? BProc should always be faster for the reason you mention. The number I like is 3ms is pretty much the baseline overhead for BProc move on myrinet. (i.e. time to send your process size + 3ms) > The reason for all this is that I'd like to use Grid Engine with > BProc, though I will also consider Clubmask (and Clustermatic? does it > have batch spooling?). No spooler in clustermatic yet. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |