From: <er...@he...> - 2004-04-08 17:11:17
|
On Mon, Apr 05, 2004 at 04:49:12AM +0900, Kimitoshi Takahashi wrote: > er...@he... wrote: > >On Thu, Apr 01, 2004 at 03:04:29AM +0900, Kimitoshi Takahashi wrote: > >> Hi all, > >> > >> Form what I read from Bproc documents, the process migration is volunatry, > >> meaning bproc_move() must be called from the proccess to be moved. > >> > >> The lovely bpsh seems to wrap non-bproc program, and cause the program to move involuntary, > >> using bproc_vexecmove(), only at the begining. > >> > >> I'm wondering if there is any way to cause a non-bproc procces to move involuntary any time > >> at user's will. > >> > >> My colleague uses a heterogeneous cluster where the memory sizes on nodes vary. > >> He sometimes wants to move small process on a large memory machine > >> before he starts obviously huge proccess. He is only using bpsh to start processes. > >> > >> Is it technically feasible to write a like of bpsh which always wraps a process on slave nodes, > >> and handles a "move now to where" signal ? > >> > >> How would you deal with the situation my colleague has ? > > > >I think a "wrapper" could take the form of a shared library. You > >could manually LD_PRELOAD yourself or you could modify bpsh to > >automatically set LD_PRELOAD for the child processes. > > I'm afraid I don't fully understand what you meant, > probably I need to learn more about basics of C programing .... > My guess is that signal handler is in libc and you suggested to > preload a signal handler which calls bproc_move() when it gets certain signal. > Is that what you meant ? Not exactly. LD_PRELOAD instructs the dynamic linker to load a library that it wouldn't otherwise load. It also loads it before the libraries that it would normally load. This allows it to override functions in the other libraries. The Electric Fence malloc debugging tool is a nice example of this kind of thing. By default, an application doesn't have signal handlers. If it wants to handle a signal it sets a signal handler. The amounts to telling the kernel to call a function when the signal arrives. The library could setup a signal handler without telling the application about it. > >A signal seems like a good way to get the process's attention but you > >still need another way to tell it where to move to. I can't think of > >anything easy for that off the top of my head. > > How about making it a two step process: > 1. When a process gets certain signal, it VMAdumps itself to the network stream > and bpmaster stores it into a file on the master. > 2. You can then manually restart the process explicitly specifying where to move. > > It's not cool in that the process migration is not peer to peer, > rather it is origin-master-target. > > This could be also used as a general check point/restarting functionality. Yeah. I think what you've described here is just a simple checkpointing mechanism. The only snag is that you'll have to re-open files after restoring the checkpoint. - Erik |