From: <er...@he...> - 2003-07-01 15:14:40
|
On Tue, Jul 01, 2003 at 09:31:23AM -0400, Nicholas Henke wrote: > Hey Erik~ > I am again faced with pesky Java users who are wanting to use bpsh to > farm out their tasks. I am running low on ammunition to kill them, so I > figured I would take a stab at getting the 'clone' system call working > in bproc. First -- is this going to be possible ? Second - can you give > me a rough overview of what needs to be done ? I believe clone works. Most of the interesting stuff with clone is local to the node and BProc doesn't get involved at all. So, in theory, it should be possible to make Java work. I think there are two things which you are likely to have trouble with: 1 - Some of the thread group stuff (CLONE_THREAD) may not work. This stuff has been kind of fluid in the 2.4.x kernels so it seems unlikely that many things use it. 2 - You cannot migrate a multi-threaded task. Some of the guys at LBL are working on some extensions to VMADump to do handle multi-threaded tasks for some checkpointing work they're doing but none of this has been combined with BProc at this point. BProc would also have to become aware of these situations. Migration will end up creating copies of the program. Also, on x86, vmadump isn't aware of funky LDT stuff which will also hamper migration. Note that this doesn't mean you can't bpsh a multi-threaded program. The other possible funny bit that you're likely to run into is that fork/clone is much slower than normal because it involves the front end. This could lead to new/interesting races or just poor performance in apps that create/clean-up threads a lot. In terms of what needs to be done, that depends entirely on what you're trying to run. I've done some simple pthreads things on nodes w/o problems. The first place to look is probably strace output of a program that fails. Then try and figure out how what the app is seeing differs from what it's expecting. - Erik |