You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: Nicholas H. <he...@se...> - 2003-07-09 15:43:21
|
For the love of Pete -- sorry bproc'ers -- I posted to the wrong list :( Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Nicholas H. <he...@se...> - 2003-07-09 15:40:50
|
On Wed, 2003-07-09 at 11:26, Thomas Clausen wrote: > Hi, > > root@betty:~# telnet localhost 2709 > Trying 127.0.0.1... > Connected to here. > Escape character is '^]'. > S > ( > ) > ^] > telnet> quit > Connection closed. > root@betty:~# > > So supermon reports nothing. Hmm. Oops... bug on my part. Can you try the attached patch for lib/python/resource_manager/BprocSupermon.py? You will need to re-run 'python2.2 setup.py install' again. I was taking empty data to mean bad data, but that is not always the case -- especially when there are no nodes :/ If this fixes it, I will cut 0.5b81 with the fixes. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: <er...@he...> - 2003-07-07 19:31:05
|
On Wed, Jul 02, 2003 at 12:34:12PM -0400, Nicholas Henke wrote: > On Tue, 2003-07-01 at 18:39, er...@he... wrote: > > > > P.S. I've attached a quick port of the 3.2.3 patch to 2.4.20. I > > think it should work. > > Same S@#$t, Different Kernel. Nic: I have a hunch about what might be going on here. There's some potential for badness in exit_notify with BProc. kill_pg and is_orphaned_pgrp might end up setting the process state back to RUNNING instead of ZOMBIE. Then they could get hung up because the ghost is gone because it's already exited. I've attached a revised patch which I think should fix that. Can you try it an see if it helps? - Erik |
From: Nicholas H. <he...@se...> - 2003-07-03 14:56:29
|
On Thu, 2003-07-03 at 10:34, Thomas Clausen wrote: > Hi all, > > I'm trying to compile bproc 3.2.5 using kernel 2.4.20. I had this up and > running, then patched my kernel with the clubmask linux-2.4.17-avenrun.patch > and linux-2.4.19-mem-swap-syms.patch patches. Now I get the following > unresolved symbols: > > root@betty~ modprobe bproc > /lib/modules/2.4.20/bproc/bproc.o: unresolved symbol irq_stat_Rsmp_6b40ff0b > /lib/modules/2.4.20/bproc/bproc.o: insmod /lib/modules/2.4.20/bproc/bproc.o > failed > /lib/modules/2.4.20/bproc/bproc.o: insmod bproc failed Did you recompile the bproc module after the patches ? It looks like it may just be a modversion problem. Applogies if that is the fix -- I should have placed that in the docs. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Thomas C. <tcl...@we...> - 2003-07-03 14:35:24
|
Hi all, I'm trying to compile bproc 3.2.5 using kernel 2.4.20. I had this up and running, then patched my kernel with the clubmask linux-2.4.17-avenrun.patch and linux-2.4.19-mem-swap-syms.patch patches. Now I get the following unresolved symbols: root@betty~ modprobe bproc /lib/modules/2.4.20/bproc/bproc.o: unresolved symbol irq_stat_Rsmp_6b40ff0b /lib/modules/2.4.20/bproc/bproc.o: insmod /lib/modules/2.4.20/bproc/bproc.o failed /lib/modules/2.4.20/bproc/bproc.o: insmod bproc failed root@betty~ Any help is appreciated. Thanks, Thomas -- .^. Thomas Clausen, post doc /V\ Physics Department, Wesleyan University, CT // \\ Tel 860-685-2018, fax 860-685-2031 /( )\ ^^-^^ Use Linux |
From: Nicholas H. <he...@se...> - 2003-07-02 16:34:25
|
On Tue, 2003-07-01 at 18:39, er...@he... wrote: > > P.S. I've attached a quick port of the 3.2.3 patch to 2.4.20. I > think it should work. Same S@#$t, Different Kernel. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Nicholas H. <he...@se...> - 2003-07-02 14:37:28
|
On Tue, 2003-07-01 at 18:39, er...@he... wrote: > I think user land back-traces are probably useless since this is some > kind of weird kernel-land problem - and the judging by the message > traces you've sent me before, the procs are getting caught somewhere > in exit (i.e. signal received and *trying* to exit). Ahhh.. that would make sense. > > It doesn't look like much changed to me between 2.4.18 and 2.4.19 but > some of the process tree handling code in exit code did. The examples > you sent me a while back all show several threads/processes being > killed at once. I have a sneaking suspicion that this is somehow a > race related to many things exiting and getting re-parented at the > same time. Ew -- and that is my official opinion of that. > > I have no idea how that's getting hung up but maybe we can determine > if it's really such a race or not. To make a long story short, can > you try the following: Sure -- I have attached a text file with the results -- slightly more readable than limiting it to 80 chars in email. > > Kill the threads one at time and see if they still get hung up in that > weird state. A half a second in between kills should be more than > enough. Then maybe bottom up or top->bottom might be interesting. Basically -- top->bottom : screwed. bottom->top+sleep: ok, bottom->top+nosleep: screwed. > > I appoligize if I've ased this before: When the threads are hung, does > the system seem healthy otherwise? Specifically, no problems creating > or killing other processes? Yes it does -- I have no problems ssh'ing or bpsh'ing in and running anything. > > P.S. I've attached a quick port of the 3.2.3 patch to 2.4.20. I > think it should work. Thanks! I will see what this produces as well. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: <er...@he...> - 2003-07-01 22:48:29
|
On Tue, Jul 01, 2003 at 04:03:28PM -0400, Nicholas Henke wrote: > Ok -- So I have managed to find the change in versions that isolates the > problem, unfortuneately, it is a kernel version change that triggers it, > not a bproc one. > > FYI -- The working combination is 2.4.18 patched for bproc 3.2.3 -- I > used the diff in the patches to backport the 2.4.19 patch for 3.2.3 to > 2.4.18 > > The 'bad' combination is 2.4.19 with bproc 3.2.3. > > So, the behavior that I am seeing now, is that a program is bpsh'd to a > node, where it uses pthreads to create a few threads to do the work. At > some point, the threads hang, and it takes a 'kill -9' to kill them. > Most of the time this will work, but I have noticed that I will have to > go to the node and 'kill -9' them there for the process to die all of > the way, if not, and I kill -9 from the fron-end, the processes will be > removed from the front-end ps output, but when I ssh to the remote node, > it is still there, and needs another kill -9 to kill it. There is also > the case where the process on the remote node just refuses to die -- > kill -9 will not pull it out of whereever it is stuck. > > What else can I provide ? Would it be possible to get a patch for bproc > 3.2.3 for kernel 2.4.20 to see if I get the same behavior there ? > > Here is a traceback for when the threads hang.This is the same traceback > as when the process ignores the kill -9. I think user land back-traces are probably useless since this is some kind of weird kernel-land problem - and the judging by the message traces you've sent me before, the procs are getting caught somewhere in exit (i.e. signal received and *trying* to exit). It doesn't look like much changed to me between 2.4.18 and 2.4.19 but some of the process tree handling code in exit code did. The examples you sent me a while back all show several threads/processes being killed at once. I have a sneaking suspicion that this is somehow a race related to many things exiting and getting re-parented at the same time. I have no idea how that's getting hung up but maybe we can determine if it's really such a race or not. To make a long story short, can you try the following: Kill the threads one at time and see if they still get hung up in that weird state. A half a second in between kills should be more than enough. Then maybe bottom up or top->bottom might be interesting. I appoligize if I've ased this before: When the threads are hung, does the system seem healthy otherwise? Specifically, no problems creating or killing other processes? - Erik P.S. I've attached a quick port of the 3.2.3 patch to 2.4.20. I think it should work. |
From: Nicholas H. <he...@se...> - 2003-07-01 20:44:05
|
On Tue, 2003-07-01 at 11:05, er...@he... wrote: > > I believe clone works. Most of the interesting stuff with clone is > local to the node and BProc doesn't get involved at all. So, in > theory, it should be possible to make Java work. Ok > > I think there are two things which you are likely to have trouble with: > > 1 - Some of the thread group stuff (CLONE_THREAD) may not work. This > stuff has been kind of fluid in the 2.4.x kernels so it seems > unlikely that many things use it. Why does it seem likely that Java uses it then --- friggin' Java! > > 2 - You cannot migrate a multi-threaded task. Some of the guys at LBL > are working on some extensions to VMADump to do handle > multi-threaded tasks for some checkpointing work they're doing but > none of this has been combined with BProc at this point. BProc > would also have to become aware of these situations. That would be very cool. > > Migration will end up creating copies of the program. Also, on > x86, vmadump isn't aware of funky LDT stuff which will also hamper > migration. Note that this doesn't mean you can't bpsh a > multi-threaded program. > > The other possible funny bit that you're likely to run into is that > fork/clone is much slower than normal because it involves the front > end. This could lead to new/interesting races or just poor > performance in apps that create/clean-up threads a lot. > > In terms of what needs to be done, that depends entirely on what > you're trying to run. I've done some simple pthreads things on nodes > w/o problems. The first place to look is probably strace output of a > program that fails. Then try and figure out how what the app is > seeing differs from what it's expecting. We have several programs that use pthreads here as well -- and they seem to run fine ( apart from the sigsuspend issue in 2.4.19 ), it is just that java seems a bit confused -- I will put together a complete bug report, and I guess we can go from there. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Nicholas H. <he...@se...> - 2003-07-01 20:09:23
|
Ok -- So I have managed to find the change in versions that isolates the problem, unfortuneately, it is a kernel version change that triggers it, not a bproc one. FYI -- The working combination is 2.4.18 patched for bproc 3.2.3 -- I used the diff in the patches to backport the 2.4.19 patch for 3.2.3 to 2.4.18 The 'bad' combination is 2.4.19 with bproc 3.2.3. So, the behavior that I am seeing now, is that a program is bpsh'd to a node, where it uses pthreads to create a few threads to do the work. At some point, the threads hang, and it takes a 'kill -9' to kill them. Most of the time this will work, but I have noticed that I will have to go to the node and 'kill -9' them there for the process to die all of the way, if not, and I kill -9 from the fron-end, the processes will be removed from the front-end ps output, but when I ssh to the remote node, it is still there, and needs another kill -9 to kill it. There is also the case where the process on the remote node just refuses to die -- kill -9 will not pull it out of whereever it is stuck. What else can I provide ? Would it be possible to get a patch for bproc 3.2.3 for kernel 2.4.20 to see if I get the same behavior there ? Here is a traceback for when the threads hang.This is the same traceback as when the process ignores the kill -9. [root@test6 root]# gdb genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast 17154 GNU gdb Red Hat Linux (5.2-2) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"...genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast: No such file or directory. Attaching to process 17154 Reading symbols from /mnt/io1/genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast...done. Reading symbols from /lib/i686/libm.so.6...done.[henken@test6 henken]$ ps -jxf PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 17156 17157 17157 17157 pts/0 17200 S 27659 0:00 -bash 17157 17200 17200 17157 pts/0 17200 R 27659 0:00 ps -jxf 568 17024 568 568 ? -1 S 27659 0:00 /bin/sh /proc/self/fd/3 /scratch/user/henken/slot_1/result /genomics/share/testsuite/tests/blastSim 17024 17025 568 568 ? -1 S 27659 0:00 /usr/bin/perl /genomics/share/testsuite/test_software/gus/gushome_06-05-03/bin/blastSimilarity --bl 17025 17151 568 568 ? -1 S 27659 0:00 \_ sh -c /genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast -d /scratch/user/he 17151 17152 568 568 ? -1 S 27659 0:00 \_ /genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast -d /scratch/user/henk 17152 17153 568 568 ? -1 S 27659 0:00 \_ /genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast -d /scratch/user/ 17153 17154 568 568 ? -1 S 27659 0:00 \_ /genomics/share/testsuite/test_software/ncbiblast_2000-10-31/rpsblast -d /scratch/u [henken@test6 henken]$ strace -p 17154 attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted [henken@test6 henken]$ su - Password: [root@test6 root]# strace -p 17154 [root@test6 root]# strace -p 17153 getppid() = 511 poll([{fd=7, events=POLLIN}], 1, 2000) = 0 getppid() = 511 poll( <unfinished ...> [root@test6 root]# strace -p 17154 Loaded symbols for /lib/i686/libm.so.6 Reading symbols from /lib/i686/libpthread.so.0...done. [New Thread 1024 (LWP 511)] Error while reading shared library symbols: Can't attach LWP 511: No such process Reading symbols from /lib/i686/libc.so.6...done. Loaded symbols for /lib/i686/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 0x40080bb5 in __sigsuspend (set=0x597697bc) at ../sysdeps/unix/sysv/linux/sigsuspend.c:45 45 ../sysdeps/unix/sysv/linux/sigsuspend.c: No such file or directory. in ../sysdeps/unix/sysv/linux/sigsuspend.c (gdb) bt #0 0x40080bb5 in __sigsuspend (set=0x597697bc) at ../sysdeps/unix/sysv/linux/sigsuspend.c:45 #1 0x400461d9 in __pthread_wait_for_restart_signal (self=0x59769be0) at pthread.c:971 #2 0x40047f49 in __pthread_alt_lock (lock=0x8297ab0, self=0x0) at restart.h:34 #3 0x40044d26 in __pthread_mutex_lock (mutex=0x8297aa0) at mutex.c:120 #4 0x0804b7aa in s_MutexLock () #5 0x0804b83d in NlmMutexLockEx () #6 0x0817794c in Nlm_GetAppParam () #7 0x0817583f in GetAppErrInfo () #8 0x08174ba1 in Nlm_ErrSetLogfile () #9 0x0804abfa in NlmThreadWrapper () #10 0x40043c6f in pthread_start_thread (arg=0x59769be0) at manager.c:284 -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: <er...@he...> - 2003-07-01 15:14:40
|
On Tue, Jul 01, 2003 at 09:31:23AM -0400, Nicholas Henke wrote: > Hey Erik~ > I am again faced with pesky Java users who are wanting to use bpsh to > farm out their tasks. I am running low on ammunition to kill them, so I > figured I would take a stab at getting the 'clone' system call working > in bproc. First -- is this going to be possible ? Second - can you give > me a rough overview of what needs to be done ? I believe clone works. Most of the interesting stuff with clone is local to the node and BProc doesn't get involved at all. So, in theory, it should be possible to make Java work. I think there are two things which you are likely to have trouble with: 1 - Some of the thread group stuff (CLONE_THREAD) may not work. This stuff has been kind of fluid in the 2.4.x kernels so it seems unlikely that many things use it. 2 - You cannot migrate a multi-threaded task. Some of the guys at LBL are working on some extensions to VMADump to do handle multi-threaded tasks for some checkpointing work they're doing but none of this has been combined with BProc at this point. BProc would also have to become aware of these situations. Migration will end up creating copies of the program. Also, on x86, vmadump isn't aware of funky LDT stuff which will also hamper migration. Note that this doesn't mean you can't bpsh a multi-threaded program. The other possible funny bit that you're likely to run into is that fork/clone is much slower than normal because it involves the front end. This could lead to new/interesting races or just poor performance in apps that create/clean-up threads a lot. In terms of what needs to be done, that depends entirely on what you're trying to run. I've done some simple pthreads things on nodes w/o problems. The first place to look is probably strace output of a program that fails. Then try and figure out how what the app is seeing differs from what it's expecting. - Erik |
From: Nicholas H. <he...@se...> - 2003-07-01 13:31:37
|
Hey Erik~ I am again faced with pesky Java users who are wanting to use bpsh to farm out their tasks. I am running low on ammunition to kill them, so I figured I would take a stab at getting the 'clone' system call working in bproc. First -- is this going to be possible ? Second - can you give me a rough overview of what needs to be done ? Thanks! Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: <ha...@no...> - 2003-06-27 08:59:26
|
> I have a simple python batch/queuing system that up until now has worked for > me. I looked at sge+bproc - but as far as I can tell you have to manually > reconfigure sge when nodes become unavailable. It can probably be set up to > automatically recongnize cluster reconfigurations but it's not obvious to me > how to do it. It should be easy to do dynamic configuration of SGE when nodes become available/unavailable. With my approach where node looks like a queue on the master node one just have to call qmod -e $N to enable the queue when node N becomas available, so this command should probably go to the end of the bproc's node_up script (where N=$1), and qmod -d $N to disable the queue when node becomes unavailable. I am not sure there is anything like node_down script in bproc (I thought there is but I do not see it in my cluster just now); if it is, it shoud start with "qmod -d $N". We could also test node's sanity in SGE's prolog and epilog scripts (run before and after the job) and call "qmod -d $N" there when needed. (Epilog script could even re-schedule the job when node died while running the job, if the job is re-runnable.) Another simple approach is to run script doing "bpstat" and then "qmod -d ..." every 30 seconds or so (on the master). If all the jobs are written as re-runnable (can be aborted at any moment and run again on a different node, this usually means that the job does not change any of its input files), it should be easy to create a node-fault-tolerant system. All this is untested, please let me know if you try it. Best Regards Vaclav Hanzl |
From: Chong C. <cc...@pl...> - 2003-06-26 22:51:38
|
LSF batch system has an integration on bproc. Other than serial job, it = also supports parallel job in nature(mpirun). Additionally, it can = handle with node unavailable issue automatically. No need to reconfig = system. The mechanism is transparent to end user. From user point of = view, when node is down, the batch system just decreases some available = slots.=20 Chong -----Original Message----- From: Thomas Clausen [mailto:tcl...@we...] Sent: Thursday, June 26, 2003 2:19 PM To: bpr...@li... Subject: [BProc] Re: is this a good candidate for bproc? Hi Russell, bproc works great. I use it for running batch jobs on a 90 cpu cluster = with 20 dual CPU and the rest single CPU machines. I have an occasional mpi = job but mostly it's single processes. I have a simple python batch/queuing system that up until now has worked = for me. I looked at sge+bproc - but as far as I can tell you have to = manually reconfigure sge when nodes become unavailable. It can probably be set up = to automatically recongnize cluster reconfigurations but it's not obvious = to me how to do it. Thomas > _______________________________________________ > Beowulf mailing list, Be...@be... > To change your subscription (digest mode or unsubscribe) visit = http://www.beowulf.org/mailman/listinfo/beowulf --=20 .^. Thomas Clausen, post doc /V\ Physics Department, Wesleyan University, CT // \\ Tel 860-685-2018, fax 860-685-2031 /( )\ =20 ^^-^^ Use Linux ------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ BProc-users mailing list BPr...@li... https://lists.sourceforge.net/lists/listinfo/bproc-users |
From: Thomas C. <tcl...@we...> - 2003-06-26 18:19:57
|
Hi Russell, bproc works great. I use it for running batch jobs on a 90 cpu cluster with 20 dual CPU and the rest single CPU machines. I have an occasional mpi job but mostly it's single processes. I have a simple python batch/queuing system that up until now has worked for me. I looked at sge+bproc - but as far as I can tell you have to manually reconfigure sge when nodes become unavailable. It can probably be set up to automatically recongnize cluster reconfigurations but it's not obvious to me how to do it. Thomas > _______________________________________________ > Beowulf mailing list, Be...@be... > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf -- .^. Thomas Clausen, post doc /V\ Physics Department, Wesleyan University, CT // \\ Tel 860-685-2018, fax 860-685-2031 /( )\ ^^-^^ Use Linux |
From: <er...@he...> - 2003-06-26 17:12:21
|
On Thu, Jun 26, 2003 at 12:50:43PM -0400, Nicholas Henke wrote: > When running vfork, it appears that only the first node in the nodelist > has the stdio redirected correctly. The rest of the nodes output appears > to go to /dev/null. Is this the expected behavior ? > > BTW -- I am running 3.2.0 Umm... Yeah. It's certainly a quirk and it should probably be fixed but that's normal. Here's what's going on: Normally, when you move to a node, bproc will setup a socket connection between the two processes to move the process information. As a nice hack to provide basic support for printf, the socket is kept around and attached to the process's STDOUT, STDERR. This works out because the socket connection is usually back to the front end where bproc can do some mostly sane IO forwarding. In the vrfork case, only the first process gets the process image from the front end. The rest of the processes get their process image from one of the previous proceses. This adds parallelism which makes it go faster and blah blah blah... The upshot is that the sockets which were used for the built-in forwarding don't go back to the front end anymore so it doesn't work. If you look at the bpsh source, you'll see that it provides explicit instructions to vexecmove on how to wire up STDIN, STDOUT, STDERR. bpsh itself becomes the IO forwarder in that case. This makes things like bpsh much more complicated than they might otherwise be. On the bright side, bpsh is a MUCH better IO forwarder than what's built into BProc at this point. I've been wanting to get rid fo the IO forwarding daemon in BProc since the very first version. It's one of those thing that's lingered because it's a nice crutch which does an ok job simple prints. - Erik |
From: Nicholas H. <he...@se...> - 2003-06-26 16:53:30
|
When running vfork, it appears that only the first node in the nodelist has the stdio redirected correctly. The rest of the nodes output appears to go to /dev/null. Is this the expected behavior ? BTW -- I am running 3.2.0 Cheers! Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: <er...@he...> - 2003-06-26 14:50:18
|
On Wed, Jun 25, 2003 at 09:41:43AM -0400, gor...@ph... wrote: > Has anyone tried bproc on 2.4.21 yet? The port should be straight forward. It looks like they just renamed get_empty_inode to new_inode. I'm working the kinks out some memory handling kinks of on a new patch now. - Erik |
From: Dale H. <ro...@ma...> - 2003-06-25 23:22:43
|
On Wed, Jun 25, 2003 at 05:39:54PM -0400, Gregory Shakhnarovich elucidated: > > Hi, > > We are working on our new BProc-based cluster. The initial setup has been > nice and smooth, but one major hole we still have is the scheduler/queuing > system (which is quite important for the intended use of the cluster). > FWIW... you might check out: http://noel.feld.cvut.cz/magi/sge+bproc.html SGE and bproc integrated. -- Dale Harris ro...@ma... /.-) |
From: Gregory S. <gr...@ai...> - 2003-06-25 22:37:41
|
Hi, I failed to mention a detail that may be relevant: we are running a Debian 2.4.18 kernel on our cluster. Thanks, -- Greg Shakhnarovich AI Lab, MIT NE43-V611 Cambridge, MA 02139 tel (617) 253-8170 fax (617) 258-6287 |
From: Chong C. <cc...@pl...> - 2003-06-25 22:30:36
|
Hi,=20 Just let you know. We have an LSF integration on bproc. It = provides full LSF scheduler ability , including fairshare, = preemption....=20 Chong -----Original Message----- From: Gregory Shakhnarovich [mailto:gr...@ai...] Sent: Wednesday, June 25, 2003 5:40 PM To: bpr...@li... Subject: [BProc] Queueing & scheduling Hi, We are working on our new BProc-based cluster. The initial setup has = been nice and smooth, but one major hole we still have is the = scheduler/queuing system (which is quite important for the intended use of the cluster). I am aware of Clubmask, but as people here have pointed out, the need to do full node install makes that solution very undesirable. It looks = like the only (other) immediately available solution is BJS. So, I am trying = to figure out the following (and will appreciate any tips): 1) We have 32 dual-CPU nodes, and would like them to be treated as 64 nodes for scheduling purposes. How can this be conveyed to BJS? 2) How do we tell BJS not to include the head node in the pool? A = related question - what is the semantics of the indices in 'nodes' directive? 3) Has anyone implemented any policy modules in addition to 'simple' and 'shared', which could be shared with us? 4) Is there any way to introduce priorities with the existing policies? What we want ideally is to have 2-3 priority levels (low/med/high) so = that the jobs get scheduled and suspended/restarted dynamically based on the priority, in addition to node availability. I.e., if all the nodes are taken by a job L with low priority, and a job H with high priority arrives, then L is suspended until H is done. (*ideally* there would be some anti-starvation mechanism as well, likely upgrading L after it's been unfinished for a while, but for now we would be happy without it.) I will much appreciate any suggestions on how this could be accomplished with Bproc. Thanks, -- Greg Shakhnarovich AI Lab, MIT NE43-V611 Cambridge, MA 02139 tel (617) 253-8170 fax (617) 258-6287 ------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ BProc-users mailing list BPr...@li... https://lists.sourceforge.net/lists/listinfo/bproc-users |
From: Nicholas H. <he...@se...> - 2003-06-25 22:12:36
|
On Wed, 2003-06-25 at 17:39, Gregory Shakhnarovich wrote: > Hi, > > We are working on our new BProc-based cluster. The initial setup has been > nice and smooth, but one major hole we still have is the scheduler/queuing > system (which is quite important for the intended use of the cluster). > > I am aware of Clubmask, but as people here have pointed out, the need to > do full node install makes that solution very undesirable. It looks like > the only (other) immediately available solution is BJS. So, I am trying to > figure out the following (and will appreciate any tips): <clubmask author> Not anymore. I am working on a release now that does away with Clubmask as an entire cluster installation/managment/feed your dog solution. I think it would be pretty easy to put Clubmask on a Clustermatic cluster, as clubmask is just a simple RPM now. The only requirements we have for the nodes are that they run a custom mond ( from supermon ), which can just be started from node_up. That said, the release is honestly a month off, as I have a ton of documentation to write, but the software itself is currently running & working fine on 3 separate clusters, and we are installing the rest of our clusters with it during July. I would be more than happy to try and get you running Clubmask on your Clustermatic setup, and I will be working with a Clustermatic cluster here in the near future. Feel free to email me if you would be willing to put in a bit of leg work. The issues I see cropping up are: 1) Need to patch kernel with a few symbol exports to make supermon happy. We can do without this, but you will not get the supermon2ganglia translator functionality. ( Supermon2ganglia is a 'fake' gmond that translates supermon data into ganglia XML so that you can view the data using the standard Ganglia web interface. See http://www.liniac.upenn.edu/ganglia for a live example. This would be pretty easy, as we have all of the SRPMs and patches that should be necessary. 2) Recompiling ZODB, IndexedCatalog, Clubmask, Python2.2.2, etc SRPMs for your target platform. Not really an issue, but it would need to be done. 3) Sanity checking -- well I guess this goes for any software. Now that I am done with the scary stuff :P, Here are a few questions for you: 1) Would you need ssh access or control to the nodes? 2) What platform would you be running on ? RH 9 ? 8 ? 3) timeframe ? Cheers! Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Gregory S. <gr...@ai...> - 2003-06-25 21:40:07
|
Hi, We are working on our new BProc-based cluster. The initial setup has been nice and smooth, but one major hole we still have is the scheduler/queuing system (which is quite important for the intended use of the cluster). I am aware of Clubmask, but as people here have pointed out, the need to do full node install makes that solution very undesirable. It looks like the only (other) immediately available solution is BJS. So, I am trying to figure out the following (and will appreciate any tips): 1) We have 32 dual-CPU nodes, and would like them to be treated as 64 nodes for scheduling purposes. How can this be conveyed to BJS? 2) How do we tell BJS not to include the head node in the pool? A related question - what is the semantics of the indices in 'nodes' directive? 3) Has anyone implemented any policy modules in addition to 'simple' and 'shared', which could be shared with us? 4) Is there any way to introduce priorities with the existing policies? What we want ideally is to have 2-3 priority levels (low/med/high) so that the jobs get scheduled and suspended/restarted dynamically based on the priority, in addition to node availability. I.e., if all the nodes are taken by a job L with low priority, and a job H with high priority arrives, then L is suspended until H is done. (*ideally* there would be some anti-starvation mechanism as well, likely upgrading L after it's been unfinished for a while, but for now we would be happy without it.) I will much appreciate any suggestions on how this could be accomplished with Bproc. Thanks, -- Greg Shakhnarovich AI Lab, MIT NE43-V611 Cambridge, MA 02139 tel (617) 253-8170 fax (617) 258-6287 |
From: <gor...@ph...> - 2003-06-25 13:42:09
|
Some of the oops people have been seeing in the 2.4.20 kernels may have been due to an RPC race condition. Here's a kernel thread on the topic, and a couple of ksymoops from my systems: http://www.ussg.iu.edu/hypermail/linux/kernel/0302.0/1146.html These oops are from systems running 3.2.5, but the also occurred under 3.2.4. They appear to be perciptated by simultaneous spikes in CPU and network (NFS?) load, which happens regularly on compute clusters. I've contacted the original poster, who hasn't seen the problem since upgrading to 2.4.21-rc6. Has anyone tried bproc on 2.4.21 yet? ====================================================================================== ksymoops 2.4.8 on i686 2.4.20. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20/ (default) -m /boot/System.map (specified) Unable to handle kernel NULL pointer dereference at virtual address 00000058 c0303206 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c0303206>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010202 eax: 0000002c ebx: 00000000 ecx: 00000008 edx: 00000001 esi: f7611078 edi: e7bc2480 ebp: f7611000 esp: c2837edc ds: 0018 es: 0018 ss: 0018 Process swapper (pid: 0, stackpage=c2837000) Stack: e7bc2480 c03031c0 00000020 00000000 c0304442 e7bc2480 e7bc24d4 c03043c0 c0123b47 e7bc2480 c2837f0c 00000001 c882a0e4 f7bfbd78 00000000 00000001 00000020 00000000 c011fafb c0433660 c011f9a1 00000000 00000001 c04095e0 Call Trace: [<c03031c0>] [<c0304442>] [<c03043c0>] [<c0123b47>] [<c011fafb>] [<c011f9a1>] [<c011f72b>] [<c010a8ad>] [<c0106e60>] [<c0106e60>] [<c0106e60>] [<c0106e60>] [<c0106e8c>] [<c0106f12>] [<c011af6b>] Code: 8b 40 2c 83 f8 09 0f 4c c8 b8 01 00 00 00 d3 e0 39 c2 7d 16 >>EIP; c0303206 <xprt_timer+46/e0> <===== >>esi; f7611078 <_end+371a0bfc/384b3b84> >>edi; e7bc2480 <_end+27752004/384b3b84> >>ebp; f7611000 <_end+371a0b84/384b3b84> >>esp; c2837edc <_end+23c7a60/384b3b84> Trace; c03031c0 <xprt_timer+0/e0> Trace; c0304442 <rpc_run_timer+82/90> Trace; c03043c0 <rpc_run_timer+0/90> Trace; c0123b47 <timer_bh+2b7/3f0> Trace; c011fafb <bh_action+4b/80> Trace; c011f9a1 <tasklet_hi_action+61/a0> Trace; c011f72b <do_softirq+7b/e0> Trace; c010a8ad <do_IRQ+dd/f0> Trace; c0106e60 <default_idle+0/40> Trace; c0106e60 <default_idle+0/40> Trace; c0106e60 <default_idle+0/40> Trace; c0106e60 <default_idle+0/40> Trace; c0106e8c <default_idle+2c/40> Trace; c0106f12 <cpu_idle+52/70> Trace; c011af6b <call_console_drivers+eb/100> Code; c0303206 <xprt_timer+46/e0> 00000000 <_EIP>: Code; c0303206 <xprt_timer+46/e0> <===== 0: 8b 40 2c mov 0x2c(%eax),%eax <===== Code; c0303209 <xprt_timer+49/e0> 3: 83 f8 09 cmp $0x9,%eax Code; c030320c <xprt_timer+4c/e0> 6: 0f 4c c8 cmovl %eax,%ecx Code; c030320f <xprt_timer+4f/e0> 9: b8 01 00 00 00 mov $0x1,%eax Code; c0303214 <xprt_timer+54/e0> e: d3 e0 shl %cl,%eax Code; c0303216 <xprt_timer+56/e0> 10: 39 c2 cmp %eax,%edx Code; c0303218 <xprt_timer+58/e0> 12: 7d 16 jge 2a <_EIP+0x2a> c0303230 <xprt_timer+70/e0> ============================================================================================== <0>Kernel panic: Aiee, killing interrupt handler!ksymoops 2.4.8 on i686 2.4.20. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20/ (default) -m /boot/System.map (specified) Unable to handle kernel NULL pointer dereference at virtual address 00000058 c0303206 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c0303206>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010202 eax: 0000002c ebx: 00000000 ecx: 00000008 edx: 00000001 esi: f75d82b8 edi: f423ee40 ebp: f75d8000 esp: f53f3f24 ds: 0018 es: 0018 ss: 0018 Process blastall (pid: 2246, stackpage=f53f3000) Stack: f423ee40 c03031c0 00000000 00000000 c0304442 f423ee40 f423ee94 c03043c0 c0123b47 f423ee40 f53f3f54 00000086 c35a7a98 d901a0e4 00000000 00000001 00000000 00000000 c011fafb c0433660 c011f9a1 00000000 00000001 c04095e0 Call Trace: [<c03031c0>] [<c0304442>] [<c03043c0>] [<c0123b47>] [<c011fafb>] [<c011f9a1>] [<c011f72b>] [<c010a8ad>] Code: 8b 40 2c 83 f8 09 0f 4c c8 b8 01 00 00 00 d3 e0 39 c2 7d 16 >>EIP; c0303206 <xprt_timer+46/e0> <===== >>esi; f75d82b8 <_end+37167e3c/384b3b84> >>edi; f423ee40 <_end+33dce9c4/384b3b84> >>ebp; f75d8000 <_end+37167b84/384b3b84> >>esp; f53f3f24 <_end+34f83aa8/384b3b84> Trace; c03031c0 <xprt_timer+0/e0> Trace; c0304442 <rpc_run_timer+82/90> Trace; c03043c0 <rpc_run_timer+0/90> Trace; c0123b47 <timer_bh+2b7/3f0> Trace; c011fafb <bh_action+4b/80> Trace; c011f9a1 <tasklet_hi_action+61/a0> Trace; c011f72b <do_softirq+7b/e0> Trace; c010a8ad <do_IRQ+dd/f0> Code; c0303206 <xprt_timer+46/e0> 00000000 <_EIP>: Code; c0303206 <xprt_timer+46/e0> <===== 0: 8b 40 2c mov 0x2c(%eax),%eax <===== Code; c0303209 <xprt_timer+49/e0> 3: 83 f8 09 cmp $0x9,%eax Code; c030320c <xprt_timer+4c/e0> 6: 0f 4c c8 cmovl %eax,%ecx Code; c030320f <xprt_timer+4f/e0> 9: b8 01 00 00 00 mov $0x1,%eax Code; c0303214 <xprt_timer+54/e0> e: d3 e0 shl %cl,%eax Code; c0303216 <xprt_timer+56/e0> 10: 39 c2 cmp %eax,%edx Code; c0303218 <xprt_timer+58/e0> 12: 7d 16 jge 2a <_EIP+0x2a> c0303230 <xprt_timer+70/e0> <0>Kernel panic: Aiee, killing interrupt handler! |
From: Nicholas H. <he...@se...> - 2003-06-24 12:04:16
|
On Tue, 2003-06-24 at 07:55, Nicholas Henke wrote: > > You may wish to try LAM/MPI ( www.lam-mpi.org ) The beta releases of > 7.0, and 7.0 when it is release finally all have very nice support for > Bproc. Wow -- one should really read their email before hitting 'Send' :) Apparently English is not my best thing this early in the morning. What I meant to say was that the 7.0 branch of LAM/MPI has very nice bproc support, including such features as marking the bpmaster node as 'no-schedule' so MPI processes are not automatically scheduled on the front end machine. Cheers! Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |