You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(25) |
Nov
|
Dec
(22) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(13) |
Feb
(22) |
Mar
(39) |
Apr
(10) |
May
(26) |
Jun
(23) |
Jul
(38) |
Aug
(20) |
Sep
(27) |
Oct
(76) |
Nov
(32) |
Dec
(11) |
2003 |
Jan
(8) |
Feb
(23) |
Mar
(12) |
Apr
(39) |
May
(1) |
Jun
(48) |
Jul
(35) |
Aug
(15) |
Sep
(60) |
Oct
(27) |
Nov
(9) |
Dec
(32) |
2004 |
Jan
(8) |
Feb
(16) |
Mar
(40) |
Apr
(25) |
May
(12) |
Jun
(33) |
Jul
(49) |
Aug
(39) |
Sep
(26) |
Oct
(47) |
Nov
(26) |
Dec
(36) |
2005 |
Jan
(29) |
Feb
(15) |
Mar
(22) |
Apr
(1) |
May
(8) |
Jun
(32) |
Jul
(11) |
Aug
(17) |
Sep
(9) |
Oct
(7) |
Nov
(15) |
Dec
|
From: Dale H. <ro...@ma...> - 2003-06-09 22:37:51
|
On Thu, May 29, 2003 at 03:25:39PM -0700, Michael Madore elucidated: > Hi, > > On the Maui homepage there is mention of support for bproc. Specifically: > > BProc Scheduling API - BProc (Scyld) > > Is this a feature present in the standard bproc, or is it an extension > provided by Scyld? Has anyone tried Sun Grid Engine and bproc? GridEngine can already use Maui. Or is that too many layers of indirection. I was under the impression that bproc was a queuing system, per se. -- Dale Harris ro...@ma... /.-) |
From: Joshua A. <lu...@li...> - 2003-06-09 20:50:06
|
An even easier method is to rpm -tb the .tar.gz file directly. Note that on newer versions of RH you may need to use the rpmbuild command instead of rpm. # rpmbuild -tb cmtools-1.2.tar.gz # rpm -Uvh /usr/src/redhat/RPMS/i386/cmtools*rpm Josh On Mon, Jun 09, 2003 at 12:08:58PM -0700, Michael Madore wrote: > Hi Larry, > > I have done this recently: > > 1) Copy the .spec file to /usr/src/redhat/SPECS > 2) Copy the tar file to /usr/src/redhat/SOURCES > 3) cd /usr/src/redhat/SPECS > 4) rpm -ba cmtools.spec > 5) rpm -Uvh ../RPMS/i386/cmtools-*1.2.1.i386.rpm > > You may have to modify the spec file slightly to reflect the new version > number. The same process can be applied to the other Clustermatic packages. > > Mike > > Larry Baker wrote: > > >I want to apply the updates from broc.sourceforge.net to my > >Clustermatic 3 CD installation. Newer versions of beoboot, bproc, and > >cmtools are available. However, they are not packaged as RPMs. > > > >I un"tar"ed cmtools and did a "make", which worked. Now I want to get > >the files moved where they belong (with soft links setup) and have the > >RPM database updated to reflect the new version. I see a .spec file > >that looks like it has the information -- how can I make use of it? > > > >1. What commands should be used to apply the updates to an existing > >Clustermatic 3 installation? > > > >2. What order should the updates be applied and tested? > > > >Larry Baker > >US Geological Survey |
From: Michael M. <mm...@as...> - 2003-06-09 19:10:00
|
Hi Larry, I have done this recently: 1) Copy the .spec file to /usr/src/redhat/SPECS 2) Copy the tar file to /usr/src/redhat/SOURCES 3) cd /usr/src/redhat/SPECS 4) rpm -ba cmtools.spec 5) rpm -Uvh ../RPMS/i386/cmtools-*1.2.1.i386.rpm You may have to modify the spec file slightly to reflect the new version number. The same process can be applied to the other Clustermatic packages. Mike Larry Baker wrote: > I want to apply the updates from broc.sourceforge.net to my > Clustermatic 3 CD installation. Newer versions of beoboot, bproc, and > cmtools are available. However, they are not packaged as RPMs. > > I un"tar"ed cmtools and did a "make", which worked. Now I want to get > the files moved where they belong (with soft links setup) and have the > RPM database updated to reflect the new version. I see a .spec file > that looks like it has the information -- how can I make use of it? > > 1. What commands should be used to apply the updates to an existing > Clustermatic 3 installation? > > 2. What order should the updates be applied and tested? > > Larry Baker > US Geological Survey |
From: Larry B. <ba...@us...> - 2003-06-09 18:49:07
|
I want to apply the updates from broc.sourceforge.net to my Clustermatic = 3 CD installation. Newer versions of beoboot, bproc, and cmtools are = available. However, they are not packaged as RPMs. I un"tar"ed cmtools and did a "make", which worked. Now I want to get = the files moved where they belong (with soft links setup) and have the = RPM database updated to reflect the new version. I see a .spec file = that looks like it has the information -- how can I make use of it? 1. What commands should be used to apply the updates to an existing = Clustermatic 3 installation? 2. What order should the updates be applied and tested? Larry Baker US Geological Survey |
From: steven j. <py...@li...> - 2003-06-07 16:14:59
|
Greetings, ksyscall is no longer a seperate module. It gets linked in with bproc now. G'day, sjames On Fri, 6 Jun 2003, cscserver wrote: > Hello > When I try to load the ksyscall.o I receive a message=20 > saying that is impossible to determine the kernel=20 > version that the module was compiled for. > I'm using bproc-3.2.5 and linux-2.4.20. > What should I do? >=20 > Thanx > Rodrigo Schmidt N=FCrmberg >=20 > =20 > _________________________________________________________________________= _ > Sele=E7=E3o de Softwares UOL. > 10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. > http://www.uol.com.br/selecao >=20 >=20 >=20 >=20 > ------------------------------------------------------- > This SF.net email is sponsored by: Etnus, makers of TotalView, The best > thread debugger on the planet. Designed with thread debugging features > you've never dreamed of, try TotalView 6 free at www.etnus.com. > _______________________________________________ > BProc-users mailing list > BPr...@li... > https://lists.sourceforge.net/lists/listinfo/bproc-users >=20 --=20 -------------------------steven james, director of research, linux labs =2E.. ........ ..... .... 230 peachtree st nw ste 2701 the original linux labs atlanta.ga.us 30303 -since 1995 http://www.linuxlabs.com office 404.577.7747 fax 404.577.7743 ----------------------------------------------------------------------- |
From: cscserver <csc...@bo...> - 2003-06-06 19:51:12
|
> Steps by steps: > > tar -zxvf bproc-3.2.5.tar.gz -C /root > tar -jxvf linux-2.4.20.tar.bz2 -C /usr/src > ln -s /usr/src/linux-2.4.20 /usr/src/linux > cd /usr/src/linux > patch -p1 < > /root/bproc-3.2.5/patches/bproc-patch-2.4.20 > make mproper > make menuconfig > > I seted up the "Beowulf Distributed > Process Space" under General Setup in > kernel configuration > > make dep > make bzImage > make modules > make modules_install > cp System.map /boot > cp arch/i386/boot/bzImage > /boot/vmlinuz-cow > > I updated the lilo.conf > lilo > I give a boot on my new kernel > > cd /root/bproc-3.2.5 > make LINUX=3D/usr/src/linux > make LINUX=3D/usr/src/linux install > > When I try to load the ksyscall.o > module (insmod kernel/ksyscall.o) I > received a error message: > > kernel/ksyscall.o: couldn't find the > kernel version the module was compiled for > > This is It. > > I hope that you can help me. > Thanx > Rodrigo Schmidt N=FCrmberg > > > > > > > Hi, > > > > Can you give more detailed information about the steps you are using to > > install Bproc? > > > > Mike Madore > > > > cscserver wrote: > > > > >Hello > > >When I try to load the ksyscall.o I receive a message > > >saying that is impossible to determine the kernel > > >version that the module was compiled for. > > >I'm using bproc-3.2.5 and linux-2.4.20. > > >What should I do? > > > > > >Thanx > > >Rodrigo Schmidt N=FCrmberg > > > > > > > > >__________________________________________________________________________ > > >Sele=E7=E3o de Softwares UOL. > > >10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. > > >http://www.uol.com.br/selecao > > > > > > > > > > > > > > >------------------------------------------------------- > > >This SF.net email is sponsored by: Etnus, makers of TotalView, The best > > >thread debugger on the planet. Designed with thread debugging features > > >you've never dreamed of, try TotalView 6 free at www.etnus.com. > > >_______________________________________________ > > >BProc-users mailing list > > >BPr...@li... > > >https://lists.sourceforge.net/lists/listinfo/bproc-users > > > > > > > > > > > > > > > __________________________________________________________________________ > Sele=E7=E3o de Softwares UOL. > 10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. > http://www.uol.com.br/selecao > > __________________________________________________________________________ Sele=E7=E3o de Softwares UOL. 10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. http://www.uol.com.br/selecao |
From: Michael M. <mm...@as...> - 2003-06-06 19:14:02
|
Hi, Can you give more detailed information about the steps you are using to=20 install Bproc? Mike Madore cscserver wrote: >Hello >When I try to load the ksyscall.o I receive a message=20 >saying that is impossible to determine the kernel=20 >version that the module was compiled for. >I'm using bproc-3.2.5 and linux-2.4.20. >What should I do? > >Thanx >Rodrigo Schmidt N=FCrmberg > >=20 >________________________________________________________________________= __ >Sele=E7=E3o de Softwares UOL. >10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. >http://www.uol.com.br/selecao > > > > >------------------------------------------------------- >This SF.net email is sponsored by: Etnus, makers of TotalView, The best >thread debugger on the planet. Designed with thread debugging features >you've never dreamed of, try TotalView 6 free at www.etnus.com. >_______________________________________________ >BProc-users mailing list >BPr...@li... >https://lists.sourceforge.net/lists/listinfo/bproc-users > =20 > |
From: cscserver <csc...@bo...> - 2003-06-06 19:00:05
|
Hello When I try to load the ksyscall.o I receive a message saying that is impossible to determine the kernel version that the module was compiled for. I'm using bproc-3.2.5 and linux-2.4.20. What should I do? Thanx Rodrigo Schmidt N=FCrmberg __________________________________________________________________________ Sele=E7=E3o de Softwares UOL. 10 softwares escolhidos pelo UOL para voc=EA e sua fam=EDlia. http://www.uol.com.br/selecao |
From: Michael M. <mm...@as...> - 2003-06-02 19:45:52
|
er...@he... wrote: >On Thu, May 29, 2003 at 03:25:39PM -0700, Michael Madore wrote: > > >>Hi, >> >>On the Maui homepage there is mention of support for bproc. Specifically: >> >> BProc Scheduling API - BProc (Scyld) >> >>Is this a feature present in the standard bproc, or is it an extension >>provided by Scyld? >> >> > >I have no idea. You should probably ask the maui people. The last >Maui I saw made no mention of BProc. > >- Erik > > Erik, It turns out that there is a project called Clubmask (clubmask.sf.net) that has integrated Maui and bproc. I assume they have written a resource manager that talks to Maui. Unfortunately, they are performing a full Linux install to each node in the cluster which kind of undermines one of the nicest features of bproc (IMO). Mike |
From: <er...@he...> - 2003-06-02 19:33:20
|
On Thu, May 29, 2003 at 03:25:39PM -0700, Michael Madore wrote: > Hi, > > On the Maui homepage there is mention of support for bproc. Specifically: > > BProc Scheduling API - BProc (Scyld) > > Is this a feature present in the standard bproc, or is it an extension > provided by Scyld? I have no idea. You should probably ask the maui people. The last Maui I saw made no mention of BProc. - Erik |
From: Michael M. <mm...@as...> - 2003-05-29 22:26:40
|
Hi, On the Maui homepage there is mention of support for bproc. Specifically: BProc Scheduling API - BProc (Scyld) Is this a feature present in the standard bproc, or is it an extension provided by Scyld? Thanks Mike Madore |
From: Mike S. <mik...@al...> - 2003-04-22 14:39:03
|
I am using kernel 2.4.20. After loading the stage 2 boot image and kmonte loads loads the stage 2 kernel, I get "Couldn't find symbol real_mode_conf" In the beoboot 1.5/monte dir I noticed a real_mode_conf patch for 2.4.17. Is there a patch available for 2.4.20. Mike Sullivan -- Mike Sullivan Director Performance Computing @lliance Technologies, Voice: (416) 385-3255 x 228, 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 http://www.alltec.com |
From: <er...@he...> - 2003-04-21 23:41:38
|
On Mon, Apr 21, 2003 at 05:27:38PM -0400, Mike Sullivan wrote: > I discovered that I was using a stage 1 kernel/ramdisk from an earlier > version of Bproc which > was causing the stage 2 load to hang. However, upon updating to the > stage 1 kernel/initrd that > come with beoboot-1.5. I know get a message of the form below. I am > ignorant of what happens > at this stage but it looks to me like the root filesystem on the ram > disk was mounted and then a second > filesystem failed to mount. Anyone have any ideas? > > VFS: Mounted root ( romfs filesystem ) readonly > Freeing unused kernel memory: 72 K Freed > boot: mount -t tmpfs none /mnt: Invalid argument > Failed to setup root filesystem > > A fatal error has occured > Kernel Panic: Attempted to kill init. My guess is no tmpfs support in the phase 1 kernel. You need that now so that the boot program can have a writable root file system. - Erik |
From: Mike S. <mik...@al...> - 2003-04-21 21:27:48
|
I discovered that I was using a stage 1 kernel/ramdisk from an earlier version of Bproc which was causing the stage 2 load to hang. However, upon updating to the stage 1 kernel/initrd that come with beoboot-1.5. I know get a message of the form below. I am ignorant of what happens at this stage but it looks to me like the root filesystem on the ram disk was mounted and then a second filesystem failed to mount. Anyone have any ideas? VFS: Mounted root ( romfs filesystem ) readonly Freeing unused kernel memory: 72 K Freed boot: mount -t tmpfs none /mnt: Invalid argument Failed to setup root filesystem A fatal error has occured Kernel Panic: Attempted to kill init. -- Mike Sullivan Director Performance Computing @lliance Technologies, Voice: (416) 385-3255 x 228, 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 http://www.alltec.com |
From: <er...@he...> - 2003-04-17 16:15:36
|
On Wed, Apr 16, 2003 at 08:53:50PM -0400, Mike Sullivan wrote: > I am using the latest version of bproc 3.2.5 and beoboot 1.5, cmtools > 1.2. The slave > will hang at the point > > Requesting: /var/beowulf/boot.img Did you update the phase1 image as well? The download protocol changed. (See ReleaseNotes + ChangeLog) - Erik |
From: Mike S. <mik...@al...> - 2003-04-17 00:53:44
|
I am using the latest version of bproc 3.2.5 and beoboot 1.5, cmtools 1.2. The slave will hang at the point Requesting: /var/beowulf/boot.img Any ideas? Thanks Mike -- Mike Sullivan Director Performance Computing @lliance Technologies, Voice: (416) 385-3255 x 228, 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 http://www.alltec.com |
From: <er...@he...> - 2003-04-16 23:02:01
|
On Thu, Apr 17, 2003 at 12:24:23AM +0200, J.A. Magallon wrote: > > On 04.16, gor...@ph... wrote: > > > > Odd thing. Attempting to "bpsh" the following csh or tcsh script fails. > > > > #!/bin/csh > > echo "Hello" > > > > Results in: > > > > /proc/self/fd/3: No such file or directory > > > > It works for me. Running bproc-3.2.4 on 2.4.21-pre5. Seems to work for me too. Just FYI, /proc/self/fd/3 is a wacky file descriptor that my horrible shell script hack sets up. Basically, it squirrels away the script file in the process's memory space and then produces this weird FD that just references the processes own memory space. It's gross but it gets around the problem of not having the script on the slave node. - Erik |
From: J.A. M. <jam...@ab...> - 2003-04-16 22:24:35
|
On 04.16, gor...@ph... wrote: > > Odd thing. Attempting to "bpsh" the following csh or tcsh script fails. > > #!/bin/csh > echo "Hello" > > Results in: > > /proc/self/fd/3: No such file or directory > It works for me. Running bproc-3.2.4 on 2.4.21-pre5. -- J.A. Magallon <jam...@ab...> \ Software is like sex: werewolf.able.es \ It's better when it's free Mandrake Linux release 9.2 (Cooker) for i586 Linux 2.4.21-pre7-jam1 (gcc 3.2.2 (Mandrake Linux 9.2 3.2.2-5mdk)) |
From: <gor...@ph...> - 2003-04-16 18:49:09
|
Odd thing. Attempting to "bpsh" the following csh or tcsh script fails. #!/bin/csh echo "Hello" Results in: /proc/self/fd/3: No such file or directory Bourne or ksh scripts have no such problems. "bpsh -N" eliminates the error message, but also discards the output. Using the bproc_rfork() call also eliminates the problem. Using bproc 3.2.0 on 2.4.19. Anyone have experience with this? Goran |
From: <er...@he...> - 2003-04-16 17:34:49
|
On Thu, Apr 10, 2003 at 01:23:02PM -0400, Nicholas Henke wrote: > Hey Erik -- Ok it looks like there may be >=2 problems combining here to > make life difficult. > > 1) Odd ass-ed SIGSTOP problem. I think this may have been solved. We > talked to a few of the other guys here, and apparently the latest > versions of glibc 2.2.4-{31,32} from redhat for 7.2 are SCREWED for > pthreads. Something to do with each thread getting 64k for their stack > when there is a total of 64K of total stack for their threads. We > reverted to glibc-2.2.4-30, and I have not seen the SIGSTOP problem > again. That's interesting. Still, it shouldn't be possible to get a SIGSTOP to the slave daemon no matter how screwed up a thing you do. > 2) kill -9 being ignored. Dunno here -- I am going to try the patch > suggested, and see what happens there. Hopefully the trace information > can shed some light on this. I believe the patch that J.A. Magallon suggested is aimed at problem 1. I'm pretty sure #2 is purely a BProc problem. - Erik |
From: <er...@he...> - 2003-04-16 17:27:46
|
On Mon, Apr 07, 2003 at 04:51:04PM -0700, rwillis wrote: > Hi, > > I made a simple test program for MPI to test it on my cluster. The program > simply sends, from any non-zero node to node-0 a simple message, node-0 then > prints it. Anyway, I compile it (using mpicc), and invoke it by typing > <path>/mpirun -d -G -p 2 ./hello > > The program does not run, but I get this back; > > rank 1 pid=6681 exited with signal 13 > [0] Error: inconsistancy in collected data! > rank 0 pid=6680 exited with signal 13 > > I get the same thing when trying to run Netpipe (make mpi). I don't know > where the inconsistancy error is coming from, I have not found it in any > source. > > Any ideas? > Has anyone seen this before? > is exiting with signal 13 bad or good? It's definitely bad. Most likely you're having a problem with the GM ids, etc in the nodeinfo file. Most likely, the GM id stored there don't match what's on the nodes. The bit of code you're having trouble with is: if ((gmpi.port_ids[MPID_MyWorldRank] != port_id) || (gmpi.board_ids[MPID_MyWorldRank] != board_id) || (gmpi.node_ids[MPID_MyWorldRank] != gmpi.my_node_id)) { fprintf (stderr, "[%d] Error: inconsistency in collected data !\n", MPID_MyWorldRank); gmpi_abort (0); } > BTW, I put in some debugging into mpirun to look at parameters going into > bproc_vexecmove_io() and a NULL is being passed into the function for the > program name when I use 'hello' instead of './hello' as an arguement to > mpirun. I don't know if htis is a bug or not, but I thought I would report > it anyways. Sort of. It's on the list of things to fix. - Erik |
From: Mike S. <mik...@al...> - 2003-04-16 15:25:57
|
I am setting up Lam on a bproc cluster and it seems to hang after starting on the master node. Below is the last line from the startup output n0<3760> ssi:boot:bproc: execmoving /usr/local/bin/lamd -H 10.0.4.1 -P 32839 -n 0 -o 0 -d to -1 which I read as lam starts on the master and then does and execmove to the master which does result in a running binary but does not seem to return. I am not sure if this is a lam problem but has noticed that commands such as bpsh -1 ls will hang, but bpsh -1 hostname will work. Should you be able to use bpsh to run any command on the master? -- Mike Sullivan Director Performance Computing @lliance Technologies, Voice: (416) 385-3255 x 228, 18 Wynford Dr, Suite 407 Fax: (416) 385-1774 Toronto, ON, Canada, M3C-3S2 Toll Free:1-877-216-3199 http://www.alltec.com |
From: Kimitoshi T. <kt...@cl...> - 2003-04-14 13:12:32
|
Hello all I want to measure the network perfomance on my bproc system, using programs like netpipe. My system uses tcp/ip giga ether. So I tried NPtcp. However, when I tried to run NPtcp in reciever mode, it said $ bpsh 1 NPtcp -r -b 0 -u 1048576 NetPIPE: protocol 'tcp' unknown! In a source code of Netpipe, it says, if(!(proto = getprotobyname("tcp"))){ printf("NetPIPE: protocol 'tcp' unknown!\n"); exit(555); } So my guess was to copy /etc/protocols and one of libnss_ which include getprotobyname to slave nodes, by using /etc/beowulf/node_up.conf and /etc/beowulf/config. Which one of libnss should I copy to slave node ? Or, am I wrong in the first place ? Is there better/easier way to measure network performance(peer to peer throughput) on a bproc system ? Thank you, Kimitoshi Takahashi Cluster Computing Inc., Japan |
From: Nicholas H. <he...@se...> - 2003-04-10 17:18:46
|
Hey Erik -- Ok it looks like there may be >=2 problems combining here to make life difficult. 1) Odd ass-ed SIGSTOP problem. I think this may have been solved. We talked to a few of the other guys here, and apparently the latest versions of glibc 2.2.4-{31,32} from redhat for 7.2 are SCREWED for pthreads. Something to do with each thread getting 64k for their stack when there is a total of 64K of total stack for their threads. We reverted to glibc-2.2.4-30, and I have not seen the SIGSTOP problem again. 2) kill -9 being ignored. Dunno here -- I am going to try the patch suggested, and see what happens there. Hopefully the trace information can shed some light on this. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |
From: Nicholas H. <he...@se...> - 2003-04-10 17:08:01
|
Again -- head node: 12867 ? S 0:00 bpsh -n 57 subtaskInvoker /scratch/user/sfischer/slot_1/result /genomics/binf/scratch/dotsBuilds/nicTest/mus/similarity/fin 12869 ? SW 0:00 \_ [subtaskInvoker] 12870 ? SW 0:01 \_ [blastSimilarity] 498 ? SW 0:00 \_ [sh] 499 ? SW 0:00 \_ [blastx] 512 ? SW 0:00 \_ [blastx] 513 ? SW 0:00 \_ [blastx] Node57: 568 ? S 0:09 /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 569 ? S 0:00 \_ /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 623 ? S 0:00 \_ mond -d 15333 ? S 0:00 \_ /bin/sh /proc/self/fd/3 /scratch/user/sfischer/slot_1/result /genomics/binf/scratch/dotsBuilds/nicTest/mus/similarity/f 15334 ? S 0:01 \_ /usr/bin/perl /home/sfischer/gushome/bin/blastSimilarity --blastBinDir /genomics/share/pkg/bio/wu-blast/current --d 17625 ? S 0:00 \_ sh -c /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask=seg+xn 17626 ? S 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask seg+xnu 17639 ? S 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask seg+ 17640 ? S 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask After the kill -9 17626 17639 17640 : 568 ? S 0:09 /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 569 ? S 0:00 \_ /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 623 ? S 0:00 \_ mond -d 17640 ? S 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask seg+xnu W 3 T 1000 B 15333 ? S 0:00 \_ /bin/sh /proc/self/fd/3 /scratch/user/sfischer/slot_1/result /genomics/binf/scratch/dotsBuilds/nicTest/mus/similarity/f 15334 ? S 0:01 \_ /usr/bin/perl /home/sfischer/gushome/bin/blastSimilarity --blastBinDir /genomics/share/pkg/bio/wu-blast/current --d 17625 ? Z 0:00 \_ [sh <defunct>] Now again kill -9 17640: 568 ? S 0:09 /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 569 ? S 0:00 \_ /usr/sbin/bpslave -m /scratch/bpslave_new.strace -r 192.168.0.223 2223 623 ? S 0:00 \_ mond -d 15333 ? S 0:00 \_ /bin/sh /proc/self/fd/3 /scratch/user/sfischer/slot_1/result /genomics/binf/scratch/dotsBuilds/nicTest/mus/similarity/f 15334 ? S 0:01 \_ /usr/bin/perl /home/sfischer/gushome/bin/blastSimilarity --blastBinDir /genomics/share/pkg/bio/wu-blast/current --d 17792 ? S 0:00 \_ sh -c /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask=seg+xn 17793 ? R 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask seg+xnu 17806 ? S 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask seg+ 17807 ? R 0:00 \_ /genomics/share/pkg/bio/wu-blast/current/blastx /scratch/user/sfischer/prodom.fsa seqTmp -wordmask --This shows that it died, and the user's program started a new blast on the node. Trace :http://www.liniac.upenn.edu/~henken/bproc/node57.trace Ok -- hope these help. If you need anything else -- please holler at me. Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |