From: Nicholas H. <he...@se...> - 2003-04-08 21:30:37
|
More info -- I get a ping timeout on the master to node30 -- so I ssh node30 and see what is going on. bpslave is still running 2 threads, and there is one processes running under bproc. If I run strace on the lower of the 2 processes -- this is what I get. After strace exits, the bpslave process is dead. < from ps auxwwww > root 577 0.0 0.0 1416 532 ? T 14:46 0:01 /usr/sbin/bpslave -r 192.168.0.223 2223 [root@node30 root]# strace -p 577 --- SIGSTOP (Stopped (signal)) --- wait4(-1, NULL, WNOHANG, NULL) = 0 time(NULL) = 1049833543 time(NULL) = 1049833543 select(5, [3 4], [3], NULL, {15, 0}) = 3 (in [3 4], out [3], left {15, 0}) read(4, "\323B\0\0\17\0\1\3\276\f\0\0\276\f\0\0\0\0\0\0\0\0\0\0"..., 152) = 152 write(3, "\0\0\0\0\27\0\2\2\0\0\0\0\377\377\377\377\0\0\0\0\4\20"..., 152) = 152 write(3, "\323B\0\0\17\0\1\3\276\f\0\0\276\f\0\0\0\0\0\0\0\0\0\0"..., 152) = 152 read(3, "\270\341\2\0\4\0\2\1\377\377\377\377\271\f\0\0\0\0\0\0"..., 152) = 152 time(NULL) = 1049833543 wait4(-1, NULL, WNOHANG, NULL) = 0 time(NULL) = 1049833543 time(NULL) = 1049833543 select(5, [3 4], [4], NULL, {15, 0}) = 3 (in [3 4], out [4], left {15, 0}) write(4, "\270\341\2\0\4\0\2\1\377\377\377\377\271\f\0\0\0\0\0\0"..., 152) = 152 read(4, "\324B\0\0\r\0\1\3\275\f\0\0\275\f\0\0\0\0\0\0\0\0\0\0\0"..., 152) = 152 read(3, "\271\341\2\0\4\0\2\1\377\377\377\377\271\f\0\0\0\0\0\0"..., 152) = 152 time(NULL) = 1049833543 wait4(-1, NULL, WNOHANG, NULL) = 0 time(NULL) = 1049833543 time(NULL) = 1049833543 select(5, [3 4], [3 4], NULL, {15, 0}) = 4 (in [3 4], out [3 4], left {15, 0}) write(4, "\271\341\2\0\4\0\2\1\377\377\377\377\271\f\0\0\0\0\0\0"..., 152) = 152 read(4, "\325B\0\0\17\0\1\3\275\f\0\0\275\f\0\0\0\0\0\0\0\0\0\0"..., 152) = 152 write(3, "\324B\0\0\r\0\1\3\275\f\0\0\275\f\0\0\0\0\0\0\0\0\0\0\0"..., 152) = -1 EPIPE (Broken pipe) --- SIGPIPE (Broken pipe) --- [root@node30 root]# strace -p 577 attach: ptrace(PTRACE_ATTACH, ...): No such process I find it a bit curious that bpslave is getting SIGSTOP -- ?? Nic -- Nicholas Henke Penguin Herder & Linux Cluster System Programmer Liniac Project - Univ. of Pennsylvania |