From: Erik A. H. <er...@he...> - 2002-01-25 17:47:44
|
On Wed, Jan 23, 2002 at 04:13:33PM -0500, henken wrote: > Here is the latest error message. I am getting this with 3.1.5 + patch. > Feb 4 17:55:30 master /usr/sbin/bpmaster: write(ghost): missing process > for message type 14 req; to=3,7576 from=1,7576 result=0 > > Here is the relevant ps output > [henken@master henken]$ ps -jx > PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND > 22401 7573 22385 22385 ? -1 S 27659 0:00 bpsh 0 > /home/henken/cvs/jobs/bin/noop > 7573 7576 22385 22385 ? -1 RW 27659 0:00 [noop] > 20876 7624 7624 20826 pts/2 7624 R 27659 0:00 ps -jx > > > You can see that the messag is tryin to get to the process that bpsh > started on the remote node. Actually, it's an EXIT message from the remote process back to the ghost on the front end. message type 14 req; to=3,7576 from=1,7576 result=0 14 = EXIT 3 = route to ghost 7576 = pid of ghost 1 = route to real (from real in this case) 7576 = pid of real process 0 = result - in this case indicating exit status 0 - normal exit. That's perfectly normal since "noop" exits on the remote node and needs to tell the ghost to do the same. The question is why the process on the front end doesn't seem to be a ghost. Looking at your ps output, it does look like it's probably ghosted since it's has no memory space swapped in. It's in the W state. Does "noop" stay in that state pretty much forever? What's in /proc/<pid>/maps for noop at this point? I really wish I could see a message trace for this. A trace of just the move and exit messages would be pretty useful here if possible. Hopefully that would get the message bulk down far enough to be managable. You can filter by doing something like: bpmaster -d -m - | egrep 'move|exit' > file Basically, I want to know what the result on the move response for "noop" was in this case. If it's non-zero and the remote went on for some reason, or if there's no error and the ghost screwed up somehow. I don't think it should be a problem, but I could imagine this being caused by the exit and move response getting out of order. - Erik -- Erik Arjan Hendriks Printed On 100 Percent Recycled Electrons er...@he... Contents may settle during shipment |