From: Gordan B. <go...@bo...> - 2009-06-16 14:53:10
|
What is the difference between these two files? I noticed that /etc/xkillallprocs got clobbered after a reboot, and the two lines I added to it (glusterfs and glusterfsd) got removed. On shutdown, with the file edited to add those two, shutdown with glusterfs still locks up immediately after "sending all processes the TERM signal". Any ideas on how to debug this further? My gut feeling is that glusterfs ends up getting killed and the machine locks up because the rootfs went away, but it's quite hard investigate a system in such a hung state. Gordan |
From: Marc G. <gr...@at...> - 2009-07-01 11:07:55
|
Hi Gordan, sorry for taking that long. On Tuesday 16 June 2009 16:53:02 Gordan Bobic wrote: > What is the difference between these two files? I noticed that > /etc/xkillallprocs got clobbered after a reboot, and the two lines I added > to it (glusterfs and glusterfsd) got removed. On shutdown, with the file Yes they got removed. Basically they should be built automatically. The procs are got from a function called {rootfs}_get_userspace_procs. In your case it should be glusterfs_userspace_procs. For gfs (rhel5) it looks as follows: function gfs_get_userspace_procs { local clutype=$1 local rootfs=$2 echo -e "aisexec \n\ ccsd \n\ fenced \n\ gfs_controld \n\ dlm_controld \n\ groupd \n\ qdiskd \n\ clvmd" } > edited to add those two, shutdown with glusterfs still locks up immediately > after "sending all processes the TERM signal". Any ideas on how to debug > this further? My gut feeling is that glusterfs ends up getting killed and > the machine locks up because the rootfs went away, but it's quite hard > investigate a system in such a hung state. Yes. It is. I always add /bin/bash(s) at every step in the relevant initscripts. But I would say if you get that xkillallprocs right it should work. You also need the /usr/comoonics/sbin/killall binary which does not kill _ALL_ userproceses but can exclude the ones in i.e. /etc/xkillallprocs. For a little backround see: https://bugzilla.redhat.com/show_bug.cgi?id=496843 https://bugzilla.redhat.com/show_bug.cgi?id=496854 https://bugzilla.redhat.com/show_bug.cgi?id=496857 https://bugzilla.redhat.com/show_bug.cgi?id=496861 Again sorry for the late response. But still hope that helps Marc. -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: Gordan B. <go...@bo...> - 2009-07-03 12:31:24
|
On Wed, 1 Jul 2009 13:07:02 +0200, Marc Grimme <gr...@at...> wrote: > Hi Gordan, > sorry for taking that long. No problem. This particular thing is only an issue at shutdown and I don't down my servers very often. And even then it's not a problem with functioning fencing devices. ;) >> What is the difference between these two files? I noticed that >> /etc/xkillallprocs got clobbered after a reboot, and the two lines I >> added >> to it (glusterfs and glusterfsd) got removed. On shutdown, with the file > > Yes they got removed. Basically they should be built automatically. > The procs are got from a function called {rootfs}_get_userspace_procs. In > your case it should be glusterfs_userspace_procs. Aha! That's what I'm missing! Thank you! >> edited to add those two, shutdown with glusterfs still locks up >> immediately >> after "sending all processes the TERM signal". Any ideas on how to debug >> this further? My gut feeling is that glusterfs ends up getting killed and >> the machine locks up because the rootfs went away, but it's quite hard >> investigate a system in such a hung state. > > Yes. It is. I always add /bin/bash(s) at every step in the relevant > initscripts. But I would say if you get that xkillallprocs right it should > work. I was thinking about something similar, but with double-wrapping init so that there is an init for the base root that can run gettys, and have a base root shell available to investigate things when they get going. It was sufficiently complicated to implement to deter me, at least for now, though. The bash-at-every-line idea has more short-term merit. :) > You also need the /usr/comoonics/sbin/killall binary which does not kill > _ALL_ > userproceses but can exclude the ones in i.e. /etc/xkillallprocs. Last I checked, that was in the halt patch that gets applied automatically. Has that changed recently? > For a little backround see: > https://bugzilla.redhat.com/show_bug.cgi?id=496843 > https://bugzilla.redhat.com/show_bug.cgi?id=496854 > https://bugzilla.redhat.com/show_bug.cgi?id=496857 > https://bugzilla.redhat.com/show_bug.cgi?id=496861 Indeed, I'm aware of the background. I was just failing to figure out where the exclusion list gets set. Having said that, if i manually modify the /etc/xkillallprocs, should that not be honoured at least in the next shutdown? I've found that the shutdown hangs even when I add glusterfs processes to it. Thanks. Gordan |
From: gordan <go...@bo...> - 2009-07-03 14:39:48
|
How about adding a debug (-d) switch to killall that makes it report the name of the process it is killing? Gordan On Wed, 1 Jul 2009, Gordan Bobic wrote: > On Wed, 1 Jul 2009 13:07:02 +0200, Marc Grimme <gr...@at...> wrote: >> Hi Gordan, >> sorry for taking that long. > > No problem. This particular thing is only an issue at shutdown and I don't > down my servers very often. And even then it's not a problem with > functioning fencing devices. ;) > >>> What is the difference between these two files? I noticed that >>> /etc/xkillallprocs got clobbered after a reboot, and the two lines I >>> added >>> to it (glusterfs and glusterfsd) got removed. On shutdown, with the file >> >> Yes they got removed. Basically they should be built automatically. >> The procs are got from a function called {rootfs}_get_userspace_procs. In >> your case it should be glusterfs_userspace_procs. > > Aha! That's what I'm missing! Thank you! > >>> edited to add those two, shutdown with glusterfs still locks up >>> immediately >>> after "sending all processes the TERM signal". Any ideas on how to debug >>> this further? My gut feeling is that glusterfs ends up getting killed > and >>> the machine locks up because the rootfs went away, but it's quite hard >>> investigate a system in such a hung state. >> >> Yes. It is. I always add /bin/bash(s) at every step in the relevant >> initscripts. But I would say if you get that xkillallprocs right it > should >> work. > > I was thinking about something similar, but with double-wrapping init so > that there is an init for the base root that can run gettys, and have a > base root shell available to investigate things when they get going. It was > sufficiently complicated to implement to deter me, at least for now, > though. The bash-at-every-line idea has more short-term merit. :) > >> You also need the /usr/comoonics/sbin/killall binary which does not kill >> _ALL_ >> userproceses but can exclude the ones in i.e. /etc/xkillallprocs. > > Last I checked, that was in the halt patch that gets applied automatically. > Has that changed recently? > >> For a little backround see: >> https://bugzilla.redhat.com/show_bug.cgi?id=496843 >> https://bugzilla.redhat.com/show_bug.cgi?id=496854 >> https://bugzilla.redhat.com/show_bug.cgi?id=496857 >> https://bugzilla.redhat.com/show_bug.cgi?id=496861 > > Indeed, I'm aware of the background. I was just failing to figure out where > the exclusion list gets set. Having said that, if i manually modify the > /etc/xkillallprocs, should that not be honoured at least in the next > shutdown? I've found that the shutdown hangs even when I add glusterfs > processes to it. > > Thanks. > > Gordan > > ------------------------------------------------------------------------------ > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel > |
From: Marc G. <gr...@at...> - 2009-07-03 15:01:14
|
On Wednesday 01 July 2009 13:47:28 Gordan Bobic wrote: > On Wed, 1 Jul 2009 13:07:02 +0200, Marc Grimme <gr...@at...> wrote: > > Hi Gordan, > > sorry for taking that long. > > No problem. This particular thing is only an issue at shutdown and I don't > down my servers very often. And even then it's not a problem with > functioning fencing devices. ;) But it should work though. > > >> What is the difference between these two files? I noticed that > >> /etc/xkillallprocs got clobbered after a reboot, and the two lines I > >> added > >> to it (glusterfs and glusterfsd) got removed. On shutdown, with the file > > > > Yes they got removed. Basically they should be built automatically. > > The procs are got from a function called {rootfs}_get_userspace_procs. In > > your case it should be glusterfs_userspace_procs. > > Aha! That's what I'm missing! Thank you! Let me know if it works. > > >> edited to add those two, shutdown with glusterfs still locks up > >> immediately > >> after "sending all processes the TERM signal". Any ideas on how to debug > >> this further? My gut feeling is that glusterfs ends up getting killed > > and > > >> the machine locks up because the rootfs went away, but it's quite hard > >> investigate a system in such a hung state. > > > > Yes. It is. I always add /bin/bash(s) at every step in the relevant > > initscripts. But I would say if you get that xkillallprocs right it > > should > > > work. > > I was thinking about something similar, but with double-wrapping init so > that there is an init for the base root that can run gettys, and have a > base root shell available to investigate things when they get going. It was > sufficiently complicated to implement to deter me, at least for now, > though. The bash-at-every-line idea has more short-term merit. :) Yes, I don't like it either. > > > You also need the /usr/comoonics/sbin/killall binary which does not kill > > _ALL_ > > userproceses but can exclude the ones in i.e. /etc/xkillallprocs. > > Last I checked, that was in the halt patch that gets applied automatically. > Has that changed recently? No it still is in SysVinit-comoonics found in the comoonics-repo. > > > For a little backround see: > > https://bugzilla.redhat.com/show_bug.cgi?id=496843 > > https://bugzilla.redhat.com/show_bug.cgi?id=496854 > > https://bugzilla.redhat.com/show_bug.cgi?id=496857 > > https://bugzilla.redhat.com/show_bug.cgi?id=496861 > > Indeed, I'm aware of the background. I was just failing to figure out where > the exclusion list gets set. Having said that, if i manually modify the > /etc/xkillallprocs, should that not be honoured at least in the next > shutdown? I've found that the shutdown hangs even when I add glusterfs > processes to it. As I said you need /usr/comoonics/sbin/killall5 for it. This allows a killall5 -x <process> + init u. We are trying to get this upstream but until now only init u got accepted. Marc. > > Thanks. > > Gordan > > --------------------------------------------------------------------------- >--- _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |