You can subscribe to this list here.
2006 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(105) |
Nov
(10) |
Dec
(7) |
2008 |
Jan
|
Feb
(31) |
Mar
(13) |
Apr
(7) |
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
(4) |
Oct
|
Nov
(23) |
Dec
|
2009 |
Jan
(25) |
Feb
(24) |
Mar
(10) |
Apr
(8) |
May
(4) |
Jun
(6) |
Jul
(27) |
Aug
(1) |
Sep
|
Oct
(2) |
Nov
(7) |
Dec
(25) |
2010 |
Jan
|
Feb
(7) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
(1) |
From: Marc G. <gr...@at...> - 2011-12-04 13:16:11
|
Hello together, just a short note to make you aware that we are also posting information on Twitter. So anybody who is interested can follow on https://twitter.com/#!/OpenSharedroot. We'll try to give more information on background developments on both the com.oonics Project and Open-Sharedroot. We'll post community information as well as information on professional usage of the com.oonics Enterprise IT Platform. So just follow to get more information. Have fun Marc. ______________________________________________________________________________ Marc Grimme E-Mail: gr...@at... ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de Enterprise Linux einfach online kaufen: www.linux-subscriptions.com Registergericht: Amtsgericht München, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan R. Bergrath | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Marc G. <gr...@at...> - 2011-11-25 15:43:44
|
Hello again, after being silent for such a long time, I should try to bring everybody up on track. As Olaf already stated in the past year we decided to integrate the open shared root project into ifs father project com.oonics. The technical background is pretty easy. Besides the use case of shared root we've also seen many not shared root installation. Mainly when the focus was to have a installation that is capable of booting on different types of hardware (same architecture). We called this feature flexboot. Besides some other projects came to live that added additional functionality that was not directly connected or necessary for having a shared root. Basically this is the reason why we decided to move the open shared root projects inside the com.oonics project. This does not change any function for using a shared root cluster but the naming. Besides moving the name we also decided to move the code to an official openly available git repository. You can now find the latest code to be downloaded from https://github.com/comoonics. You'll find four repositories: * git://github.com/comoonics/comoonics-initrd-ng.git: Is the repository for the initrd used for booting flexible Linux installations * git://github.com/comoonics/comoonics-cluster-suite.git: The python libraries needed as dependency for some of the flexible Linux installation tools (see http://comoonics.org/development/ and the projects called comoonics-*-py). * git://github.com/comoonics/comoonics-release.git: The release files. * git://github.com/comoonics/comoonics-Dracut.git: The dracut modules to use open shared root with dracut (initrd of RHEL6 and Fedora) Bugs and feature requests should be filed at https://bugzilla.comoonics.org/. In the next weeks and months we'll also try to bring some information up2date on http://comoonics.org and http://open-sharedroot.org. We're happy to get any kind of feedback. Have fun Marc. ______________________________________________________________________________ Marc Grimme ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de Enterprise Linux einfach online kaufen: www.linux-subscriptions.com Registergericht: Amtsgericht München, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan R. Bergrath | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Olaf R. <bri...@ol...> - 2011-11-19 19:55:46
|
Hello everybody, I'm Olaf form munich germany and I`m new here. At first: a big sorry for my blemished english! I am the autor of the German fedora wiki Article (https://fedoraproject.org/wiki/Features/Opensharedroot/de). On 2011-05-31 "OPEN-Sharedroot has grown up" (http://www.comoonics.org/news-archive/open-sharedroot-has-grown-up/). So this Project is rename to "com.oonics". Yes, that is fine! As I understand it, we have some todos now. Here are my proposal tops: * renamesourceforge project pageor open new Project. * We need now a new com.oonics roadmap (http://www.comoonics.org/development/osr-development-roadmap/major-os-platforms) * Well I look in the Email Archives (http://sourceforge.net/projects/open-sharedroot/support), I think one list is enough: three posts in one Yaer on twolists! So let's merge the open-sharedroot-users and open-sharedroot-devel list. * Moveing the bugtracker fromhttps://bugzilla.open-sharedroot.org/ toohttps://bugzilla.comoonics.org/ * Some source code file missing a license node in the header. That is a law insecurity for users and contributors. * I think, we need a coding convention. As file in the code or as article on the website. * nice to have: a Nightly Build Service. That is important for the early tester. * An other important thinks is, a clear release management and/or software development process. In dependent a clear git workflow conventions. That is important for new contributors. If that is okay, I will open a feature requests ticket of this tops. Blessings, Olaf |
From: Marc G. <gr...@at...> - 2011-11-14 21:32:04
|
Hello, I'm very happy to announce the availability for the open-sharedroot project version 5.0. It is now possible to build diskless "shared" root clusters for the following configurations: - RHEL5 Ext3, Ext4, NFS, GFS, GFS2*, GlusterFS* - RHEL6 Ext3, Ext4, NFS, GFS2*, GlusterFS* - SLES11 Ext3, Ext4, NFS, OCFS2 - Fedora* - OpenSuSE* - Now with and without configuration via /etc/cluster/cluster.conf. * not yet or not officially supported We are looking forward to your feedback. See also http://comoonics.org/news-archive/release-of-com-oonics-5-0-for-rhel5-nfs-and-rhel6-nfs-sharedroot Have fun. Marc ______________________________________________________________________________ Marc Grimme E-Mail: grimme( at )atix.de ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de Enterprise Linux einfach online kaufen: www.linux-subscriptions.com Registergericht: Amtsgericht München, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Thomas Merz (Vors.), Marc Grimme, Mark Hlawatschek, Jan R. Bergrath | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Marc G. <gr...@at...> - 2010-10-20 19:48:42
|
Hi Gordan, no we didn't make any effort on porting the initrd to BusyBox. With RHEL6 we will provide dracut plugins wherever possible. We are currently following up on this project. So we'll expect to have something in the end of this year. Hope that helps. Regards Marc. ----- "Gordan Bobic" <go...@bo...> wrote: > Hi, > > Has there been any progress on dieting the initrd using BusyBox? > > Also, has any effort thus far gone into updating the initrd building > to > make it work on RHEL6 (dracut based)? A lot of my new deployments are > > pre-emptively based on RHEL6b to avoid an upgrade in a few months' > time, > so being able to build OSR RHEL6 would be quite useful. > > Gordan > > ------------------------------------------------------------------------------ > Download new Adobe(R) Flash(R) Builder(TM) 4 > The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly > Flex(R) Builder(TM)) enable the development of rich applications that > run > across multiple browsers and platforms. Download your free trials > today! > http://p.sf.net/sfu/adobe-dev2dev > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Marc Grimme Tel: +49 89 4523538-14 Fax: +49 89 9901766-0 E-Mail: gr...@at... ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2010-10-20 07:46:14
|
Hi, Has there been any progress on dieting the initrd using BusyBox? Also, has any effort thus far gone into updating the initrd building to make it work on RHEL6 (dracut based)? A lot of my new deployments are pre-emptively based on RHEL6b to avoid an upgrade in a few months' time, so being able to build OSR RHEL6 would be quite useful. Gordan |
From: <ope...@pr...> - 2010-02-24 10:19:51
|
Your mail to 'Open-t9-commits' with the subject For ope...@li... Discount #40992 Is being held until the list moderator can review it for approval. The reason it is being held: Post by non-member to a members-only list Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://lists.projects.openmoko.org/cgi-bin/mailman/confirm/open-t9-commits/bedb6ec5183b9243151abe4a40428147e09c5a25 |
From: <ope...@pr...> - 2010-02-21 11:31:15
|
Your mail to 'Open-t9-commits' with the subject For ope...@li... Discount #44245 Is being held until the list moderator can review it for approval. The reason it is being held: Post by non-member to a members-only list Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://lists.projects.openmoko.org/cgi-bin/mailman/confirm/open-t9-commits/fb240f54674ea971f17ea73ee7b081e8cbd6db7e |
From: Gordan B. <go...@bo...> - 2009-12-29 11:06:32
|
Marc Grimme wrote: >>> Hm. I like that awk, grep, etc are going with busybox on the one >> hand. >>> That would make things easier. I'll think about that. >> OK. I'm more than happy to help with the testing/implementation of >> this. As far as I can tell, all that would be required is creating the >> symlinks - doable with something like: >> >> ./busybox --help |\ >> grep 'Currently defined functions:' -A30 \ >> | grep '^\s.*,' \ >> | tr , '\n' \ >> | xargs -n1 -i{} ln -s busybox {} >> >> (that's off the top of my head, don't quote me on it, and -A30 may be >> too low) >> >> and pruning out the packages that are made redundant by it. If taking >> it to the extreme and dropping bash as well, then all the shell scripts >> would need to be checked to reference /bin/sh instead if /bin/bash, >> and they'd need testing to make sure that they aren't broken by some bash >> vs. sh obscurity. > Right. But that's much work. > There are many bash deps inside the code. I'm currently cleaning it > up a little bit. But I don't see to get rid of bash in near time. Are there really that many bash specific extensions in use in there that aren't supported by vanilla bourne sh? > As we are also going to support dracut as initrd for future clusters - > on distros where it will be available (FC12, RHEL6, SLES???) - all this > done there. They are using dash (posix shell). Currently we did not port > the cluster filesystem parts but NFS to dracut. Interesting. I didn't realize there was such a major change introduced recently into the way initrd works. > So bottomline is we'll stay with bash in bootscripts as it's too much > work to move so far. Especially when I expect dracut to overpower our > initrd. Indeed, I see where you're going with this. It does sound like cometh FC13/RHEL6 it may be time for a major change. And that puts a limit on the life expectancy of the current setup. :-/ >>> BTW: I'm currently developing an initrd without the need of python >> and >>> friends. That will really spare a lot of space ;-) . >> Now that IS interesting. What is python stuff actually used for? In >> all my modding and patching of OSR I've never actually bumped into any >> python code. > > com-queryclusterconf is python. And this queries the clusterconfiguration > in a general way. This is required right now. > > AND most importantly for GFS clusters we need fenceagents in the initrd > and most are based on python. I've heard rumours to this extent, but all the ones I ever needed to use were always written in perl. > Although we could add those later on via bootsr. Ouch. That's not ideal. Mind you, python vs. perl isn't that big a difference in terms of disk footprint. > But when we are getting rid of the com-queryclusterconf python dep we > could build initrds without python/perl for NFS,OCFS2, glusterfs and localfs. That is an interesting point. But doesn't OCFS2 need fencing agents? I would expect it to, considering it is functionally nearly identical to GFS. Speaking of which, I've been looking into getting Veritas Cluster to work, and when I manage to get my hands on a trial licence I'll look into adding support for it into OSR. ;) Gordan |
From: Gordan B. <go...@bo...> - 2009-12-29 10:55:33
|
Marc Grimme wrote: >> For the sake of education and diversity I have just finished the first >> >> attempt at this, purely to see if there is an inherent problem in >> OpenVZ that might shoot this down. >> >> Well - I haven't found any such problems! :D > > I don't see them either. As far as I understand them. My big concern was that I didn't know what OpenVZ guests would do and how the OpenVZ host would handle a situation where another OpenVZ hosts (and it's OpenVZ guest) started modifying the directory trees concurrently. Thankfully, at least thus far, it hasn't blown up horribly, which is a good start. :) >> The basic setup was this: >> >> Two identical host machines (virtual, because it was easier, but fully >> >> virtualized with KVM, CentOS 5.4). The host OS was stripped down to a >> >> bare minimum (a "mere" 850MB...) since I didn't feel applying the more >> >> sensibly sized OSR init root was vital for now and modding it would >> require extra work. >> >> Shared virtual disk (shared image, presented as an IDE device), with >> GFS on top. So far, so standard. >> >> The shared GFS device was mounted under /vz/private (where OpenVZ >> keeps the VM fs trees). >> >> CentOS 5.4 guest template was initialized in there. The guest config >> file was the same on both hosts, except for the IP address (the IP is >> >> configured on the host rather than the guest, but each guest can have >> >> independent iptables rules if required). >> >> Thus - the two guests were running on a GFS shared root. The one thing >> >> remaining would be to set up the entries in fstab to make sure that >> /cdsl.local gets bind mounted correctly at boot-up time, but other >> than that, I'd say the preliminary prototype test has passed. :) >> >> The basic thing I wanted to achieve is to have a cleaner separation >> between the host provided shared rootfs and the guest so that there >> are no issues during shutdown with unmounting file systems, etc. This >> prototype appears to have completely met those requirements. >> >> There is also an added bonus benefit that I hadn't thought of before - >> >> the guest can be cleanly rebooted without rebooting the host (which >> also means without triggering fencing and suchlike). >> > Ok. Correct me if I'm wrong. > My current understanding of your concept is as follows: > > You want to create a OpenVZ Cluster that just does OpenVZ without anything else. > > So no "Sharedroot" within OpenVZ. But it's small. > > Then you'll boot the SharedRoot as a "Guest" for you apps. I'm not sure if you're saying what I think you're saying or the reverse of it, so let me clarify. Let's use the terms "init-root" for the non-shared-root bootstrap part (what the initrd does in OSR) and "real-root" for the actual shared root. The init-root runs the OpenVZ patched kernel, and is responsible of setting up the clustering in order to bring up the shared root file systems. In essence, it's task is pretty similar to what OSR does at the moment. The real-root is on the shared/replicated file system and an OpenVZ guest gets booted into it (OpenVZ uses a chroot on the host's fs rather than a disk image like all the other virtualization solutions). So shared-root is obtained by the guest OpenVZ VMs, while the init-root provides all the usual things that OSR has to provide as usual (clustering start-up, shared-root fs setup). > Advantages: > * Better debugging (Console, ..) > * Easier to maintain > * Virtualizationperspective: Very low resource overhead. Thanks to OpenVZ Containers. It also means that multiple SR clusters could be had on the same set of physical hosts with negligible overheads. This is a new feature. :) > Disadvantages: > * Tainted kernel (no support from RedHat, SAP etc.) Not quite - IIRC OpenVZ is what Virtuozzo uses (http://www.parallels.com/uk/products/pvc45/) so commercial support is available. And OpenVZ project does provide patched versions of RHEL patched kernels. > * Local installations for the "Host" Nodes needed. Not necessarily - there is no reason why this couldn't fit into the initrd. My prototype ended up with an 850MB host install, but that was purely because I just dropped CentOS 5.4 on it, and pruned it down a bit, rather than gone the route of squeezing OpenVZ tools into the OSR initrd (which would yield a much more sensible footprint - although at the cost of not having two of the three advantages you listed above). > Questions: > * This sounds like the approach the ovirt people are doing. They are > using stateless linux with RO-Root as image and then mount /var rw?! Funny you should mention this. I was actually looking for some RHEV-H distro SRPMS so I could rebuild them and set up a RHEV-H host in some vague hope that it might be thinner than a full fat RHEL, but 1) I failed to find any and 2) Upon consideration I don't think it is likely to get that much thinner. OpenVZ have a "minimal" CentOS 5 template that comes in at about 46MB (not including the kernel, though!), so this _could_ be made pretty thin at a push. Of course, there is the trade-off between dieting and functionality. How the host installs would be kept in (partial) sync is a different question - if the install can be trimmed down enough, it could be rolled into the initrd. If that ends up not being plausible it all rapidly turns into a can of worms because the next thing you know you'll end up needing shared root to boot-strap the shared root. :^) > * Complexity: How to handle bonding/multipathing and other enterprise configurations? That is really down to a question of whether OpenVZ can handle it. NIC bonding would be done on the host, and the guest would simply have a venet (virtual network) interface that magically has traffic routed via the multi-path or bonded host NICs. The main point of this would be to separate the clustering aspect from the shared root aspect, and abstract it away from the guest. The guest would need minimal awareness of the fact that it is running in a shared root, over and above bind-mounting some paths to /cdsl.local and /cluster in it's fstab. The NIC configuration is done from the host (although this is a little dodgy in places WRT how it's implemented from the guest side, it's one of the things I'm looking into). > Did I understand your concept right? I hope so, but I re-explained above just to make sure. :) > The biggest problem I see is the support one. This is a very nice > concept for people who don't care about support. As I mentioned above, commercial support for OpenVZ is available, and OpenVZ project does appear to also specifically provide RHEL kernels patched with OpenVZ. Parallels/Virtuozo do provide commercial support. Gordan |
From: Marc G. <gr...@at...> - 2009-12-29 10:36:10
|
----- "Gordan Bobic" <go...@bo...> wrote: > Marc Grimme wrote: > > > Hm. I like that awk, grep, etc are going with busybox on the one > hand. > > That would make things easier. I'll think about that. > > OK. I'm more than happy to help with the testing/implementation of > this. > As far as I can tell, all that would be required is creating the > symlinks - doable with something like: > > ./busybox --help |\ > grep 'Currently defined functions:' -A30 \ > | grep '^\s.*,' \ > | tr , '\n' \ > | xargs -n1 -i{} ln -s busybox {} > > (that's off the top of my head, don't quote me on it, and -A30 may be > > too low) > > and pruning out the packages that are made redundant by it. If taking > it > to the extreme and dropping bash as well, then all the shell scripts > would need to be checked to reference /bin/sh instead if /bin/bash, > and > they'd need testing to make sure that they aren't broken by some bash > > vs. sh obscurity. Right. But that's much work. There are many bash deps inside the code. I'm currently cleaning it up a little bit. But I don't see to get rid of bash in near time. As we are also going to support dracut as initrd for future clusters - on distros where it will be available (FC12, RHEL6, SLES???) - all this done there. They are using dash (posix shell). Currently we did not port the cluster filesystem parts but NFS to dracut. So bottomline is we'll stay with bash in bootscripts as it's too much work to move so far. Especially when I expect dracut to overpower our initrd. > > > BTW: I'm currently developing an initrd without the need of python > and > > friends. That will really spare a lot of space ;-) . > > Now that IS interesting. What is python stuff actually used for? In > all > my modding and patching of OSR I've never actually bumped into any > python code. com-queryclusterconf is python. And this queries the clusterconfiguration in a general way. This is required right now. AND most importantly for GFS clusters we need fenceagents in the initrd and most are based on python. Although we could add those later on via bootsr. But when we are getting rid of the com-queryclusterconf python dep we could build initrds without python/perl for NFS,OCFS2, glusterfs and localfs. > > Gordan > > P.S. > We seem to have drifted off the list again... :-/ This is a development list. And I'm only forgeting to CC it all the time ;-) . It might be interesting to other people as well. -- Marc Grimme |
From: Marc G. <gr...@at...> - 2009-12-29 10:20:33
|
----- "Gordan Bobic" <go...@bo...> wrote: > For the sake of education and diversity I have just finished the first > > attempt at this, purely to see if there is an inherent problem in > OpenVZ > that might shoot this down. > > Well - I haven't found any such problems! :D I don't see them either. As far as I understand them. > > The basic setup was this: > > Two identical host machines (virtual, because it was easier, but fully > > virtualized with KVM, CentOS 5.4). The host OS was stripped down to a > > bare minimum (a "mere" 850MB...) since I didn't feel applying the more > > sensibly sized OSR init root was vital for now and modding it would > require extra work. > > Shared virtual disk (shared image, presented as an IDE device), with > GFS > on top. So far, so standard. > > The shared GFS device was mounted under /vz/private (where OpenVZ > keeps > the VM fs trees). > > CentOS 5.4 guest template was initialized in there. The guest config > file was the same on both hosts, except for the IP address (the IP is > > configured on the host rather than the guest, but each guest can have > > independent iptables rules if required). > > Thus - the two guests were running on a GFS shared root. The one thing > > remaining would be to set up the entries in fstab to make sure that > /cdsl.local gets bind mounted correctly at boot-up time, but other > than > that, I'd say the preliminary prototype test has passed. :) > > The basic thing I wanted to achieve is to have a cleaner separation > between the host provided shared rootfs and the guest so that there > are > no issues during shutdown with unmounting file systems, etc. This > prototype appears to have completely met those requirements. > > There is also an added bonus benefit that I hadn't thought of before - > > the guest can be cleanly rebooted without rebooting the host (which > also > means without triggering fencing and suchlike). > > Gordan Ok. Correct me if I'm wrong. My current understanding of your concept is as follows: You want to create a OpenVZ Cluster that just does OpenVZ without anything else. So no "Sharedroot" within OpenVZ. But it's small. Then you'll boot the SharedRoot as a "Guest" for you apps. Advantages: * Better debugging (Console, ..) * Easier to maintain * Virtualizationperspective: Very low resource overhead. Thanks to OpenVZ Containers. Disadvantages: * Tainted kernel (no support from RedHat, SAP etc.) * Local installations for the "Host" Nodes needed. Questions: * This sounds like the approach the ovirt people are doing. They are using stateless linux with RO-Root as image and then mount /var rw?! * Complexity: How to handle bonding/multipathing and other enterprise configurations? Did I understand your concept right? The biggest problem I see is the support one. This is a very nice concept for people who don't care about support. What do you think? -- Marc Grimme |
From: Gordan B. <go...@bo...> - 2009-12-29 08:28:06
|
For the sake of education and diversity I have just finished the first attempt at this, purely to see if there is an inherent problem in OpenVZ that might shoot this down. Well - I haven't found any such problems! :D The basic setup was this: Two identical host machines (virtual, because it was easier, but fully virtualized with KVM, CentOS 5.4). The host OS was stripped down to a bare minimum (a "mere" 850MB...) since I didn't feel applying the more sensibly sized OSR init root was vital for now and modding it would require extra work. Shared virtual disk (shared image, presented as an IDE device), with GFS on top. So far, so standard. The shared GFS device was mounted under /vz/private (where OpenVZ keeps the VM fs trees). CentOS 5.4 guest template was initialized in there. The guest config file was the same on both hosts, except for the IP address (the IP is configured on the host rather than the guest, but each guest can have independent iptables rules if required). Thus - the two guests were running on a GFS shared root. The one thing remaining would be to set up the entries in fstab to make sure that /cdsl.local gets bind mounted correctly at boot-up time, but other than that, I'd say the preliminary prototype test has passed. :) The basic thing I wanted to achieve is to have a cleaner separation between the host provided shared rootfs and the guest so that there are no issues during shutdown with unmounting file systems, etc. This prototype appears to have completely met those requirements. There is also an added bonus benefit that I hadn't thought of before - the guest can be cleanly rebooted without rebooting the host (which also means without triggering fencing and suchlike). Gordan |
From: Gordan B. <go...@bo...> - 2009-12-29 04:28:12
|
I notice that in: /etc/comoonics/bootimage/rpms.initrd.d/baselibs.list there is coreutils - unfiltered. This is about 9MB on RHEL5. What about replacing this with busybox? That's only 2.4MB, and seems to include everything I can think of that the OSR init root might require. Granted, 9MB->2.4MB isn't as big a saving as some of the other recent ones, but it might be worth having if it doesn't end up being too involved to switch. Thoughts? Gordan |
From: Gordan B. <go...@bo...> - 2009-12-29 02:25:17
|
Marc Grimme wrote: >> The reason I ask two-fold: >> >> 1) I'm not sure if having it symlinked to /proc/mounts confuses the >> process of unmounting the file systems in the shared root during >> shutdown. /proc/mounts shows all mounted file systems, including paths >> >> that aren't mounted in the OSR root (e.g. /mnt/tmproot for the glfs >> backing fs is mounted only in the initroot, but still shows up in >> /proc/mounts). >> >> 2) It seems to interfere with the checks to see whether a bind-mount >> is >> already mounted. Under OSR, I'm seeing all bind-mounts in fstab >> getting >> mounted twice, once when mounting local file systems, and once when >> mounting "other" file systems. > > This can be changed. Just add an options _netdev as option to the fstab an the bindmounts should only be mounted once. I thought about that, but it's a bit of a bodge. It can end up interfering with the unmounting order on shutdown since it makes all of those file systems unmount via the netfs init script rather than halt. >> Is anything likely to break if /etc/mtab is symlinked to >> /cdsl.local/etc/mtab? > Yes. > First see: > https://partner-bugzilla.redhat.com/show_bug.cgi?id=214891 > Then when mount <path> is executed with RHEL5 the following is done (if I recall it right). > All changes are done to a tmpfile /etc/mtab.tmp and this version gets newly created in /etc and afterwards copied back to /etc/mtab. The symlink to /cdsl.local does not survive. But if /etc/mtab is a symlink to /proc/mounts all of the above is ignored. > > That's how I recall the background of this topic. > > As this is in phase of being change it might be that there are changes already in RHEL5 latest U but not very likely. I see. That is, indeed a problem. :-/ Of course, going to OpenVZ guest root as mentioned in my other post would work around this (if it works at all, but that's something I'm planning to find out relatively soon). *whistles innocently* ;) Gordan |
From: Marc G. <gr...@at...> - 2009-12-28 20:03:54
|
----- "Gordan Bobic" <go...@bo...> wrote: > Is there a particular functional reason why /etc/mtab should be > symlinked to /proc/mounts instead of /cdsl.local/etc/mtab? Yes there is. > > The reason I ask two-fold: > > 1) I'm not sure if having it symlinked to /proc/mounts confuses the > process of unmounting the file systems in the shared root during > shutdown. /proc/mounts shows all mounted file systems, including paths > > that aren't mounted in the OSR root (e.g. /mnt/tmproot for the glfs > backing fs is mounted only in the initroot, but still shows up in > /proc/mounts). > > 2) It seems to interfere with the checks to see whether a bind-mount > is > already mounted. Under OSR, I'm seeing all bind-mounts in fstab > getting > mounted twice, once when mounting local file systems, and once when > mounting "other" file systems. This can be changed. Just add an options _netdev as option to the fstab an the bindmounts should only be mounted once. > > Is anything likely to break if /etc/mtab is symlinked to > /cdsl.local/etc/mtab? Yes. First see: https://partner-bugzilla.redhat.com/show_bug.cgi?id=214891 Then when mount <path> is executed with RHEL5 the following is done (if I recall it right). All changes are done to a tmpfile /etc/mtab.tmp and this version gets newly created in /etc and afterwards copied back to /etc/mtab. The symlink to /cdsl.local does not survive. But if /etc/mtab is a symlink to /proc/mounts all of the above is ignored. That's how I recall the background of this topic. As this is in phase of being change it might be that there are changes already in RHEL5 latest U but not very likely. It is easily to be tested: rm /etc/mtab; touch /cdsl.local/etc/mtab; ln -s /cdsl.local/etc/mtab /etc/mtab; mount <somepath>; ls -l /etc/mtab. Hope this helps Marc > > Gordan > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast > and easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Marc Grimme Tel: +49 89 4523538-14 Fax: +49 89 9901766-0 E-Mail: gr...@at... ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2009-12-28 13:06:20
|
Marc Grimme wrote: > Gordan, > thanks for finding this out. > I think you are right with the analysis. > I'll check this and then let this patch go upstream ASAP. OK. Please let me know if you find that I made a mistake somewhere. > P.S. Congratulations for you getting the "2009 Gluster Hacker Award". LOL! You don't miss anything, do you! ;) Thanks! :) Gordan |
From: Marc G. <gr...@at...> - 2009-12-28 11:29:11
|
Gordan, thanks for finding this out. I think you are right with the analysis. I'll check this and then let this patch go upstream ASAP. Thanks again. Marc P.S. Congratulations for you getting the "2009 Gluster Hacker Award". ----- "Gordan Bobic" <go...@bo...> wrote: > Gordan Bobic wrote: > > Gordan Bobic wrote: > >> Marc Grimme wrote: > >> > >>>> And just to confirm, I used the same binary on another machine > >>>> (standalone, no OSR or clustering), and it works exactly as > expected > >>>> (prints out what processes it is killing). That means that > whatever > >>>> causes killall5 to go away and never return is specific to > glfs+OSR > >>>> (since killall5 works fine on my gfs+OSR clusters). I'm not sure > where > >>>> to even begin debugging this, though, so any ideas would be > welcome. > >> > > >>> You might want to try to start it with strace. I recall something > that > >>> under some environments the browsing through /proc which is done > by > >>> killall5 freezes. And I think this is done before killing. Somehow > what > >>> does not work is a stat call on some /proc files within > /proc/<pid>. I > >>> don't recall exactly but I have something like this in mind. > >>> > >>> If you have found the pid that causes the problem perhaps we get > some > >>> new ideas on how to handle this behaviour. > >> OK, I have straced killall5, and the last few things it does is > stat > >> /proc/version (twice, it seems) and set up SIGTERM, SIGSTOP and > SIGKILL > >> signals. This appears to correspond to lines 682-692 in > killall5.c: > >> > >> mount_proc(); > >> ... > >> signal(SIGTERM, SIG_IGN); > >> signal(SIGSTOP, SIG_IGN); > >> signal(SIGKILL, SIG_IGN); > >> > >> The last thing strace reports is: > >> > >> kill(4294967295,SIGSTOP > >> > >> (note - no closing bracket) > >> > >> which seems to correspond to line 695: > >> > >> if (TEST == 0) kill(-1, SIGSTOP); > >> > >> Reading what "man 2 kill" says: > >> POSIX.1-2001 requires that kill(-1,sig) send sig to all processes > that > >> the current process may send signals to, except possibly for some > >> implementation-defined system processes. > >> > >> I have a suspicion that this may well be the cause of the problems. > > >> killall5 doesn't iterate through all the processes to kill! > According to > >> this, sending "kill(-1, <signal>)" sends the signal to all the > processes > >> that we have permissions to terminate without explicitly specifying > the > >> processes to terminate! Since killall5 is running as root at this > point, > >> this means all processes, with the possible exception of "some > >> implementation-defined system processes". Right now my bet would be > on > >> this killing glusterfsd (which is in fact running in userspace, and > thus > >> is extremely unlikely to be exempt). > >> > >> This brings up another issue - it sounds like the -x option may be > > >> ineffective, too, even on the normal GFS related processes. If the > > >> signals get sent to all processes, then this would include the the > > >> processes specified by -x, regardless. This leads me to suspect > that > >> unless these processes are explicitly excluded in the kernel > >> implementation, they are not spared the killing at this stage. > Looking > >> at the ps output - fenced, groupd, aisexec and ccsd, for example, > don't > >> show up in square brackets, which implies they aren't running in > kernel > >> space (although that isn't really definitive, only indicative, > AFAIK). > >> So, this may be affected by the bug, too - but this may not be > obvious > >> because once they die, the node will get fenced by the other nodes, > > >> which will end up doing something similar. Or maybe these processes > > >> simply catch and ignore the signals if they are being used (e.g. if > gfs > >> is mounted), or something like that. Anyway, that is just > hypothesis at > >> this point, but it's probably worth checking if you have a suitable > test > >> environment handy (I don't have a non-production gfs cluster handy > at > >> the moment). > >> > >> Anyway, I'm going to comment out line 695 and see how that goes. In > > >> theory, this seems superfluous anyway, since the iteration through > /proc > >> for processes to kill should catch everything anyway, and in fact, > it is > >> this iteration that -x relies on for it's functionality! Otherwise > > >> kill(-1) will just blow everything away and preempt anything -x > might do > >> in the first place! > >> > >> Am I missing something obvious here? Is there a flaw in my > analysis? > > > > Sorry, small ammendment - line 695 only sends SIGSTOP. Since it > resumes > > the processes afterwards, this may not affect all processes, e.g. > those > > required by gfs. But if it sends a stop to glusterfsd, it's almost > > certain that rootfs will in fact block, so it is definitely an issue > for > > that. Since SIGSTOP cannot be caught or ignored by the process > itself, > > killall5 will have to be explicitly modified to do this differently, > > > e.g. using a double-pass through /proc, specifically without > including > > glusterfsd in the list of processes to signal. > > Attached is a proposed patch that tries to work around this specific > issue. It seems to work the machine no longer locks up on killall5, > which is a good sign, and a definitive improvement. :^) > > Please review. > > Now the problem is that md devices get stopped shortly afterwards, > just > after the "INIT: no more processes left in this run-level" message. > Now > I have to figure out what does that, since these must remain running > until the shutdown sequence reaches the OSR initroot... But that's > something for a separate thread. > > Gordan > > > [Text File:killall5.c.patch] -- Marc Grimme Tel: +49 89 4523538-14 Fax: +49 89 9901766-0 E-Mail: gr...@at... ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) | Vorsitzender des Aufsichtsrats: Dr. Martin Buss |
From: Gordan B. <go...@bo...> - 2009-12-28 08:21:52
|
Last new thread for today, I promise. ;) I've been thinking about this for a while, and I think it's worth sharing. OSR is conceptually quite smilar virtualization - there is an init-root "hypervisor" that boot-straps the shared storage and then starts up the "guest" root chrooted to the shared storage. OpenVZ (http://en.wikipedia.org/wiki/OpenVZ) virtualization is very similar to this (very much like Solaris "zones" or FreeBSD "jails"). It starts up the "guest" installation in it's own chroot on the local file system, without actually having a disk image file container, and the virtualization abstraction layer is paper thin because the only things being virtualized for the guest are the process IDs (since some things are allegedly sensitive to init not being pid 1) and the networking (so that each guest can have it's completely independent network configuration). All this means that the performance penalty is negligible. The guest VM doesn't run it's own kernel - the host kernel does all the kernel tasks, the guest's lowest level is it's init. What I'm thinking about is coming up with an OSR modification that takes advantage of this - making the init root slightly more fully featured (useful for debug purposes!) so that it boots into it's own init, has it's own console login, and sets up the disk volumes (of whatever description) for the shared root guest. It then simply starts up the shared root guest VM. Now, I know this is a lot to take in (and yes, I know it sounds like a mad idea at first), and it is conceptually a pretty big change. But do you think it makes any sense to even look into going in this direction? The benefits would be: 1) A more fully featured standalone init-root host would allow for easier debugging. 2) The "guest" wouldn't need any modification or tweaks - this would have avoided a number of problems - e.g. the issue on the killall5 thread, volume mounting above/below the guest's init line (/etc/mtab thread), need for patches to the guest's halt/network init scripts, and possibly other things that show up the fragility of the initrd. 3) The guest wouldn't need any awareness of the file system it lives on or of any daemons required to sustain that file system - the host would take care of all of that with complete transparency. This means no need to worry about killing a process upon which things like the rootfs depends on. The reasons against this I can think of: 1) A tiny performance hit due to the networking stack and PID virtualization. (I don't think this would be measurable considering the inevitable cluster fs overheads.) 2) The initrd would end up being a bit bigger (if it ends up having it's own init and getties it would be doing more, so it's inevitable for it to grow slightly, but it would almost certainly grow by a lot less than the savings yielded recently by the pruning of the unused kernel modules and the pyc/pyo files. ;) 3) Any other unforeseen things that show up only once the prototype is built. This is a big one. There has been a whole array of bugs I had tripped in glfs because nobody ever considered the use-case of using it for a rootfs during testing (the biggest one off the top of my head was a massive memory leak stemming from mmap()-induced memory fragmentation that only arose when shared libraries were kept on glfs). I suspect this would likely expose similar problems - but I guess that is inevitable when straying off the straight and narrow. Gordan |
From: Gordan B. <go...@bo...> - 2009-12-28 07:31:05
|
Is there a particular functional reason why /etc/mtab should be symlinked to /proc/mounts instead of /cdsl.local/etc/mtab? The reason I ask two-fold: 1) I'm not sure if having it symlinked to /proc/mounts confuses the process of unmounting the file systems in the shared root during shutdown. /proc/mounts shows all mounted file systems, including paths that aren't mounted in the OSR root (e.g. /mnt/tmproot for the glfs backing fs is mounted only in the initroot, but still shows up in /proc/mounts). 2) It seems to interfere with the checks to see whether a bind-mount is already mounted. Under OSR, I'm seeing all bind-mounts in fstab getting mounted twice, once when mounting local file systems, and once when mounting "other" file systems. Is anything likely to break if /etc/mtab is symlinked to /cdsl.local/etc/mtab? Gordan |
From: Gordan B. <go...@bo...> - 2009-12-28 01:49:31
|
Gordan Bobic wrote: > Gordan Bobic wrote: >> Marc Grimme wrote: >> >>>> And just to confirm, I used the same binary on another machine >>>> (standalone, no OSR or clustering), and it works exactly as expected >>>> (prints out what processes it is killing). That means that whatever >>>> causes killall5 to go away and never return is specific to glfs+OSR >>>> (since killall5 works fine on my gfs+OSR clusters). I'm not sure where >>>> to even begin debugging this, though, so any ideas would be welcome. >> > >>> You might want to try to start it with strace. I recall something that >>> under some environments the browsing through /proc which is done by >>> killall5 freezes. And I think this is done before killing. Somehow what >>> does not work is a stat call on some /proc files within /proc/<pid>. I >>> don't recall exactly but I have something like this in mind. >>> >>> If you have found the pid that causes the problem perhaps we get some >>> new ideas on how to handle this behaviour. >> OK, I have straced killall5, and the last few things it does is stat >> /proc/version (twice, it seems) and set up SIGTERM, SIGSTOP and SIGKILL >> signals. This appears to correspond to lines 682-692 in killall5.c: >> >> mount_proc(); >> ... >> signal(SIGTERM, SIG_IGN); >> signal(SIGSTOP, SIG_IGN); >> signal(SIGKILL, SIG_IGN); >> >> The last thing strace reports is: >> >> kill(4294967295,SIGSTOP >> >> (note - no closing bracket) >> >> which seems to correspond to line 695: >> >> if (TEST == 0) kill(-1, SIGSTOP); >> >> Reading what "man 2 kill" says: >> POSIX.1-2001 requires that kill(-1,sig) send sig to all processes that >> the current process may send signals to, except possibly for some >> implementation-defined system processes. >> >> I have a suspicion that this may well be the cause of the problems. >> killall5 doesn't iterate through all the processes to kill! According to >> this, sending "kill(-1, <signal>)" sends the signal to all the processes >> that we have permissions to terminate without explicitly specifying the >> processes to terminate! Since killall5 is running as root at this point, >> this means all processes, with the possible exception of "some >> implementation-defined system processes". Right now my bet would be on >> this killing glusterfsd (which is in fact running in userspace, and thus >> is extremely unlikely to be exempt). >> >> This brings up another issue - it sounds like the -x option may be >> ineffective, too, even on the normal GFS related processes. If the >> signals get sent to all processes, then this would include the the >> processes specified by -x, regardless. This leads me to suspect that >> unless these processes are explicitly excluded in the kernel >> implementation, they are not spared the killing at this stage. Looking >> at the ps output - fenced, groupd, aisexec and ccsd, for example, don't >> show up in square brackets, which implies they aren't running in kernel >> space (although that isn't really definitive, only indicative, AFAIK). >> So, this may be affected by the bug, too - but this may not be obvious >> because once they die, the node will get fenced by the other nodes, >> which will end up doing something similar. Or maybe these processes >> simply catch and ignore the signals if they are being used (e.g. if gfs >> is mounted), or something like that. Anyway, that is just hypothesis at >> this point, but it's probably worth checking if you have a suitable test >> environment handy (I don't have a non-production gfs cluster handy at >> the moment). >> >> Anyway, I'm going to comment out line 695 and see how that goes. In >> theory, this seems superfluous anyway, since the iteration through /proc >> for processes to kill should catch everything anyway, and in fact, it is >> this iteration that -x relies on for it's functionality! Otherwise >> kill(-1) will just blow everything away and preempt anything -x might do >> in the first place! >> >> Am I missing something obvious here? Is there a flaw in my analysis? > > Sorry, small ammendment - line 695 only sends SIGSTOP. Since it resumes > the processes afterwards, this may not affect all processes, e.g. those > required by gfs. But if it sends a stop to glusterfsd, it's almost > certain that rootfs will in fact block, so it is definitely an issue for > that. Since SIGSTOP cannot be caught or ignored by the process itself, > killall5 will have to be explicitly modified to do this differently, > e.g. using a double-pass through /proc, specifically without including > glusterfsd in the list of processes to signal. Attached is a proposed patch that tries to work around this specific issue. It seems to work the machine no longer locks up on killall5, which is a good sign, and a definitive improvement. :^) Please review. Now the problem is that md devices get stopped shortly afterwards, just after the "INIT: no more processes left in this run-level" message. Now I have to figure out what does that, since these must remain running until the shutdown sequence reaches the OSR initroot... But that's something for a separate thread. Gordan |
From: Gordan B. <go...@bo...> - 2009-12-28 01:02:39
|
Gordan Bobic wrote: > Marc Grimme wrote: > >>> And just to confirm, I used the same binary on another machine >>> (standalone, no OSR or clustering), and it works exactly as expected >>> (prints out what processes it is killing). That means that whatever >>> causes killall5 to go away and never return is specific to glfs+OSR >>> (since killall5 works fine on my gfs+OSR clusters). I'm not sure where >>> >>> to even begin debugging this, though, so any ideas would be welcome. > > >> You might want to try to start it with strace. I recall something that >> under some environments the browsing through /proc which is done by >> killall5 freezes. And I think this is done before killing. Somehow what >> does not work is a stat call on some /proc files within /proc/<pid>. I >> don't recall exactly but I have something like this in mind. >> >> If you have found the pid that causes the problem perhaps we get some >> new ideas on how to handle this behaviour. > > OK, I have straced killall5, and the last few things it does is stat > /proc/version (twice, it seems) and set up SIGTERM, SIGSTOP and SIGKILL > signals. This appears to correspond to lines 682-692 in killall5.c: > > mount_proc(); > ... > signal(SIGTERM, SIG_IGN); > signal(SIGSTOP, SIG_IGN); > signal(SIGKILL, SIG_IGN); > > The last thing strace reports is: > > kill(4294967295,SIGSTOP > > (note - no closing bracket) > > which seems to correspond to line 695: > > if (TEST == 0) kill(-1, SIGSTOP); > > Reading what "man 2 kill" says: > POSIX.1-2001 requires that kill(-1,sig) send sig to all processes that > the current process may send signals to, except possibly for some > implementation-defined system processes. > > I have a suspicion that this may well be the cause of the problems. > killall5 doesn't iterate through all the processes to kill! According to > this, sending "kill(-1, <signal>)" sends the signal to all the processes > that we have permissions to terminate without explicitly specifying the > processes to terminate! Since killall5 is running as root at this point, > this means all processes, with the possible exception of "some > implementation-defined system processes". Right now my bet would be on > this killing glusterfsd (which is in fact running in userspace, and thus > is extremely unlikely to be exempt). > > This brings up another issue - it sounds like the -x option may be > ineffective, too, even on the normal GFS related processes. If the > signals get sent to all processes, then this would include the the > processes specified by -x, regardless. This leads me to suspect that > unless these processes are explicitly excluded in the kernel > implementation, they are not spared the killing at this stage. Looking > at the ps output - fenced, groupd, aisexec and ccsd, for example, don't > show up in square brackets, which implies they aren't running in kernel > space (although that isn't really definitive, only indicative, AFAIK). > So, this may be affected by the bug, too - but this may not be obvious > because once they die, the node will get fenced by the other nodes, > which will end up doing something similar. Or maybe these processes > simply catch and ignore the signals if they are being used (e.g. if gfs > is mounted), or something like that. Anyway, that is just hypothesis at > this point, but it's probably worth checking if you have a suitable test > environment handy (I don't have a non-production gfs cluster handy at > the moment). > > Anyway, I'm going to comment out line 695 and see how that goes. In > theory, this seems superfluous anyway, since the iteration through /proc > for processes to kill should catch everything anyway, and in fact, it is > this iteration that -x relies on for it's functionality! Otherwise > kill(-1) will just blow everything away and preempt anything -x might do > in the first place! > > Am I missing something obvious here? Is there a flaw in my analysis? Sorry, small ammendment - line 695 only sends SIGSTOP. Since it resumes the processes afterwards, this may not affect all processes, e.g. those required by gfs. But if it sends a stop to glusterfsd, it's almost certain that rootfs will in fact block, so it is definitely an issue for that. Since SIGSTOP cannot be caught or ignored by the process itself, killall5 will have to be explicitly modified to do this differently, e.g. using a double-pass through /proc, specifically without including glusterfsd in the list of processes to signal. Gordan |
From: Gordan B. <go...@bo...> - 2009-12-28 00:55:30
|
Marc Grimme wrote: >> And just to confirm, I used the same binary on another machine >> (standalone, no OSR or clustering), and it works exactly as expected >> (prints out what processes it is killing). That means that whatever >> causes killall5 to go away and never return is specific to glfs+OSR >> (since killall5 works fine on my gfs+OSR clusters). I'm not sure where >> >> to even begin debugging this, though, so any ideas would be welcome. > > You might want to try to start it with strace. I recall something that > under some environments the browsing through /proc which is done by > killall5 freezes. And I think this is done before killing. Somehow what > does not work is a stat call on some /proc files within /proc/<pid>. I > don't recall exactly but I have something like this in mind. > > If you have found the pid that causes the problem perhaps we get some > new ideas on how to handle this behaviour. OK, I have straced killall5, and the last few things it does is stat /proc/version (twice, it seems) and set up SIGTERM, SIGSTOP and SIGKILL signals. This appears to correspond to lines 682-692 in killall5.c: mount_proc(); ... signal(SIGTERM, SIG_IGN); signal(SIGSTOP, SIG_IGN); signal(SIGKILL, SIG_IGN); The last thing strace reports is: kill(4294967295,SIGSTOP (note - no closing bracket) which seems to correspond to line 695: if (TEST == 0) kill(-1, SIGSTOP); Reading what "man 2 kill" says: POSIX.1-2001 requires that kill(-1,sig) send sig to all processes that the current process may send signals to, except possibly for some implementation-defined system processes. I have a suspicion that this may well be the cause of the problems. killall5 doesn't iterate through all the processes to kill! According to this, sending "kill(-1, <signal>)" sends the signal to all the processes that we have permissions to terminate without explicitly specifying the processes to terminate! Since killall5 is running as root at this point, this means all processes, with the possible exception of "some implementation-defined system processes". Right now my bet would be on this killing glusterfsd (which is in fact running in userspace, and thus is extremely unlikely to be exempt). This brings up another issue - it sounds like the -x option may be ineffective, too, even on the normal GFS related processes. If the signals get sent to all processes, then this would include the the processes specified by -x, regardless. This leads me to suspect that unless these processes are explicitly excluded in the kernel implementation, they are not spared the killing at this stage. Looking at the ps output - fenced, groupd, aisexec and ccsd, for example, don't show up in square brackets, which implies they aren't running in kernel space (although that isn't really definitive, only indicative, AFAIK). So, this may be affected by the bug, too - but this may not be obvious because once they die, the node will get fenced by the other nodes, which will end up doing something similar. Or maybe these processes simply catch and ignore the signals if they are being used (e.g. if gfs is mounted), or something like that. Anyway, that is just hypothesis at this point, but it's probably worth checking if you have a suitable test environment handy (I don't have a non-production gfs cluster handy at the moment). Anyway, I'm going to comment out line 695 and see how that goes. In theory, this seems superfluous anyway, since the iteration through /proc for processes to kill should catch everything anyway, and in fact, it is this iteration that -x relies on for it's functionality! Otherwise kill(-1) will just blow everything away and preempt anything -x might do in the first place! Am I missing something obvious here? Is there a flaw in my analysis? Gordan |
From: Marc G. <gr...@at...> - 2009-12-27 10:18:46
|
----- "Gordan Bobic" <go...@bo...> wrote: > Gordan Bobic wrote: > > On 26/12/2009 21:44, Marc Grimme wrote: > > > And just to confirm, I used the same binary on another machine > (standalone, no OSR or clustering), and it works exactly as expected > (prints out what processes it is killing). That means that whatever > causes killall5 to go away and never return is specific to glfs+OSR > (since killall5 works fine on my gfs+OSR clusters). I'm not sure where > > to even begin debugging this, though, so any ideas would be welcome. You might want to try to start it with strace. I recall something that under some environments the browsing through /proc which is done by killall5 freezes. And I think this is done before killing. Somehow what does not work is a stat call on some /proc files within /proc/<pid>. I don't recall exactly but I have something like this in mind. If you have found the pid that causes the problem perhaps we get some new ideas on how to handle this behaviour. Marc. |
From: Gordan B. <go...@bo...> - 2009-12-27 03:00:34
|
Gordan Bobic wrote: > On 26/12/2009 21:44, Marc Grimme wrote: > >>>> On output of killall5: >>>> It should output to the console. Which means you should see your >>> printfs >>>> on console. >>> That's what I thought - only it doesn't. killall5 starts, and nothing >>> >>> happens after that. I wonder if the problem actually occurs before >>> that, >> Try to first add an echo before killall5 is issued to see what would be called. >> I often added a bash/sh call afterwards to test what's happening: >> action $"Sending all processes the TERM signal..." /usr/comoonics/killall5 -15 ${KILLALL_OPTS} >> sleep 5 >> action $"Sending all processes the KILL signal..." /usr/comoonics/killall5 -9 ${KILLALL_OPTS} >> -> >> echo "Sending all processes the TERM signal..." /usr/comoonics/killall5 -15 ${KILLALL_OPTS} >> bash >> action $"Sending all processes the TERM signal..." /usr/comoonics/killall5 -15 ${KILLALL_OPTS} >> sleep 5 >> action $"Sending all processes the KILL signal..." /usr/comoonics/killall5 -9 ${KILLALL_OPTS} >> >> Then you know what happens. I don't think bootsr does any harm see below. >> In the shell you could also try your command. > > I just did that, and the result is as expected. killall5 starts and > never returns. No output. > >>>> You could >>>> output the killall opts to console in order to see if the programs >>> get >>>> excluded within killing: >>>> echo "/usr/comoonics/killall5 -15 ${KILLALL_OPTS}" >>> > just before it is called and you should see if it is called >>> properly. >>> >>> Did that, too, and that looks fine. But killall5 never returns, and >>> never prints any output. > > >> Never returns is suspicious. Try to call it yourself. > > I did - and it really doesn't return. > I'm a bit stumped at this, especially since the effect is the same both > on the original killall5 that ships with OSR sysvinit and my modified > killall5, so I'm reasonably sure that killall5 isn't broken or miscompiled. And just to confirm, I used the same binary on another machine (standalone, no OSR or clustering), and it works exactly as expected (prints out what processes it is killing). That means that whatever causes killall5 to go away and never return is specific to glfs+OSR (since killall5 works fine on my gfs+OSR clusters). I'm not sure where to even begin debugging this, though, so any ideas would be welcome. Gordan |