From: <go...@bo...> - 2008-04-16 09:52:22
|
Hi, Does anyone think that adding support for this would be a good idea? I'm working with GlusterFS at the moment, so could try to add the relevant init stuff when I've ironed things out a bit. Maybe as a contrib package, like DRBD? On a separate node, am I correct in presuming that the diet version of the initrd with the kernel drivers pruned and additional package filtering added as per the patch I sent a while back was not deemed a good idea? Gordan |
From: Marc G. <gr...@at...> - 2008-04-16 12:47:36
|
Hi Gordan, On Wednesday 16 April 2008 11:52:18 go...@bo... wrote: > Hi, > > Does anyone think that adding support for this would be a good idea? I'm > working with GlusterFS at the moment, so could try to add the relevant > init stuff when I've ironed things out a bit. Maybe as a contrib package, > like DRBD? After going roughly over the features and concepts of Glusterfs, I would doubt it being an easy task to build a open-sharedroot cluster with it but why not. Still it sounds quite promising and if you like you are welcome to contribute. We'll support you as best as we can. > > On a separate node, am I correct in presuming that the diet version of the > initrd with the kernel drivers pruned and additional package filtering > added as per the patch I sent a while back was not deemed a good idea? Thanks for reminding me. I forgot to answer, sorry. The idea itself is good. But originally and by concept the initrd it designed to be an initrd used for different hardware configurations. That implies we need different kernel modules and tools on the same cluster. Say you would use a combination of virtulized and unvirtualized nodes in a cluster. As of now that is possible. Or just different servers. This would not be possible with your diet-patch, would it? I thought of using it as special option to the mkinitrd (--diet or the like). Could you provide a patch for this? Thanks and regards Marc. > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/java >one _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: <go...@bo...> - 2008-04-16 13:13:17
|
On Wed, 16 Apr 2008, Marc Grimme wrote: >> Does anyone think that adding support for this would be a good idea? I'm >> working with GlusterFS at the moment, so could try to add the relevant >> init stuff when I've ironed things out a bit. Maybe as a contrib package, >> like DRBD? > > After going roughly over the features and concepts of Glusterfs, I would doubt > it being an easy task to build a open-sharedroot cluster with it but why not. It shouldn't be too different to the OSR NFS setup. There are two options: 1) diskless client 2) mirror client/server In the diskless client case it would be pretty much the same as NFS. In the mirror client/server, it would be similar to a DRBD+GFS setup, only scaleable to more than 2-3 nodes (IIRC, DRBD only supports up to 2-3 nodes at the moment). Each node would mount it's local mirror as OSR (as it does with DRBD). The upshot is that as far as I can make out, unlike GFS, split-brains are less of an issue in FS terms - GlusterFS would sort that out, so in theory, we could have an n-node cluster with quorum of 1. The only potential issue with that would be migration of IPs - if it split-brains, it would, in theory, cause an IP resource clash. But I think the scope for FS corruption would be removed. There might still be file clobbering on resync, but the FS certainly wouldn't get totally destroyed like with splitbrain GFS and shared SAN storage (DRBD+GFS also has the same benefit that you get to keep at least one version of the FS after split-brain). > Still it sounds quite promising and if you like you are welcome to contribute. OK, thanks. I just wanted to float the idea and see if there are any strong objections to it first. :-) >> On a separate node, am I correct in presuming that the diet version of the >> initrd with the kernel drivers pruned and additional package filtering >> added as per the patch I sent a while back was not deemed a good idea? > Thanks for reminding me. I forgot to answer, sorry. > > The idea itself is good. But originally and by concept the initrd it designed > to be an initrd used for different hardware configurations. Same initrd for multiple configurations? Why is this useful? Different configurations could also run different kernels, which would invalidate the shared initrd concept... > That implies we need different kernel modules and tools on the same > cluster. Sure - but clusters like this, at least in my experience, generally tend to be homogenous, when there is choice. The way I made the patch make some allowance for this is that both the loaded modules and all the ones listed in /etc/modprobe.conf get included - just in case. So modprobe.conf could be (ab)used to load additional modules. But I accept this is potentially a somewhat cringeworthy hack when used to force load additional modules into the initrd. > Say you would > use a combination of virtulized and unvirtualized nodes in a cluster. As of > now that is possible. Or just different servers. This would not be possible > with your diet-patch, would it? No, probably not - but then again virtualized and non-virtualized nodes would be running different kernels (e.g. 2.6.18-53 physical and 2.6.18-53xen virtual), so the point is somewhat moot. You'd need different initrds anyway. And the different nodes would use different modprobe.conf files if their hardware is different. So the only extra requirement with my patch would be that the initrd is built on the node as the node that will be running the initrd image. In case of virtual vs. non-virtual hardware (or just different kernel versions), it would still be a case of running mkinitrd with different kernel versions with a different modprobe.conf file, as AFAIK, this gets included in the initrd. > I thought of using it as special option to the mkinitrd (--diet or the like). > Could you provide a patch for this? That could be arranged. I think that's a reasonably good idea. But as I mentioned above, I'm not sure the full-fat initrd actually gains much in terms of node/hardware compatibility. I'll send the --diet optioned patch. I'll leave the choice of whether --diet should be the default to you guys. :-) Gordan |
From: Marc G. <gr...@at...> - 2008-04-17 07:48:59
|
On Wednesday 16 April 2008 15:13:15 go...@bo... wrote: > On Wed, 16 Apr 2008, Marc Grimme wrote: > >> Does anyone think that adding support for this would be a good idea? I'm > >> working with GlusterFS at the moment, so could try to add the relevant > >> init stuff when I've ironed things out a bit. Maybe as a contrib > >> package, like DRBD? > > > > After going roughly over the features and concepts of Glusterfs, I would > > doubt it being an easy task to build a open-sharedroot cluster with it > > but why not. > > It shouldn't be too different to the OSR NFS setup. There are two options: > 1) diskless client > 2) mirror client/server > > In the diskless client case it would be pretty much the same as NFS. Agreed. I forgot about NFS ;-) . > > In the mirror client/server, it would be similar to a DRBD+GFS setup, only > scaleable to more than 2-3 nodes (IIRC, DRBD only supports up to 2-3 nodes > at the moment). Each node would mount it's local mirror as OSR (as it does > with DRBD). > > The upshot is that as far as I can make out, unlike GFS, split-brains are > less of an issue in FS terms - GlusterFS would sort that out, so in > theory, we could have an n-node cluster with quorum of 1. > > The only potential issue with that would be migration of IPs - if it > split-brains, it would, in theory, cause an IP resource clash. But I think > the scope for FS corruption would be removed. There might still be file > clobbering on resync, but the FS certainly wouldn't get totally > destroyed like with splitbrain GFS and shared SAN storage (DRBD+GFS also > has the same benefit that you get to keep at least one version of the FS > after split-brain). And wouldn't the ip thing if appropriate being handled via a clustermanager (rgmanager)? > > > Still it sounds quite promising and if you like you are welcome to > > contribute. > > OK, thanks. I just wanted to float the idea and see if there are any > strong objections to it first. :-) > > >> On a separate node, am I correct in presuming that the diet version of > >> the initrd with the kernel drivers pruned and additional package > >> filtering added as per the patch I sent a while back was not deemed a > >> good idea? > > > > Thanks for reminding me. I forgot to answer, sorry. > > > > The idea itself is good. But originally and by concept the initrd it > > designed to be an initrd used for different hardware configurations. > > Same initrd for multiple configurations? Why is this useful? Different > configurations could also run different kernels, which would invalidate > the shared initrd concept... No necessarily. I was a design idea and still is a kind of USP and most important something other customers use. Just a small example why. Let's suppose you have servers from HP the same Product branch (like HP DL38x) but of different generations. Then the onboard nics would on older ones be the tg3/bmc5700 driver whereas newer Generations use bnx2 drivers for their onboard nics. And then when bringing in IBM/Sun/Dell whatever other servers it becomes more complicated. And all this should be handled by one single shared boot. Did this explain the problem? > > > That implies we need different kernel modules and tools on the same > > cluster. > > Sure - but clusters like this, at least in my experience, generally tend > to be homogenous, when there is choice. Not in our experience. See above. > > The way I made the patch make some allowance for this is that both the > loaded modules and all the ones listed in /etc/modprobe.conf get included > - just in case. So modprobe.conf could be (ab)used to load additional > modules. But I accept this is potentially a somewhat cringeworthy hack > when used to force load additional modules into the initrd. > > > Say you would > > use a combination of virtulized and unvirtualized nodes in a cluster. As > > of now that is possible. Or just different servers. This would not be > > possible with your diet-patch, would it? > > No, probably not - but then again virtualized and non-virtualized nodes > would be running different kernels (e.g. 2.6.18-53 physical and > 2.6.18-53xen virtual), so the point is somewhat moot. You'd need different > initrds anyway. And the different nodes would use different modprobe.conf > files if their hardware is different. So the only extra requirement with > my patch would be that the initrd is built on the node as the node that > will be running the initrd image. In case of virtual vs. non-virtual > hardware (or just different kernel versions), it would still be a case of > running mkinitrd with different kernel versions with a different > modprobe.conf file, as AFAIK, this gets included in the initrd. You got that point (but keep in mind that only holds for XEN). But see above. > > > I thought of using it as special option to the mkinitrd (--diet or the > > like). Could you provide a patch for this? > > That could be arranged. I think that's a reasonably good idea. But as I > mentioned above, I'm not sure the full-fat initrd actually gains much in > terms of node/hardware compatibility. > > I'll send the --diet optioned patch. I'll leave the choice of > whether --diet should be the default to you guys. :-) ;-) Thanks Gordan. Regards Marc. -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: <go...@bo...> - 2008-04-17 09:37:11
|
On Thu, 17 Apr 2008, Marc Grimme wrote: >> In the mirror client/server, it would be similar to a DRBD+GFS setup, only >> scaleable to more than 2-3 nodes (IIRC, DRBD only supports up to 2-3 nodes >> at the moment). Each node would mount it's local mirror as OSR (as it does >> with DRBD). >> >> The upshot is that as far as I can make out, unlike GFS, split-brains are >> less of an issue in FS terms - GlusterFS would sort that out, so in >> theory, we could have an n-node cluster with quorum of 1. >> >> The only potential issue with that would be migration of IPs - if it >> split-brains, it would, in theory, cause an IP resource clash. But I think >> the scope for FS corruption would be removed. There might still be file >> clobbering on resync, but the FS certainly wouldn't get totally >> destroyed like with splitbrain GFS and shared SAN storage (DRBD+GFS also >> has the same benefit that you get to keep at least one version of the FS >> after split-brain). > > And wouldn't the ip thing if appropriate being handled via a clustermanager > (rgmanager)? Indeed it would, but that would still be susceptible to split-brain IP clashes. But fencing should, hopefully, stop that from ever happening. >>>> On a separate node, am I correct in presuming that the diet version of >>>> the initrd with the kernel drivers pruned and additional package >>>> filtering added as per the patch I sent a while back was not deemed a >>>> good idea? >>> >>> Thanks for reminding me. I forgot to answer, sorry. >>> >>> The idea itself is good. But originally and by concept the initrd it >>> designed to be an initrd used for different hardware configurations. >> >> Same initrd for multiple configurations? Why is this useful? Different >> configurations could also run different kernels, which would invalidate >> the shared initrd concept... > > No necessarily. I was a design idea and still is a kind of USP and most > important something other customers use. > > Just a small example why. Let's suppose you have servers from HP the same > Product branch (like HP DL38x) but of different generations. Then the onboard > nics would on older ones be the tg3/bmc5700 driver whereas newer Generations > use bnx2 drivers for their onboard nics. And then when bringing in > IBM/Sun/Dell whatever other servers it becomes more complicated. And all this > should be handled by one single shared boot. > > Did this explain the problem? How do you work around the fact that each node needs a different modprobe.conf for the different NIC/driver bindings? Gordan |
From: Marc G. <gr...@at...> - 2008-04-17 09:59:51
|
On Thursday 17 April 2008 11:15:42 go...@bo... wrote: > On Thu, 17 Apr 2008, Marc Grimme wrote: > >> In the mirror client/server, it would be similar to a DRBD+GFS setup, > >> only scaleable to more than 2-3 nodes (IIRC, DRBD only supports up to > >> 2-3 nodes at the moment). Each node would mount it's local mirror as OSR > >> (as it does with DRBD). > >> > >> The upshot is that as far as I can make out, unlike GFS, split-brains > >> are less of an issue in FS terms - GlusterFS would sort that out, so in > >> theory, we could have an n-node cluster with quorum of 1. > >> > >> The only potential issue with that would be migration of IPs - if it > >> split-brains, it would, in theory, cause an IP resource clash. But I > >> think the scope for FS corruption would be removed. There might still be > >> file clobbering on resync, but the FS certainly wouldn't get totally > >> destroyed like with splitbrain GFS and shared SAN storage (DRBD+GFS also > >> has the same benefit that you get to keep at least one version of the FS > >> after split-brain). > > > > And wouldn't the ip thing if appropriate being handled via a > > clustermanager (rgmanager)? > > Indeed it would, but that would still be susceptible to split-brain IP > clashes. But fencing should, hopefully, stop that from ever happening. Yes. The rgmanager or even heartbeat or any other ha-cluster software has its own way of detecting and solving split brain scenarios. The rgmanager uses the same functionality as gfs does. > > >>>> On a separate node, am I correct in presuming that the diet version of > >>>> the initrd with the kernel drivers pruned and additional package > >>>> filtering added as per the patch I sent a while back was not deemed a > >>>> good idea? > >>> > >>> Thanks for reminding me. I forgot to answer, sorry. > >>> > >>> The idea itself is good. But originally and by concept the initrd it > >>> designed to be an initrd used for different hardware configurations. > >> > >> Same initrd for multiple configurations? Why is this useful? Different > >> configurations could also run different kernels, which would invalidate > >> the shared initrd concept... > > > > No necessarily. I was a design idea and still is a kind of USP and most > > important something other customers use. > > > > Just a small example why. Let's suppose you have servers from HP the same > > Product branch (like HP DL38x) but of different generations. Then the > > onboard nics would on older ones be the tg3/bmc5700 driver whereas newer > > Generations use bnx2 drivers for their onboard nics. And then when > > bringing in IBM/Sun/Dell whatever other servers it becomes more > > complicated. And all this should be handled by one single shared boot. > > > > Did this explain the problem? > > How do you work around the fact that each node needs a different > modprobe.conf for the different NIC/driver bindings? The hardware detection takes place in the initrd and the "generated" initrd is copied during bootprocess on the rootdisk. > > Gordan > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/java >one _______________________________________________ > Open-sharedroot-devel mailing list > Ope...@li... > https://lists.sourceforge.net/lists/listinfo/open-sharedroot-devel Marc. -- Gruss / Regards, Marc Grimme http://www.atix.de/ http://www.open-sharedroot.org/ |
From: <go...@bo...> - 2008-04-17 11:18:40
Attachments:
chroot-lib.sh.patch
create-gfs-initrd-generic.sh.patch
|
On Thu, 17 Apr 2008, Marc Grimme wrote: >>>>>> On a separate node, am I correct in presuming that the diet version of >>>>>> the initrd with the kernel drivers pruned and additional package >>>>>> filtering added as per the patch I sent a while back was not deemed a >>>>>> good idea? >>>>> >>>>> Thanks for reminding me. I forgot to answer, sorry. >>>>> >>>>> The idea itself is good. But originally and by concept the initrd it >>>>> designed to be an initrd used for different hardware configurations. >>>> >>>> Same initrd for multiple configurations? Why is this useful? Different >>>> configurations could also run different kernels, which would invalidate >>>> the shared initrd concept... >>> >>> No necessarily. I was a design idea and still is a kind of USP and most >>> important something other customers use. >>> >>> Just a small example why. Let's suppose you have servers from HP the same >>> Product branch (like HP DL38x) but of different generations. Then the >>> onboard nics would on older ones be the tg3/bmc5700 driver whereas newer >>> Generations use bnx2 drivers for their onboard nics. And then when >>> bringing in IBM/Sun/Dell whatever other servers it becomes more >>> complicated. And all this should be handled by one single shared boot. >>> >>> Did this explain the problem? >> >> How do you work around the fact that each node needs a different >> modprobe.conf for the different NIC/driver bindings? > > The hardware detection takes place in the initrd and the "generated" initrd is > copied during bootprocess on the rootdisk. Ah, OK. I didn't realize that this bit of NIC detection logic happens in the initrd. I thought it just went by modprobe.conf. Anyway - attached is the updated patch for create-gfs-initrd-generic.sh mkinitrd now takes the -l parameter (l for "light"). I thought about using getopt instead of getopts for long parameters, but since the current implementation uses getopts bash builtin, I decided to stick with it. When -l is passed, only modules currently loaded and those listed in modprobe.conf are loaded into the initrd. This reduces the initrd image by about 10MB (in my case from 53MB to 43MB). I have also attached the patch for chroot-lib.sh. This is the same patch I sent previously, that adds optional exclusion filtering in the rpm list files. Old format is still supported - if the 2nd filtering parameter (the one for excluding files) is ommitted, it will be ignored and the exclusion not performed. Gordan |