From: Don Z. <dz...@ho...> - 2004-06-17 15:38:47
|
Yes, I know. He is part of our group. It seems as though SAF is not good enough for our work. It all boils down to a split partition event vs. a node down event. SAF/CCM can not determine this reliably, hence we are looking at implementing a quorum disk. Which is an extra check for those two events that also differentiates the two events. Just another side project with OpenDLM while we tie up our performance testing with CCCP. Best Regards, Don >From: "Cahill, Ben M" <ben...@in...> >Reply-To: ope...@li... >To: <ope...@li...> >Subject: RE: [Opendlm-devel] More SAF ??s >Date: Thu, 17 Jun 2004 08:20:49 -0700 > >Here's some more info from linux-ha list, pasted in below ... same >problem?? > >-- Ben -- >Opinions are mine, not Intel's > > >-----Original Message----- >From: lin...@li... >[mailto:lin...@li...] On Behalf Of Alan Robertson >Sent: Thursday, June 17, 2004 9:44 AM >To: General Linux-HA mailing list >Subject: Re: [Linux-HA] ccm_testclient > >Ram Pai wrote: > > On Wed, 2004-06-16 at 09:06, Salman, Basith wrote: > > > >>Hi All, > >> > >>On running the ccm_testclient application on both nodes of a two node > >>cluster, and disconnecting the ethernet cable(connecting the two > >>nodes) I get the output message from the test program on both nodes as > >>below (at the bottom). > >> > >>And on RECONNECTINNG back the cable I get the message on one of the > >>nodes as: > >>./ccm_testclient: info: event=EVICTED > >>./ccm_testclient: ERROR: terminating > >> > >>Is this the correct behaviour? Should I not get the evicted message on > >>one of the nodes as soon as the cable is pulled out? > > > > > > When you pull out the cable, each node looses connectivity with other. > > I assume stonith is not configured and hence both the nodes continue >to > > exist. As a result they form their own individual cluster with no > > quorum. > > > > When the cable is connected back, they gain back connectivity. And >when > > they try to form a membership, they realize that they are out of sync > > with each other's membership. So one of the node evicts. > > > > So this is by design. Ideally we would like CCM to be able to merge > > gracefully. But that is a tougher problem. > >And, unfortunately, it's necessary for it to do so, even though it >doesn't >do that currently. Ram is the original owner of this code, but no >longer >has time to maintain it. Forrest Zhao is the current maintainer of CCM. > >IMHO, it's a bug in the API that fails to inform you of the membership >without quorum. The newer version (which fixes that) isn't yet >implemented. > >Ram: Is it right that he doesn't get notified of the loss of quorum? > > > >-- > Alan Robertson <al...@un...> > > > > > > > > > -----Original Message----- > > From: ope...@li... > > [mailto:ope...@li...] On Behalf > > Of Stanley Wang > > Sent: Wednesday, June 16, 2004 12:49 AM > > To: OpenDLM Dev Mail List > > Subject: Re: [Opendlm-devel] More SAF ??s > > > > yes, you are right. > > It seems SAF AIS membership API doesn't consider of cluster partition. > > There is no any message type is used for informing client cluster > > partition event. > > You can go ahead and submit a bug for AIS spec, they may fix > > it in next > > revision. Actually I did something alike when I'm coding for SAF AIS > > lock API. > > > > Best Regards, > > Stan > > > > On Mon, 2004-06-14 at 18:03, Zickus II, Don wrote: > > > Stan, > > > > > > After testing the SAF interface and reading the spec and > > header file, I noticed SAF has no concept of a split brain > > cluster? Is this true??? It seems the only return codes > > supported is SAF_NO_CHANGE, SAF_JOINED, and SAF_LEFT. Unless > > the logic for SAF_LEFT is broken and doesn't translate the > > OCF protocol for leaving the cluster properly, we will not be > > able to use the SAF module. We would have to go back to > > writing a module that interfaces directly with the OCF piece > > of CCM. I hope you can tell me otherwise. :) Please let me > > know if this is true or not. > > > > > > Best Regards, > > > Don > > -- > > Opinions expressed are those of the author and do not represent Intel > > Corporation > > "gpg --recv-keys --keyserver wwwkeys.pgp.net E1390A7F" > > {E1390A7F:3AD1 1B0C 2019 E183 0CFF 55E8 369A 8B75 E139 0A7F} > > > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference > > Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer > > Conference, June 28 - July 1 at the Moscone Center in San > > Francisco, CA > > REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority > > Code NWMGYKND > > _______________________________________________ > > Opendlm-devel mailing list > > Ope...@li... > > https://lists.sourceforge.net/lists/listinfo/opendlm-devel > > > > > > >------------------------------------------------------- >This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference >Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer >Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA >REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND >_______________________________________________ >Opendlm-devel mailing list >Ope...@li... >https://lists.sourceforge.net/lists/listinfo/opendlm-devel _________________________________________________________________ Is your PC infected? Get a FREE online computer virus scan from McAfee® Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 |