|
From: Sean G. <sea...@no...> - 2010-07-02 15:57:36
|
Sure, sounds like a plan. I've been trying to follow your checkins in the branch, so I'm not completely disconnected. Let me know when you're ready for me to take a look. Sean On 07/02/2010 11:53 AM, ext Bryan Thompson wrote: > Sean, > > Great info. I am making good progress on the zookeeper / quorum integration and I should have the basic unit tests for that integration running early next week. Once I get there, would you be interested in taking a look at the implementation and the unit tests? > > Thanks, > Bryan > > >> -----Original Message----- >> From: Sean Gossard [mailto:sea...@no...] >> Sent: Thursday, July 01, 2010 1:29 PM >> To: Bryan Thompson >> Cc: big...@li... >> Subject: Re: [Bigdata-developers] zookeeper message delivery >> guarantees? >> >> >> If an established connection breaks, the client immediately >> returns the CONNECTION_LOSS error code for any pending >> request without a full server reply. This includes writes >> that were never sent, in flight, or finished on the server >> side but the connection broke before the acknowledgment was >> received. This clearly leads to a situation where a client >> won't know if their writes made it. >> >> If the client does not have an active connection, all >> requests are queued until contact is made with any random >> server. After a connection, the sessionID is validated with >> the server. If the session is valid, the queued requests are >> processed as mentioned above. If the session has expired, If >> the session has expired, all the pending requests immediately >> return SESSION_EXPIRED, the client is closed, and any future >> requests to the client return SESSION_EXPIRED. >> >> If someone calls close() on the client, all pending requests >> return CONNECTION_LOSS, and all future calls return SESSION_EXPIRED. >> ---------------------------------------- >> So...... if you are doing a write operation and the call returns: >> >> OK : You are guaranteed the write completed in full. >> >> SESSION_EXPIRED : The server never accepted the operation >> because the session is invalid. The servers will eventually >> have torn down all ephemeral nodes previously created by this >> session. You will need to create a new client object. >> >> CONNECTION_LOSS : The write may or may not have succeeded. >> For non-sequential creates you know the path so things are easy, use >> exists() or retry the create() and handle the exception if it >> existed. >> For sequential nodes, you don't know the final path of the >> node and the call isn't idempotent, so you'll have to scan >> the children of the parent node looking for the write. The >> three options are prefix naming convention, trackable user >> data, or Stat.getEphemeralOwner(). >> >> ANYTHING_ELSE : For anything else, you retained a valid >> connection to a server, but it didn't return OK, so it is >> unlikely the write succeeded, but the most paranoid stance is >> to assume the same behavior of CONNECTION_LOSS for any status >> code you don't recognize or aren't 100% on. >> ---------------------------------------- >> >> As a side note, a registered client watch only fires once, >> and only while connected, so depending on your write >> patterns, missed events may be the norm. Take care your >> algorithms don't require detection of every znode state >> change and are self righting. >> >> >> Sean >> >> >> >> >> On 06/30/2010 10:32 AM, ext Bryan Thompson wrote: >> >>> Hello, >>> >>> I am wondering if anyone (SeanG?, BrianM?) has looked into >>> >> the guarantees which zookeeper provides for reliable >> messaging for operation return codes in the context of >> ephemeral znode creates. Basically, I would like to know >> whether a zookeeper client can rely on observing the return >> code for a successful operation which creates an ephemeral >> (or ephemeral sequential) znode -or- have a guarantee that >> its session was timed out and the ephemeral znode destroyed. >> That is, does zookeeper provide guaranteed delivery of the >> operation return code unless the session is invalidated by a timeout? >> >>> Thanks, >>> Bryan >>> >>> >> ---------------------------------------------------------------------- >> >>> -------- This SF.net email is sponsored by Sprint What will you do >>> first with EVO, the first 4G phone? >>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >>> >> |