|
From: Bryan T. <br...@sy...> - 2010-06-30 14:32:37
|
Hello, I am wondering if anyone (SeanG?, BrianM?) has looked into the guarantees which zookeeper provides for reliable messaging for operation return codes in the context of ephemeral znode creates. Basically, I would like to know whether a zookeeper client can rely on observing the return code for a successful operation which creates an ephemeral (or ephemeral sequential) znode -or- have a guarantee that its session was timed out and the ephemeral znode destroyed. That is, does zookeeper provide guaranteed delivery of the operation return code unless the session is invalidated by a timeout? Thanks, Bryan |
|
From: Sean G. <sea...@no...> - 2010-07-01 17:31:02
|
If an established connection breaks, the client immediately returns the CONNECTION_LOSS error code for any pending request without a full server reply. This includes writes that were never sent, in flight, or finished on the server side but the connection broke before the acknowledgment was received. This clearly leads to a situation where a client won't know if their writes made it. If the client does not have an active connection, all requests are queued until contact is made with any random server. After a connection, the sessionID is validated with the server. If the session is valid, the queued requests are processed as mentioned above. If the session has expired, If the session has expired, all the pending requests immediately return SESSION_EXPIRED, the client is closed, and any future requests to the client return SESSION_EXPIRED. If someone calls close() on the client, all pending requests return CONNECTION_LOSS, and all future calls return SESSION_EXPIRED. ---------------------------------------- So...... if you are doing a write operation and the call returns: OK : You are guaranteed the write completed in full. SESSION_EXPIRED : The server never accepted the operation because the session is invalid. The servers will eventually have torn down all ephemeral nodes previously created by this session. You will need to create a new client object. CONNECTION_LOSS : The write may or may not have succeeded. For non-sequential creates you know the path so things are easy, use exists() or retry the create() and handle the exception if it existed. For sequential nodes, you don't know the final path of the node and the call isn't idempotent, so you'll have to scan the children of the parent node looking for the write. The three options are prefix naming convention, trackable user data, or Stat.getEphemeralOwner(). ANYTHING_ELSE : For anything else, you retained a valid connection to a server, but it didn't return OK, so it is unlikely the write succeeded, but the most paranoid stance is to assume the same behavior of CONNECTION_LOSS for any status code you don't recognize or aren't 100% on. ---------------------------------------- As a side note, a registered client watch only fires once, and only while connected, so depending on your write patterns, missed events may be the norm. Take care your algorithms don't require detection of every znode state change and are self righting. Sean On 06/30/2010 10:32 AM, ext Bryan Thompson wrote: > Hello, > > I am wondering if anyone (SeanG?, BrianM?) has looked into the guarantees which zookeeper provides for reliable messaging for operation return codes in the context of ephemeral znode creates. Basically, I would like to know whether a zookeeper client can rely on observing the return code for a successful operation which creates an ephemeral (or ephemeral sequential) znode -or- have a guarantee that its session was timed out and the ephemeral znode destroyed. That is, does zookeeper provide guaranteed delivery of the operation return code unless the session is invalidated by a timeout? > > Thanks, > Bryan > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
|
From: Bryan T. <br...@sy...> - 2010-07-02 15:54:33
|
Sean, Great info. I am making good progress on the zookeeper / quorum integration and I should have the basic unit tests for that integration running early next week. Once I get there, would you be interested in taking a look at the implementation and the unit tests? Thanks, Bryan > -----Original Message----- > From: Sean Gossard [mailto:sea...@no...] > Sent: Thursday, July 01, 2010 1:29 PM > To: Bryan Thompson > Cc: big...@li... > Subject: Re: [Bigdata-developers] zookeeper message delivery > guarantees? > > > If an established connection breaks, the client immediately > returns the CONNECTION_LOSS error code for any pending > request without a full server reply. This includes writes > that were never sent, in flight, or finished on the server > side but the connection broke before the acknowledgment was > received. This clearly leads to a situation where a client > won't know if their writes made it. > > If the client does not have an active connection, all > requests are queued until contact is made with any random > server. After a connection, the sessionID is validated with > the server. If the session is valid, the queued requests are > processed as mentioned above. If the session has expired, If > the session has expired, all the pending requests immediately > return SESSION_EXPIRED, the client is closed, and any future > requests to the client return SESSION_EXPIRED. > > If someone calls close() on the client, all pending requests > return CONNECTION_LOSS, and all future calls return SESSION_EXPIRED. > ---------------------------------------- > So...... if you are doing a write operation and the call returns: > > OK : You are guaranteed the write completed in full. > > SESSION_EXPIRED : The server never accepted the operation > because the session is invalid. The servers will eventually > have torn down all ephemeral nodes previously created by this > session. You will need to create a new client object. > > CONNECTION_LOSS : The write may or may not have succeeded. > For non-sequential creates you know the path so things are easy, use > exists() or retry the create() and handle the exception if it > existed. > For sequential nodes, you don't know the final path of the > node and the call isn't idempotent, so you'll have to scan > the children of the parent node looking for the write. The > three options are prefix naming convention, trackable user > data, or Stat.getEphemeralOwner(). > > ANYTHING_ELSE : For anything else, you retained a valid > connection to a server, but it didn't return OK, so it is > unlikely the write succeeded, but the most paranoid stance is > to assume the same behavior of CONNECTION_LOSS for any status > code you don't recognize or aren't 100% on. > ---------------------------------------- > > As a side note, a registered client watch only fires once, > and only while connected, so depending on your write > patterns, missed events may be the norm. Take care your > algorithms don't require detection of every znode state > change and are self righting. > > > Sean > > > > > On 06/30/2010 10:32 AM, ext Bryan Thompson wrote: > > Hello, > > > > I am wondering if anyone (SeanG?, BrianM?) has looked into > the guarantees which zookeeper provides for reliable > messaging for operation return codes in the context of > ephemeral znode creates. Basically, I would like to know > whether a zookeeper client can rely on observing the return > code for a successful operation which creates an ephemeral > (or ephemeral sequential) znode -or- have a guarantee that > its session was timed out and the ephemeral znode destroyed. > That is, does zookeeper provide guaranteed delivery of the > operation return code unless the session is invalidated by a timeout? > > > > Thanks, > > Bryan > > > ---------------------------------------------------------------------- > > -------- This SF.net email is sponsored by Sprint What will you do > > first with EVO, the first 4G phone? > > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > > _______________________________________________ > > Bigdata-developers mailing list > > Big...@li... > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > |
|
From: Sean G. <sea...@no...> - 2010-07-02 15:57:36
|
Sure, sounds like a plan. I've been trying to follow your checkins in the branch, so I'm not completely disconnected. Let me know when you're ready for me to take a look. Sean On 07/02/2010 11:53 AM, ext Bryan Thompson wrote: > Sean, > > Great info. I am making good progress on the zookeeper / quorum integration and I should have the basic unit tests for that integration running early next week. Once I get there, would you be interested in taking a look at the implementation and the unit tests? > > Thanks, > Bryan > > >> -----Original Message----- >> From: Sean Gossard [mailto:sea...@no...] >> Sent: Thursday, July 01, 2010 1:29 PM >> To: Bryan Thompson >> Cc: big...@li... >> Subject: Re: [Bigdata-developers] zookeeper message delivery >> guarantees? >> >> >> If an established connection breaks, the client immediately >> returns the CONNECTION_LOSS error code for any pending >> request without a full server reply. This includes writes >> that were never sent, in flight, or finished on the server >> side but the connection broke before the acknowledgment was >> received. This clearly leads to a situation where a client >> won't know if their writes made it. >> >> If the client does not have an active connection, all >> requests are queued until contact is made with any random >> server. After a connection, the sessionID is validated with >> the server. If the session is valid, the queued requests are >> processed as mentioned above. If the session has expired, If >> the session has expired, all the pending requests immediately >> return SESSION_EXPIRED, the client is closed, and any future >> requests to the client return SESSION_EXPIRED. >> >> If someone calls close() on the client, all pending requests >> return CONNECTION_LOSS, and all future calls return SESSION_EXPIRED. >> ---------------------------------------- >> So...... if you are doing a write operation and the call returns: >> >> OK : You are guaranteed the write completed in full. >> >> SESSION_EXPIRED : The server never accepted the operation >> because the session is invalid. The servers will eventually >> have torn down all ephemeral nodes previously created by this >> session. You will need to create a new client object. >> >> CONNECTION_LOSS : The write may or may not have succeeded. >> For non-sequential creates you know the path so things are easy, use >> exists() or retry the create() and handle the exception if it >> existed. >> For sequential nodes, you don't know the final path of the >> node and the call isn't idempotent, so you'll have to scan >> the children of the parent node looking for the write. The >> three options are prefix naming convention, trackable user >> data, or Stat.getEphemeralOwner(). >> >> ANYTHING_ELSE : For anything else, you retained a valid >> connection to a server, but it didn't return OK, so it is >> unlikely the write succeeded, but the most paranoid stance is >> to assume the same behavior of CONNECTION_LOSS for any status >> code you don't recognize or aren't 100% on. >> ---------------------------------------- >> >> As a side note, a registered client watch only fires once, >> and only while connected, so depending on your write >> patterns, missed events may be the norm. Take care your >> algorithms don't require detection of every znode state >> change and are self righting. >> >> >> Sean >> >> >> >> >> On 06/30/2010 10:32 AM, ext Bryan Thompson wrote: >> >>> Hello, >>> >>> I am wondering if anyone (SeanG?, BrianM?) has looked into >>> >> the guarantees which zookeeper provides for reliable >> messaging for operation return codes in the context of >> ephemeral znode creates. Basically, I would like to know >> whether a zookeeper client can rely on observing the return >> code for a successful operation which creates an ephemeral >> (or ephemeral sequential) znode -or- have a guarantee that >> its session was timed out and the ephemeral znode destroyed. >> That is, does zookeeper provide guaranteed delivery of the >> operation return code unless the session is invalidated by a timeout? >> >>> Thanks, >>> Bryan >>> >>> >> ---------------------------------------------------------------------- >> >>> -------- This SF.net email is sponsored by Sprint What will you do >>> first with EVO, the first 4G phone? >>> Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >>> >> |