You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(26) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(5) |
Feb
(1) |
Mar
|
Apr
(71) |
May
(22) |
Jun
(47) |
Jul
(32) |
Aug
(18) |
Sep
(9) |
Oct
(4) |
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Austin S. <au...@co...> - 2008-07-22 09:12:01
|
We are using Zookeeper to implement consistent hashing, and need to be able to add or remove hosts from the Zookeeper service without interrupting its functionality. According to http://zookeeper.wiki.sourceforge.net/ZooKeeperConfiguration : "Every machine that is part of the ZooKeeper service needs to know about every other machine. So, we need to have a list of machines in the configuration file." Can we add a single Zookeeper server to the service with an expanded host list without interrupting the proper functioning of the service? The old servers will each have host list s_1, ..., s_k, and the new server s_k+1 will have host list s_1, ..., s_k, s_k+1. Now we need to restart the rest of the servers one by one from s_1 to s_k with the new host list s_1, ..., s_k, s_k+1. Is this a violation of the service specification? Thanks, Austin |
From: Patrick H. <ph...@gm...> - 2008-07-16 16:27:14
|
Thanks for the report, I created a new JIRA for this: https://issues.apache.org/jira/browse/ZOOKEEPER-77 Patrick Martin Schaaf wrote: > Hi, > > FYI today I found this NPE in the logs. After this Exception an Error > event was thrown. > > java.lang.NullPointerException > at com.yahoo.jute.Utils.toCSVString(Utils.java:128) > at com.yahoo.jute.CsvOutputArchive.writeString(CsvOutputArchive.java: > 95) > at com.yahoo.zookeeper.proto.WatcherEvent.toString(WatcherEvent.java: > 60) > at net.sf.katta.zk.ZKClient.process(ZKClient.java:404) > at com.yahoo.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:264) > > > Bye, > martin. > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Benjamin R. <br...@ya...> - 2008-07-16 16:00:12
|
Writes are processed serially inside of ZooKeeper. Things are pipelined though, so there can be many writes outstanding. ben Avinash Lakshman wrote: > Hi All > > I understand writes to Zookeeper are probably serialized. But suppose > I have two znodes /A/A1 and /A/A2 can modifications to these znodes > happen concurrently? > > Avinash > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |
From: Martin S. <ms...@10...> - 2008-07-16 14:52:37
|
Hi, FYI today I found this NPE in the logs. After this Exception an Error event was thrown. java.lang.NullPointerException at com.yahoo.jute.Utils.toCSVString(Utils.java:128) at com.yahoo.jute.CsvOutputArchive.writeString(CsvOutputArchive.java: 95) at com.yahoo.zookeeper.proto.WatcherEvent.toString(WatcherEvent.java: 60) at net.sf.katta.zk.ZKClient.process(ZKClient.java:404) at com.yahoo.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:264) Bye, martin. |
From: Avinash L. <avi...@gm...> - 2008-07-16 05:01:52
|
Hi All I understand writes to Zookeeper are probably serialized. But suppose I have two znodes /A/A1 and /A/A2 can modifications to these znodes happen concurrently? Avinash |
From: Benjamin R. <br...@ya...> - 2008-07-16 02:59:44
|
Jacob answered your first question. With respect to your last question, yes it is save to share ZooKeeper objects. The ZooKeeper object is thread safe and it is much more efficient for all threads to share the same object rather than having an object per thread. ben Anthony Urso wrote: > I have a distributed server cluster and each of the many threads on > each server has a ZooKeeper java client used to communicate to other > machines. What kind of scalability can I expect from the zookeeper > server? Can I just add more QuorumServers infinitely? > > Also, assuming I can coordinate the default watcher behavior of these > various threads, would it be safe to share the ZooKeeper objects > between them? > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |
From: Jacob L. <jy...@ya...> - 2008-07-15 22:26:03
|
Hi Anthony Scalability depends on the mix of reads and writes you do with ZooKeeper. If all communication is going through ZooKeeper, then you're probably expecting 1 or more reads per write. In that case (about equal numbers of reads and writes), your scalability is limited by the number of writes to ZooKeeper. Adding more quorum servers will likely *hurt* performance, not help, because a majority needs to be in agreement about the order of all writes. If you're only going to use ZooKeeper for very occasional updates and mostly read values from the znodes, then adding more quorum servers will allow you to scale to more clients. In practice we've found that the aggregate read throughput scales well with the number of quorum servers, whereas the aggregate write throughput scales inversely (i.e. declines) with the number of quorum servers. The sweet spot where good read and write throughput is obtained seems to be five quorum servers, with today's networking and hardware technology. --Jacob -----Original Message----- From: zoo...@li... [mailto:zoo...@li...] On Behalf Of Anthony Urso Sent: Tuesday, July 15, 2008 3:13 PM To: zoo...@li...; zoo...@ha... Subject: [Zookeeper-user] Scalability and ZooKeeper java client thread safety I have a distributed server cluster and each of the many threads on each server has a ZooKeeper java client used to communicate to other machines. What kind of scalability can I expect from the zookeeper server? Can I just add more QuorumServers infinitely? Also, assuming I can coordinate the default watcher behavior of these various threads, would it be safe to share the ZooKeeper objects between them? ------------------------------------------------------------------------ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Zookeeper-user mailing list Zoo...@li... https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Anthony U. <ant...@gm...> - 2008-07-15 22:13:06
|
I have a distributed server cluster and each of the many threads on each server has a ZooKeeper java client used to communicate to other machines. What kind of scalability can I expect from the zookeeper server? Can I just add more QuorumServers infinitely? Also, assuming I can coordinate the default watcher behavior of these various threads, would it be safe to share the ZooKeeper objects between them? |
From: Avinash L. <avi...@gm...> - 2008-07-14 17:34:44
|
Thanks for all suggestions. I have gotten it to work. Avinash On Mon, Jul 14, 2008 at 8:23 AM, Benjamin Reed <br...@ya...> wrote: > The easiest way to fix this code is to move the Collections.sort(values) to > right after the zk.getChildren() and then use the following Comparator with > Collections.sort() and Collections.binarySearch(): > > /* This Comparator defines an ordering such that the strings with > * the lowest sequence numbers are first in sequence sorted order > * followed by strings without sequence numbers in lexographical > * order. This class assumes that a '-' preceeds the sequence > * number. */ > class SequenceComparator implements Comparator<String> { > > @Override > public int compare(String o1, String o2) { > int s1 = getSequence(o1); > int s2 = getSequence(o2); > if (s1 == -1 && s2 == -1) { > return o1.compareTo(o2); > } > return s1 == -1 ? 1 : s2 == -1 ? -1 : s1 - s2 ? : -1; > } > > /* Returns the sequence suffix of s. This method assumes that > * the sequence number is prefixed with a '-'. */ > private int getSequence(String s) { > int i = s.lastIndexOf('-'); > if (i != -1) { > try { > return Integer.parseInt(s.substring(i+1)); > // If an exception occurred we misdetected a sequence suffix, > // so return -1. > } catch(NumberFormatException e) { > } catch(ArrayIndexOutOfBoundsException e) { > } > } > return -1; > } > } > > ben > > On Thursday 10 July 2008 22:20:06 Avinash Lakshman wrote: > > Hi > > > > I am trying to elect leader among 50 nodes. There is always one odd guy > who > > seems to think that someone else distinct from what some other nodes see > as > > leader. Could someone please tell me what is wrong with the following > code > > for leader election: > > > > public void electLeader() > > { > > ZooKeeper zk = > StorageService.instance().getZooKeeperHandle(); > > String path = "/Leader"; > > try > > { > > String createPath = path + > > "/L-"; > > LeaderElector.createLock_.lock(); > > while( true ) > > { > > /* Get all znodes under the Leader znode */ > > List<String> values = zk.getChildren(path, false); > > /* > > * Get the first znode and if it is the > > * pathCreated created above then the data > > * in that znode is the leader's identity. > > */ > > if ( leader_ == null ) > > { > > leader_ = new AtomicReference<EndPoint>( > > EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) > > ); } > > else > > { > > leader_.set( EndPoint.fromBytes( zk.getData(path > + > > "/" + values .get(0), false, null) ) ); > > /* Disseminate the state as to who the leader is. > > */ onLeaderElection(); > > } > > logger_.debug("Elected leader is " + leader_ + " @ > > znode " + ( path + "/" + values.get(0) ) ); > > Collections.sort(values); > > /* We need only the last portion of this znode */ > > String[] peices = pathCreated_.split("/"); > > int index = Collections.binarySearch(values, > > peices[peices.length - 1]); > > if ( index > 0 ) > > { > > String pathToCheck = path + "/" + > values.get(index > > - 1); > > Stat stat = zk.exists(pathToCheck, true); > > if ( stat != null ) > > { > > logger_.debug("Awaiting my turn ..."); > > condition_.await(); > > logger_.debug("Checking to see if leader is > > around ..."); > > } > > } > > else > > { > > break; > > } > > } > > } > > catch ( InterruptedException ex ) > > { > > logger_.warn(LogUtil.throwableToString(ex)); > > } > > catch ( KeeperException ex ) > > { > > logger_.warn(LogUtil.throwableToString(ex)); > > } > > finally > > { > > LeaderElector.createLock_.unlock(); > > } > > } > > } > > > > Thanks > > Avinash > > > |
From: Benjamin R. <br...@ya...> - 2008-07-14 15:22:55
|
The easiest way to fix this code is to move the Collections.sort(values) to right after the zk.getChildren() and then use the following Comparator with Collections.sort() and Collections.binarySearch(): /* This Comparator defines an ordering such that the strings with * the lowest sequence numbers are first in sequence sorted order * followed by strings without sequence numbers in lexographical * order. This class assumes that a '-' preceeds the sequence * number. */ class SequenceComparator implements Comparator<String> { @Override public int compare(String o1, String o2) { int s1 = getSequence(o1); int s2 = getSequence(o2); if (s1 == -1 && s2 == -1) { return o1.compareTo(o2); } return s1 == -1 ? 1 : s2 == -1 ? -1 : s1 - s2 ? : -1; } /* Returns the sequence suffix of s. This method assumes that * the sequence number is prefixed with a '-'. */ private int getSequence(String s) { int i = s.lastIndexOf('-'); if (i != -1) { try { return Integer.parseInt(s.substring(i+1)); // If an exception occurred we misdetected a sequence suffix, // so return -1. } catch(NumberFormatException e) { } catch(ArrayIndexOutOfBoundsException e) { } } return -1; } } ben On Thursday 10 July 2008 22:20:06 Avinash Lakshman wrote: > Hi > > I am trying to elect leader among 50 nodes. There is always one odd guy who > seems to think that someone else distinct from what some other nodes see as > leader. Could someone please tell me what is wrong with the following code > for leader election: > > public void electLeader() > { > ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); > String path = "/Leader"; > try > { > String createPath = path + > "/L-"; > LeaderElector.createLock_.lock(); > while( true ) > { > /* Get all znodes under the Leader znode */ > List<String> values = zk.getChildren(path, false); > /* > * Get the first znode and if it is the > * pathCreated created above then the data > * in that znode is the leader's identity. > */ > if ( leader_ == null ) > { > leader_ = new AtomicReference<EndPoint>( > EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) > ); } > else > { > leader_.set( EndPoint.fromBytes( zk.getData(path + > "/" + values .get(0), false, null) ) ); > /* Disseminate the state as to who the leader is. > */ onLeaderElection(); > } > logger_.debug("Elected leader is " + leader_ + " @ > znode " + ( path + "/" + values.get(0) ) ); > Collections.sort(values); > /* We need only the last portion of this znode */ > String[] peices = pathCreated_.split("/"); > int index = Collections.binarySearch(values, > peices[peices.length - 1]); > if ( index > 0 ) > { > String pathToCheck = path + "/" + values.get(index > - 1); > Stat stat = zk.exists(pathToCheck, true); > if ( stat != null ) > { > logger_.debug("Awaiting my turn ..."); > condition_.await(); > logger_.debug("Checking to see if leader is > around ..."); > } > } > else > { > break; > } > } > } > catch ( InterruptedException ex ) > { > logger_.warn(LogUtil.throwableToString(ex)); > } > catch ( KeeperException ex ) > { > logger_.warn(LogUtil.throwableToString(ex)); > } > finally > { > LeaderElector.createLock_.unlock(); > } > } > } > > Thanks > Avinash |
From: Patrick H. <ph...@ap...> - 2008-07-11 22:17:00
|
I don't see leader election documented on the ZK "recipies" wiki: http://zookeeper.wiki.sourceforge.net/ZooKeeperRecipes Jacob, would you mind updating the wiki page, documenting this recipe? Thanks! Patrick Jacob Levy wrote: > *Avinash* > > > > The following protocol will help you fix the observed misbehavior. As > Flavio points out, you cannot rely on the order of nodes in getChildren, > you must use an intrinsic property of each node to determine who is the > leader. The protocol devised by Runping Qi and described here will do that. > > > > First of all, when you create child nodes of the node that holds the > leadership bids, you must create them with the EPHEMERAL and SEQUENCE > flag. ZooKeeper guarantees to give you an ephemeral node named uniquely > and with a sequence number larger by at least one than any previously > created node in the sequence. You provide a prefix, like “L_” or your > own choice, and ZooKeeper creates nodes named “L_23”, “L_24”, etc. The > sequence number starts at 0 and increases monotonously. > > > > Once you’ve placed your leadership bid, you search backwards from the > sequence number of **your** node to see if there are any preceding (in > terms of the sequence number) nodes. When you find one, you place a > watch on it and wait for it to disappear. When you get the watch > notification, you search again, until you do not find a preceding node, > then you know you’re the leader. This protocol guarantees that there is > at any time only one node that thinks it is the leader. But it does not > disseminate information about who is the leader. If you want everyone to > know who is the leader, you can have an additional Znode whose value is > the name of the current leader (or some identifying information on how > to contact the leader, etc.). Note that this cannot be done atomically, > so by the time other nodes find out who the leader is, the leadership > may already have passed on to a different node. > > > > *Flavio* > > > > Might it make sense to provide a standardized implementation of leader > election in the library code in Java? > > > > --Jacob > > > > ------------------------------------------------------------------------ > > *From:* zoo...@li... > [mailto:zoo...@li...] *On Behalf Of > *Flavio Junqueira > *Sent:* Friday, July 11, 2008 1:02 AM > *To:* zoo...@li... > *Cc:* zoo...@ha... > *Subject:* Re: [Zookeeper-user] Leader election > > > > Hi Avinash, getChildren returns a list in lexicographic order, so if you > are updating the children of the election node concurrently, then you > may get a different first node with different clients. If you are using > the sequence flag to create nodes, then you may consider stripping the > prefix of the node name and using the sufix value to determine order. > > Hope it helps. > > -Flavio > > > > ----- Original Message ---- > From: Avinash Lakshman <avi...@gm...> > To: zoo...@li... > Sent: Friday, July 11, 2008 7:20:06 AM > Subject: [Zookeeper-user] Leader election > > Hi > > I am trying to elect leader among 50 nodes. There is always one odd guy > who seems to think that someone else distinct from what some other nodes > see as leader. Could someone please tell me what is wrong with the > following code for leader election: > > public void electLeader() > { > ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); > String path = "/Leader"; > try > { > String createPath = path + > "/L-"; > LeaderElector.createLock_.lock(); > while( true ) > { > /* Get all znodes under the Leader znode */ > List<String> values = zk.getChildren(path, false); > /* > * Get the first znode and if it is the > * pathCreated created above then the data > * in that znode is the leader's identity. > */ > if ( leader_ == null ) > { > leader_ = new AtomicReference<EndPoint>( > EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) ); > } > else > { > leader_.set( EndPoint.fromBytes( zk.getData(path > + "/" + values .get(0), false, null) ) ); > /* Disseminate the state as to who the leader is. */ > onLeaderElection(); > } > logger_.debug("Elected leader is " + leader_ + " @ > znode " + ( path + "/" + values.get(0) ) ); > Collections.sort(values); > /* We need only the last portion of this znode */ > String[] peices = pathCreated_.split("/"); > int index = Collections.binarySearch(values, > peices[peices.length - 1]); > if ( index > 0 ) > { > String pathToCheck = path + "/" + > values.get(index - 1); > Stat stat = zk.exists(pathToCheck, true); > if ( stat != null ) > { > logger_.debug("Awaiting my turn ..."); > condition_.await(); > logger_.debug("Checking to see if leader is > around ..."); > } > } > else > { > break; > } > } > } > catch ( InterruptedException ex ) > { > logger_.warn(LogUtil.throwableToString(ex)); > } > catch ( KeeperException ex ) > { > logger_.warn(LogUtil.throwableToString(ex)); > } > finally > { > LeaderElector.createLock_.unlock(); > } > } > } > > Thanks > Avinash > > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > > > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Jacob L. <jy...@ya...> - 2008-07-11 17:44:59
|
Avinash The following protocol will help you fix the observed misbehavior. As Flavio points out, you cannot rely on the order of nodes in getChildren, you must use an intrinsic property of each node to determine who is the leader. The protocol devised by Runping Qi and described here will do that. First of all, when you create child nodes of the node that holds the leadership bids, you must create them with the EPHEMERAL and SEQUENCE flag. ZooKeeper guarantees to give you an ephemeral node named uniquely and with a sequence number larger by at least one than any previously created node in the sequence. You provide a prefix, like "L_" or your own choice, and ZooKeeper creates nodes named "L_23", "L_24", etc. The sequence number starts at 0 and increases monotonously. Once you've placed your leadership bid, you search backwards from the sequence number of *your* node to see if there are any preceding (in terms of the sequence number) nodes. When you find one, you place a watch on it and wait for it to disappear. When you get the watch notification, you search again, until you do not find a preceding node, then you know you're the leader. This protocol guarantees that there is at any time only one node that thinks it is the leader. But it does not disseminate information about who is the leader. If you want everyone to know who is the leader, you can have an additional Znode whose value is the name of the current leader (or some identifying information on how to contact the leader, etc.). Note that this cannot be done atomically, so by the time other nodes find out who the leader is, the leadership may already have passed on to a different node. Flavio Might it make sense to provide a standardized implementation of leader election in the library code in Java? --Jacob ________________________________ From: zoo...@li... [mailto:zoo...@li...] On Behalf Of Flavio Junqueira Sent: Friday, July 11, 2008 1:02 AM To: zoo...@li... Cc: zoo...@ha... Subject: Re: [Zookeeper-user] Leader election Hi Avinash, getChildren returns a list in lexicographic order, so if you are updating the children of the election node concurrently, then you may get a different first node with different clients. If you are using the sequence flag to create nodes, then you may consider stripping the prefix of the node name and using the sufix value to determine order. Hope it helps. -Flavio ----- Original Message ---- From: Avinash Lakshman <avi...@gm...> To: zoo...@li... Sent: Friday, July 11, 2008 7:20:06 AM Subject: [Zookeeper-user] Leader election Hi I am trying to elect leader among 50 nodes. There is always one odd guy who seems to think that someone else distinct from what some other nodes see as leader. Could someone please tell me what is wrong with the following code for leader election: public void electLeader() { ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); String path = "/Leader"; try { String createPath = path + "/L-"; LeaderElector.createLock_.lock(); while( true ) { /* Get all znodes under the Leader znode */ List<String> values = zk.getChildren(path, false); /* * Get the first znode and if it is the * pathCreated created above then the data * in that znode is the leader's identity. */ if ( leader_ == null ) { leader_ = new AtomicReference<EndPoint>( EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) ); } else { leader_.set( EndPoint.fromBytes( zk.getData(path + "/" + values .get(0), false, null) ) ); /* Disseminate the state as to who the leader is. */ onLeaderElection(); } logger_.debug("Elected leader is " + leader_ + " @ znode " + ( path + "/" + values.get(0) ) ); Collections.sort(values); /* We need only the last portion of this znode */ String[] peices = pathCreated_.split("/"); int index = Collections.binarySearch(values, peices[peices.length - 1]); if ( index > 0 ) { String pathToCheck = path + "/" + values.get(index - 1); Stat stat = zk.exists(pathToCheck, true); if ( stat != null ) { logger_.debug("Awaiting my turn ..."); condition_.await(); logger_.debug("Checking to see if leader is around ..."); } } else { break; } } } catch ( InterruptedException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } catch ( KeeperException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } finally { LeaderElector.createLock_.unlock(); } } } Thanks Avinash |
From: Flavio J. <fpj...@ya...> - 2008-07-11 08:02:10
|
Hi Avinash, getChildren returns a list in lexicographic order, so if you are updating the children of the election node concurrently, then you may get a different first node with different clients. If you are using the sequence flag to create nodes, then you may consider stripping the prefix of the node name and using the sufix value to determine order. Hope it helps. -Flavio ----- Original Message ---- From: Avinash Lakshman <avi...@gm...> To: zoo...@li... Sent: Friday, July 11, 2008 7:20:06 AM Subject: [Zookeeper-user] Leader election Hi I am trying to elect leader among 50 nodes. There is always one odd guy who seems to think that someone else distinct from what some other nodes see as leader. Could someone please tell me what is wrong with the following code for leader election: public void electLeader() { ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); String path = "/Leader"; try { String createPath = path + "/L-"; LeaderElector.createLock_.lock(); while( true ) { /* Get all znodes under the Leader znode */ List<String> values = zk.getChildren(path, false); /* * Get the first znode and if it is the * pathCreated created above then the data * in that znode is the leader's identity. */ if ( leader_ == null ) { leader_ = new AtomicReference<EndPoint>( EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) ); } else { leader_.set( EndPoint.fromBytes( zk.getData(path + "/" + values .get(0), false, null) ) ); /* Disseminate the state as to who the leader is. */ onLeaderElection(); } logger_.debug("Elected leader is " + leader_ + " @ znode " + ( path + "/" + values.get(0) ) ); Collections.sort(values); /* We need only the last portion of this znode */ String[] peices = pathCreated_.split("/"); int index = Collections.binarySearch(values, peices[peices.length - 1]); if ( index > 0 ) { String pathToCheck = path + "/" + values.get(index - 1); Stat stat = zk.exists(pathToCheck, true); if ( stat != null ) { logger_.debug("Awaiting my turn ..."); condition_.await(); logger_.debug("Checking to see if leader is around ..."); } } else { break; } } } catch ( InterruptedException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } catch ( KeeperException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } finally { LeaderElector.createLock_.unlock(); } } } Thanks Avinash |
From: Avinash L. <avi...@gm...> - 2008-07-11 05:19:58
|
Hi I am trying to elect leader among 50 nodes. There is always one odd guy who seems to think that someone else distinct from what some other nodes see as leader. Could someone please tell me what is wrong with the following code for leader election: public void electLeader() { ZooKeeper zk = StorageService.instance().getZooKeeperHandle(); String path = "/Leader"; try { String createPath = path + "/L-"; LeaderElector.createLock_.lock(); while( true ) { /* Get all znodes under the Leader znode */ List<String> values = zk.getChildren(path, false); /* * Get the first znode and if it is the * pathCreated created above then the data * in that znode is the leader's identity. */ if ( leader_ == null ) { leader_ = new AtomicReference<EndPoint>( EndPoint.fromBytes( zk.getData(path + "/" + values.get(0), false, null) ) ); } else { leader_.set( EndPoint.fromBytes( zk.getData(path + "/" + values .get(0), false, null) ) ); /* Disseminate the state as to who the leader is. */ onLeaderElection(); } logger_.debug("Elected leader is " + leader_ + " @ znode " + ( path + "/" + values.get(0) ) ); Collections.sort(values); /* We need only the last portion of this znode */ String[] peices = pathCreated_.split("/"); int index = Collections.binarySearch(values, peices[peices.length - 1]); if ( index > 0 ) { String pathToCheck = path + "/" + values.get(index - 1); Stat stat = zk.exists(pathToCheck, true); if ( stat != null ) { logger_.debug("Awaiting my turn ..."); condition_.await(); logger_.debug("Checking to see if leader is around ..."); } } else { break; } } } catch ( InterruptedException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } catch ( KeeperException ex ) { logger_.warn(LogUtil.throwableToString(ex)); } finally { LeaderElector.createLock_.unlock(); } } } Thanks Avinash |
From: Benjamin R. <br...@ya...> - 2008-07-06 16:54:03
|
This is a great FAQ topic! There are two kinds of connection problems: 1) Disconnections: this callback says that we have disconnected: KeeperStateDisconnected. This state is usually due to a server failure or transient communication error that will hopefully be followed up by a reconnected callback. The basic idea is that when disconnected from ZooKeeper the process will not have a clear idea of changes that are happening, so it should be conservative and assume the worst. 2) Expired session: this callback says that there was a problem, usually a network outage, that prevented the client from keeping its session alive so the session timed out. This state is not recoverable. This is game over a new ZooKeeper object needs to be created the state stored in ZooKeeper needs to be re-queried and re-setup. Here is the best practice for handling these two states: 1) For disconnections, the server should suspend operations that relied on information in ZooKeeper. For example, a leader should suspend operations that assume it is a leader. Operations resume once the connection is reestablished. 2) For expired sessions, the server should relinquish any rights it received from ZooKeeper and rerun the ZooKeeper initialization operations. For example, a leader will need to give up leadership, create a new ZooKeeper object and rerun the leader election protocol. Restarting the application is a very easy way to do this. Of course there are always exceptions to these practices. For example, given a leader that is established with ZooKeeper and behaves conservatively by suspending operations on disconnects, even if a process is disconnected from ZooKeeper it could still send requests to the leader process. (A partial network partition may cause one process to not be able to connect to ZooKeeper and still be able to connect to another process that can connect to ZooKeeper.) Personally, I would still write my applications to behave conservatively in these situations since these kind of partial partitionings are difficult to test. ben ----- Original Message ---- From: Anthony Urso <ant...@gm...> To: zoo...@li...; zoo...@ha... Sent: Thursday, July 3, 2008 7:17:32 PM Subject: [Zookeeper-user] Recipes for dealing with disconnection and connection expiration Anyone have examples of the right way to deal with ZooKeeper disconnection or connection expiration? Currently I am exiting and starting fresh, but hopefully there is a more efficient pattern. Cheers, Anthony ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Zookeeper-user mailing list Zoo...@li... https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Anthony U. <ant...@gm...> - 2008-07-04 02:17:25
|
Anyone have examples of the right way to deal with ZooKeeper disconnection or connection expiration? Currently I am exiting and starting fresh, but hopefully there is a more efficient pattern. Cheers, Anthony |
From: Benjamin R. <br...@ya...> - 2008-06-30 14:31:46
|
> The session is expired and I try now to catch this state with > event.getState() == Watcher.Event.KeeperStateExpired. Is there an > example how to renew the session or do I have to create a new > ZooKeeper object? When a session expires the ZooKeeper object becomes invalid. You must create a new ZooKeeper object. ben |
From: Flavio J. <fpj...@ya...> - 2008-06-27 19:20:39
|
Hi Martin, We have actually found a bug that might be the cause of your problem. As we are now in the process of migrating to Apache, the issue is on JIRA: https://issues.apache.org/jira/browse/ZOOKEEPER-57 Please feel free to add comments. Thanks, -Flavio ----- Original Message ---- From: Martin Schaaf <ms...@10...> To: zoo...@li... Sent: Friday, June 27, 2008 4:07:00 PM Subject: Re: [Zookeeper-user] Lost connection Am 26.06.2008 um 15:09 schrieb Flavio Junqueira: > It looks like your client session is expiring. You can try > increasing the session timeout value when creating the ZooKeeper > object on your client. In any case, I would say it is good practice > to have code on your client to deal with session expirations. Upon a > session expiration, there is a call to your Watcher.process > implementation indicating that it has expired. It happens again now I found the follwoing in one of the server log files: Failed to update data in zookeeper net.sf.katta.util.KattaException: unable to check path: /katta/nodes/1 at net.sf.katta.zk.ZKClient.exists(ZKClient.java:301) at net.sf.katta.node.Node$StatusUpdater.run(Node.java:730) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: com.yahoo.zookeeper.KeeperException: KeeperErrorCode = ConnectionLoss at com.yahoo.zookeeper.ZooKeeper.exists(ZooKeeper.java:357) at net.sf.katta.zk.ZKClient.exists(ZKClient.java:299) ... 3 more Failed to update data in zookeeper net.sf.katta.util.KattaException: unable to check path: /katta/nodes/1 at net.sf.katta.zk.ZKClient.exists(ZKClient.java:301) at net.sf.katta.node.Node$StatusUpdater.run(Node.java:730) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: com.yahoo.zookeeper.KeeperException: KeeperErrorCode = SessionExpired at com.yahoo.zookeeper.ZooKeeper.exists(ZooKeeper.java:357) at net.sf.katta.zk.ZKClient.exists(ZKClient.java:299) ... 3 more The session is expired and I try now to catch this state with event.getState() == Watcher.Event.KeeperStateExpired. Is there an example how to renew the session or do I have to create a new ZooKeeper object? Bye, martin. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Zookeeper-user mailing list Zoo...@li... https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Martin S. <ms...@10...> - 2008-06-27 14:07:00
|
Am 26.06.2008 um 15:09 schrieb Flavio Junqueira: > It looks like your client session is expiring. You can try > increasing the session timeout value when creating the ZooKeeper > object on your client. In any case, I would say it is good practice > to have code on your client to deal with session expirations. Upon a > session expiration, there is a call to your Watcher.process > implementation indicating that it has expired. It happens again now I found the follwoing in one of the server log files: Failed to update data in zookeeper net.sf.katta.util.KattaException: unable to check path: /katta/nodes/1 at net.sf.katta.zk.ZKClient.exists(ZKClient.java:301) at net.sf.katta.node.Node$StatusUpdater.run(Node.java:730) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: com.yahoo.zookeeper.KeeperException: KeeperErrorCode = ConnectionLoss at com.yahoo.zookeeper.ZooKeeper.exists(ZooKeeper.java:357) at net.sf.katta.zk.ZKClient.exists(ZKClient.java:299) ... 3 more Failed to update data in zookeeper net.sf.katta.util.KattaException: unable to check path: /katta/nodes/1 at net.sf.katta.zk.ZKClient.exists(ZKClient.java:301) at net.sf.katta.node.Node$StatusUpdater.run(Node.java:730) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: com.yahoo.zookeeper.KeeperException: KeeperErrorCode = SessionExpired at com.yahoo.zookeeper.ZooKeeper.exists(ZooKeeper.java:357) at net.sf.katta.zk.ZKClient.exists(ZKClient.java:299) ... 3 more The session is expired and I try now to catch this state with event.getState() == Watcher.Event.KeeperStateExpired. Is there an example how to renew the session or do I have to create a new ZooKeeper object? Bye, martin. |
From: Martin S. <ms...@10...> - 2008-06-27 06:09:20
|
Am 26.06.2008 um 15:09 schrieb Flavio Junqueira: > It looks like your client session is expiring. You can try > increasing the session timeout value when creating the ZooKeeper > object on your client. In any case, I would say it is good practice > to have code on your client to deal with session expirations. Upon a > session expiration, there is a call to your Watcher.process > implementation indicating that it has expired. Ok. I will check this. > Do you also see repeatedly: "Renewing 11aba706af6001e"? Basically, > I'm trying to determine if your client is trying to revalidate a > session repeatedly. No. I see the expiration for different sessions and not for the same sorry for not being clear enough. Bye, martin. |
From: Martin S. <ms...@10...> - 2008-06-27 06:04:02
|
Am 26.06.2008 um 18:16 schrieb Benjamin Reed: > Is there anything that characterizes the failures you are seeing? > For example, is it always after the clients have been running for a > while? Were the clients idle just before the timeout? etc... I would > really like to reproduce this in our lab. I think about 1 to 2 days. |
From: Benjamin R. <br...@ya...> - 2008-06-26 16:18:38
|
Martin, Is there anything that characterizes the failures you are seeing? For example, is it always after the clients have been running for a while? Were the clients idle just before the timeout? etc... I would really like to reproduce this in our lab. thanx ben Martin Schaaf wrote: > Am 26.06.2008 um 12:53 schrieb Flavio Junqueira: > > >> Hi Martin, Have you observed any session expiration in your logs? >> What is the value of tickTime you are using? >> > > > I find this too very often > > 2008-06-26 02:50:36,003 WARN > com.yahoo.zookeeper.server.SessionTrackerImpl: Expiring 11aba706af6001e > > Is this what you mean? > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |
From: Flavio J. <fpj...@ya...> - 2008-06-26 13:09:16
|
It looks like your client session is expiring. You can try increasing the session timeout value when creating the ZooKeeper object on your client. In any case, I would say it is good practice to have code on your client to deal with session expirations. Upon a session expiration, there is a call to your Watcher.process implementation indicating that it has expired. Do you also see repeatedly: "Renewing 11aba706af6001e"? Basically, I'm trying to determine if your client is trying to revalidate a session repeatedly. Thanks, -Flavio ----- Original Message ---- From: Martin Schaaf <ms...@10...> To: zoo...@li... Sent: Thursday, June 26, 2008 1:44:01 PM Subject: Re: [Zookeeper-user] Lost connection Am 26.06.2008 um 12:53 schrieb Flavio Junqueira: > Hi Martin, Have you observed any session expiration in your logs? > What is the value of tickTime you are using? I find this too very often 2008-06-26 02:50:36,003 WARN com.yahoo.zookeeper.server.SessionTrackerImpl: Expiring 11aba706af6001e Is this what you mean? ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Zookeeper-user mailing list Zoo...@li... https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Martin S. <ms...@10...> - 2008-06-26 11:44:00
|
Am 26.06.2008 um 12:53 schrieb Flavio Junqueira: > Hi Martin, Have you observed any session expiration in your logs? > What is the value of tickTime you are using? I find this too very often 2008-06-26 02:50:36,003 WARN com.yahoo.zookeeper.server.SessionTrackerImpl: Expiring 11aba706af6001e Is this what you mean? |
From: Martin S. <ms...@10...> - 2008-06-26 11:27:43
|
Am 26.06.2008 um 12:53 schrieb Flavio Junqueira: > Hi Martin, Have you observed any session expiration in your logs? > What is the value of tickTime you are using? > tickTime is 2000 I found this in the logs 2008-06-25 07:26:42,003 WARN com.yahoo.zookeeper.server.PrepRequestProcessor: Processed session termination request for id: 11aba706af60019 2008-06-25 07:50:42,161 WARN com.yahoo.zookeeper.server.ZooKeeperServer: Closing: java.io.IOException: TIMED OUT at com.yahoo.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:666) 2008-06-25 07:50:42,723 WARN com.yahoo.zookeeper.server.ZooKeeperServer: Closing: java.io.IOException: TIMED OUT at com.yahoo.zookeeper.ClientCnxn $SendThread.run(ClientCnxn.java:666) and very often this 2008-06-25 07:50:43,970 WARN com.yahoo.zookeeper.server.NIOServerCnxn: Connected to /10.201.8.204:50229 lastZxid 6 2008-06-25 07:50:43,970 WARN com.yahoo.zookeeper.server.NIOServerCnxn: Finished init of 11aba706af60000: true 2008-06-25 07:50:43,970 WARN com.yahoo.zookeeper.server.NIOServerCnxn: Renewing session 11aba706af60000 Bye, Martin Schaaf |