You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(37) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(27) |
Feb
(34) |
Mar
(30) |
Apr
(151) |
May
(184) |
Jun
(55) |
Jul
(2) |
Aug
(6) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
From: SourceForge.net <no...@so...> - 2008-06-03 18:48:53
|
Patches item #1982992, was opened at 2008-06-02 21:38 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Benjamin Reed (breed) Summary: concurrency issue in the zookeeper leader code. Initial Comment: The Datatree.ephemerals has a hashset of ephermal nodes for a given session. This list is not synchronized. This was a part of the bug that one of our users found. One of the sessions was getting expired and some other session was deleting ephemeral nodes from the session getting expired. This caused a concurrency modification excpetion while we called hashlist.clone() and datatree.getephermals() --- The fix is simple to put synchronized blocks around the accesses to the list. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-03 11:49 Message: Logged In: YES user_id=154690 Originator: NO +1 thanx mahadev! ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-03 11:35 Message: Logged In: YES user_id=1926680 Originator: YES thanks for catching that Ben :). Attaching a patch that fixes the problem. File Added: concurrency_2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-06-03 11:05 Message: Logged In: YES user_id=154690 Originator: NO We need synchronized blocks around all accesses. Since the reads are not in synchronized blocks, you will still get concurrent modification exceptions. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-02 21:39 Message: Logged In: YES user_id=1926680 Originator: YES ben can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 18:44:18
|
Bugs item #1981340, was opened at 2008-06-01 08:05 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Benjamin Reed (breed) Summary: Child watches are not triggered when the node is deleted Initial Comment: If a client sets a watch on "/foo" with a getChildren("/foo" ...) and "/foo" gets deleted, no watch event is triggered. (Ever.) It could be argued that this is correct behavior since the children did not change, but it can lead to confusion since "/foo" could be recreated and children added and still the watch would not get triggered. (We trace watches by znode instance and deleting and creating "/foo" causes a new znode instance.) The correct behavior should be that when a znode is removed there are three events triggered: parent child watches, znode data watches, and znode child watches. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-03 11:44 Message: Logged In: YES user_id=154690 Originator: YES I don't think we want another list, that will just make things more complicated. Option 2 is best since triggerWatch already builds a HashSet that we can simply return. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-06-03 11:27 Message: Logged In: YES user_id=12853 Originator: NO This is going to be a bit more complicated that it seems. Let me know what you think of the following: The server has 2 watcher lists; data & child. Currently watchers are triggered exclusively to one or the other list (all datachanged are on data list, all children changed on child list,etc...) Merely adding a "childwatches.trigger(path, nodedeleted)" to datatree will result in (potentially) two nodedeleted events being sent to a watcher (if client did a exists(path) & getchildren(path) for example). I don't believe we want to do this. There are two options afaict; add a new watch list "deletedNodeWatches" to datatree that contains both data & child watches to be triggered for nodedelete events only, or continue to have 2 lists and filter out any duplicate events during the trigger phase. Adding more lists doesn't seem like a great option. Option 2 would require a change to watchmanager.triggerwatches to return the set of watchers triggered, which would be used to filter out duplicate triggers. pseudo for datatree.deletenode: processedwatches = datawatches.triggerwatches(path, nodedeleted) childwatches.triggerwatches(path, nodedeleted, processedwatches) triggerwatches would be updated to use processwatches parameter to filter out duplicate nodedeleted events from being generated (the child watch would be cleared on the server) Secondly the client (current patch) also has two lists - data and child watches. The client will have to use the same trick -- trigger both data and child watches, eliminate duplicates by uniquely triggering based on path/watcher pairs. (process the data watches, process the child watches filtering out any dups from being generated). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 18:35:46
|
Patches item #1982992, was opened at 2008-06-03 04:38 Message generated for change (Comment added) made by mahadevkonar You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Benjamin Reed (breed) Summary: concurrency issue in the zookeeper leader code. Initial Comment: The Datatree.ephemerals has a hashset of ephermal nodes for a given session. This list is not synchronized. This was a part of the bug that one of our users found. One of the sessions was getting expired and some other session was deleting ephemeral nodes from the session getting expired. This caused a concurrency modification excpetion while we called hashlist.clone() and datatree.getephermals() --- The fix is simple to put synchronized blocks around the accesses to the list. ---------------------------------------------------------------------- >Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-03 18:35 Message: Logged In: YES user_id=1926680 Originator: YES thanks for catching that Ben :). Attaching a patch that fixes the problem. File Added: concurrency_2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-06-03 18:05 Message: Logged In: YES user_id=154690 Originator: NO We need synchronized blocks around all accesses. Since the reads are not in synchronized blocks, you will still get concurrent modification exceptions. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-03 04:39 Message: Logged In: YES user_id=1926680 Originator: YES ben can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 18:27:47
|
Bugs item #1981340, was opened at 2008-06-01 08:05 Message generated for change (Comment added) made by phunt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) >Assigned to: Benjamin Reed (breed) Summary: Child watches are not triggered when the node is deleted Initial Comment: If a client sets a watch on "/foo" with a getChildren("/foo" ...) and "/foo" gets deleted, no watch event is triggered. (Ever.) It could be argued that this is correct behavior since the children did not change, but it can lead to confusion since "/foo" could be recreated and children added and still the watch would not get triggered. (We trace watches by znode instance and deleting and creating "/foo" causes a new znode instance.) The correct behavior should be that when a znode is removed there are three events triggered: parent child watches, znode data watches, and znode child watches. ---------------------------------------------------------------------- >Comment By: Patrick Hunt (phunt) Date: 2008-06-03 11:27 Message: Logged In: YES user_id=12853 Originator: NO This is going to be a bit more complicated that it seems. Let me know what you think of the following: The server has 2 watcher lists; data & child. Currently watchers are triggered exclusively to one or the other list (all datachanged are on data list, all children changed on child list,etc...) Merely adding a "childwatches.trigger(path, nodedeleted)" to datatree will result in (potentially) two nodedeleted events being sent to a watcher (if client did a exists(path) & getchildren(path) for example). I don't believe we want to do this. There are two options afaict; add a new watch list "deletedNodeWatches" to datatree that contains both data & child watches to be triggered for nodedelete events only, or continue to have 2 lists and filter out any duplicate events during the trigger phase. Adding more lists doesn't seem like a great option. Option 2 would require a change to watchmanager.triggerwatches to return the set of watchers triggered, which would be used to filter out duplicate triggers. pseudo for datatree.deletenode: processedwatches = datawatches.triggerwatches(path, nodedeleted) childwatches.triggerwatches(path, nodedeleted, processedwatches) triggerwatches would be updated to use processwatches parameter to filter out duplicate nodedeleted events from being generated (the child watch would be cleared on the server) Secondly the client (current patch) also has two lists - data and child watches. The client will have to use the same trick -- trigger both data and child watches, eliminate duplicates by uniquely triggering based on path/watcher pairs. (process the data watches, process the child watches filtering out any dups from being generated). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 18:05:30
|
Patches item #1982992, was opened at 2008-06-02 21:38 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Benjamin Reed (breed) Summary: concurrency issue in the zookeeper leader code. Initial Comment: The Datatree.ephemerals has a hashset of ephermal nodes for a given session. This list is not synchronized. This was a part of the bug that one of our users found. One of the sessions was getting expired and some other session was deleting ephemeral nodes from the session getting expired. This caused a concurrency modification excpetion while we called hashlist.clone() and datatree.getephermals() --- The fix is simple to put synchronized blocks around the accesses to the list. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-03 11:05 Message: Logged In: YES user_id=154690 Originator: NO We need synchronized blocks around all accesses. Since the reads are not in synchronized blocks, you will still get concurrent modification exceptions. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-02 21:39 Message: Logged In: YES user_id=1926680 Originator: YES ben can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 15:51:32
|
Patches item #1913998, was opened at 2008-03-13 18:52 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1913998&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: 3.0.0 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Andrew Kornev (akornev) Assigned to: Benjamin Reed (breed) Summary: JMX instrumentation Initial Comment: This patch adds support for management and monitoring via JMX. It uses the MXBean feature made available in Java 6. The instrumentation code relies on the new lifecycle event notification mechanism to dynamically register and unregister MBeans. The ant build file has NOT been modified to build this new feature yet. There are no javadocs for the new classes either (but this will be fixed, of course). ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-03 08:51 Message: Logged In: YES user_id=154690 Originator: NO Committed revision 170. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-06-03 08:49 Message: Logged In: YES user_id=154690 Originator: NO Updated the patch slightly so that: 1) everything goes under the jmx directory 2) DataTreeBean.getDatasize always returns -1 until we get a more efficient way of getting the data size 3) HiddenNameBean was removed since it is unused File Added: jmx-5.patch ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 13:14 Message: Logged In: YES user_id=1926680 Originator: NO assigning to ben since he is reviewing the code. ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-09 11:34 Message: Logged In: YES user_id=1926652 Originator: YES Fixed compilation errors introduced as a result of [1956480] patch. Now is really a good time to start reviewing this patch. File Added: jmx-4.patch ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-07 11:25 Message: Logged In: YES user_id=1926652 Originator: YES Moved around ManagedZooKeeperServer and ManagedQuorumPeer classes back to the jmx source tree to make it possible to build the src source tree whithout the jmx tree. Now would be a good time to start reviewing this patch. File Added: jmx-3.patch ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-01 18:06 Message: Logged In: YES user_id=1926652 Originator: YES Added javadocs, updated the code to use log4j, moved common classes to the src/ directory tree. File Added: jmx-2.patch ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-17 13:44 Message: Logged In: YES user_id=1926680 Originator: NO that is primarily my concern. Meaning if jmx is to be enabled in production then creating an ojbect per connection would deteriorate the scalability of zookeeper. What I mean is that do we actually need per connection management in jmx? or we could have just connection stats in jmx and some other means to debug problems with connections (other than jmx)? ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-17 13:14 Message: Logged In: YES user_id=1926652 Originator: YES JMX is primarily for management and monitoring, and my hope is to have JMX enabled in production by default. JMX instrumentation code will eventually rely on the /proc data tree (when it becomes available) to get connection-related information. But we're still gonna have to have a bean per connection registered with MBeanServer. Dynamic registering/unregistering of connection beans may indeed incur some (insignificant) overhead, but I'm not aware of any other way of making a dynamic resource avilable to JMX. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-17 12:50 Message: Logged In: YES user_id=1926680 Originator: NO 3) for this what I meant was that .. we c annot run with jmx on in production .. and if connection tracking is for debugging purpose then jmx just does not sound the right tool for debugging... 4) thats true... but lets have a better process in line :) ... sorry to have started with you. ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-17 12:23 Message: Logged In: YES user_id=1926652 Originator: YES 1) yes! 2) updated the code to use logError() for logging errors only and logTraceMessage() for logging debug/info. 3) not sure I understand the concern. Connection tracking using /proc FS is orthogonal to JMX. JMX Beans would have to be instantiated for each new connection anyway if we want to be able to manage connections thru JMX. 4) this may take a while. Historically, however, lacking and/or misleading javadocs have never been a showstopper for the existing ZK code ;-) ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-15 11:02 Message: Logged In: YES user_id=1926680 Originator: NO 1) dataTreeBean = new com.yahoo.zookeeper.jmx.server.DataTreeBean( + server.dataTree); can we just replace this with DataTreeBean(datatree)? 2) also i wonder if we should have logging levels on the server side like debug and info , warn, fatal right now we just have logerror and logwarn... and we end up using logError for everything .. 3) am really unconfortable with the idea that we keep track of all the connections using jmx beans? I thought we were going to put all this info in the /proc file system? We end up creating a new object connection bean for every connection (as per my understanding) ... 4) can you upload a patch with javadocs? :) Its easier to review :). ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-14 10:11 Message: Logged In: YES user_id=1926652 Originator: YES Are you sure it's this patch that has tabs in it? Or, maybe, you meant the other one (the refactoring patch)? Because that one does have tabs (which I've already fixed in locally). But this one does not. Thanks for reviewing! ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-13 19:20 Message: Logged In: YES user_id=1926680 Originator: NO this patch has tabs in it... i havent finished reviewing... just getting my review comments out... ill let you know when I am done fully reviewing the patch so that you dont have to upload it again and again. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1913998&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 15:49:47
|
Patches item #1913998, was opened at 2008-03-13 18:52 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1913998&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Andrew Kornev (akornev) Assigned to: Benjamin Reed (breed) Summary: JMX instrumentation Initial Comment: This patch adds support for management and monitoring via JMX. It uses the MXBean feature made available in Java 6. The instrumentation code relies on the new lifecycle event notification mechanism to dynamically register and unregister MBeans. The ant build file has NOT been modified to build this new feature yet. There are no javadocs for the new classes either (but this will be fixed, of course). ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-03 08:49 Message: Logged In: YES user_id=154690 Originator: NO Updated the patch slightly so that: 1) everything goes under the jmx directory 2) DataTreeBean.getDatasize always returns -1 until we get a more efficient way of getting the data size 3) HiddenNameBean was removed since it is unused File Added: jmx-5.patch ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 13:14 Message: Logged In: YES user_id=1926680 Originator: NO assigning to ben since he is reviewing the code. ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-09 11:34 Message: Logged In: YES user_id=1926652 Originator: YES Fixed compilation errors introduced as a result of [1956480] patch. Now is really a good time to start reviewing this patch. File Added: jmx-4.patch ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-07 11:25 Message: Logged In: YES user_id=1926652 Originator: YES Moved around ManagedZooKeeperServer and ManagedQuorumPeer classes back to the jmx source tree to make it possible to build the src source tree whithout the jmx tree. Now would be a good time to start reviewing this patch. File Added: jmx-3.patch ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-05-01 18:06 Message: Logged In: YES user_id=1926652 Originator: YES Added javadocs, updated the code to use log4j, moved common classes to the src/ directory tree. File Added: jmx-2.patch ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-17 13:44 Message: Logged In: YES user_id=1926680 Originator: NO that is primarily my concern. Meaning if jmx is to be enabled in production then creating an ojbect per connection would deteriorate the scalability of zookeeper. What I mean is that do we actually need per connection management in jmx? or we could have just connection stats in jmx and some other means to debug problems with connections (other than jmx)? ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-17 13:14 Message: Logged In: YES user_id=1926652 Originator: YES JMX is primarily for management and monitoring, and my hope is to have JMX enabled in production by default. JMX instrumentation code will eventually rely on the /proc data tree (when it becomes available) to get connection-related information. But we're still gonna have to have a bean per connection registered with MBeanServer. Dynamic registering/unregistering of connection beans may indeed incur some (insignificant) overhead, but I'm not aware of any other way of making a dynamic resource avilable to JMX. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-17 12:50 Message: Logged In: YES user_id=1926680 Originator: NO 3) for this what I meant was that .. we c annot run with jmx on in production .. and if connection tracking is for debugging purpose then jmx just does not sound the right tool for debugging... 4) thats true... but lets have a better process in line :) ... sorry to have started with you. ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-17 12:23 Message: Logged In: YES user_id=1926652 Originator: YES 1) yes! 2) updated the code to use logError() for logging errors only and logTraceMessage() for logging debug/info. 3) not sure I understand the concern. Connection tracking using /proc FS is orthogonal to JMX. JMX Beans would have to be instantiated for each new connection anyway if we want to be able to manage connections thru JMX. 4) this may take a while. Historically, however, lacking and/or misleading javadocs have never been a showstopper for the existing ZK code ;-) ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-15 11:02 Message: Logged In: YES user_id=1926680 Originator: NO 1) dataTreeBean = new com.yahoo.zookeeper.jmx.server.DataTreeBean( + server.dataTree); can we just replace this with DataTreeBean(datatree)? 2) also i wonder if we should have logging levels on the server side like debug and info , warn, fatal right now we just have logerror and logwarn... and we end up using logError for everything .. 3) am really unconfortable with the idea that we keep track of all the connections using jmx beans? I thought we were going to put all this info in the /proc file system? We end up creating a new object connection bean for every connection (as per my understanding) ... 4) can you upload a patch with javadocs? :) Its easier to review :). ---------------------------------------------------------------------- Comment By: Andrew Kornev (akornev) Date: 2008-03-14 10:11 Message: Logged In: YES user_id=1926652 Originator: YES Are you sure it's this patch that has tabs in it? Or, maybe, you meant the other one (the refactoring patch)? Because that one does have tabs (which I've already fixed in locally). But this one does not. Thanks for reviewing! ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-03-13 19:20 Message: Logged In: YES user_id=1926680 Originator: NO this patch has tabs in it... i havent finished reviewing... just getting my review comments out... ill let you know when I am done fully reviewing the patch so that you dont have to upload it again and again. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1913998&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 04:39:09
|
Patches item #1982992, was opened at 2008-06-03 04:38 Message generated for change (Comment added) made by mahadevkonar You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) >Assigned to: Benjamin Reed (breed) Summary: concurrency issue in the zookeeper leader code. Initial Comment: The Datatree.ephemerals has a hashset of ephermal nodes for a given session. This list is not synchronized. This was a part of the bug that one of our users found. One of the sessions was getting expired and some other session was deleting ephemeral nodes from the session getting expired. This caused a concurrency modification excpetion while we called hashlist.clone() and datatree.getephermals() --- The fix is simple to put synchronized blocks around the accesses to the list. ---------------------------------------------------------------------- >Comment By: Mahadev Konar (mahadevkonar) Date: 2008-06-03 04:39 Message: Logged In: YES user_id=1926680 Originator: YES ben can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-03 04:38:25
|
Patches item #1982992, was opened at 2008-06-03 04:38 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: server Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Nobody/Anonymous (nobody) Summary: concurrency issue in the zookeeper leader code. Initial Comment: The Datatree.ephemerals has a hashset of ephermal nodes for a given session. This list is not synchronized. This was a part of the bug that one of our users found. One of the sessions was getting expired and some other session was deleting ephemeral nodes from the session getting expired. This caused a concurrency modification excpetion while we called hashlist.clone() and datatree.getephermals() --- The fix is simple to put synchronized blocks around the accesses to the list. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1982992&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-02 14:00:07
|
Bugs item #1937084, was opened at 2008-04-07 14:23 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1937084&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None >Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: fpj (fpj) Summary: Incompatible client and server list detection Initial Comment: Bad things may happen if the lists of hosts that make up ZooKeeper at the client doesn't match the actual servers that make up ZooKeeper particularly when the client's list includes servers from different ZooKeeper instances. The ZooKeeper server should validate the clients list when the clients connect to ZooKeeper. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-02 07:00 Message: Logged In: YES user_id=154690 Originator: YES I realize this is a bit irrelevant since we are moving to Apache, but we should not move issues between bugs and patches even though sourceforge allows it. Patches really need to originate as patches because there is an explicit agreement of contribution. Also, multiple patches may result from a single bug. For example, for this bug we should probably do a patch to fix the jute int list rather than create another class. ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-31 03:46 Message: Logged In: YES user_id=1926444 Originator: NO Sorry, this is what I should have said... =) I have moved this request to the patches queue as I'm proposing a patch. The patch attached creates a new jute packet type, called ChkServersPacket, that clients send to servers and servers recognize as containing the list of ZK servers the client is aware of. Note that jute has a bug that does not allowed me to create a vector<int> field, so I had to add a new type called Port. This is basically a hack to bypass the jute bug. The main files affected are ClientCnxn.java and NIOServerCnxn.java. This patch does not implement a modification to the C interface. A modification to the C client is necessary with the modifications to the server in the current patch as the server drops the connection to the client if the list doesn't match, and it doesn't accept the client connection if the client doesn't send a list of servers. It is a possibility (not implemented in this patch, though) to enable and disable the check through a configuration flag. -Flavio ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-31 03:36 Message: Logged In: YES user_id=1926444 Originator: NO I have moved this request to the patches queue as I'm proposing a patch. The patch attached creates a new jute packet type, called ChkServersPacket, that clients send to servers and servers recognize as containing the list of ZK servers the client is aware of. The main files affected are ClientCnxn.java and NIOServerCnxn.java. This patch thus not implement a modification to the C interface. A modification to the C client is necessary with the modifications to the server in the current patch as the server drops the connection to the client if the list doesn't match, and it doesn't accept the client coonection if the client doesn't send a list of servers. It is a possibility (not implemented in this patch, though) to enable and disable the check through a configuration flag. -Flavio File Added: patch-chkservers.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-29 11:45 Message: Logged In: YES user_id=1926444 Originator: NO File Added: patch-srv-chk.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-28 10:20 Message: Logged In: YES user_id=1926444 Originator: NO My plan is to add a check on the server side to NIOServerCNXN.readRequest() similar to the one for OpCode.auth. I would then add a new OpCode "chkservers" and a new packet "ChkServersPacket". Another way would be to piggyback this information on some other message, but I find such things confusing when trying to understand the code later. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-19 19:03 Message: Logged In: YES user_id=154690 Originator: YES I think it should be IP address/port pairs that are compared. It is possible that there could be a backend network used by the servers for the quorum protocols and a frontend network used to talk to clients. In that case we should be able to specify those address/port pairs, but for now lets just assume they are the same. (It's a valid assumption in all our current deployments.) You get extra credit for supporting both IPV4 and V6 addresses :) ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 00:17 Message: Logged In: YES user_id=12853 Originator: NO Are you saying that the client should pass the list of addresses to the Server? and the server should verify this list against the quorum members? I don't see this in the current code - what would have to change, the protocol and client initialization (server connect)? Seems like comparing ip addresses could be an issue since the ip seen by the client could differ from the ips used by the servers. Comparing host names seems like it could also have issues... What happens if a quorum member is temporarily offline (not active member, say network went down temporarily) and a client connects? Also -- seems like the host:port would need to be compared, not just host, correct? (multiple zk clusters running on a single host). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1937084&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-01 15:05:53
|
Patches item #1937078, was opened at 2008-04-07 14:21 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: java client Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Benjamin Reed (breed) Summary: Passing a watch object to read requests Initial Comment: When using a libraries for higher order synchronization primitives it gets hard to coordinate interests in watch events. It would be much more convenient if a Watcher object could be passed to the getData, getChildren, or exists methods. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-06-01 08:06 Message: Logged In: YES user_id=154690 Originator: YES Looks good! You are still missing 3 and 4. 3) Since there is currently only one watch, yes all watches are notified of a disconnect. 4) Yes, this is a bug in the server. The current behaviors leave untriggered watches which is not right. I've opened a bug for this: [ 1981340 ] Child watches are not triggered when the node is deleted ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 12:40 Message: Logged In: YES user_id=12853 Originator: NO I implemented all of your (ben) suggestions except for item 3. This keeps the behavior as close to the original as possible - the default watcher will be called when not sync'd. I register the watcher in finish now - at the very beginning of the function. Please verify this in particular is correct. File Added: watcherobj3.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 10:55 Message: Logged In: YES user_id=12853 Originator: NO regarding: 4) as part of reworking the patch I looked closely at the code for triggering watches and based processwatchevent on that, unless I'm missing something what I have is correct behavior -- here is deletenode code on the server: dataWatches.triggerWatch(path, Event.EventNodeDeleted); childWatches.triggerWatch(parentName.equals("")?"/":parentName, Event.EventNodeChildrenChanged); childwatches get childrenchanged on the parent of path while datawatches gets nodedeleted on the path itself 2) so the assumption is that two sets of code won't register the same watch obj? This seems reasonable, but I will make explicit in the docs 3) is that what happens currently? all watches are triggered on disconnect? Otw I'll incorporate the feedback you provided. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-30 09:40 Message: Logged In: YES user_id=154690 Originator: YES This looks much better. There are still a couple of issues: 1) The XXXWatches hashmaps should be final not volatile 2) The elements of the watch hashmaps should be hashsets, not lists. If the same watch is added twice, we only need to invoke once. 3) In processWatchEvent, on disconnect I think all the watches need to be invoked not cleared. This may not really be an issue since I'll be changing this with the auto reregister of watches. 4) In processWatchEvent, the nodeDeleted event gets delivered to both data and child watchers. (If you are watching for children of a node, and that node goes away we trigger a watch.) 5) In processWatchEvent, the remove from the hashmap needs to be done in a synchronized block. Otherwise, you could have concurrent additions to the Hashset, currently ArrayList, going on. 6) You cannot add the watches in the completion function and the blocks of the synchronous calls. If you do you end up with a race condition: you are interested in a change, it comes in right after you get the response but before you actually process it. They all need to be done in one place: finishPacket(); the next packet will not be processed before finishPacket is complete. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-28 15:17 Message: Logged In: YES user_id=12853 Originator: NO I updated the patch based on feedback on original. File Added: watcherobj2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:31 Message: Logged In: YES user_id=154690 Originator: YES As far as the implementation goes, we can't do it in the clientcnxn EventThread since only async events go there. The easiest thing would be to do it in finishPacket(). You should probably add members in Packet to hold the Watcher object put it in watches in finishPacket(). ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:26 Message: Logged In: YES user_id=154690 Originator: YES 'Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively?' If a create(node) succeeds, that means the node did not exists, which means that getData got a NoNode error, which means that no watch was set. So, you will only see a nodecreated event and it would be delivered to the Watcher of the exists() call. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:48 Message: Logged In: YES user_id=12853 Originator: NO What would you suggest regarding async read requests and watcher registration - we wrap the callback with code that registers the watcher then calls the client callback? I think this would be cleaner than embedding into the clientcnxn EventThread code. What do you think? ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:40 Message: Logged In: YES user_id=12853 Originator: NO Hm. I re-read the docs and I'm still a bit confused by your statements regarding watch types. Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively? The watches wiki http://zookeeper.wiki.sourceforge.net/ZooKeeperWatches should really be beefed up with more detail on which event types are generated based on write ops. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:07 Message: Logged In: YES user_id=12853 Originator: NO I'll rework the patch. Regarding the c vs java - I only implemented java due to the category being set to "java client". Perhaps we should add a "client API" category? Or maybe create two requests instead of 1 -- would be better for me since I'm much more familiar with java than C, you could assign to someone better able to execute. :-) ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 11:22 Message: Logged In: YES user_id=154690 Originator: YES We do also need to implement this in the C client as well, but for simplicity we should make that a separate patch. We shouldn't add watches to the watches hashmap until we get a successful result back from the server. Otherwise, we might trigger watches for things that the client isn't expecting. There is also a problem with type watch types. In the server we have a watch list for data and child watches. It might be good to keep that same division in the client. (Either way would work though.) a watch on getChildren() can result in a childrenChanged or nodeDeleted watch event. getData() can result in dataChanged or nodeDeleted event. exist() can result in dataChanged, nodeCreated, or nodeDeleted event. getChildren() and getData() will not get nodeCreated events, because non existence is an error condition for those two calls and a watch will not be set. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-22 14:09 Message: Logged In: YES user_id=12853 Originator: NO Here's the patch, assigning to Ben for further review. Please pay particular attention to the handling of state changes - I clear the watch table when we are not sync'd to the server. Verify my workflow. File Added: watcherobj.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 23:18 Message: Logged In: YES user_id=12853 Originator: NO I've implemented this for Java. I overloaded the respective methods with signatures that take watcher object instead of watch boolean. I added new methods rather than replacing -- this maintains b/w compatibility. 1) should I mark the old methods as deprecated or should we just support both method types (newer signature w/ explicit watcher object && old watch boolean which directs to the "global" watcher.) 2) I see this is listed as category "java client" so I'm assuming there is no need for this feature in c code interface. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 17:52 Message: Logged In: YES user_id=12853 Originator: NO I was specifically referring to the client api, not the protocol -- should I assume that that last watcher on a particular node "wins". What I mean is if a client calls "getChildren(path, watcherobj1); getChildren(path, watcherobj2);" (notice same patch) what should the client lib do? I'm assuming use the last watcher, but wanted to verify. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-19 19:05 Message: Logged In: YES user_id=154690 Originator: YES The client will need to track the multiple watcher objects to invoke when it gets the callback. The client server protocol doesn't change, this is just a client side thing. We should add the ability to remove watches as a separate feature request. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 00:21 Message: Logged In: YES user_id=12853 Originator: NO What should we do if the user calls getData multiple times with different watcher objects? Remove previous watch and replace with new watcher, throw error, or allow multiple watchers on particular node? How about removing a watch by calling method on particular node with null watcher? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-01 15:05:14
|
Bugs item #1981340, was opened at 2008-06-01 08:05 Message generated for change (Settings changed) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None >Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Nobody/Anonymous (nobody) Summary: Child watches are not triggered when the node is deleted Initial Comment: If a client sets a watch on "/foo" with a getChildren("/foo" ...) and "/foo" gets deleted, no watch event is triggered. (Ever.) It could be argued that this is correct behavior since the children did not change, but it can lead to confusion since "/foo" could be recreated and children added and still the watch would not get triggered. (We trace watches by znode instance and deleting and creating "/foo" causes a new znode instance.) The correct behavior should be that when a znode is removed there are three events triggered: parent child watches, znode data watches, and znode child watches. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-01 15:04:54
|
Bugs item #1981340, was opened at 2008-06-01 08:05 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Nobody/Anonymous (nobody) Summary: Child watches are not triggered when the node is deleted Initial Comment: If a client sets a watch on "/foo" with a getChildren("/foo" ...) and "/foo" gets deleted, no watch event is triggered. (Ever.) It could be argued that this is correct behavior since the children did not change, but it can lead to confusion since "/foo" could be recreated and children added and still the watch would not get triggered. (We trace watches by znode instance and deleting and creating "/foo" causes a new znode instance.) The correct behavior should be that when a znode is removed there are three events triggered: parent child watches, znode data watches, and znode child watches. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981340&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-06-01 12:16:12
|
Bugs item #1981288, was opened at 2008-06-01 14:16 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981288&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: fpj (fpj) Assigned to: Nobody/Anonymous (nobody) Summary: Vector of Integers with Jute Initial Comment: Currently, Jute generates a "List<int>" for the Java code, which is broken. I have tried to simply replace "int" with "Integer" in the third parameter of the call to the constructor of JType. It doesn't seem to work, though. One option would be to hack into JVector, and force it to use the type "Integer" whenever the Jute type is "int". If possible, I'd rather opt for a cleaner option, though. -Flavio ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1981288&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-31 10:46:45
|
Patches item #1937084, was opened at 2008-04-07 23:23 Message generated for change (Comment added) made by fpj You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937084&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: fpj (fpj) Summary: Incompatible client and server list detection Initial Comment: Bad things may happen if the lists of hosts that make up ZooKeeper at the client doesn't match the actual servers that make up ZooKeeper particularly when the client's list includes servers from different ZooKeeper instances. The ZooKeeper server should validate the clients list when the clients connect to ZooKeeper. ---------------------------------------------------------------------- >Comment By: fpj (fpj) Date: 2008-05-31 12:46 Message: Logged In: YES user_id=1926444 Originator: NO Sorry, this is what I should have said... =) I have moved this request to the patches queue as I'm proposing a patch. The patch attached creates a new jute packet type, called ChkServersPacket, that clients send to servers and servers recognize as containing the list of ZK servers the client is aware of. Note that jute has a bug that does not allowed me to create a vector<int> field, so I had to add a new type called Port. This is basically a hack to bypass the jute bug. The main files affected are ClientCnxn.java and NIOServerCnxn.java. This patch does not implement a modification to the C interface. A modification to the C client is necessary with the modifications to the server in the current patch as the server drops the connection to the client if the list doesn't match, and it doesn't accept the client connection if the client doesn't send a list of servers. It is a possibility (not implemented in this patch, though) to enable and disable the check through a configuration flag. -Flavio ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-31 12:36 Message: Logged In: YES user_id=1926444 Originator: NO I have moved this request to the patches queue as I'm proposing a patch. The patch attached creates a new jute packet type, called ChkServersPacket, that clients send to servers and servers recognize as containing the list of ZK servers the client is aware of. The main files affected are ClientCnxn.java and NIOServerCnxn.java. This patch thus not implement a modification to the C interface. A modification to the C client is necessary with the modifications to the server in the current patch as the server drops the connection to the client if the list doesn't match, and it doesn't accept the client coonection if the client doesn't send a list of servers. It is a possibility (not implemented in this patch, though) to enable and disable the check through a configuration flag. -Flavio File Added: patch-chkservers.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-29 20:45 Message: Logged In: YES user_id=1926444 Originator: NO File Added: patch-srv-chk.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-28 19:20 Message: Logged In: YES user_id=1926444 Originator: NO My plan is to add a check on the server side to NIOServerCNXN.readRequest() similar to the one for OpCode.auth. I would then add a new OpCode "chkservers" and a new packet "ChkServersPacket". Another way would be to piggyback this information on some other message, but I find such things confusing when trying to understand the code later. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-20 04:03 Message: Logged In: YES user_id=154690 Originator: YES I think it should be IP address/port pairs that are compared. It is possible that there could be a backend network used by the servers for the quorum protocols and a frontend network used to talk to clients. In that case we should be able to specify those address/port pairs, but for now lets just assume they are the same. (It's a valid assumption in all our current deployments.) You get extra credit for supporting both IPV4 and V6 addresses :) ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 09:17 Message: Logged In: YES user_id=12853 Originator: NO Are you saying that the client should pass the list of addresses to the Server? and the server should verify this list against the quorum members? I don't see this in the current code - what would have to change, the protocol and client initialization (server connect)? Seems like comparing ip addresses could be an issue since the ip seen by the client could differ from the ips used by the servers. Comparing host names seems like it could also have issues... What happens if a quorum member is temporarily offline (not active member, say network went down temporarily) and a client connects? Also -- seems like the host:port would need to be compared, not just host, correct? (multiple zk clusters running on a single host). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937084&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-31 10:36:44
|
Patches item #1937084, was opened at 2008-04-07 23:23 Message generated for change (Comment added) made by fpj You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937084&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. >Category: None >Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: fpj (fpj) Summary: Incompatible client and server list detection Initial Comment: Bad things may happen if the lists of hosts that make up ZooKeeper at the client doesn't match the actual servers that make up ZooKeeper particularly when the client's list includes servers from different ZooKeeper instances. The ZooKeeper server should validate the clients list when the clients connect to ZooKeeper. ---------------------------------------------------------------------- >Comment By: fpj (fpj) Date: 2008-05-31 12:36 Message: Logged In: YES user_id=1926444 Originator: NO I have moved this request to the patches queue as I'm proposing a patch. The patch attached creates a new jute packet type, called ChkServersPacket, that clients send to servers and servers recognize as containing the list of ZK servers the client is aware of. The main files affected are ClientCnxn.java and NIOServerCnxn.java. This patch thus not implement a modification to the C interface. A modification to the C client is necessary with the modifications to the server in the current patch as the server drops the connection to the client if the list doesn't match, and it doesn't accept the client coonection if the client doesn't send a list of servers. It is a possibility (not implemented in this patch, though) to enable and disable the check through a configuration flag. -Flavio File Added: patch-chkservers.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-29 20:45 Message: Logged In: YES user_id=1926444 Originator: NO File Added: patch-srv-chk.txt ---------------------------------------------------------------------- Comment By: fpj (fpj) Date: 2008-05-28 19:20 Message: Logged In: YES user_id=1926444 Originator: NO My plan is to add a check on the server side to NIOServerCNXN.readRequest() similar to the one for OpCode.auth. I would then add a new OpCode "chkservers" and a new packet "ChkServersPacket". Another way would be to piggyback this information on some other message, but I find such things confusing when trying to understand the code later. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-20 04:03 Message: Logged In: YES user_id=154690 Originator: YES I think it should be IP address/port pairs that are compared. It is possible that there could be a backend network used by the servers for the quorum protocols and a frontend network used to talk to clients. In that case we should be able to specify those address/port pairs, but for now lets just assume they are the same. (It's a valid assumption in all our current deployments.) You get extra credit for supporting both IPV4 and V6 addresses :) ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 09:17 Message: Logged In: YES user_id=12853 Originator: NO Are you saying that the client should pass the list of addresses to the Server? and the server should verify this list against the quorum members? I don't see this in the current code - what would have to change, the protocol and client initialization (server connect)? Seems like comparing ip addresses could be an issue since the ip seen by the client could differ from the ips used by the servers. Comparing host names seems like it could also have issues... What happens if a quorum member is temporarily offline (not active member, say network went down temporarily) and a client connects? Also -- seems like the host:port would need to be compared, not just host, correct? (multiple zk clusters running on a single host). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937084&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 21:14:08
|
Bugs item #1979772, was opened at 2008-05-30 14:14 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1979772&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Patrick Hunt (phunt) Assigned to: Benjamin Reed (breed) Summary: keeper state inconsistency Initial Comment: I user reported that in the java client KeeperStateChanged seems be be both unused and unusual(shares value with KeeperStateDisconnected) http://sourceforge.net/mailarchive/forum.php?thread_name=0A263CC7-6F54-4978-9F97-4B2E4FD7EE42%40elctech.com&forum_name=zookeeper-user I looked at the history of the watcher.java file - the only log entry is the initial commit by Ben. Assigning to Ben for followup - is this state some artifact and can be removed? Or does it server some purpose - if so we should document in the javadoc. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1979772&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 20:09:25
|
Bugs item #1969991, was opened at 2008-05-22 23:15 Message generated for change (Comment added) made by mahadevkonar You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1969991&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3.0.0 >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Patrick Hunt (phunt) Assigned to: Mahadev Konar (mahadevkonar) Summary: remove log4j.properties from zk.jar Initial Comment: remove log4j config from jar file and instead put conf on classpath in script some users have seen issues with tomcat - seems the default tomcat logging gets screwed up if the log4j.properties is in the jar. ---------------------------------------------------------------------- >Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-30 20:09 Message: Logged In: YES user_id=1926680 Originator: NO fixed in https://sourceforge.net/tracker/index.php?func=detail&aid=1978290&group_id=209147&atid=1008546 ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-29 20:02 Message: Logged In: YES user_id=12853 Originator: YES Mahadev volunteered to fix this one. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1969991&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 20:08:48
|
Patches item #1978290, was opened at 2008-05-29 20:07 Message generated for change (Comment added) made by mahadevkonar You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Mahadev Konar (mahadevkonar) Summary: remove log4j prop file from zookeeper.jar Initial Comment: deleting the log4j prop file from the jar. ---------------------------------------------------------------------- >Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-30 20:08 Message: Logged In: YES user_id=1926680 Originator: YES committed to revision 169. Will fix the scripts in a different patch. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 19:44 Message: Logged In: YES user_id=12853 Originator: NO Handoff to Mahadev for commit. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 19:42 Message: Logged In: YES user_id=12853 Originator: NO looks good +1 (assuming 1951806 will also be fixed - fixup the script to include conf on the classpath such that log4j.properties is available) ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 20:11 Message: Logged In: YES user_id=1926680 Originator: YES pat can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 19:43:56
|
Patches item #1978290, was opened at 2008-05-29 13:07 Message generated for change (Comment added) made by phunt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) >Assigned to: Mahadev Konar (mahadevkonar) Summary: remove log4j prop file from zookeeper.jar Initial Comment: deleting the log4j prop file from the jar. ---------------------------------------------------------------------- >Comment By: Patrick Hunt (phunt) Date: 2008-05-30 12:44 Message: Logged In: YES user_id=12853 Originator: NO Handoff to Mahadev for commit. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 12:42 Message: Logged In: YES user_id=12853 Originator: NO looks good +1 (assuming 1951806 will also be fixed - fixup the script to include conf on the classpath such that log4j.properties is available) ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 13:11 Message: Logged In: YES user_id=1926680 Originator: YES pat can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 19:42:50
|
Patches item #1978290, was opened at 2008-05-29 13:07 Message generated for change (Comment added) made by phunt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mahadev Konar (mahadevkonar) Assigned to: Patrick Hunt (phunt) Summary: remove log4j prop file from zookeeper.jar Initial Comment: deleting the log4j prop file from the jar. ---------------------------------------------------------------------- >Comment By: Patrick Hunt (phunt) Date: 2008-05-30 12:42 Message: Logged In: YES user_id=12853 Originator: NO looks good +1 (assuming 1951806 will also be fixed - fixup the script to include conf on the classpath such that log4j.properties is available) ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 13:11 Message: Logged In: YES user_id=1926680 Originator: YES pat can you review it? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1978290&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 19:40:05
|
Patches item #1937078, was opened at 2008-04-07 14:21 Message generated for change (Comment added) made by phunt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: java client Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Benjamin Reed (breed) Summary: Passing a watch object to read requests Initial Comment: When using a libraries for higher order synchronization primitives it gets hard to coordinate interests in watch events. It would be much more convenient if a Watcher object could be passed to the getData, getChildren, or exists methods. ---------------------------------------------------------------------- >Comment By: Patrick Hunt (phunt) Date: 2008-05-30 12:40 Message: Logged In: YES user_id=12853 Originator: NO I implemented all of your (ben) suggestions except for item 3. This keeps the behavior as close to the original as possible - the default watcher will be called when not sync'd. I register the watcher in finish now - at the very beginning of the function. Please verify this in particular is correct. File Added: watcherobj3.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-30 10:55 Message: Logged In: YES user_id=12853 Originator: NO regarding: 4) as part of reworking the patch I looked closely at the code for triggering watches and based processwatchevent on that, unless I'm missing something what I have is correct behavior -- here is deletenode code on the server: dataWatches.triggerWatch(path, Event.EventNodeDeleted); childWatches.triggerWatch(parentName.equals("")?"/":parentName, Event.EventNodeChildrenChanged); childwatches get childrenchanged on the parent of path while datawatches gets nodedeleted on the path itself 2) so the assumption is that two sets of code won't register the same watch obj? This seems reasonable, but I will make explicit in the docs 3) is that what happens currently? all watches are triggered on disconnect? Otw I'll incorporate the feedback you provided. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-30 09:40 Message: Logged In: YES user_id=154690 Originator: YES This looks much better. There are still a couple of issues: 1) The XXXWatches hashmaps should be final not volatile 2) The elements of the watch hashmaps should be hashsets, not lists. If the same watch is added twice, we only need to invoke once. 3) In processWatchEvent, on disconnect I think all the watches need to be invoked not cleared. This may not really be an issue since I'll be changing this with the auto reregister of watches. 4) In processWatchEvent, the nodeDeleted event gets delivered to both data and child watchers. (If you are watching for children of a node, and that node goes away we trigger a watch.) 5) In processWatchEvent, the remove from the hashmap needs to be done in a synchronized block. Otherwise, you could have concurrent additions to the Hashset, currently ArrayList, going on. 6) You cannot add the watches in the completion function and the blocks of the synchronous calls. If you do you end up with a race condition: you are interested in a change, it comes in right after you get the response but before you actually process it. They all need to be done in one place: finishPacket(); the next packet will not be processed before finishPacket is complete. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-28 15:17 Message: Logged In: YES user_id=12853 Originator: NO I updated the patch based on feedback on original. File Added: watcherobj2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:31 Message: Logged In: YES user_id=154690 Originator: YES As far as the implementation goes, we can't do it in the clientcnxn EventThread since only async events go there. The easiest thing would be to do it in finishPacket(). You should probably add members in Packet to hold the Watcher object put it in watches in finishPacket(). ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:26 Message: Logged In: YES user_id=154690 Originator: YES 'Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively?' If a create(node) succeeds, that means the node did not exists, which means that getData got a NoNode error, which means that no watch was set. So, you will only see a nodecreated event and it would be delivered to the Watcher of the exists() call. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:48 Message: Logged In: YES user_id=12853 Originator: NO What would you suggest regarding async read requests and watcher registration - we wrap the callback with code that registers the watcher then calls the client callback? I think this would be cleaner than embedding into the clientcnxn EventThread code. What do you think? ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:40 Message: Logged In: YES user_id=12853 Originator: NO Hm. I re-read the docs and I'm still a bit confused by your statements regarding watch types. Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively? The watches wiki http://zookeeper.wiki.sourceforge.net/ZooKeeperWatches should really be beefed up with more detail on which event types are generated based on write ops. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:07 Message: Logged In: YES user_id=12853 Originator: NO I'll rework the patch. Regarding the c vs java - I only implemented java due to the category being set to "java client". Perhaps we should add a "client API" category? Or maybe create two requests instead of 1 -- would be better for me since I'm much more familiar with java than C, you could assign to someone better able to execute. :-) ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 11:22 Message: Logged In: YES user_id=154690 Originator: YES We do also need to implement this in the C client as well, but for simplicity we should make that a separate patch. We shouldn't add watches to the watches hashmap until we get a successful result back from the server. Otherwise, we might trigger watches for things that the client isn't expecting. There is also a problem with type watch types. In the server we have a watch list for data and child watches. It might be good to keep that same division in the client. (Either way would work though.) a watch on getChildren() can result in a childrenChanged or nodeDeleted watch event. getData() can result in dataChanged or nodeDeleted event. exist() can result in dataChanged, nodeCreated, or nodeDeleted event. getChildren() and getData() will not get nodeCreated events, because non existence is an error condition for those two calls and a watch will not be set. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-22 14:09 Message: Logged In: YES user_id=12853 Originator: NO Here's the patch, assigning to Ben for further review. Please pay particular attention to the handling of state changes - I clear the watch table when we are not sync'd to the server. Verify my workflow. File Added: watcherobj.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 23:18 Message: Logged In: YES user_id=12853 Originator: NO I've implemented this for Java. I overloaded the respective methods with signatures that take watcher object instead of watch boolean. I added new methods rather than replacing -- this maintains b/w compatibility. 1) should I mark the old methods as deprecated or should we just support both method types (newer signature w/ explicit watcher object && old watch boolean which directs to the "global" watcher.) 2) I see this is listed as category "java client" so I'm assuming there is no need for this feature in c code interface. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 17:52 Message: Logged In: YES user_id=12853 Originator: NO I was specifically referring to the client api, not the protocol -- should I assume that that last watcher on a particular node "wins". What I mean is if a client calls "getChildren(path, watcherobj1); getChildren(path, watcherobj2);" (notice same patch) what should the client lib do? I'm assuming use the last watcher, but wanted to verify. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-19 19:05 Message: Logged In: YES user_id=154690 Originator: YES The client will need to track the multiple watcher objects to invoke when it gets the callback. The client server protocol doesn't change, this is just a client side thing. We should add the ability to remove watches as a separate feature request. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 00:21 Message: Logged In: YES user_id=12853 Originator: NO What should we do if the user calls getData multiple times with different watcher objects? Remove previous watch and replace with new watcher, throw error, or allow multiple watchers on particular node? How about removing a watch by calling method on particular node with null watcher? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 17:55:31
|
Patches item #1937078, was opened at 2008-04-07 14:21 Message generated for change (Comment added) made by phunt You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: java client Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Benjamin Reed (breed) Summary: Passing a watch object to read requests Initial Comment: When using a libraries for higher order synchronization primitives it gets hard to coordinate interests in watch events. It would be much more convenient if a Watcher object could be passed to the getData, getChildren, or exists methods. ---------------------------------------------------------------------- >Comment By: Patrick Hunt (phunt) Date: 2008-05-30 10:55 Message: Logged In: YES user_id=12853 Originator: NO regarding: 4) as part of reworking the patch I looked closely at the code for triggering watches and based processwatchevent on that, unless I'm missing something what I have is correct behavior -- here is deletenode code on the server: dataWatches.triggerWatch(path, Event.EventNodeDeleted); childWatches.triggerWatch(parentName.equals("")?"/":parentName, Event.EventNodeChildrenChanged); childwatches get childrenchanged on the parent of path while datawatches gets nodedeleted on the path itself 2) so the assumption is that two sets of code won't register the same watch obj? This seems reasonable, but I will make explicit in the docs 3) is that what happens currently? all watches are triggered on disconnect? Otw I'll incorporate the feedback you provided. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-30 09:40 Message: Logged In: YES user_id=154690 Originator: YES This looks much better. There are still a couple of issues: 1) The XXXWatches hashmaps should be final not volatile 2) The elements of the watch hashmaps should be hashsets, not lists. If the same watch is added twice, we only need to invoke once. 3) In processWatchEvent, on disconnect I think all the watches need to be invoked not cleared. This may not really be an issue since I'll be changing this with the auto reregister of watches. 4) In processWatchEvent, the nodeDeleted event gets delivered to both data and child watchers. (If you are watching for children of a node, and that node goes away we trigger a watch.) 5) In processWatchEvent, the remove from the hashmap needs to be done in a synchronized block. Otherwise, you could have concurrent additions to the Hashset, currently ArrayList, going on. 6) You cannot add the watches in the completion function and the blocks of the synchronous calls. If you do you end up with a race condition: you are interested in a change, it comes in right after you get the response but before you actually process it. They all need to be done in one place: finishPacket(); the next packet will not be processed before finishPacket is complete. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-28 15:17 Message: Logged In: YES user_id=12853 Originator: NO I updated the patch based on feedback on original. File Added: watcherobj2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:31 Message: Logged In: YES user_id=154690 Originator: YES As far as the implementation goes, we can't do it in the clientcnxn EventThread since only async events go there. The easiest thing would be to do it in finishPacket(). You should probably add members in Packet to hold the Watcher object put it in watches in finishPacket(). ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:26 Message: Logged In: YES user_id=154690 Originator: YES 'Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively?' If a create(node) succeeds, that means the node did not exists, which means that getData got a NoNode error, which means that no watch was set. So, you will only see a nodecreated event and it would be delivered to the Watcher of the exists() call. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:48 Message: Logged In: YES user_id=12853 Originator: NO What would you suggest regarding async read requests and watcher registration - we wrap the callback with code that registers the watcher then calls the client callback? I think this would be cleaner than embedding into the clientcnxn EventThread code. What do you think? ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:40 Message: Logged In: YES user_id=12853 Originator: NO Hm. I re-read the docs and I'm still a bit confused by your statements regarding watch types. Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively? The watches wiki http://zookeeper.wiki.sourceforge.net/ZooKeeperWatches should really be beefed up with more detail on which event types are generated based on write ops. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:07 Message: Logged In: YES user_id=12853 Originator: NO I'll rework the patch. Regarding the c vs java - I only implemented java due to the category being set to "java client". Perhaps we should add a "client API" category? Or maybe create two requests instead of 1 -- would be better for me since I'm much more familiar with java than C, you could assign to someone better able to execute. :-) ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 11:22 Message: Logged In: YES user_id=154690 Originator: YES We do also need to implement this in the C client as well, but for simplicity we should make that a separate patch. We shouldn't add watches to the watches hashmap until we get a successful result back from the server. Otherwise, we might trigger watches for things that the client isn't expecting. There is also a problem with type watch types. In the server we have a watch list for data and child watches. It might be good to keep that same division in the client. (Either way would work though.) a watch on getChildren() can result in a childrenChanged or nodeDeleted watch event. getData() can result in dataChanged or nodeDeleted event. exist() can result in dataChanged, nodeCreated, or nodeDeleted event. getChildren() and getData() will not get nodeCreated events, because non existence is an error condition for those two calls and a watch will not be set. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-22 14:09 Message: Logged In: YES user_id=12853 Originator: NO Here's the patch, assigning to Ben for further review. Please pay particular attention to the handling of state changes - I clear the watch table when we are not sync'd to the server. Verify my workflow. File Added: watcherobj.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 23:18 Message: Logged In: YES user_id=12853 Originator: NO I've implemented this for Java. I overloaded the respective methods with signatures that take watcher object instead of watch boolean. I added new methods rather than replacing -- this maintains b/w compatibility. 1) should I mark the old methods as deprecated or should we just support both method types (newer signature w/ explicit watcher object && old watch boolean which directs to the "global" watcher.) 2) I see this is listed as category "java client" so I'm assuming there is no need for this feature in c code interface. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 17:52 Message: Logged In: YES user_id=12853 Originator: NO I was specifically referring to the client api, not the protocol -- should I assume that that last watcher on a particular node "wins". What I mean is if a client calls "getChildren(path, watcherobj1); getChildren(path, watcherobj2);" (notice same patch) what should the client lib do? I'm assuming use the last watcher, but wanted to verify. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-19 19:05 Message: Logged In: YES user_id=154690 Originator: YES The client will need to track the multiple watcher objects to invoke when it gets the callback. The client server protocol doesn't change, this is just a client side thing. We should add the ability to remove watches as a separate feature request. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 00:21 Message: Logged In: YES user_id=12853 Originator: NO What should we do if the user calls getData multiple times with different watcher objects? Remove previous watch and replace with new watcher, throw error, or allow multiple watchers on particular node? How about removing a watch by calling method on particular node with null watcher? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 16:39:56
|
Patches item #1937078, was opened at 2008-04-07 14:21 Message generated for change (Comment added) made by breed You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: java client Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Benjamin Reed (breed) Assigned to: Benjamin Reed (breed) Summary: Passing a watch object to read requests Initial Comment: When using a libraries for higher order synchronization primitives it gets hard to coordinate interests in watch events. It would be much more convenient if a Watcher object could be passed to the getData, getChildren, or exists methods. ---------------------------------------------------------------------- >Comment By: Benjamin Reed (breed) Date: 2008-05-30 09:40 Message: Logged In: YES user_id=154690 Originator: YES This looks much better. There are still a couple of issues: 1) The XXXWatches hashmaps should be final not volatile 2) The elements of the watch hashmaps should be hashsets, not lists. If the same watch is added twice, we only need to invoke once. 3) In processWatchEvent, on disconnect I think all the watches need to be invoked not cleared. This may not really be an issue since I'll be changing this with the auto reregister of watches. 4) In processWatchEvent, the nodeDeleted event gets delivered to both data and child watchers. (If you are watching for children of a node, and that node goes away we trigger a watch.) 5) In processWatchEvent, the remove from the hashmap needs to be done in a synchronized block. Otherwise, you could have concurrent additions to the Hashset, currently ArrayList, going on. 6) You cannot add the watches in the completion function and the blocks of the synchronous calls. If you do you end up with a race condition: you are interested in a change, it comes in right after you get the response but before you actually process it. They all need to be done in one place: finishPacket(); the next packet will not be processed before finishPacket is complete. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-28 15:17 Message: Logged In: YES user_id=12853 Originator: NO I updated the patch based on feedback on original. File Added: watcherobj2.patch ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:31 Message: Logged In: YES user_id=154690 Originator: YES As far as the implementation goes, we can't do it in the clientcnxn EventThread since only async events go there. The easiest thing would be to do it in finishPacket(). You should probably add members in Packet to hold the Watcher object put it in watches in finishPacket(). ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 13:26 Message: Logged In: YES user_id=154690 Originator: YES 'Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively?' If a create(node) succeeds, that means the node did not exists, which means that getData got a NoNode error, which means that no watch was set. So, you will only see a nodecreated event and it would be delivered to the Watcher of the exists() call. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:48 Message: Logged In: YES user_id=12853 Originator: NO What would you suggest regarding async read requests and watcher registration - we wrap the callback with code that registers the watcher then calls the client callback? I think this would be cleaner than embedding into the clientcnxn EventThread code. What do you think? ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:40 Message: Logged In: YES user_id=12853 Originator: NO Hm. I re-read the docs and I'm still a bit confused by your statements regarding watch types. Are you saying that if I call getData(node,true) and exists(node,true) and then a subsequent create(node) happens I will get a single watch event of type datachanged for node? I won't see a "datachanged(node)" as well as a "nodecreated(node)", one for each of the read calls respectively? The watches wiki http://zookeeper.wiki.sourceforge.net/ZooKeeperWatches should really be beefed up with more detail on which event types are generated based on write ops. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-27 12:07 Message: Logged In: YES user_id=12853 Originator: NO I'll rework the patch. Regarding the c vs java - I only implemented java due to the category being set to "java client". Perhaps we should add a "client API" category? Or maybe create two requests instead of 1 -- would be better for me since I'm much more familiar with java than C, you could assign to someone better able to execute. :-) ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-27 11:22 Message: Logged In: YES user_id=154690 Originator: YES We do also need to implement this in the C client as well, but for simplicity we should make that a separate patch. We shouldn't add watches to the watches hashmap until we get a successful result back from the server. Otherwise, we might trigger watches for things that the client isn't expecting. There is also a problem with type watch types. In the server we have a watch list for data and child watches. It might be good to keep that same division in the client. (Either way would work though.) a watch on getChildren() can result in a childrenChanged or nodeDeleted watch event. getData() can result in dataChanged or nodeDeleted event. exist() can result in dataChanged, nodeCreated, or nodeDeleted event. getChildren() and getData() will not get nodeCreated events, because non existence is an error condition for those two calls and a watch will not be set. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-22 14:09 Message: Logged In: YES user_id=12853 Originator: NO Here's the patch, assigning to Ben for further review. Please pay particular attention to the handling of state changes - I clear the watch table when we are not sync'd to the server. Verify my workflow. File Added: watcherobj.patch ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 23:18 Message: Logged In: YES user_id=12853 Originator: NO I've implemented this for Java. I overloaded the respective methods with signatures that take watcher object instead of watch boolean. I added new methods rather than replacing -- this maintains b/w compatibility. 1) should I mark the old methods as deprecated or should we just support both method types (newer signature w/ explicit watcher object && old watch boolean which directs to the "global" watcher.) 2) I see this is listed as category "java client" so I'm assuming there is no need for this feature in c code interface. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-20 17:52 Message: Logged In: YES user_id=12853 Originator: NO I was specifically referring to the client api, not the protocol -- should I assume that that last watcher on a particular node "wins". What I mean is if a client calls "getChildren(path, watcherobj1); getChildren(path, watcherobj2);" (notice same patch) what should the client lib do? I'm assuming use the last watcher, but wanted to verify. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-05-19 19:05 Message: Logged In: YES user_id=154690 Originator: YES The client will need to track the multiple watcher objects to invoke when it gets the callback. The client server protocol doesn't change, this is just a client side thing. We should add the ability to remove watches as a separate feature request. ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-14 00:21 Message: Logged In: YES user_id=12853 Originator: NO What should we do if the user calls getData multiple times with different watcher objects? Remove previous watch and replace with new watcher, throw error, or allow multiple watchers on particular node? How about removing a watch by calling method on particular node with null watcher? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008546&aid=1937078&group_id=209147 |
From: SourceForge.net <no...@so...> - 2008-05-30 00:53:09
|
Bugs item #1946934, was opened at 2008-04-19 17:36 Message generated for change (Comment added) made by tedunning You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1946934&group_id=209147 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3.0.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Ted Dunning (tedunning) Assigned to: Mahadev Konar (mahadevkonar) Summary: NPE during normal operations Initial Comment: Somehow, I got my server to do this evil thing. I will try to figure out how to do this more consistently. 4/19/08 5:30:31 PM PDT [FinalRequestProcessor.java@220][8]: ****************************** 11944c2037100d8 256 fffffffffffffffe txn type = unknown getChildren n/a 4/19/08 5:30:31 PM PDT [FinalRequestProcessor.java@227][8]: ffffffff0 4/19/08 5:30:31 PM PDT [FinalRequestProcessor.java@228][8]: : java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:157) at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:730) at com.yahoo.zookeeper.server.DataTree.getNode(DataTree.java:84) at com.yahoo.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:205) at com.yahoo.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:226) at com.yahoo.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) 4/19/08 5:30:49 PM PDT [FinalRequestProcessor.java@220][8]: ****************************** 11944c2037100d8 257 fffffffffffffffe txn type = unknown exists n/a 4/19/08 5:30:49 PM PDT [FinalRequestProcessor.java@227][8]: ffffffff0 4/19/08 5:30:49 PM PDT [FinalRequestProcessor.java@228][8]: : java.lang.NullPointerException at com.yahoo.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:169) at com.yahoo.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:226) at com.yahoo.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) ---------------------------------------------------------------------- >Comment By: Ted Dunning (tedunning) Date: 2008-05-29 17:53 Message: Logged In: YES user_id=972206 Originator: YES I am pretty sure that I had a null even though it was impossible (in the programming sense). I haven't had this problem since then. ---------------------------------------------------------------------- Comment By: Mahadev Konar (mahadevkonar) Date: 2008-05-29 17:21 Message: Logged In: YES user_id=1926680 Originator: NO Ted, just want to check if null path is the only problem here. In your comments you mentioned that null path might not be possible. Can you check to see if this is true -- (in a QA sense -- not a programmer sense)? :) ---------------------------------------------------------------------- Comment By: Patrick Hunt (phunt) Date: 2008-05-05 16:03 Message: Logged In: YES user_id=12853 Originator: NO Ben looks like you had a handle on this. Please submit a patch. ---------------------------------------------------------------------- Comment By: Ted Dunning (tedunning) Date: 2008-04-22 10:58 Message: Logged In: YES user_id=972206 Originator: YES Good thing then to have found it. My only problem is that by inspection, my code "can't" be sending the null. The usual software engineering definition of "can't" is obviously the one that I mean here. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-04-22 08:34 Message: Logged In: YES user_id=154690 Originator: NO I think your suspicion is correct. (I thought we were handling this.) It's actually a double bug! Both the server and client should check for a null path. ---------------------------------------------------------------------- Comment By: Ted Dunning (tedunning) Date: 2008-04-21 10:35 Message: Logged In: YES user_id=972206 Originator: YES java version "1.5.0_13" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_13-b05-241) Java HotSpot(TM) Client VM (build 1.5.0_13-121, mixed mode, sharing) Running on a mac under OSX 10.4 I think actually that this is actually a case of invalid arguments (from me) causing an obscure message rather than being caught. My suspicion is null path passed to create. I still owe you a test case. ---------------------------------------------------------------------- Comment By: Benjamin Reed (breed) Date: 2008-04-21 09:55 Message: Logged In: YES user_id=154690 Originator: NO Which version of Java are you using? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1008544&aid=1946934&group_id=209147 |