From: Bartlomiej N. <ba...@ya...> - 2008-04-25 23:46:04
|
Thanks Ben. As always, your answer is pretty reasonable and I agree with most of the points. However I strongly disagree with: 2) a permanent watch is a watch that spans across multiple changes within the same connection. I'm not requesting anything else. 4) How do subscriptions make it different from permanent watches? Maybe, there is a feature I'm not aware of. Besides, for a permanent watch, it should be cleared only when session expires or client disconnects, that would be the semantic and a client can detect it easily. 5) There are two types of users, those who don't want to deal with difficult problems (I guess it's a majority), and those who want to get as much from the system at the minimum cost, considering a risk it brings. I think a flexible API should allow both, rather then enforcing correctness on the user side just for the sake of correctness. I just like to have a choice. Just to end up this discussion, please suggest the most efficient solution (without persistent watches, you can use subscriptions) that minimizes number of network traffic between client and server for the following scenario: 1) A client is interested in some activity of another application and it's critical to get *ALL* changes as long as the connection to ZK doesn't break 2) The client operates on the boundary the network bandwidth limit Thanks, Bart Bart -----Original Message----- From: Benjamin Reed [mailto:br...@ya...] Sent: Friday, April 25, 2008 4:07 PM To: Bartlomiej Niechwiej Cc: Benjamin Reed; Jacob Levy; Ted Dunning; zoo...@li... Subject: Re: [Zookeeper-user] [Bug?] Notification not guaranteed exists() is a special case where the watch event does indeed have the data. But it only buys you want you need in the absences of failures. If you need to reconnect to another server, you still miss events. The reasons for not having permanent watches are practical: 1) Except for exists() you can get the same information more efficiently with current watches and version tracking. Even in your example if the alive node is really thrashing so fast that the tracking server cannot request fast enough to keep up with all the events, do you really need the missed events? The events you are able to keep up with are enough to indicate problems. You could actually get fine grained counts by making alive a directory and using the SEQUENCE flag. That way you just compare the current sequence of the file with the previous sequence you saw. 2) Even for exists() it doesn't work across server connections, since watches can be missed. 3) Because of 2) an application cannot reliably count on permanent watches. 4) Applications would need to be responsible for proactively cleaning up permanent watches. (Which means they probably wouldn't.) 5) Most importantly the ZooKeeper API is designed to encourage correct usage. We don't include the data that is changed in the watch event specifically because our initial users tried to take advantage of that data and always ended up with errors in their code. Permanent watches can also induce such errors. So, really it's the practical issues that are behind our aversion to permanent watches. ZooKeeper needs to provide clean well understood semantics. Our current watches do this and subscribe would too, but something in the middle is likely to induce errors and misunderstandings. ben On Friday 25 April 2008 15:29:18 Bartlomiej Niechwiej wrote: > Ben, I think I gave a clear example that I was interesting in > notifications about the change, not about the data being changed. In > other words, the presence or absence is my data, a boolean. In the > scenario I described, zookeeper cannot provide a reliable way of giving > me what I need. That's it. > > Why it is so hard to provide a permanent server side watches? What is > the problem with that? Is it that we don't want this functionality > because it doesn't make sense or is it just a theoretical discussion? > > You suggest subscriptions mechanism, which is way too much expensive > versus what I propose, and for simple cases like the one described, you > would have to end up spending too much ZK resources. > > B. > > -----Original Message----- > From: Benjamin Reed [mailto:br...@ya...] > Sent: Friday, April 25, 2008 2:29 PM > To: Jacob Levy; Ted Dunning; Bartlomiej Niechwiej; Benjamin Reed > Cc: zoo...@li... > Subject: Re: [Zookeeper-user] [Bug?] Notification not guaranteed > > Here is the issue: are you watching for changes or just the notification > that something changed? Watches are really about notification of > changes. Imagine the following execution: > > time 0: set /a to value0 (now version 1) > time 1: set /a to value1 (now version 2) > time 2: set /a to value2 (now version 3) > time 3: set /a to value3 (now version 4) > time 4: set /a to value4 (now version 5) > > If we had a permanent watch a client watching /a would get: > > time 0: getData(/a, permanent) > time 1: getData returns value1 version 2 > time 2: /a changed > time 3: /a changed > time 4: /a changed > > With our current watches you could see something like: > > time 0: getData(/a, true) > time 1: getData returns value1 version 2 > time 2: /a changed > time 3: getData(/a, true) > time 4: getData returns value4 version 5 > > Now note, at the client you can change the above into a permanent watch > by generating locally missed events by calculating the number of missed > changes by subtracting the version numbers: > > time 0: getData(/a, true) > time 1: getData returns value1 version 2 > time 2: /a changed > time 3: getData(/a, true) > time 4: getData returns value4 version 5 by looking at the version > numbers we see that we missed 2 events, so generate now > time 4: /a changed (locally generated) > time 4: /a changed (locally generated) > > There is a slight additional latency for the version 4 change, but in > some sense we have compressed the traffic (the collapsing that Ted > mentioned). > > Now, in the end this is all silly. If you are really watching /a in this > way, you are probably more interested in the actual data, something that > the watch doesn't give you. In that case you usually want the latest > value. (This is what ZooKeeper makes easy right now.) or all the > intermediate values. Watch events don't have values, so the permanent > watches don't help with intermediate values. Subscribe events would push > values. Subscribe actually gives you something you cannot get today. > > This is a repeat of what is said on > http://zookeeper.wiki.sourceforge.net/SubscribeMethod Does this help > clarify the wiki any better? > > ben > > > ----- Original Message ---- > From: Jacob Levy <jy...@ya...> > To: Ted Dunning <tdu...@ve...>; Bartlomiej Niechwiej > <ba...@ya...>; Benjamin Reed <br...@ya...> > Cc: zoo...@li... > Sent: Friday, April 25, 2008 1:12:57 PM > Subject: Re: [Zookeeper-user] [Bug?] Notification not guaranteed > > In case noone answered your question yet: > > A permanent watch (subscription) will guarantee that the client sees > EVERY change in the thing being watched after the time the permanent > watch is established. A one time watch that is reasserted every time you > read is different, since it can miss events between the time that the > watch fired and is reasserted. > > --Jacob > > > -----Original Message----- > From: zoo...@li... > [mailto:zoo...@li...] On Behalf Of Ted > Dunning > Sent: Friday, April 25, 2008 11:31 AM > To: Bartlomiej Niechwiej; Benjamin Reed > Cc: zoo...@li... > Subject: Re: [Zookeeper-user] [Bug?] Notification not guaranteed > > > Bartlomiej, > > How is a watch that is always reasserted on every read different from a > permanent watch? The client side implementation has the virtue that it > collapses multiple changes if the client goes away or gets busy. > > Is it just a client API issue? |