From: Jonathan L. <jon...@ee...> - 2008-12-10 15:31:14
|
This is an interesting situation. Let's think about the goal. The goal is to keep pinging the same nodes as long as they are alive, and to replace a node soon after it dies, keeping the total number of up neighbors at approximately MAX_NEIGHBORS. How about the following: - when you pick a neighbor to ping, you set a time on that event (in a new per-neighbor timestamp LastPingTime) - when you first insert a neighbor, set this timestamp to 0. - when you loop in performMaintenance, remove nodes whose LastPingTime - LastUpdateTime > threshold (like 10 minutes) - when picking a node, pick the guy with the oldest LastPingTime (not sure about this one -- I don't like situations where you pick the absolute *oldest* because then we could get in a situation where an old guy sits there, waiting to be kicked out, and we keep picking him to be pinged; I'd prefer to pick randomly, or random older guys). - a simple way around this is to simply have a self-adjusting threshold: so, we walk through the nodes, and everyone whose LastPingTime is older than MIN_UPDATE_TIME_TO_PING ago (a relative time) get added to a set; we pick a guy randomly from the set to ping. If the set is empty, we increase MIN_UPDATE_TIME_TO_PING, and wait for the next time the function is called. If the set is too large (relative to a fraction of MAX_NEIGHBORS), we decrease it. We still might get some churn from lost UDP packets. You might also give nodes a second chance to be pinged by including LastPingTime - LastUpdateTime > 1 minute in the set. Note that increases and decreases in MIN_UPDATE_TIME_TO_PING shouldn't be affected by the number of nodes that fall into this category. We still probably need a way for new potential neighbors to be rejected if our current set of neighbors is fine. Because I still want addNeigbor to really add the neighbor, how about putting a check before calling that to see if we need more neighbors. What thinks? -jonathan On Dec 9, 2008, at 11:37 PM, ext Eric Chan-Tin wrote: > Peter, > > Thanks. That fixed it. > > However, here is another problem I found while deploying Pyxida and > running them for a while. > > In performMaintenance(), I loop through every neighbor and if > (neighbor.getLastUpdateTime() < current_time - 10 minutes), then I > remove that neighbor. > > > In NCManager, I call getNeighborToPing, which randomly picks one of > the > neighbors (32 in this case). Pinging is performed every 10 seconds, so > in 10 minutes, you can ping 60 nodes. However, if a neighbor A is not > among those 60 nodes, it will get kicked out in performMaintenance(), > although A is still alive (it just hasn't been contacted yet). > > There are some possible options. > > 1) Increase the timeout period from 10 minutes to ?? minutes. This > doesn't quite work because say in Azureus, the number of neighbors is > 512, so the timeout needs to be increased even more... It could be a > function of MAX_NEIGHBORS... > > 2) When Pyxida replies to a ping request, it will update the > neighbor's > lastUpdateTime. This does not quite work either because it assumes a > symmetric relationship which is most likely not the case. > > 3) Introduce some more logic to Pyxida so it remembers what nodes it > contacted and does not send, say, more than 2 requests in 10 minutes > to > the same node. This solution isn't that great either. > > Let me know what you guys think. I am currently leaning towards option > 1) but make the timeout period a function of MAX_NEIGHBORS, although I > haven't quite found the perfect function yet. > > Here are some preliminary results. I compared the number of neighbors > which are the same at each reported time (10 minutes). Before the fix, > more than 50% of the neighbors changed within 10 minutes. After the > fix, > only about 30% of the neighbors changed within 10 minutes. It is more > stable but I think 30% is still quite a big number. > > Sorry for the long email :) > > -Eric > > Peter Pietzuch wrote: >> Hi, >> >> Eric Chan-Tin wrote: >>> for (RemoteState<T> neighbor : neighbors) { //this is line 1029 >>> if (neighbor.getLastUpdateTime() < (curr_time - >>> MAINTENANCE_PERIOD)) { >>> neighbors.remove(neighbor); >>> } >>> } >> >> IIRC you can't modify a collection while iterating over it using the >> above for loop. Instead you have to use a proper >> Iterator<RemoteState<T>> and use it's remove method. >> >> Cheers, >> Peter >> >>> Jonathan Ledlie wrote: >>>> On Nov 3, 2008, at 3:46 PM, Eric Chan-Tin wrote: >>>> >>>>> Jonathan, >>>>> >>>>> I can insert some code to fix this and it doesn't look too hard >>>>> but I do >>>>> have some concerns. >>>>> >>>>>> I don't remember observing this behavior in our PL >>>>>> deployment, but that could have been because we had the ping >>>>>> time set >>>>>> to 1-2 minutes, IIRC. >>>>> The default ping time was set to 10 seconds + some Gaussian >>>>> distribution, >>>>> so that's what I used, but changing the ping time to 1 minute or >>>>> more >>>>> will >>>>> result in the same effect except it will take longer for the >>>>> "cycle" to >>>>> occur. >>>>> >>>>> What I have in mind was the following: >>>>> >>>>> 1) In NCClient.addNeighbor() would be changed to (in pseudocode) >>>>> protected boolean addNeighbor(RemoteState<T> guy) { >>>>> boolean added = false; >>>>> //if neighbors.size() >= MAX_NEIGHBORS >>>>> //Check last update time of each neighbor in >>>>> neighbors >>>>> //neighbor.getLastUpdateTime() >>>>> //if any of them is older than some number >>>>> TIMEOUT, >>>>> remove the oldest one, then add "guy" to neighbors >>>>> //else don't add "guy" to neighbors >>>>> //else add "guy" to neighbors >>>>> } >>>>> >>>> IMO this shouldn't be the behavior of this function because it is >>>> called >>>> "addNeighbor." Instead what could happen is that part of the >>>> performMaintenance() function could remove old neighbors. >>>> >>>>> 2) In NCManager, getGossipNode() would just pick a random node >>>>> from >>>>> localNC.primaryNC.getNeighbors() to send a request to every 10 >>>>> seconds. >>>>> >>>>> 3) Pyxida will still keep track of all GossipNodes in the >>>>> network. It >>>>> could >>>>> also (say every 10 minutes) check its NCClient.neighbors list >>>>> and see if >>>>> any neighbor is older than TIMEOUT, remove it, then pick a random >>>>> GossipNode to send a ping request. >>>>> >>>>> The main problem with this scheme is that one Pyxida node would >>>>> most >>>>> likely >>>>> stay with the same set of neighbors all the time. Its coordinates >>>>> would be >>>>> greatly affected by those 32 nodes (I am assuming MAX_NEIGHBORS >>>>> = 32). >>>>> >>>> Instead of (2), why not have the getNeighborToPing function >>>> determine >>>> the behavior? If it returns a node, then ping it. If it returns >>>> null, >>>> then have the NCManager pick some other node to use? If things are >>>> hunky-dory, getNeighborToPing() would return a node (a current >>>> member of >>>> the neighbor set), if we're short on neighbors, then it would >>>> return >>>> null. The periodic maintenance would have already tossed old >>>> nodes, so >>>> this would be self-sustaining. >>>> >>>> The "problem" you mention is fairly central to network >>>> coordinates in >>>> general. Sure the coordinates will be a little better the more >>>> neighbors you have. But in a very large system talking to >>>> everybody >>>> isn't feasible. Peter and I have a graph in one of the workshop >>>> papers >>>> which shows the effect of increasing the number of neighbors >>>> >>>>> I don't think the network would be partitioned, but it could >>>>> happen. For >>>>> example, the boostrap nodes have each other in their neighbors >>>>> list. >>>>> Every >>>>> new node bootstrapping to the network would contact those >>>>> bootstrapping >>>>> nodes first and add them to their neighbors. So every Pyxida >>>>> node could >>>>> have as its neighbors the "Bootstrap nodes", which would make >>>>> Pyxida >>>>> become >>>>> a centralized system. >>>>> >>>> As long as the initial bootstrap list is large enough, I think >>>> you are >>>> right, that partitions won't be a problem. >>>> >>>>> One last thing: When Pyxida receives a ping request from a node, >>>>> it does >>>>> not update its coordinates. Is there any particular reason for >>>>> that? >>>>> Is it >>>>> because Pyxida will not be able to calculate the RTT? >>>>> >>>> Yes. >>>> >>>>> Also, there is no "gossiping" of other nodes' coordinates, >>>>> right? So if A >>>>> has B in its neighbors list, it will from time to time tell >>>>> other nodes >>>>> about B's coordinates. I realize this has a lot of complications >>>>> but the >>>>> original Vivaldi paper was very fuzzy about this point and the >>>>> previous >>>>> point that it seems it is up to the implementors to decide what >>>>> to do. >>>>> >>>> I don't think we are doing this. It could be useful in certain >>>> circumstance (e.g., picking potential new neighbors), but we're >>>> not, IIRC. >>>> >>>>> I will have to think some more about this as there are a lot of >>>>> nasty >>>>> little side effects that could happen, but these are my thoughts. >>>>> >>>>> Thanks, >>>>> >>>>> -Eric >>>>> >>>> Sounds good. >>>> >>>> -jonathan >>>> >>>>> On Nov 3 2008, Jonathan Ledlie wrote: >>>>> >>>>>> Eric, >>>>>> >>>>>> Based on a look at the code just now, I see where you are >>>>>> coming from: >>>>>> with these params on PL, there isn't anything stopping the >>>>>> neighbor >>>>>> list from circulating pretty quickly through all nodes. In the >>>>>> Azureus implementation, the guys fed to the alg (fed to >>>>>> processSample) >>>>>> weren't changing much because they were coming from the routing >>>>>> finger >>>>>> tables. I don't remember observing this behavior in our PL >>>>>> deployment, but that could have been because we had the ping >>>>>> time set >>>>>> to 1-2 minutes, IIRC. >>>>>> >>>>>> Do you want to insert some code that will fix this? Basically >>>>>> we want >>>>>> to only add a new neighbor if our oldest neighbor should be >>>>>> tossed (or >>>>>> if we haven't reached MAX_NEIGHBORS). It's probably best from a >>>>>> clarity standpoint if addNeighbor does continue to actually add >>>>>> the >>>>>> neighbor in all cases. Probably the best thing to do is to have >>>>>> NCClient drop old neighbors internally and have NCManager ask >>>>>> it which >>>>>> nodes should be pinged via getNeighborToPing. So NCClient would >>>>>> continue to have the same behavior on processSample (it would not >>>>>> reject new information) and NCManager and NCClient would >>>>>> cooperate to >>>>>> measure RTTs to current neighbors. We do want to make sure, >>>>>> however, >>>>>> that the gossip set stays large enough so that partitions don't >>>>>> occur. >>>>>> >>>>>> Do you want to be added to the SourceForge Pyxida developers to >>>>>> make >>>>>> this change? (Please test it out carefully first.) >>>>>> >>>>>> -jonathan >>>>>> >>>>>> On Oct 30, 2008, at 11:32 AM, ext Eric Chan-Tin wrote: >>>>>> >>>>>>> Jonathan, >>>>>>> >>>>>>> From what I saw, eventually, you learn about all the nodes in >>>>>>> the >>>>>>> network. Every 10 seconds, Pyxida picks a random "Gossip" node >>>>>>> from >>>>>>> its >>>>>>> list and sends a request. The Gossip node will reply back with >>>>>>> an >>>>>>> "upNeighbor", that is, a "Gossip" node that is "up" or with 10% >>>>>>> probability, a node that is "pending". Since naturally, all >>>>>>> the nodes >>>>>>> bootstrap off the same set of nodes, every node will >>>>>>> eventually know >>>>>>> of >>>>>>> every other node. >>>>>>> >>>>>>> Pyxida might not be rotating through everybody continuously >>>>>>> (although >>>>>>> eventually, since it knows everybody, it will have added >>>>>>> everybody to >>>>>>> its Neighbors list at some point), but could be removing a >>>>>>> node that >>>>>>> is >>>>>>> "alive". So, if MAX_NEIGHBORS = 32, let's call them A_1, >>>>>>> A_2, ..., >>>>>>> A_32. >>>>>>> Pyxida picks a random Gossip node B, sends it a request and B >>>>>>> replies >>>>>>> back. Pyxida will add B to its Neighbors list, see that size of >>>>>>> Neighbors > MAX_NEIGHBORS, kick A_1 out. Then from those 32 >>>>>>> neighbors >>>>>>> A_2, ...., A_32, B, it will calculate the force, and apply it >>>>>>> to the >>>>>>> current system coordinates. Thus, if every 10 seconds, Pyxida >>>>>>> picks a >>>>>>> different set of nodes C, D, E, etc... it would keep kicking >>>>>>> nodes out >>>>>>> of the neighbors list. >>>>>>> >>>>>>> It's the "kicking A_1 out" part that seemed a bit confusing. >>>>>>> Churn >>>>>>> is a >>>>>>> good reason but you don't check that A_1 might still be alive >>>>>>> or maybe >>>>>>> A_1 was just added. I have been printing the neighbors list >>>>>>> every 10 >>>>>>> minutes, and they are completely different between those 10 >>>>>>> minutes. >>>>>>> >>>>>>> One possible problem might be that I am choosing the value of >>>>>>> MAX_NEIGHBORS to be too small. Originally it was set to 512 >>>>>>> (which was >>>>>>> meant for Azureus). Since I am using Planetlab where there are >>>>>>> about >>>>>>> 400-500 "stable" nodes, I changed the value to 32. But I don't >>>>>>> think >>>>>>> that should be a problem since log(n) = log(512) = 9. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Eric >>>>>>> >>>>>>> Jonathan Ledlie wrote: >>>>>>>> Eric, >>>>>>>> >>>>>>>> Glad to hear it. Let Peter and me know if you publish >>>>>>>> anything using >>>>>>>> Pyxida and we can put it on the web page. >>>>>>>> >>>>>>>> "Gossip" nodes are for learning about the network. >>>>>>>> "Neighbors" are >>>>>>>> for >>>>>>>> constructing the coordinate. >>>>>>>> >>>>>>>> So, IIRC, the gossip process learns about nodes that may or >>>>>>>> may not >>>>>>>> be >>>>>>>> applied to the neighbor list. That is, we may eventually learn >>>>>>>> about a >>>>>>>> large portion of the network, via gossiping, but should only >>>>>>>> be using >>>>>>>> MAX_NEIGHBORS at any given time for coordinate computation. >>>>>>>> The >>>>>>>> reason >>>>>>>> we don't want to use a fixed set of neighbors for coordinate >>>>>>>> computation >>>>>>>> is because nodes may come and go from the system. However, >>>>>>>> it should >>>>>>>> not be the case that we just rotate through everybody, because >>>>>>>> then, as >>>>>>>> you note, we'd be computing against everyone's coordinate/ >>>>>>>> latency, >>>>>>>> and >>>>>>>> this is counter to the point of network coordinates. With this >>>>>>>> gradual >>>>>>>> transience as a desirable quality in mind, I am pretty sure we >>>>>>>> maintain >>>>>>>> neighbors in a fairly fixed manner. Does this not seem to be >>>>>>>> the >>>>>>>> case >>>>>>>> to you? >>>>>>>> >>>>>>>> -jonathan >>>>>>>> >>>>>>>> On Oct 29, 2008, at 9:20 PM, Eric Chan-Tin wrote: >>>>>>>> >>>>>>>>> Hi Jonathan and Peter, >>>>>>>>> >>>>>>>>> First of all, just wanted to say that Pyxida has been working >>>>>>>>> great on >>>>>>>>> Planetlab for the past 4 months or so. Getting good data and >>>>>>>>> experiments >>>>>>>>> going well. >>>>>>>>> >>>>>>>>> However, I do have one question/comment. >>>>>>>>> >>>>>>>>> Every Pyxida node contains a list of Neighbors (in NCClient) >>>>>>>>> and a >>>>>>>>> list >>>>>>>>> of Gossip nodes (they are called neighbors in NCManager but >>>>>>>>> I will >>>>>>>>> call >>>>>>>>> them gossip for now so as not to confuse with the 'real' >>>>>>>>> neighbors >>>>>>>>> in >>>>>>>>> NCClient). Every 10 seconds, Pyxida will pick a random node >>>>>>>>> (call >>>>>>>>> it A) >>>>>>>>> from Gossip and send a gossip request (basically asking for >>>>>>>>> rtt, >>>>>>>>> coordinate, error). But Gossip contains all the nodes in the >>>>>>>>> network. >>>>>>>>> The comment in the code said that was to avoid Pyxida becoming >>>>>>>>> lonely >>>>>>>>> and no-one to gossip with, which makes sense. >>>>>>>>> >>>>>>>>> Once A responds, Pyxida will add A to its Neighbors (if >>>>>>>>> Neighbors >>>>>>>>> does >>>>>>>>> not contain A) and remove the first element from the >>>>>>>>> Neighbors list. >>>>>>>>> This seems a little bit odd. Was there any particular >>>>>>>>> reasoning >>>>>>>>> behind >>>>>>>>> this? Because it would seem like you're getting the >>>>>>>>> coordinate from >>>>>>>>> every node in the network eventually. >>>>>>>>> >>>>>>>>> Thanks a lot, >>>>>>>>> >>>>>>>>> -Eric >>>>>>>>> >>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>> Your inclination is correct. See forwarded message. >>>>>>>>>> >>>>>>>>>> -jonathan >>>>>>>>>> >>>>>>>>>> On Apr 9, 2008, at 1:53 PM, ext Eric Chan-Tin wrote: >>>>>>>>>> >>>>>>>>>>> Thank you for the clarification. >>>>>>>>>>> >>>>>>>>>>> I've been reading through your published papers and code >>>>>>>>>>> and I >>>>>>>>>>> must >>>>>>>>>>> say I learned a lot about how Pyxida works :) >>>>>>>>>>> >>>>>>>>>>> I still have a quick and easy (hopefully) question. Each >>>>>>>>>>> Pyxida >>>>>>>>>>> node >>>>>>>>>>> has a small amount of neighbors and learn about other nodes >>>>>>>>>>> through >>>>>>>>>>> other nodes. I am running on about 200 machines, and it >>>>>>>>>>> seems >>>>>>>>>>> like all >>>>>>>>>>> the Pyxida nodes are adding each other to their >>>>>>>>>>> upNeighbors set. >>>>>>>>>>> NCClient.java has the MAX_NEIGHBORS set to 512. >>>>>>>>>>> >>>>>>>>>>> I wanted to make sure that I am not misreading the code, but >>>>>>>>>>> shouldn't >>>>>>>>>>> each Pyxida node keep a small number of neighbors in its >>>>>>>>>>> upNeighbors >>>>>>>>>>> set. Else, it will be a lot of updates to send a >>>>>>>>>>> gossipMessage >>>>>>>>>>> to all >>>>>>>>>>> 200 neighbors, and it also means that a Pyxida node has >>>>>>>>>>> the whole >>>>>>>>>>> network in its upNeighbors set - that seems wrong. I am >>>>>>>>>>> inclined >>>>>>>>>>> to >>>>>>>>>>> change the value of MAX_NEIGHBORS but wanted to make sure >>>>>>>>>>> that my >>>>>>>>>>> understanding is correct first. >>>>>>>>>>> >>>>>>>>>>> Thank you, >>>>>>>>>>> >>>>>>>>>>> Eric >>>>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>> >>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move >>>>>>>>> Developer's >>>>>>>>> challenge Build the coolest Linux based applications with >>>>>>>>> Moblin >>>>>>>>> SDK & >>>>>>>>> win great prizes Grand prize is a trip for two to an Open >>>>>>>>> Source >>>>>>>>> event >>>>>>>>> anywhere in the world >>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>>>>> _______________________________________________ Pyxida-users >>>>>>>>> mailing >>>>>>>>> list Pyx...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>>>> ------------------------------------------------------------------------- >>>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>>> challenge >>>>> Build the coolest Linux based applications with Moblin SDK & win >>>>> great >>>>> prizes >>>>> Grand prize is a trip for two to an Open Source event anywhere >>>>> in the >>>>> world >>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>> _______________________________________________ >>>>> Pyxida-users mailing list >>>>> Pyx...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>> ------------------------------------------------------------------------------ >>> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las >>> Vegas, Nevada. >>> The future of the web can't happen without you. Join us at MIX09 >>> to help >>> pave the way to the Next Web now. Learn more and register at >>> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ >>> _______________________________________________ >>> Pyxida-users mailing list >>> Pyx...@li... >>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >> >> |