From: Jonathan L. <jon...@ee...> - 2008-12-18 20:51:30
|
I could be wrong, but I think the way you've written getNeighborToPing is biased toward elements earlier in the neighbors set. This is because you traverse neighbor set in the same order each time. I'll see if I can code up a fix. -jonathan On Dec 18, 2008, at 11:35 AM, Eric Chan-Tin wrote: > Sorry for the delay. Was busy with final exams and projects. > > I made the changes. They can be found at > > http://www-users.cs.umn.edu/~dchantin/NCClient.java > http://www-users.cs.umn.edu/~dchantin/NCManager.java > http://www-users.cs.umn.edu/~dchantin/RemoteState.java > > All the changes are enclosed within > //ERIC START > ... > //ERIC END > > I compared every reported "neighbors list" of each Pyxida node > (about 370 > total) and see how many neighbors did not change within 10 minutes. > Then > did the average for all the nodes. > > Original: 43.37% of the neighbors stayed the same on average. > > Fix0 (before we found the interesting situation): 74.73% > > Fix1 (with the weight probability of picking a neighbor to ping): > 95.79% > > Fix2 (with the weight probability and the MIN_UPDATE_PING): 98.81% > > Also, the reported error, rrl, etc.. were much lower in the Fixed > Pyxida > than in the original one (this could not be significant, since the > Original > experiment was performed 1-2 months ago), but as expected, it was not > higher. > > You can take a look at the code and test it if you want. If everything > looks fine, I'll remove some of the debugging code and other "useless" > comments (like //ERIC). Also, let me know which "fix" you prefer. I > will > then email you the updated code. > > Thanks, > > -Eric > > > On Dec 10 2008, Jonathan Ledlie wrote: > >> >> Seems reasonable. >> >> Both ideas (lastPingTime and a weighted probability based on age) >> seem >> to be independent. >> >> Some of the code is more CPU intensive than it should be (e.g., >> looping through potentially long lists), and that is a slight >> downside >> to the function approach vs a set, but this probably isn't a big >> deal. >> >> -jonathan >> >> On Dec 10, 2008, at 10:40 PM, Eric Chan-Tin wrote: >> >>> okay. I think I might have misunderstood you. Your idea of >>> lastPingTime >>> looks interesting. >>> >>> My idea is to introduce a probability function. Right now, it just >>> randomly picks one of the neighbors. What I am thinking it might >>> do is >>> that instead of just randomly picking a neighbor, pick a neighbor >>> based >>> on some function. So, the neighbor whose lastUpdateTime is oldest >>> will >>> have a higher chance of being picked than a neighbor who was just >>> updated. >>> >>> Does that make more sense? >>> >>> -Eric >>> >>> Jonathan Ledlie wrote: >>>> >>>> Hmm. The advantage of lastPingTime is that you learn whether or >>>> not >>>> you've tried to contact a neighbor and they haven't responded. >>>> Then you >>>> can use that knowledge to determine whether or not to toss them. >>>> >>>> Maybe I'm not understanding your suggestion then. >>>> >>>> -jonathan >>>> >>>> On Dec 10, 2008, at 12:51 PM, Eric Chan-Tin wrote: >>>> >>>>> That's a good idea. Here is a little twist on it. >>>>> >>>>> Put a "weight" on each neighbor, based on the lastUpdateTime. So >>>>> if >>>>> currentTime - neighbor's lastUpdateTime, then the weight is >>>>> higher. >>>>> Then in getNeighborToPing, neighbor's with higher weight have a >>>>> higher >>>>> probability of getting picked than neighbor's with lower weight. >>>>> >>>>> The weight could be ( (currentTime - lastUpdateTime) / 15 >>>>> minutes). >>>>> It's not 10 minutes so as to avoid a probability of 1 if the >>>>> lastUpdateTime was the previous performMaintenance. This should >>>>> also >>>>> avoid the problem of always pinging the oldest neighbor since it's >>>>> about probability (weight). >>>>> >>>>> It seems a little bit less complicated than having lastPingTime >>>>> and >>>>> MIN_UPDATE_TIME. >>>>> >>>>> -Eric >>>>> >>>>> On Dec 10 2008, Jonathan Ledlie wrote: >>>>> >>>>>> >>>>>> This is an interesting situation. Let's think about the goal. >>>>>> The >>>>>> goal is to keep pinging the same nodes as long as they are alive, >>>>>> and to replace a node soon after it dies, keeping the total >>>>>> number >>>>>> of up neighbors at approximately MAX_NEIGHBORS. >>>>>> >>>>>> How about the following: >>>>>> - when you pick a neighbor to ping, you set a time on that event >>>>>> (in >>>>>> a new per-neighbor timestamp LastPingTime) >>>>>> - when you first insert a neighbor, set this timestamp to 0. >>>>>> - when you loop in performMaintenance, remove nodes whose >>>>>> LastPingTime - LastUpdateTime > threshold (like 10 minutes) >>>>>> >>>>>> - when picking a node, pick the guy with the oldest LastPingTime >>>>>> (not sure about this one -- I don't like situations where you >>>>>> pick >>>>>> the absolute *oldest* because then we could get in a situation >>>>>> where >>>>>> an old guy sits there, waiting to be kicked out, and we keep >>>>>> picking >>>>>> him to be pinged; I'd prefer to pick randomly, or random older >>>>>> guys). >>>>>> - a simple way around this is to simply have a self-adjusting >>>>>> threshold: so, we walk through the nodes, and everyone whose >>>>>> LastPingTime is older than MIN_UPDATE_TIME_TO_PING ago (a >>>>>> relative >>>>>> time) get added to a set; we pick a guy randomly from the set to >>>>>> ping. If the set is empty, we increase MIN_UPDATE_TIME_TO_PING, >>>>>> and >>>>>> wait for the next time the function is called. If the set is too >>>>>> large (relative to a fraction of MAX_NEIGHBORS), we decrease it. >>>>>> >>>>>> We still might get some churn from lost UDP packets. You might >>>>>> also >>>>>> give nodes a second chance to be pinged by including >>>>>> LastPingTime - >>>>>> LastUpdateTime > 1 minute in the set. Note that increases and >>>>>> decreases in MIN_UPDATE_TIME_TO_PING shouldn't be affected by the >>>>>> number of nodes that fall into this category. >>>>>> >>>>>> We still probably need a way for new potential neighbors to be >>>>>> rejected if our current set of neighbors is fine. Because I >>>>>> still >>>>>> want addNeigbor to really add the neighbor, how about putting a >>>>>> check before calling that to see if we need more neighbors. >>>>>> >>>>>> What thinks? >>>>>> >>>>>> -jonathan >>>>>> >>>>>> On Dec 9, 2008, at 11:37 PM, ext Eric Chan-Tin wrote: >>>>>> >>>>>>> Peter, >>>>>>> >>>>>>> Thanks. That fixed it. >>>>>>> >>>>>>> However, here is another problem I found while deploying Pyxida >>>>>>> and >>>>>>> running them for a while. >>>>>>> >>>>>>> In performMaintenance(), I loop through every neighbor and if >>>>>>> (neighbor.getLastUpdateTime() < current_time - 10 minutes), >>>>>>> then I >>>>>>> remove that neighbor. >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> In NCManager, I call getNeighborToPing, which randomly picks one >>>>>>> of >>>>>>> the >>>>>>> neighbors (32 in this case). Pinging is performed every 10 >>>>>>> seconds, so >>>>>>> in 10 minutes, you can ping 60 nodes. However, if a neighbor A >>>>>>> is not >>>>>>> among those 60 nodes, it will get kicked out in >>>>>>> performMaintenance(), >>>>>>> although A is still alive (it just hasn't been contacted yet). >>>>>>> >>>>>>> There are some possible options. >>>>>>> >>>>>>> 1) Increase the timeout period from 10 minutes to ?? minutes. >>>>>>> This >>>>>>> doesn't quite work because say in Azureus, the number of >>>>>>> neighbors is >>>>>>> 512, so the timeout needs to be increased even more... It could >>>>>>> be a >>>>>>> function of MAX_NEIGHBORS... >>>>>>> >>>>>>> 2) When Pyxida replies to a ping request, it will update the >>>>>>> neighbor's >>>>>>> lastUpdateTime. This does not quite work either because it >>>>>>> assumes a >>>>>>> symmetric relationship which is most likely not the case. >>>>>>> >>>>>>> 3) Introduce some more logic to Pyxida so it remembers what >>>>>>> nodes it >>>>>>> contacted and does not send, say, more than 2 requests in 10 >>>>>>> minutes to >>>>>>> the same node. This solution isn't that great either. >>>>>>> >>>>>>> Let me know what you guys think. I am currently leaning towards >>>>>>> option >>>>>>> 1) but make the timeout period a function of MAX_NEIGHBORS, >>>>>>> although I >>>>>>> haven't quite found the perfect function yet. >>>>>>> >>>>>>> Here are some preliminary results. I compared the number of >>>>>>> neighbors >>>>>>> which are the same at each reported time (10 minutes). Before >>>>>>> the fix, >>>>>>> more than 50% of the neighbors changed within 10 minutes. After >>>>>>> the >>>>>>> fix, >>>>>>> only about 30% of the neighbors changed within 10 minutes. It is >>>>>>> more >>>>>>> stable but I think 30% is still quite a big number. >>>>>>> >>>>>>> Sorry for the long email :) >>>>>>> >>>>>>> -Eric >>>>>>> >>>>>>> Peter Pietzuch wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Eric Chan-Tin wrote: >>>>>>>>> for (RemoteState<T> neighbor : neighbors) { //this is line >>>>>>>>> 1029 >>>>>>>>> if (neighbor.getLastUpdateTime() < (curr_time - >>>>>>>>> MAINTENANCE_PERIOD)) { >>>>>>>>> neighbors.remove(neighbor); >>>>>>>>> } >>>>>>>>> } >>>>>>>> >>>>>>>> IIRC you can't modify a collection while iterating over it >>>>>>>> using the >>>>>>>> above for loop. Instead you have to use a proper >>>>>>>> Iterator<RemoteState<T>> and use it's remove method. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Peter >>>>>>>> >>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>> On Nov 3, 2008, at 3:46 PM, Eric Chan-Tin wrote: >>>>>>>>>> >>>>>>>>>>> Jonathan, >>>>>>>>>>> >>>>>>>>>>> I can insert some code to fix this and it doesn't look too >>>>>>>>>>> hard >>>>>>>>>>> but I do >>>>>>>>>>> have some concerns. >>>>>>>>>>> >>>>>>>>>>>> I don't remember observing this behavior in our PL >>>>>>>>>>>> deployment, but that could have been because we had the >>>>>>>>>>>> ping >>>>>>>>>>>> time set >>>>>>>>>>>> to 1-2 minutes, IIRC. >>>>>>>>>>> The default ping time was set to 10 seconds + some Gaussian >>>>>>>>>>> distribution, >>>>>>>>>>> so that's what I used, but changing the ping time to 1 >>>>>>>>>>> minute >>>>>>>>>>> or more >>>>>>>>>>> will >>>>>>>>>>> result in the same effect except it will take longer for the >>>>>>>>>>> "cycle" to >>>>>>>>>>> occur. >>>>>>>>>>> >>>>>>>>>>> What I have in mind was the following: >>>>>>>>>>> >>>>>>>>>>> 1) In NCClient.addNeighbor() would be changed to (in >>>>>>>>>>> pseudocode) >>>>>>>>>>> protected boolean addNeighbor(RemoteState<T> guy) { >>>>>>>>>>> boolean added = false; >>>>>>>>>>> //if neighbors.size() >= MAX_NEIGHBORS >>>>>>>>>>> //Check last update time of each neighbor in >>>>>>>>>>> neighbors >>>>>>>>>>> //neighbor.getLastUpdateTime() >>>>>>>>>>> //if any of them is older than some number >>>>>>>>>>> TIMEOUT, >>>>>>>>>>> remove the oldest one, then add "guy" to neighbors >>>>>>>>>>> //else don't add "guy" to neighbors >>>>>>>>>>> //else add "guy" to neighbors >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>> IMO this shouldn't be the behavior of this function because >>>>>>>>>> it >>>>>>>>>> is called >>>>>>>>>> "addNeighbor." Instead what could happen is that part of the >>>>>>>>>> performMaintenance() function could remove old neighbors. >>>>>>>>>> >>>>>>>>>>> 2) In NCManager, getGossipNode() would just pick a random >>>>>>>>>>> node >>>>>>>>>>> from >>>>>>>>>>> localNC.primaryNC.getNeighbors() to send a request to >>>>>>>>>>> every 10 >>>>>>>>>>> seconds. >>>>>>>>>>> >>>>>>>>>>> 3) Pyxida will still keep track of all GossipNodes in the >>>>>>>>>>> network. It >>>>>>>>>>> could >>>>>>>>>>> also (say every 10 minutes) check its NCClient.neighbors >>>>>>>>>>> list >>>>>>>>>>> and see if >>>>>>>>>>> any neighbor is older than TIMEOUT, remove it, then pick a >>>>>>>>>>> random >>>>>>>>>>> GossipNode to send a ping request. >>>>>>>>>>> >>>>>>>>>>> The main problem with this scheme is that one Pyxida node >>>>>>>>>>> would >>>>>>>>>>> most >>>>>>>>>>> likely >>>>>>>>>>> stay with the same set of neighbors all the time. Its >>>>>>>>>>> coordinates >>>>>>>>>>> would be >>>>>>>>>>> greatly affected by those 32 nodes (I am assuming >>>>>>>>>>> MAX_NEIGHBORS >>>>>>>>>>> = 32). >>>>>>>>>>> >>>>>>>>>> Instead of (2), why not have the getNeighborToPing function >>>>>>>>>> determine >>>>>>>>>> the behavior? If it returns a node, then ping it. If it >>>>>>>>>> returns null, >>>>>>>>>> then have the NCManager pick some other node to use? If >>>>>>>>>> things are >>>>>>>>>> hunky-dory, getNeighborToPing() would return a node (a >>>>>>>>>> current >>>>>>>>>> member of >>>>>>>>>> the neighbor set), if we're short on neighbors, then it would >>>>>>>>>> return >>>>>>>>>> null. The periodic maintenance would have already tossed old >>>>>>>>>> nodes, so >>>>>>>>>> this would be self-sustaining. >>>>>>>>>> >>>>>>>>>> The "problem" you mention is fairly central to network >>>>>>>>>> coordinates in >>>>>>>>>> general. Sure the coordinates will be a little better the >>>>>>>>>> more >>>>>>>>>> neighbors you have. But in a very large system talking to >>>>>>>>>> everybody >>>>>>>>>> isn't feasible. Peter and I have a graph in one of the >>>>>>>>>> workshop >>>>>>>>>> papers >>>>>>>>>> which shows the effect of increasing the number of neighbors >>>>>>>>>> >>>>>>>>>>> I don't think the network would be partitioned, but it could >>>>>>>>>>> happen. For >>>>>>>>>>> example, the boostrap nodes have each other in their >>>>>>>>>>> neighbors >>>>>>>>>>> list. >>>>>>>>>>> Every >>>>>>>>>>> new node bootstrapping to the network would contact those >>>>>>>>>>> bootstrapping >>>>>>>>>>> nodes first and add them to their neighbors. So every Pyxida >>>>>>>>>>> node could >>>>>>>>>>> have as its neighbors the "Bootstrap nodes", which would >>>>>>>>>>> make >>>>>>>>>>> Pyxida >>>>>>>>>>> become >>>>>>>>>>> a centralized system. >>>>>>>>>>> >>>>>>>>>> As long as the initial bootstrap list is large enough, I >>>>>>>>>> think >>>>>>>>>> you are >>>>>>>>>> right, that partitions won't be a problem. >>>>>>>>>> >>>>>>>>>>> One last thing: When Pyxida receives a ping request from a >>>>>>>>>>> node, it does >>>>>>>>>>> not update its coordinates. Is there any particular reason >>>>>>>>>>> for >>>>>>>>>>> that? >>>>>>>>>>> Is it >>>>>>>>>>> because Pyxida will not be able to calculate the RTT? >>>>>>>>>>> >>>>>>>>>> Yes. >>>>>>>>>> >>>>>>>>>>> Also, there is no "gossiping" of other nodes' coordinates, >>>>>>>>>>> right? So if A >>>>>>>>>>> has B in its neighbors list, it will from time to time tell >>>>>>>>>>> other nodes >>>>>>>>>>> about B's coordinates. I realize this has a lot of >>>>>>>>>>> complications but the >>>>>>>>>>> original Vivaldi paper was very fuzzy about this point and >>>>>>>>>>> the >>>>>>>>>>> previous >>>>>>>>>>> point that it seems it is up to the implementors to decide >>>>>>>>>>> what >>>>>>>>>>> to do. >>>>>>>>>>> >>>>>>>>>> I don't think we are doing this. It could be useful in >>>>>>>>>> certain >>>>>>>>>> circumstance (e.g., picking potential new neighbors), but >>>>>>>>>> we're >>>>>>>>>> not, IIRC. >>>>>>>>>> >>>>>>>>>>> I will have to think some more about this as there are a lot >>>>>>>>>>> of >>>>>>>>>>> nasty >>>>>>>>>>> little side effects that could happen, but these are my >>>>>>>>>>> thoughts. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -Eric >>>>>>>>>>> >>>>>>>>>> Sounds good. >>>>>>>>>> >>>>>>>>>> -jonathan >>>>>>>>>> >>>>>>>>>>> On Nov 3 2008, Jonathan Ledlie wrote: >>>>>>>>>>> >>>>>>>>>>>> Eric, >>>>>>>>>>>> >>>>>>>>>>>> Based on a look at the code just now, I see where you are >>>>>>>>>>>> coming from: >>>>>>>>>>>> with these params on PL, there isn't anything stopping the >>>>>>>>>>>> neighbor >>>>>>>>>>>> list from circulating pretty quickly through all nodes. In >>>>>>>>>>>> the >>>>>>>>>>>> Azureus implementation, the guys fed to the alg (fed to >>>>>>>>>>>> processSample) >>>>>>>>>>>> weren't changing much because they were coming from the >>>>>>>>>>>> routing finger >>>>>>>>>>>> tables. I don't remember observing this behavior in our PL >>>>>>>>>>>> deployment, but that could have been because we had the >>>>>>>>>>>> ping >>>>>>>>>>>> time set >>>>>>>>>>>> to 1-2 minutes, IIRC. >>>>>>>>>>>> >>>>>>>>>>>> Do you want to insert some code that will fix this? >>>>>>>>>>>> Basically >>>>>>>>>>>> we want >>>>>>>>>>>> to only add a new neighbor if our oldest neighbor should be >>>>>>>>>>>> tossed (or >>>>>>>>>>>> if we haven't reached MAX_NEIGHBORS). It's probably best >>>>>>>>>>>> from a >>>>>>>>>>>> clarity standpoint if addNeighbor does continue to actually >>>>>>>>>>>> add the >>>>>>>>>>>> neighbor in all cases. Probably the best thing to do is to >>>>>>>>>>>> have >>>>>>>>>>>> NCClient drop old neighbors internally and have NCManager >>>>>>>>>>>> ask >>>>>>>>>>>> it which >>>>>>>>>>>> nodes should be pinged via getNeighborToPing. So NCClient >>>>>>>>>>>> would >>>>>>>>>>>> continue to have the same behavior on processSample (it >>>>>>>>>>>> would not >>>>>>>>>>>> reject new information) and NCManager and NCClient would >>>>>>>>>>>> cooperate to >>>>>>>>>>>> measure RTTs to current neighbors. We do want to make >>>>>>>>>>>> sure, >>>>>>>>>>>> however, >>>>>>>>>>>> that the gossip set stays large enough so that partitions >>>>>>>>>>>> don't occur. >>>>>>>>>>>> >>>>>>>>>>>> Do you want to be added to the SourceForge Pyxida >>>>>>>>>>>> developers >>>>>>>>>>>> to make >>>>>>>>>>>> this change? (Please test it out carefully first.) >>>>>>>>>>>> >>>>>>>>>>>> -jonathan >>>>>>>>>>>> >>>>>>>>>>>> On Oct 30, 2008, at 11:32 AM, ext Eric Chan-Tin wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Jonathan, >>>>>>>>>>>>> >>>>>>>>>>>>> From what I saw, eventually, you learn about all the nodes >>>>>>>>>>>>> in >>>>>>>>>>>>> the >>>>>>>>>>>>> network. Every 10 seconds, Pyxida picks a random "Gossip" >>>>>>>>>>>>> node from >>>>>>>>>>>>> its >>>>>>>>>>>>> list and sends a request. The Gossip node will reply back >>>>>>>>>>>>> with an >>>>>>>>>>>>> "upNeighbor", that is, a "Gossip" node that is "up" or >>>>>>>>>>>>> with 10% >>>>>>>>>>>>> probability, a node that is "pending". Since naturally, >>>>>>>>>>>>> all >>>>>>>>>>>>> the nodes >>>>>>>>>>>>> bootstrap off the same set of nodes, every node will >>>>>>>>>>>>> eventually know >>>>>>>>>>>>> of >>>>>>>>>>>>> every other node. >>>>>>>>>>>>> >>>>>>>>>>>>> Pyxida might not be rotating through everybody >>>>>>>>>>>>> continuously >>>>>>>>>>>>> (although >>>>>>>>>>>>> eventually, since it knows everybody, it will have added >>>>>>>>>>>>> everybody to >>>>>>>>>>>>> its Neighbors list at some point), but could be removing a >>>>>>>>>>>>> node that >>>>>>>>>>>>> is >>>>>>>>>>>>> "alive". So, if MAX_NEIGHBORS = 32, let's call them A_1, >>>>>>>>>>>>> A_2, >>>>>>>>>>>>> ..., >>>>>>>>>>>>> A_32. >>>>>>>>>>>>> Pyxida picks a random Gossip node B, sends it a request >>>>>>>>>>>>> and B >>>>>>>>>>>>> replies >>>>>>>>>>>>> back. Pyxida will add B to its Neighbors list, see that >>>>>>>>>>>>> size of >>>>>>>>>>>>> Neighbors > MAX_NEIGHBORS, kick A_1 out. Then from those >>>>>>>>>>>>> 32 >>>>>>>>>>>>> neighbors >>>>>>>>>>>>> A_2, ...., A_32, B, it will calculate the force, and apply >>>>>>>>>>>>> it >>>>>>>>>>>>> to the >>>>>>>>>>>>> current system coordinates. Thus, if every 10 seconds, >>>>>>>>>>>>> Pyxida >>>>>>>>>>>>> picks a >>>>>>>>>>>>> different set of nodes C, D, E, etc... it would keep >>>>>>>>>>>>> kicking >>>>>>>>>>>>> nodes out >>>>>>>>>>>>> of the neighbors list. >>>>>>>>>>>>> >>>>>>>>>>>>> It's the "kicking A_1 out" part that seemed a bit >>>>>>>>>>>>> confusing. >>>>>>>>>>>>> Churn >>>>>>>>>>>>> is a >>>>>>>>>>>>> good reason but you don't check that A_1 might still be >>>>>>>>>>>>> alive >>>>>>>>>>>>> or maybe >>>>>>>>>>>>> A_1 was just added. I have been printing the neighbors >>>>>>>>>>>>> list >>>>>>>>>>>>> every 10 >>>>>>>>>>>>> minutes, and they are completely different between those >>>>>>>>>>>>> 10 >>>>>>>>>>>>> minutes. >>>>>>>>>>>>> >>>>>>>>>>>>> One possible problem might be that I am choosing the value >>>>>>>>>>>>> of >>>>>>>>>>>>> MAX_NEIGHBORS to be too small. Originally it was set to >>>>>>>>>>>>> 512 >>>>>>>>>>>>> (which was >>>>>>>>>>>>> meant for Azureus). Since I am using Planetlab where there >>>>>>>>>>>>> are about >>>>>>>>>>>>> 400-500 "stable" nodes, I changed the value to 32. But I >>>>>>>>>>>>> don't think >>>>>>>>>>>>> that should be a problem since log(n) = log(512) = 9. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> -Eric >>>>>>>>>>>>> >>>>>>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>>>>>> Eric, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Glad to hear it. Let Peter and me know if you publish >>>>>>>>>>>>>> anything using >>>>>>>>>>>>>> Pyxida and we can put it on the web page. >>>>>>>>>>>>>> >>>>>>>>>>>>>> "Gossip" nodes are for learning about the network. >>>>>>>>>>>>>> "Neighbors" are >>>>>>>>>>>>>> for >>>>>>>>>>>>>> constructing the coordinate. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So, IIRC, the gossip process learns about nodes that >>>>>>>>>>>>>> may or >>>>>>>>>>>>>> may not >>>>>>>>>>>>>> be >>>>>>>>>>>>>> applied to the neighbor list. That is, we may eventually >>>>>>>>>>>>>> learn >>>>>>>>>>>>>> about a >>>>>>>>>>>>>> large portion of the network, via gossiping, but should >>>>>>>>>>>>>> only >>>>>>>>>>>>>> be using >>>>>>>>>>>>>> MAX_NEIGHBORS at any given time for coordinate >>>>>>>>>>>>>> computation. >>>>>>>>>>>>>> The >>>>>>>>>>>>>> reason >>>>>>>>>>>>>> we don't want to use a fixed set of neighbors for >>>>>>>>>>>>>> coordinate >>>>>>>>>>>>>> computation >>>>>>>>>>>>>> is because nodes may come and go from the system. >>>>>>>>>>>>>> However, >>>>>>>>>>>>>> it should >>>>>>>>>>>>>> not be the case that we just rotate through everybody, >>>>>>>>>>>>>> because >>>>>>>>>>>>>> then, as >>>>>>>>>>>>>> you note, we'd be computing against everyone's >>>>>>>>>>>>>> coordinate/ >>>>>>>>>>>>>> latency, >>>>>>>>>>>>>> and >>>>>>>>>>>>>> this is counter to the point of network coordinates. >>>>>>>>>>>>>> With this >>>>>>>>>>>>>> gradual >>>>>>>>>>>>>> transience as a desirable quality in mind, I am pretty >>>>>>>>>>>>>> sure we >>>>>>>>>>>>>> maintain >>>>>>>>>>>>>> neighbors in a fairly fixed manner. Does this not seem >>>>>>>>>>>>>> to >>>>>>>>>>>>>> be the >>>>>>>>>>>>>> case >>>>>>>>>>>>>> to you? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -jonathan >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Oct 29, 2008, at 9:20 PM, Eric Chan-Tin wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Jonathan and Peter, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First of all, just wanted to say that Pyxida has been >>>>>>>>>>>>>>> working >>>>>>>>>>>>>>> great on >>>>>>>>>>>>>>> Planetlab for the past 4 months or so. Getting good data >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> experiments >>>>>>>>>>>>>>> going well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, I do have one question/comment. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Every Pyxida node contains a list of Neighbors (in >>>>>>>>>>>>>>> NCClient) and a >>>>>>>>>>>>>>> list >>>>>>>>>>>>>>> of Gossip nodes (they are called neighbors in NCManager >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>> I will >>>>>>>>>>>>>>> call >>>>>>>>>>>>>>> them gossip for now so as not to confuse with the 'real' >>>>>>>>>>>>>>> neighbors >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> NCClient). Every 10 seconds, Pyxida will pick a random >>>>>>>>>>>>>>> node >>>>>>>>>>>>>>> (call >>>>>>>>>>>>>>> it A) >>>>>>>>>>>>>>> from Gossip and send a gossip request (basically asking >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> rtt, >>>>>>>>>>>>>>> coordinate, error). But Gossip contains all the nodes in >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> network. >>>>>>>>>>>>>>> The comment in the code said that was to avoid Pyxida >>>>>>>>>>>>>>> becoming >>>>>>>>>>>>>>> lonely >>>>>>>>>>>>>>> and no-one to gossip with, which makes sense. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Once A responds, Pyxida will add A to its Neighbors (if >>>>>>>>>>>>>>> Neighbors >>>>>>>>>>>>>>> does >>>>>>>>>>>>>>> not contain A) and remove the first element from the >>>>>>>>>>>>>>> Neighbors list. >>>>>>>>>>>>>>> This seems a little bit odd. Was there any particular >>>>>>>>>>>>>>> reasoning >>>>>>>>>>>>>>> behind >>>>>>>>>>>>>>> this? Because it would seem like you're getting the >>>>>>>>>>>>>>> coordinate from >>>>>>>>>>>>>>> every node in the network eventually. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks a lot, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Eric >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>>>>>>>> Your inclination is correct. See forwarded message. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -jonathan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Apr 9, 2008, at 1:53 PM, ext Eric Chan-Tin wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you for the clarification. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've been reading through your published papers and >>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>> and I >>>>>>>>>>>>>>>>> must >>>>>>>>>>>>>>>>> say I learned a lot about how Pyxida works :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I still have a quick and easy (hopefully) question. >>>>>>>>>>>>>>>>> Each >>>>>>>>>>>>>>>>> Pyxida >>>>>>>>>>>>>>>>> node >>>>>>>>>>>>>>>>> has a small amount of neighbors and learn about other >>>>>>>>>>>>>>>>> nodes >>>>>>>>>>>>>>>>> through >>>>>>>>>>>>>>>>> other nodes. I am running on about 200 machines, and >>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>> seems >>>>>>>>>>>>>>>>> like all >>>>>>>>>>>>>>>>> the Pyxida nodes are adding each other to their >>>>>>>>>>>>>>>>> upNeighbors set. >>>>>>>>>>>>>>>>> NCClient.java has the MAX_NEIGHBORS set to 512. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I wanted to make sure that I am not misreading the >>>>>>>>>>>>>>>>> code, but >>>>>>>>>>>>>>>>> shouldn't >>>>>>>>>>>>>>>>> each Pyxida node keep a small number of neighbors in >>>>>>>>>>>>>>>>> its >>>>>>>>>>>>>>>>> upNeighbors >>>>>>>>>>>>>>>>> set. Else, it will be a lot of updates to send a >>>>>>>>>>>>>>>>> gossipMessage >>>>>>>>>>>>>>>>> to all >>>>>>>>>>>>>>>>> 200 neighbors, and it also means that a Pyxida node >>>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>>> the whole >>>>>>>>>>>>>>>>> network in its upNeighbors set - that seems wrong. I >>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>> inclined >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> change the value of MAX_NEIGHBORS but wanted to make >>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>> that my >>>>>>>>>>>>>>>>> understanding is correct first. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Eric >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move >>>>>>>>>>>>>>> Developer's >>>>>>>>>>>>>>> challenge Build the coolest Linux based applications >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> Moblin >>>>>>>>>>>>>>> SDK & >>>>>>>>>>>>>>> win great prizes Grand prize is a trip for two to an >>>>>>>>>>>>>>> Open >>>>>>>>>>>>>>> Source >>>>>>>>>>>>>>> event >>>>>>>>>>>>>>> anywhere in the world >>>>>>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> Pyxida-users mailing >>>>>>>>>>>>>>> list Pyx...@li... >>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida- >>>>>>>>>>>>>>> users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move >>>>>>>>>>> Developer's challenge Build the coolest Linux based >>>>>>>>>>> applications >>>>>>>>>>> with Moblin SDK & win great prizes Grand prize is a trip >>>>>>>>>>> for two >>>>>>>>>>> to an Open Source event anywhere in the world >>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>>>>>>> _______________________________________________ Pyxida-users >>>>>>>>>>> mailing list Pyx...@li... >>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las >>>>>>>>> Vegas, Nevada. The future of the web can't happen without you. >>>>>>>>> Join us at MIX09 to help pave the way to the Next Web now. >>>>>>>>> Learn >>>>>>>>> more and register at >>>>>>>>> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ >>>>>>>>> _______________________________________________ >>>>>>>>> Pyxida-users mailing list Pyx...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>>>>>>> >>>>>>>> >>>>>> >> > > > ------------------------------------------------------------------------------ > SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, > Nevada. > The future of the web can't happen without you. Join us at MIX09 to > help > pave the way to the Next Web now. Learn more and register at > http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ > _______________________________________________ > Pyxida-users mailing list > Pyx...@li... > https://lists.sourceforge.net/lists/listinfo/pyxida-users |