From: Jonathan L. <jon...@ee...> - 2008-12-18 20:05:27
|
Eric, Do you have a PL Pyxida slice that I can connect to to try this out? That is, I'll run it on my machine, but connect to your slice via the right ports. -jonathan On Dec 18, 2008, at 11:35 AM, Eric Chan-Tin wrote: > Sorry for the delay. Was busy with final exams and projects. > > I made the changes. They can be found at > > http://www-users.cs.umn.edu/~dchantin/NCClient.java > http://www-users.cs.umn.edu/~dchantin/NCManager.java > http://www-users.cs.umn.edu/~dchantin/RemoteState.java > > All the changes are enclosed within > //ERIC START > ... > //ERIC END > > I compared every reported "neighbors list" of each Pyxida node > (about 370 > total) and see how many neighbors did not change within 10 minutes. > Then > did the average for all the nodes. > > Original: 43.37% of the neighbors stayed the same on average. > > Fix0 (before we found the interesting situation): 74.73% > > Fix1 (with the weight probability of picking a neighbor to ping): > 95.79% > > Fix2 (with the weight probability and the MIN_UPDATE_PING): 98.81% > > Also, the reported error, rrl, etc.. were much lower in the Fixed > Pyxida > than in the original one (this could not be significant, since the > Original > experiment was performed 1-2 months ago), but as expected, it was not > higher. > > You can take a look at the code and test it if you want. If everything > looks fine, I'll remove some of the debugging code and other "useless" > comments (like //ERIC). Also, let me know which "fix" you prefer. I > will > then email you the updated code. > > Thanks, > > -Eric > > > On Dec 10 2008, Jonathan Ledlie wrote: > >> >> Seems reasonable. >> >> Both ideas (lastPingTime and a weighted probability based on age) >> seem >> to be independent. >> >> Some of the code is more CPU intensive than it should be (e.g., >> looping through potentially long lists), and that is a slight >> downside >> to the function approach vs a set, but this probably isn't a big >> deal. >> >> -jonathan >> >> On Dec 10, 2008, at 10:40 PM, Eric Chan-Tin wrote: >> >>> okay. I think I might have misunderstood you. Your idea of >>> lastPingTime >>> looks interesting. >>> >>> My idea is to introduce a probability function. Right now, it just >>> randomly picks one of the neighbors. What I am thinking it might >>> do is >>> that instead of just randomly picking a neighbor, pick a neighbor >>> based >>> on some function. So, the neighbor whose lastUpdateTime is oldest >>> will >>> have a higher chance of being picked than a neighbor who was just >>> updated. >>> >>> Does that make more sense? >>> >>> -Eric >>> >>> Jonathan Ledlie wrote: >>>> >>>> Hmm. The advantage of lastPingTime is that you learn whether or >>>> not >>>> you've tried to contact a neighbor and they haven't responded. >>>> Then you >>>> can use that knowledge to determine whether or not to toss them. >>>> >>>> Maybe I'm not understanding your suggestion then. >>>> >>>> -jonathan >>>> >>>> On Dec 10, 2008, at 12:51 PM, Eric Chan-Tin wrote: >>>> >>>>> That's a good idea. Here is a little twist on it. >>>>> >>>>> Put a "weight" on each neighbor, based on the lastUpdateTime. So >>>>> if >>>>> currentTime - neighbor's lastUpdateTime, then the weight is >>>>> higher. >>>>> Then in getNeighborToPing, neighbor's with higher weight have a >>>>> higher >>>>> probability of getting picked than neighbor's with lower weight. >>>>> >>>>> The weight could be ( (currentTime - lastUpdateTime) / 15 >>>>> minutes). >>>>> It's not 10 minutes so as to avoid a probability of 1 if the >>>>> lastUpdateTime was the previous performMaintenance. This should >>>>> also >>>>> avoid the problem of always pinging the oldest neighbor since it's >>>>> about probability (weight). >>>>> >>>>> It seems a little bit less complicated than having lastPingTime >>>>> and >>>>> MIN_UPDATE_TIME. >>>>> >>>>> -Eric >>>>> >>>>> On Dec 10 2008, Jonathan Ledlie wrote: >>>>> >>>>>> >>>>>> This is an interesting situation. Let's think about the goal. >>>>>> The >>>>>> goal is to keep pinging the same nodes as long as they are alive, >>>>>> and to replace a node soon after it dies, keeping the total >>>>>> number >>>>>> of up neighbors at approximately MAX_NEIGHBORS. >>>>>> >>>>>> How about the following: >>>>>> - when you pick a neighbor to ping, you set a time on that event >>>>>> (in >>>>>> a new per-neighbor timestamp LastPingTime) >>>>>> - when you first insert a neighbor, set this timestamp to 0. >>>>>> - when you loop in performMaintenance, remove nodes whose >>>>>> LastPingTime - LastUpdateTime > threshold (like 10 minutes) >>>>>> >>>>>> - when picking a node, pick the guy with the oldest LastPingTime >>>>>> (not sure about this one -- I don't like situations where you >>>>>> pick >>>>>> the absolute *oldest* because then we could get in a situation >>>>>> where >>>>>> an old guy sits there, waiting to be kicked out, and we keep >>>>>> picking >>>>>> him to be pinged; I'd prefer to pick randomly, or random older >>>>>> guys). >>>>>> - a simple way around this is to simply have a self-adjusting >>>>>> threshold: so, we walk through the nodes, and everyone whose >>>>>> LastPingTime is older than MIN_UPDATE_TIME_TO_PING ago (a >>>>>> relative >>>>>> time) get added to a set; we pick a guy randomly from the set to >>>>>> ping. If the set is empty, we increase MIN_UPDATE_TIME_TO_PING, >>>>>> and >>>>>> wait for the next time the function is called. If the set is too >>>>>> large (relative to a fraction of MAX_NEIGHBORS), we decrease it. >>>>>> >>>>>> We still might get some churn from lost UDP packets. You might >>>>>> also >>>>>> give nodes a second chance to be pinged by including >>>>>> LastPingTime - >>>>>> LastUpdateTime > 1 minute in the set. Note that increases and >>>>>> decreases in MIN_UPDATE_TIME_TO_PING shouldn't be affected by the >>>>>> number of nodes that fall into this category. >>>>>> >>>>>> We still probably need a way for new potential neighbors to be >>>>>> rejected if our current set of neighbors is fine. Because I >>>>>> still >>>>>> want addNeigbor to really add the neighbor, how about putting a >>>>>> check before calling that to see if we need more neighbors. >>>>>> >>>>>> What thinks? >>>>>> >>>>>> -jonathan >>>>>> >>>>>> On Dec 9, 2008, at 11:37 PM, ext Eric Chan-Tin wrote: >>>>>> >>>>>>> Peter, >>>>>>> >>>>>>> Thanks. That fixed it. >>>>>>> >>>>>>> However, here is another problem I found while deploying Pyxida >>>>>>> and >>>>>>> running them for a while. >>>>>>> >>>>>>> In performMaintenance(), I loop through every neighbor and if >>>>>>> (neighbor.getLastUpdateTime() < current_time - 10 minutes), >>>>>>> then I >>>>>>> remove that neighbor. >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> In NCManager, I call getNeighborToPing, which randomly picks one >>>>>>> of >>>>>>> the >>>>>>> neighbors (32 in this case). Pinging is performed every 10 >>>>>>> seconds, so >>>>>>> in 10 minutes, you can ping 60 nodes. However, if a neighbor A >>>>>>> is not >>>>>>> among those 60 nodes, it will get kicked out in >>>>>>> performMaintenance(), >>>>>>> although A is still alive (it just hasn't been contacted yet). >>>>>>> >>>>>>> There are some possible options. >>>>>>> >>>>>>> 1) Increase the timeout period from 10 minutes to ?? minutes. >>>>>>> This >>>>>>> doesn't quite work because say in Azureus, the number of >>>>>>> neighbors is >>>>>>> 512, so the timeout needs to be increased even more... It could >>>>>>> be a >>>>>>> function of MAX_NEIGHBORS... >>>>>>> >>>>>>> 2) When Pyxida replies to a ping request, it will update the >>>>>>> neighbor's >>>>>>> lastUpdateTime. This does not quite work either because it >>>>>>> assumes a >>>>>>> symmetric relationship which is most likely not the case. >>>>>>> >>>>>>> 3) Introduce some more logic to Pyxida so it remembers what >>>>>>> nodes it >>>>>>> contacted and does not send, say, more than 2 requests in 10 >>>>>>> minutes to >>>>>>> the same node. This solution isn't that great either. >>>>>>> >>>>>>> Let me know what you guys think. I am currently leaning towards >>>>>>> option >>>>>>> 1) but make the timeout period a function of MAX_NEIGHBORS, >>>>>>> although I >>>>>>> haven't quite found the perfect function yet. >>>>>>> >>>>>>> Here are some preliminary results. I compared the number of >>>>>>> neighbors >>>>>>> which are the same at each reported time (10 minutes). Before >>>>>>> the fix, >>>>>>> more than 50% of the neighbors changed within 10 minutes. After >>>>>>> the >>>>>>> fix, >>>>>>> only about 30% of the neighbors changed within 10 minutes. It is >>>>>>> more >>>>>>> stable but I think 30% is still quite a big number. >>>>>>> >>>>>>> Sorry for the long email :) >>>>>>> >>>>>>> -Eric >>>>>>> >>>>>>> Peter Pietzuch wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Eric Chan-Tin wrote: >>>>>>>>> for (RemoteState<T> neighbor : neighbors) { //this is line >>>>>>>>> 1029 >>>>>>>>> if (neighbor.getLastUpdateTime() < (curr_time - >>>>>>>>> MAINTENANCE_PERIOD)) { >>>>>>>>> neighbors.remove(neighbor); >>>>>>>>> } >>>>>>>>> } >>>>>>>> >>>>>>>> IIRC you can't modify a collection while iterating over it >>>>>>>> using the >>>>>>>> above for loop. Instead you have to use a proper >>>>>>>> Iterator<RemoteState<T>> and use it's remove method. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Peter >>>>>>>> >>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>> On Nov 3, 2008, at 3:46 PM, Eric Chan-Tin wrote: >>>>>>>>>> >>>>>>>>>>> Jonathan, >>>>>>>>>>> >>>>>>>>>>> I can insert some code to fix this and it doesn't look too >>>>>>>>>>> hard >>>>>>>>>>> but I do >>>>>>>>>>> have some concerns. >>>>>>>>>>> >>>>>>>>>>>> I don't remember observing this behavior in our PL >>>>>>>>>>>> deployment, but that could have been because we had the >>>>>>>>>>>> ping >>>>>>>>>>>> time set >>>>>>>>>>>> to 1-2 minutes, IIRC. >>>>>>>>>>> The default ping time was set to 10 seconds + some Gaussian >>>>>>>>>>> distribution, >>>>>>>>>>> so that's what I used, but changing the ping time to 1 >>>>>>>>>>> minute >>>>>>>>>>> or more >>>>>>>>>>> will >>>>>>>>>>> result in the same effect except it will take longer for the >>>>>>>>>>> "cycle" to >>>>>>>>>>> occur. >>>>>>>>>>> >>>>>>>>>>> What I have in mind was the following: >>>>>>>>>>> >>>>>>>>>>> 1) In NCClient.addNeighbor() would be changed to (in >>>>>>>>>>> pseudocode) >>>>>>>>>>> protected boolean addNeighbor(RemoteState<T> guy) { >>>>>>>>>>> boolean added = false; >>>>>>>>>>> //if neighbors.size() >= MAX_NEIGHBORS >>>>>>>>>>> //Check last update time of each neighbor in >>>>>>>>>>> neighbors >>>>>>>>>>> //neighbor.getLastUpdateTime() >>>>>>>>>>> //if any of them is older than some number >>>>>>>>>>> TIMEOUT, >>>>>>>>>>> remove the oldest one, then add "guy" to neighbors >>>>>>>>>>> //else don't add "guy" to neighbors >>>>>>>>>>> //else add "guy" to neighbors >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>> IMO this shouldn't be the behavior of this function because >>>>>>>>>> it >>>>>>>>>> is called >>>>>>>>>> "addNeighbor." Instead what could happen is that part of the >>>>>>>>>> performMaintenance() function could remove old neighbors. >>>>>>>>>> >>>>>>>>>>> 2) In NCManager, getGossipNode() would just pick a random >>>>>>>>>>> node >>>>>>>>>>> from >>>>>>>>>>> localNC.primaryNC.getNeighbors() to send a request to >>>>>>>>>>> every 10 >>>>>>>>>>> seconds. >>>>>>>>>>> >>>>>>>>>>> 3) Pyxida will still keep track of all GossipNodes in the >>>>>>>>>>> network. It >>>>>>>>>>> could >>>>>>>>>>> also (say every 10 minutes) check its NCClient.neighbors >>>>>>>>>>> list >>>>>>>>>>> and see if >>>>>>>>>>> any neighbor is older than TIMEOUT, remove it, then pick a >>>>>>>>>>> random >>>>>>>>>>> GossipNode to send a ping request. >>>>>>>>>>> >>>>>>>>>>> The main problem with this scheme is that one Pyxida node >>>>>>>>>>> would >>>>>>>>>>> most >>>>>>>>>>> likely >>>>>>>>>>> stay with the same set of neighbors all the time. Its >>>>>>>>>>> coordinates >>>>>>>>>>> would be >>>>>>>>>>> greatly affected by those 32 nodes (I am assuming >>>>>>>>>>> MAX_NEIGHBORS >>>>>>>>>>> = 32). >>>>>>>>>>> >>>>>>>>>> Instead of (2), why not have the getNeighborToPing function >>>>>>>>>> determine >>>>>>>>>> the behavior? If it returns a node, then ping it. If it >>>>>>>>>> returns null, >>>>>>>>>> then have the NCManager pick some other node to use? If >>>>>>>>>> things are >>>>>>>>>> hunky-dory, getNeighborToPing() would return a node (a >>>>>>>>>> current >>>>>>>>>> member of >>>>>>>>>> the neighbor set), if we're short on neighbors, then it would >>>>>>>>>> return >>>>>>>>>> null. The periodic maintenance would have already tossed old >>>>>>>>>> nodes, so >>>>>>>>>> this would be self-sustaining. >>>>>>>>>> >>>>>>>>>> The "problem" you mention is fairly central to network >>>>>>>>>> coordinates in >>>>>>>>>> general. Sure the coordinates will be a little better the >>>>>>>>>> more >>>>>>>>>> neighbors you have. But in a very large system talking to >>>>>>>>>> everybody >>>>>>>>>> isn't feasible. Peter and I have a graph in one of the >>>>>>>>>> workshop >>>>>>>>>> papers >>>>>>>>>> which shows the effect of increasing the number of neighbors >>>>>>>>>> >>>>>>>>>>> I don't think the network would be partitioned, but it could >>>>>>>>>>> happen. For >>>>>>>>>>> example, the boostrap nodes have each other in their >>>>>>>>>>> neighbors >>>>>>>>>>> list. >>>>>>>>>>> Every >>>>>>>>>>> new node bootstrapping to the network would contact those >>>>>>>>>>> bootstrapping >>>>>>>>>>> nodes first and add them to their neighbors. So every Pyxida >>>>>>>>>>> node could >>>>>>>>>>> have as its neighbors the "Bootstrap nodes", which would >>>>>>>>>>> make >>>>>>>>>>> Pyxida >>>>>>>>>>> become >>>>>>>>>>> a centralized system. >>>>>>>>>>> >>>>>>>>>> As long as the initial bootstrap list is large enough, I >>>>>>>>>> think >>>>>>>>>> you are >>>>>>>>>> right, that partitions won't be a problem. >>>>>>>>>> >>>>>>>>>>> One last thing: When Pyxida receives a ping request from a >>>>>>>>>>> node, it does >>>>>>>>>>> not update its coordinates. Is there any particular reason >>>>>>>>>>> for >>>>>>>>>>> that? >>>>>>>>>>> Is it >>>>>>>>>>> because Pyxida will not be able to calculate the RTT? >>>>>>>>>>> >>>>>>>>>> Yes. >>>>>>>>>> >>>>>>>>>>> Also, there is no "gossiping" of other nodes' coordinates, >>>>>>>>>>> right? So if A >>>>>>>>>>> has B in its neighbors list, it will from time to time tell >>>>>>>>>>> other nodes >>>>>>>>>>> about B's coordinates. I realize this has a lot of >>>>>>>>>>> complications but the >>>>>>>>>>> original Vivaldi paper was very fuzzy about this point and >>>>>>>>>>> the >>>>>>>>>>> previous >>>>>>>>>>> point that it seems it is up to the implementors to decide >>>>>>>>>>> what >>>>>>>>>>> to do. >>>>>>>>>>> >>>>>>>>>> I don't think we are doing this. It could be useful in >>>>>>>>>> certain >>>>>>>>>> circumstance (e.g., picking potential new neighbors), but >>>>>>>>>> we're >>>>>>>>>> not, IIRC. >>>>>>>>>> >>>>>>>>>>> I will have to think some more about this as there are a lot >>>>>>>>>>> of >>>>>>>>>>> nasty >>>>>>>>>>> little side effects that could happen, but these are my >>>>>>>>>>> thoughts. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -Eric >>>>>>>>>>> >>>>>>>>>> Sounds good. >>>>>>>>>> >>>>>>>>>> -jonathan >>>>>>>>>> >>>>>>>>>>> On Nov 3 2008, Jonathan Ledlie wrote: >>>>>>>>>>> >>>>>>>>>>>> Eric, >>>>>>>>>>>> >>>>>>>>>>>> Based on a look at the code just now, I see where you are >>>>>>>>>>>> coming from: >>>>>>>>>>>> with these params on PL, there isn't anything stopping the >>>>>>>>>>>> neighbor >>>>>>>>>>>> list from circulating pretty quickly through all nodes. In >>>>>>>>>>>> the >>>>>>>>>>>> Azureus implementation, the guys fed to the alg (fed to >>>>>>>>>>>> processSample) >>>>>>>>>>>> weren't changing much because they were coming from the >>>>>>>>>>>> routing finger >>>>>>>>>>>> tables. I don't remember observing this behavior in our PL >>>>>>>>>>>> deployment, but that could have been because we had the >>>>>>>>>>>> ping >>>>>>>>>>>> time set >>>>>>>>>>>> to 1-2 minutes, IIRC. >>>>>>>>>>>> >>>>>>>>>>>> Do you want to insert some code that will fix this? >>>>>>>>>>>> Basically >>>>>>>>>>>> we want >>>>>>>>>>>> to only add a new neighbor if our oldest neighbor should be >>>>>>>>>>>> tossed (or >>>>>>>>>>>> if we haven't reached MAX_NEIGHBORS). It's probably best >>>>>>>>>>>> from a >>>>>>>>>>>> clarity standpoint if addNeighbor does continue to actually >>>>>>>>>>>> add the >>>>>>>>>>>> neighbor in all cases. Probably the best thing to do is to >>>>>>>>>>>> have >>>>>>>>>>>> NCClient drop old neighbors internally and have NCManager >>>>>>>>>>>> ask >>>>>>>>>>>> it which >>>>>>>>>>>> nodes should be pinged via getNeighborToPing. So NCClient >>>>>>>>>>>> would >>>>>>>>>>>> continue to have the same behavior on processSample (it >>>>>>>>>>>> would not >>>>>>>>>>>> reject new information) and NCManager and NCClient would >>>>>>>>>>>> cooperate to >>>>>>>>>>>> measure RTTs to current neighbors. We do want to make >>>>>>>>>>>> sure, >>>>>>>>>>>> however, >>>>>>>>>>>> that the gossip set stays large enough so that partitions >>>>>>>>>>>> don't occur. >>>>>>>>>>>> >>>>>>>>>>>> Do you want to be added to the SourceForge Pyxida >>>>>>>>>>>> developers >>>>>>>>>>>> to make >>>>>>>>>>>> this change? (Please test it out carefully first.) >>>>>>>>>>>> >>>>>>>>>>>> -jonathan >>>>>>>>>>>> >>>>>>>>>>>> On Oct 30, 2008, at 11:32 AM, ext Eric Chan-Tin wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Jonathan, >>>>>>>>>>>>> >>>>>>>>>>>>> From what I saw, eventually, you learn about all the nodes >>>>>>>>>>>>> in >>>>>>>>>>>>> the >>>>>>>>>>>>> network. Every 10 seconds, Pyxida picks a random "Gossip" >>>>>>>>>>>>> node from >>>>>>>>>>>>> its >>>>>>>>>>>>> list and sends a request. The Gossip node will reply back >>>>>>>>>>>>> with an >>>>>>>>>>>>> "upNeighbor", that is, a "Gossip" node that is "up" or >>>>>>>>>>>>> with 10% >>>>>>>>>>>>> probability, a node that is "pending". Since naturally, >>>>>>>>>>>>> all >>>>>>>>>>>>> the nodes >>>>>>>>>>>>> bootstrap off the same set of nodes, every node will >>>>>>>>>>>>> eventually know >>>>>>>>>>>>> of >>>>>>>>>>>>> every other node. >>>>>>>>>>>>> >>>>>>>>>>>>> Pyxida might not be rotating through everybody >>>>>>>>>>>>> continuously >>>>>>>>>>>>> (although >>>>>>>>>>>>> eventually, since it knows everybody, it will have added >>>>>>>>>>>>> everybody to >>>>>>>>>>>>> its Neighbors list at some point), but could be removing a >>>>>>>>>>>>> node that >>>>>>>>>>>>> is >>>>>>>>>>>>> "alive". So, if MAX_NEIGHBORS = 32, let's call them A_1, >>>>>>>>>>>>> A_2, >>>>>>>>>>>>> ..., >>>>>>>>>>>>> A_32. >>>>>>>>>>>>> Pyxida picks a random Gossip node B, sends it a request >>>>>>>>>>>>> and B >>>>>>>>>>>>> replies >>>>>>>>>>>>> back. Pyxida will add B to its Neighbors list, see that >>>>>>>>>>>>> size of >>>>>>>>>>>>> Neighbors > MAX_NEIGHBORS, kick A_1 out. Then from those >>>>>>>>>>>>> 32 >>>>>>>>>>>>> neighbors >>>>>>>>>>>>> A_2, ...., A_32, B, it will calculate the force, and apply >>>>>>>>>>>>> it >>>>>>>>>>>>> to the >>>>>>>>>>>>> current system coordinates. Thus, if every 10 seconds, >>>>>>>>>>>>> Pyxida >>>>>>>>>>>>> picks a >>>>>>>>>>>>> different set of nodes C, D, E, etc... it would keep >>>>>>>>>>>>> kicking >>>>>>>>>>>>> nodes out >>>>>>>>>>>>> of the neighbors list. >>>>>>>>>>>>> >>>>>>>>>>>>> It's the "kicking A_1 out" part that seemed a bit >>>>>>>>>>>>> confusing. >>>>>>>>>>>>> Churn >>>>>>>>>>>>> is a >>>>>>>>>>>>> good reason but you don't check that A_1 might still be >>>>>>>>>>>>> alive >>>>>>>>>>>>> or maybe >>>>>>>>>>>>> A_1 was just added. I have been printing the neighbors >>>>>>>>>>>>> list >>>>>>>>>>>>> every 10 >>>>>>>>>>>>> minutes, and they are completely different between those >>>>>>>>>>>>> 10 >>>>>>>>>>>>> minutes. >>>>>>>>>>>>> >>>>>>>>>>>>> One possible problem might be that I am choosing the value >>>>>>>>>>>>> of >>>>>>>>>>>>> MAX_NEIGHBORS to be too small. Originally it was set to >>>>>>>>>>>>> 512 >>>>>>>>>>>>> (which was >>>>>>>>>>>>> meant for Azureus). Since I am using Planetlab where there >>>>>>>>>>>>> are about >>>>>>>>>>>>> 400-500 "stable" nodes, I changed the value to 32. But I >>>>>>>>>>>>> don't think >>>>>>>>>>>>> that should be a problem since log(n) = log(512) = 9. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> -Eric >>>>>>>>>>>>> >>>>>>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>>>>>> Eric, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Glad to hear it. Let Peter and me know if you publish >>>>>>>>>>>>>> anything using >>>>>>>>>>>>>> Pyxida and we can put it on the web page. >>>>>>>>>>>>>> >>>>>>>>>>>>>> "Gossip" nodes are for learning about the network. >>>>>>>>>>>>>> "Neighbors" are >>>>>>>>>>>>>> for >>>>>>>>>>>>>> constructing the coordinate. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So, IIRC, the gossip process learns about nodes that >>>>>>>>>>>>>> may or >>>>>>>>>>>>>> may not >>>>>>>>>>>>>> be >>>>>>>>>>>>>> applied to the neighbor list. That is, we may eventually >>>>>>>>>>>>>> learn >>>>>>>>>>>>>> about a >>>>>>>>>>>>>> large portion of the network, via gossiping, but should >>>>>>>>>>>>>> only >>>>>>>>>>>>>> be using >>>>>>>>>>>>>> MAX_NEIGHBORS at any given time for coordinate >>>>>>>>>>>>>> computation. >>>>>>>>>>>>>> The >>>>>>>>>>>>>> reason >>>>>>>>>>>>>> we don't want to use a fixed set of neighbors for >>>>>>>>>>>>>> coordinate >>>>>>>>>>>>>> computation >>>>>>>>>>>>>> is because nodes may come and go from the system. >>>>>>>>>>>>>> However, >>>>>>>>>>>>>> it should >>>>>>>>>>>>>> not be the case that we just rotate through everybody, >>>>>>>>>>>>>> because >>>>>>>>>>>>>> then, as >>>>>>>>>>>>>> you note, we'd be computing against everyone's >>>>>>>>>>>>>> coordinate/ >>>>>>>>>>>>>> latency, >>>>>>>>>>>>>> and >>>>>>>>>>>>>> this is counter to the point of network coordinates. >>>>>>>>>>>>>> With this >>>>>>>>>>>>>> gradual >>>>>>>>>>>>>> transience as a desirable quality in mind, I am pretty >>>>>>>>>>>>>> sure we >>>>>>>>>>>>>> maintain >>>>>>>>>>>>>> neighbors in a fairly fixed manner. Does this not seem >>>>>>>>>>>>>> to >>>>>>>>>>>>>> be the >>>>>>>>>>>>>> case >>>>>>>>>>>>>> to you? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -jonathan >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Oct 29, 2008, at 9:20 PM, Eric Chan-Tin wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Jonathan and Peter, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> First of all, just wanted to say that Pyxida has been >>>>>>>>>>>>>>> working >>>>>>>>>>>>>>> great on >>>>>>>>>>>>>>> Planetlab for the past 4 months or so. Getting good data >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> experiments >>>>>>>>>>>>>>> going well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, I do have one question/comment. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Every Pyxida node contains a list of Neighbors (in >>>>>>>>>>>>>>> NCClient) and a >>>>>>>>>>>>>>> list >>>>>>>>>>>>>>> of Gossip nodes (they are called neighbors in NCManager >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>> I will >>>>>>>>>>>>>>> call >>>>>>>>>>>>>>> them gossip for now so as not to confuse with the 'real' >>>>>>>>>>>>>>> neighbors >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> NCClient). Every 10 seconds, Pyxida will pick a random >>>>>>>>>>>>>>> node >>>>>>>>>>>>>>> (call >>>>>>>>>>>>>>> it A) >>>>>>>>>>>>>>> from Gossip and send a gossip request (basically asking >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> rtt, >>>>>>>>>>>>>>> coordinate, error). But Gossip contains all the nodes in >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> network. >>>>>>>>>>>>>>> The comment in the code said that was to avoid Pyxida >>>>>>>>>>>>>>> becoming >>>>>>>>>>>>>>> lonely >>>>>>>>>>>>>>> and no-one to gossip with, which makes sense. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Once A responds, Pyxida will add A to its Neighbors (if >>>>>>>>>>>>>>> Neighbors >>>>>>>>>>>>>>> does >>>>>>>>>>>>>>> not contain A) and remove the first element from the >>>>>>>>>>>>>>> Neighbors list. >>>>>>>>>>>>>>> This seems a little bit odd. Was there any particular >>>>>>>>>>>>>>> reasoning >>>>>>>>>>>>>>> behind >>>>>>>>>>>>>>> this? Because it would seem like you're getting the >>>>>>>>>>>>>>> coordinate from >>>>>>>>>>>>>>> every node in the network eventually. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks a lot, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Eric >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jonathan Ledlie wrote: >>>>>>>>>>>>>>>> Your inclination is correct. See forwarded message. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -jonathan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Apr 9, 2008, at 1:53 PM, ext Eric Chan-Tin wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you for the clarification. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've been reading through your published papers and >>>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>> and I >>>>>>>>>>>>>>>>> must >>>>>>>>>>>>>>>>> say I learned a lot about how Pyxida works :) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I still have a quick and easy (hopefully) question. >>>>>>>>>>>>>>>>> Each >>>>>>>>>>>>>>>>> Pyxida >>>>>>>>>>>>>>>>> node >>>>>>>>>>>>>>>>> has a small amount of neighbors and learn about other >>>>>>>>>>>>>>>>> nodes >>>>>>>>>>>>>>>>> through >>>>>>>>>>>>>>>>> other nodes. I am running on about 200 machines, and >>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>> seems >>>>>>>>>>>>>>>>> like all >>>>>>>>>>>>>>>>> the Pyxida nodes are adding each other to their >>>>>>>>>>>>>>>>> upNeighbors set. >>>>>>>>>>>>>>>>> NCClient.java has the MAX_NEIGHBORS set to 512. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I wanted to make sure that I am not misreading the >>>>>>>>>>>>>>>>> code, but >>>>>>>>>>>>>>>>> shouldn't >>>>>>>>>>>>>>>>> each Pyxida node keep a small number of neighbors in >>>>>>>>>>>>>>>>> its >>>>>>>>>>>>>>>>> upNeighbors >>>>>>>>>>>>>>>>> set. Else, it will be a lot of updates to send a >>>>>>>>>>>>>>>>> gossipMessage >>>>>>>>>>>>>>>>> to all >>>>>>>>>>>>>>>>> 200 neighbors, and it also means that a Pyxida node >>>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>>> the whole >>>>>>>>>>>>>>>>> network in its upNeighbors set - that seems wrong. I >>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>> inclined >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> change the value of MAX_NEIGHBORS but wanted to make >>>>>>>>>>>>>>>>> sure >>>>>>>>>>>>>>>>> that my >>>>>>>>>>>>>>>>> understanding is correct first. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Eric >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move >>>>>>>>>>>>>>> Developer's >>>>>>>>>>>>>>> challenge Build the coolest Linux based applications >>>>>>>>>>>>>>> with >>>>>>>>>>>>>>> Moblin >>>>>>>>>>>>>>> SDK & >>>>>>>>>>>>>>> win great prizes Grand prize is a trip for two to an >>>>>>>>>>>>>>> Open >>>>>>>>>>>>>>> Source >>>>>>>>>>>>>>> event >>>>>>>>>>>>>>> anywhere in the world >>>>>>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> Pyxida-users mailing >>>>>>>>>>>>>>> list Pyx...@li... >>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida- >>>>>>>>>>>>>>> users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------------------- >>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move >>>>>>>>>>> Developer's challenge Build the coolest Linux based >>>>>>>>>>> applications >>>>>>>>>>> with Moblin SDK & win great prizes Grand prize is a trip >>>>>>>>>>> for two >>>>>>>>>>> to an Open Source event anywhere in the world >>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>>>>>>>>> _______________________________________________ Pyxida-users >>>>>>>>>>> mailing list Pyx...@li... >>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las >>>>>>>>> Vegas, Nevada. The future of the web can't happen without you. >>>>>>>>> Join us at MIX09 to help pave the way to the Next Web now. >>>>>>>>> Learn >>>>>>>>> more and register at >>>>>>>>> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ >>>>>>>>> _______________________________________________ >>>>>>>>> Pyxida-users mailing list Pyx...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users >>>>>>>> >>>>>>>> >>>>>> >> > > > ------------------------------------------------------------------------------ > SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, > Nevada. > The future of the web can't happen without you. Join us at MIX09 to > help > pave the way to the Next Web now. Learn more and register at > http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ > _______________________________________________ > Pyxida-users mailing list > Pyx...@li... > https://lists.sourceforge.net/lists/listinfo/pyxida-users |