Re: [Pyxida-users] Pyxida developer + help

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Eric,

Do you have a PL Pyxida slice that I can connect to to try this out?

That is, I'll run it on my machine, but connect to your slice via the  
right ports.

-jonathan

On Dec 18, 2008, at 11:35 AM, Eric Chan-Tin wrote:

> Sorry for the delay. Was busy with final exams and projects.
>
> I made the changes. They can be found at
>
> http://www-users.cs.umn.edu/~dchantin/NCClient.java
> http://www-users.cs.umn.edu/~dchantin/NCManager.java
> http://www-users.cs.umn.edu/~dchantin/RemoteState.java
>
> All the changes are enclosed within
> //ERIC START
> ...
> //ERIC END
>
> I compared every reported "neighbors list" of each Pyxida node  
> (about 370
> total) and see how many neighbors did not change within 10 minutes.  
> Then
> did the average for all the nodes.
>
> Original: 43.37% of the neighbors stayed the same on average.
>
> Fix0 (before we found the interesting situation): 74.73%
>
> Fix1 (with the weight probability of picking a neighbor to ping):  
> 95.79%
>
> Fix2 (with the weight probability and the MIN_UPDATE_PING): 98.81%
>
> Also, the reported error, rrl, etc.. were much lower in the Fixed  
> Pyxida
> than in the original one (this could not be significant, since the  
> Original
> experiment was performed 1-2 months ago), but as expected, it was not
> higher.
>
> You can take a look at the code and test it if you want. If everything
> looks fine, I'll remove some of the debugging code and other "useless"
> comments (like //ERIC). Also, let me know which "fix" you prefer. I  
> will
> then email you the updated code.
>
> Thanks,
>
> -Eric
>
>
> On Dec 10 2008, Jonathan Ledlie wrote:
>
>>
>> Seems reasonable.
>>
>> Both ideas (lastPingTime and a weighted probability based on age)  
>> seem
>> to be independent.
>>
>> Some of the code is more CPU intensive than it should be (e.g.,
>> looping through potentially long lists), and that is a slight  
>> downside
>> to the function approach vs a set, but this probably isn't a big  
>> deal.
>>
>> -jonathan
>>
>> On Dec 10, 2008, at 10:40 PM, Eric Chan-Tin wrote:
>>
>>> okay. I think I might have misunderstood you. Your idea of
>>> lastPingTime
>>> looks interesting.
>>>
>>> My idea is to introduce a probability function. Right now, it just
>>> randomly picks one of the neighbors. What I am thinking it might  
>>> do is
>>> that instead of just randomly picking a neighbor, pick a neighbor
>>> based
>>> on some function. So, the neighbor whose lastUpdateTime is oldest  
>>> will
>>> have a higher chance of being picked than a neighbor who was just
>>> updated.
>>>
>>> Does that make more sense?
>>>
>>> -Eric
>>>
>>> Jonathan Ledlie wrote:
>>>>
>>>> Hmm.  The advantage of lastPingTime is that you learn whether or  
>>>> not
>>>> you've tried to contact a neighbor and they haven't responded.
>>>> Then you
>>>> can use that knowledge to determine whether or not to toss them.
>>>>
>>>> Maybe I'm not understanding your suggestion then.
>>>>
>>>> -jonathan
>>>>
>>>> On Dec 10, 2008, at 12:51 PM, Eric Chan-Tin wrote:
>>>>
>>>>> That's a good idea. Here is a little twist on it.
>>>>>
>>>>> Put a "weight" on each neighbor, based on the lastUpdateTime. So  
>>>>> if
>>>>> currentTime - neighbor's lastUpdateTime, then the weight is  
>>>>> higher.
>>>>> Then in getNeighborToPing, neighbor's with higher weight have a
>>>>> higher
>>>>> probability of getting picked than neighbor's with lower weight.
>>>>>
>>>>> The weight could be ( (currentTime - lastUpdateTime) / 15  
>>>>> minutes).
>>>>> It's not 10 minutes so as to avoid a probability of 1 if the
>>>>> lastUpdateTime was the previous performMaintenance. This should  
>>>>> also
>>>>> avoid the problem of always pinging the oldest neighbor since it's
>>>>> about probability (weight).
>>>>>
>>>>> It seems a little bit less complicated than having lastPingTime  
>>>>> and
>>>>> MIN_UPDATE_TIME.
>>>>>
>>>>> -Eric
>>>>>
>>>>> On Dec 10 2008, Jonathan Ledlie wrote:
>>>>>
>>>>>>
>>>>>> This is an interesting situation.  Let's think about the goal.   
>>>>>> The
>>>>>> goal is to keep pinging the same nodes as long as they are alive,
>>>>>> and  to replace a node soon after it dies, keeping the total  
>>>>>> number
>>>>>> of up  neighbors at approximately MAX_NEIGHBORS.
>>>>>>
>>>>>> How about the following:
>>>>>> - when you pick a neighbor to ping, you set a time on that event
>>>>>> (in
>>>>>> a  new per-neighbor timestamp LastPingTime)
>>>>>> - when you first insert a neighbor, set this timestamp to 0.
>>>>>> - when you loop in performMaintenance, remove nodes whose
>>>>>> LastPingTime  - LastUpdateTime > threshold (like 10 minutes)
>>>>>>
>>>>>> - when picking a node, pick the guy with the oldest LastPingTime
>>>>>> (not  sure about this one -- I don't like situations where you  
>>>>>> pick
>>>>>> the  absolute *oldest* because then we could get in a situation
>>>>>> where
>>>>>> an  old guy sits there, waiting to be kicked out, and we keep
>>>>>> picking
>>>>>> him  to be pinged; I'd prefer to pick randomly, or random older
>>>>>> guys).
>>>>>> - a simple way around this is to simply have a self-adjusting
>>>>>> threshold: so, we walk through the nodes, and everyone whose
>>>>>> LastPingTime is older than MIN_UPDATE_TIME_TO_PING ago (a  
>>>>>> relative
>>>>>> time) get added to a set; we pick a guy randomly from the set to
>>>>>> ping.  If the set is empty, we increase MIN_UPDATE_TIME_TO_PING,
>>>>>> and
>>>>>> wait for the next time the function is called.  If the set is too
>>>>>> large (relative to a fraction of MAX_NEIGHBORS), we decrease it.
>>>>>>
>>>>>> We still might get some churn from lost UDP packets.  You might
>>>>>> also
>>>>>> give nodes a second chance to be pinged by including  
>>>>>> LastPingTime -
>>>>>> LastUpdateTime > 1 minute in the set.  Note that increases and
>>>>>> decreases in MIN_UPDATE_TIME_TO_PING shouldn't be affected by the
>>>>>> number of nodes that fall into this category.
>>>>>>
>>>>>> We still probably need a way for new potential neighbors to be
>>>>>> rejected if our current set of neighbors is fine.  Because I  
>>>>>> still
>>>>>> want addNeigbor to really add the neighbor, how about putting a
>>>>>> check  before calling that to see if we need more neighbors.
>>>>>>
>>>>>> What thinks?
>>>>>>
>>>>>> -jonathan
>>>>>>
>>>>>> On Dec 9, 2008, at 11:37 PM, ext Eric Chan-Tin wrote:
>>>>>>
>>>>>>> Peter,
>>>>>>>
>>>>>>> Thanks. That fixed it.
>>>>>>>
>>>>>>> However, here is another problem I found while deploying Pyxida
>>>>>>> and
>>>>>>> running them for a while.
>>>>>>>
>>>>>>> In performMaintenance(), I loop through every neighbor and if
>>>>>>> (neighbor.getLastUpdateTime() < current_time - 10 minutes),  
>>>>>>> then I
>>>>>>> remove that neighbor.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> In NCManager, I call getNeighborToPing, which randomly picks one
>>>>>>> of
>>>>>>> the
>>>>>>> neighbors (32 in this case). Pinging is performed every 10
>>>>>>> seconds, so
>>>>>>> in 10 minutes, you can ping 60 nodes. However, if a neighbor A
>>>>>>> is not
>>>>>>> among those 60 nodes, it will get kicked out in
>>>>>>> performMaintenance(),
>>>>>>> although A is still alive (it just hasn't been contacted yet).
>>>>>>>
>>>>>>> There are some possible options.
>>>>>>>
>>>>>>> 1) Increase the timeout period from 10 minutes to ?? minutes.  
>>>>>>> This
>>>>>>> doesn't quite work because say in Azureus, the number of
>>>>>>> neighbors is
>>>>>>> 512, so the timeout needs to be increased even more... It could
>>>>>>> be a
>>>>>>> function of MAX_NEIGHBORS...
>>>>>>>
>>>>>>> 2) When Pyxida replies to a ping request, it will update the
>>>>>>> neighbor's
>>>>>>> lastUpdateTime. This does not quite work either because it
>>>>>>> assumes a
>>>>>>> symmetric relationship which is most likely not the case.
>>>>>>>
>>>>>>> 3) Introduce some more logic to Pyxida so it remembers what
>>>>>>> nodes it
>>>>>>> contacted and does not send, say, more than 2 requests in 10
>>>>>>> minutes  to
>>>>>>> the same node. This solution isn't that great either.
>>>>>>>
>>>>>>> Let me know what you guys think. I am currently leaning towards
>>>>>>> option
>>>>>>> 1) but make the timeout period a function of MAX_NEIGHBORS,
>>>>>>> although I
>>>>>>> haven't quite found the perfect function yet.
>>>>>>>
>>>>>>> Here are some preliminary results. I compared the number of
>>>>>>> neighbors
>>>>>>> which are the same at each reported time (10 minutes). Before
>>>>>>> the fix,
>>>>>>> more than 50% of the neighbors changed within 10 minutes. After
>>>>>>> the
>>>>>>> fix,
>>>>>>> only about 30% of the neighbors changed within 10 minutes. It is
>>>>>>> more
>>>>>>> stable but I think 30% is still quite a big number.
>>>>>>>
>>>>>>> Sorry for the long email :)
>>>>>>>
>>>>>>> -Eric
>>>>>>>
>>>>>>> Peter Pietzuch wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Eric Chan-Tin wrote:
>>>>>>>>> for (RemoteState<T> neighbor : neighbors) { //this is line  
>>>>>>>>> 1029
>>>>>>>>> if (neighbor.getLastUpdateTime() < (curr_time -
>>>>>>>>> MAINTENANCE_PERIOD)) {
>>>>>>>>>    neighbors.remove(neighbor);
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>
>>>>>>>> IIRC you can't modify a collection while iterating over it
>>>>>>>> using the
>>>>>>>> above for loop. Instead you have to use a proper
>>>>>>>> Iterator<RemoteState<T>> and use it's remove method.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Peter
>>>>>>>>
>>>>>>>>> Jonathan Ledlie wrote:
>>>>>>>>>> On Nov 3, 2008, at 3:46 PM, Eric Chan-Tin wrote:
>>>>>>>>>>
>>>>>>>>>>> Jonathan,
>>>>>>>>>>>
>>>>>>>>>>> I can insert some code to fix this and it doesn't look too
>>>>>>>>>>> hard
>>>>>>>>>>> but I do
>>>>>>>>>>> have some concerns.
>>>>>>>>>>>
>>>>>>>>>>>> I don't remember observing this behavior in our PL
>>>>>>>>>>>> deployment, but that could have been because we had the  
>>>>>>>>>>>> ping
>>>>>>>>>>>> time set
>>>>>>>>>>>> to 1-2 minutes, IIRC.
>>>>>>>>>>> The default ping time was set to 10 seconds + some Gaussian
>>>>>>>>>>> distribution,
>>>>>>>>>>> so that's what I used, but changing the ping time to 1  
>>>>>>>>>>> minute
>>>>>>>>>>> or  more
>>>>>>>>>>> will
>>>>>>>>>>> result in the same effect except it will take longer for the
>>>>>>>>>>> "cycle" to
>>>>>>>>>>> occur.
>>>>>>>>>>>
>>>>>>>>>>> What I have in mind was the following:
>>>>>>>>>>>
>>>>>>>>>>> 1) In NCClient.addNeighbor() would be changed to (in
>>>>>>>>>>> pseudocode)
>>>>>>>>>>>    protected boolean addNeighbor(RemoteState<T> guy) {
>>>>>>>>>>>            boolean added = false;
>>>>>>>>>>>            //if neighbors.size() >= MAX_NEIGHBORS
>>>>>>>>>>>               //Check last update time of each neighbor in
>>>>>>>>>>> neighbors
>>>>>>>>>>>               //neighbor.getLastUpdateTime()
>>>>>>>>>>>               //if any of them is older than some number
>>>>>>>>>>> TIMEOUT,
>>>>>>>>>>> remove the oldest one, then add "guy" to neighbors
>>>>>>>>>>>               //else don't add "guy" to neighbors
>>>>>>>>>>>            //else add "guy" to neighbors
>>>>>>>>>>>    }
>>>>>>>>>>>
>>>>>>>>>> IMO this shouldn't be the behavior of this function because  
>>>>>>>>>> it
>>>>>>>>>> is  called
>>>>>>>>>> "addNeighbor."  Instead what could happen is that part of the
>>>>>>>>>> performMaintenance() function could remove old neighbors.
>>>>>>>>>>
>>>>>>>>>>> 2) In NCManager, getGossipNode() would just pick a random  
>>>>>>>>>>> node
>>>>>>>>>>> from
>>>>>>>>>>> localNC.primaryNC.getNeighbors() to send a request to  
>>>>>>>>>>> every 10
>>>>>>>>>>> seconds.
>>>>>>>>>>>
>>>>>>>>>>> 3) Pyxida will still keep track of all GossipNodes in the
>>>>>>>>>>> network. It
>>>>>>>>>>> could
>>>>>>>>>>> also (say every 10 minutes) check its NCClient.neighbors  
>>>>>>>>>>> list
>>>>>>>>>>> and see if
>>>>>>>>>>> any neighbor is older than TIMEOUT, remove it, then pick a
>>>>>>>>>>> random
>>>>>>>>>>> GossipNode to send a ping request.
>>>>>>>>>>>
>>>>>>>>>>> The main problem with this scheme is that one Pyxida node
>>>>>>>>>>> would
>>>>>>>>>>> most
>>>>>>>>>>> likely
>>>>>>>>>>> stay with the same set of neighbors all the time. Its
>>>>>>>>>>> coordinates
>>>>>>>>>>> would be
>>>>>>>>>>> greatly affected by those 32 nodes (I am assuming
>>>>>>>>>>> MAX_NEIGHBORS
>>>>>>>>>>> = 32).
>>>>>>>>>>>
>>>>>>>>>> Instead of (2), why not have the getNeighborToPing function
>>>>>>>>>> determine
>>>>>>>>>> the behavior?  If it returns a node, then ping it.  If it
>>>>>>>>>> returns  null,
>>>>>>>>>> then have the NCManager pick some other node to use?  If
>>>>>>>>>> things are
>>>>>>>>>> hunky-dory, getNeighborToPing() would return a node (a  
>>>>>>>>>> current
>>>>>>>>>> member of
>>>>>>>>>> the neighbor set), if we're short on neighbors, then it would
>>>>>>>>>> return
>>>>>>>>>> null.  The periodic maintenance would have already tossed old
>>>>>>>>>> nodes, so
>>>>>>>>>> this would be self-sustaining.
>>>>>>>>>>
>>>>>>>>>> The "problem" you mention is fairly central to network
>>>>>>>>>> coordinates in
>>>>>>>>>> general.  Sure the coordinates will be a little better the  
>>>>>>>>>> more
>>>>>>>>>> neighbors you have.  But in a very large system talking to
>>>>>>>>>> everybody
>>>>>>>>>> isn't feasible.  Peter and I have a graph in one of the
>>>>>>>>>> workshop
>>>>>>>>>> papers
>>>>>>>>>> which shows the effect of increasing the number of neighbors
>>>>>>>>>>
>>>>>>>>>>> I don't think the network would be partitioned, but it could
>>>>>>>>>>> happen. For
>>>>>>>>>>> example, the boostrap nodes have each other in their  
>>>>>>>>>>> neighbors
>>>>>>>>>>> list.
>>>>>>>>>>> Every
>>>>>>>>>>> new node bootstrapping to the network would contact those
>>>>>>>>>>> bootstrapping
>>>>>>>>>>> nodes first and add them to their neighbors. So every Pyxida
>>>>>>>>>>> node could
>>>>>>>>>>> have as its neighbors the "Bootstrap nodes", which would  
>>>>>>>>>>> make
>>>>>>>>>>> Pyxida
>>>>>>>>>>> become
>>>>>>>>>>> a centralized system.
>>>>>>>>>>>
>>>>>>>>>> As long as the initial bootstrap list is large enough, I  
>>>>>>>>>> think
>>>>>>>>>> you are
>>>>>>>>>> right, that partitions won't be a problem.
>>>>>>>>>>
>>>>>>>>>>> One last thing: When Pyxida receives a ping request from a
>>>>>>>>>>> node,  it does
>>>>>>>>>>> not update its coordinates. Is there any particular reason  
>>>>>>>>>>> for
>>>>>>>>>>> that?
>>>>>>>>>>> Is it
>>>>>>>>>>> because Pyxida will not be able to calculate the RTT?
>>>>>>>>>>>
>>>>>>>>>> Yes.
>>>>>>>>>>
>>>>>>>>>>> Also, there is no "gossiping" of other nodes' coordinates,
>>>>>>>>>>> right? So if A
>>>>>>>>>>> has B in its neighbors list, it will from time to time tell
>>>>>>>>>>> other nodes
>>>>>>>>>>> about B's coordinates. I realize this has a lot of
>>>>>>>>>>> complications  but the
>>>>>>>>>>> original Vivaldi paper was very fuzzy about this point and  
>>>>>>>>>>> the
>>>>>>>>>>> previous
>>>>>>>>>>> point that it seems it is up to the implementors to decide
>>>>>>>>>>> what
>>>>>>>>>>> to do.
>>>>>>>>>>>
>>>>>>>>>> I don't think we are doing this.  It could be useful in  
>>>>>>>>>> certain
>>>>>>>>>> circumstance (e.g., picking potential new neighbors), but  
>>>>>>>>>> we're
>>>>>>>>>> not, IIRC.
>>>>>>>>>>
>>>>>>>>>>> I will have to think some more about this as there are a lot
>>>>>>>>>>> of
>>>>>>>>>>> nasty
>>>>>>>>>>> little side effects that could happen, but these are my
>>>>>>>>>>> thoughts.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> -Eric
>>>>>>>>>>>
>>>>>>>>>> Sounds good.
>>>>>>>>>>
>>>>>>>>>> -jonathan
>>>>>>>>>>
>>>>>>>>>>> On Nov 3 2008, Jonathan Ledlie wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Eric,
>>>>>>>>>>>>
>>>>>>>>>>>> Based on a look at the code just now, I see where you are
>>>>>>>>>>>> coming from:
>>>>>>>>>>>> with these params on PL, there isn't anything stopping the
>>>>>>>>>>>> neighbor
>>>>>>>>>>>> list from circulating pretty quickly through all nodes.  In
>>>>>>>>>>>> the
>>>>>>>>>>>> Azureus implementation, the guys fed to the alg (fed to
>>>>>>>>>>>> processSample)
>>>>>>>>>>>> weren't changing much because they were coming from the
>>>>>>>>>>>> routing  finger
>>>>>>>>>>>> tables.  I don't remember observing this behavior in our PL
>>>>>>>>>>>> deployment, but that could have been because we had the  
>>>>>>>>>>>> ping
>>>>>>>>>>>> time set
>>>>>>>>>>>> to 1-2 minutes, IIRC.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you want to insert some code that will fix this?
>>>>>>>>>>>> Basically
>>>>>>>>>>>> we want
>>>>>>>>>>>> to only add a new neighbor if our oldest neighbor should be
>>>>>>>>>>>> tossed (or
>>>>>>>>>>>> if we haven't reached MAX_NEIGHBORS).  It's probably best
>>>>>>>>>>>> from a
>>>>>>>>>>>> clarity standpoint if addNeighbor does continue to actually
>>>>>>>>>>>> add  the
>>>>>>>>>>>> neighbor in all cases.  Probably the best thing to do is to
>>>>>>>>>>>> have
>>>>>>>>>>>> NCClient drop old neighbors internally and have NCManager  
>>>>>>>>>>>> ask
>>>>>>>>>>>> it which
>>>>>>>>>>>> nodes should be pinged via getNeighborToPing.  So NCClient
>>>>>>>>>>>> would
>>>>>>>>>>>> continue to have the same behavior on processSample (it
>>>>>>>>>>>> would not
>>>>>>>>>>>> reject new information) and NCManager and NCClient would
>>>>>>>>>>>> cooperate to
>>>>>>>>>>>> measure RTTs to current neighbors.  We do want to make  
>>>>>>>>>>>> sure,
>>>>>>>>>>>> however,
>>>>>>>>>>>> that the gossip set stays large enough so that partitions
>>>>>>>>>>>> don't  occur.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you want to be added to the SourceForge Pyxida  
>>>>>>>>>>>> developers
>>>>>>>>>>>> to  make
>>>>>>>>>>>> this change?  (Please test it out carefully first.)
>>>>>>>>>>>>
>>>>>>>>>>>> -jonathan
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 30, 2008, at 11:32 AM, ext Eric Chan-Tin wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Jonathan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> From what I saw, eventually, you learn about all the nodes
>>>>>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>> network. Every 10 seconds, Pyxida picks a random "Gossip"
>>>>>>>>>>>>> node  from
>>>>>>>>>>>>> its
>>>>>>>>>>>>> list and sends a request. The Gossip node will reply back
>>>>>>>>>>>>> with  an
>>>>>>>>>>>>> "upNeighbor", that is, a "Gossip" node that is "up" or
>>>>>>>>>>>>> with 10%
>>>>>>>>>>>>> probability, a node that is "pending". Since naturally,  
>>>>>>>>>>>>> all
>>>>>>>>>>>>> the nodes
>>>>>>>>>>>>> bootstrap off the same set of nodes, every node will
>>>>>>>>>>>>> eventually know
>>>>>>>>>>>>> of
>>>>>>>>>>>>> every other node.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pyxida might not be rotating through everybody  
>>>>>>>>>>>>> continuously
>>>>>>>>>>>>> (although
>>>>>>>>>>>>> eventually, since it knows everybody, it will have added
>>>>>>>>>>>>> everybody to
>>>>>>>>>>>>> its Neighbors list at some point), but could be removing a
>>>>>>>>>>>>> node that
>>>>>>>>>>>>> is
>>>>>>>>>>>>> "alive". So, if MAX_NEIGHBORS = 32, let's call them A_1,
>>>>>>>>>>>>> A_2,
>>>>>>>>>>>>> ...,
>>>>>>>>>>>>> A_32.
>>>>>>>>>>>>> Pyxida picks a random Gossip node B, sends it a request
>>>>>>>>>>>>> and B
>>>>>>>>>>>>> replies
>>>>>>>>>>>>> back. Pyxida will add B to its Neighbors list, see that
>>>>>>>>>>>>> size of
>>>>>>>>>>>>> Neighbors > MAX_NEIGHBORS, kick A_1 out. Then from those  
>>>>>>>>>>>>> 32
>>>>>>>>>>>>> neighbors
>>>>>>>>>>>>> A_2, ...., A_32, B, it will calculate the force, and apply
>>>>>>>>>>>>> it
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> current system coordinates. Thus, if every 10 seconds,
>>>>>>>>>>>>> Pyxida
>>>>>>>>>>>>> picks a
>>>>>>>>>>>>> different set of nodes C, D, E, etc... it would keep  
>>>>>>>>>>>>> kicking
>>>>>>>>>>>>> nodes out
>>>>>>>>>>>>> of the neighbors list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's the "kicking A_1 out" part that seemed a bit  
>>>>>>>>>>>>> confusing.
>>>>>>>>>>>>> Churn
>>>>>>>>>>>>> is a
>>>>>>>>>>>>> good reason but you don't check that A_1 might still be
>>>>>>>>>>>>> alive
>>>>>>>>>>>>> or maybe
>>>>>>>>>>>>> A_1 was just added. I have been printing the neighbors  
>>>>>>>>>>>>> list
>>>>>>>>>>>>> every 10
>>>>>>>>>>>>> minutes, and they are completely different between those  
>>>>>>>>>>>>> 10
>>>>>>>>>>>>> minutes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One possible problem might be that I am choosing the value
>>>>>>>>>>>>> of
>>>>>>>>>>>>> MAX_NEIGHBORS to be too small. Originally it was set to  
>>>>>>>>>>>>> 512
>>>>>>>>>>>>> (which was
>>>>>>>>>>>>> meant for Azureus). Since I am using Planetlab where there
>>>>>>>>>>>>> are  about
>>>>>>>>>>>>> 400-500 "stable" nodes, I changed the value to 32. But I
>>>>>>>>>>>>> don't  think
>>>>>>>>>>>>> that should be a problem since log(n) = log(512) = 9.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Eric
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jonathan Ledlie wrote:
>>>>>>>>>>>>>> Eric,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Glad to hear it.  Let Peter and me know if you publish
>>>>>>>>>>>>>> anything using
>>>>>>>>>>>>>> Pyxida and we can put it on the web page.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> "Gossip" nodes are for learning about the network.
>>>>>>>>>>>>>> "Neighbors" are
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> constructing the coordinate.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, IIRC, the gossip process learns about nodes that  
>>>>>>>>>>>>>> may or
>>>>>>>>>>>>>> may not
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> applied to the neighbor list.  That is, we may eventually
>>>>>>>>>>>>>> learn
>>>>>>>>>>>>>> about a
>>>>>>>>>>>>>> large portion of the network, via gossiping, but should
>>>>>>>>>>>>>> only
>>>>>>>>>>>>>> be using
>>>>>>>>>>>>>> MAX_NEIGHBORS at any given time for coordinate  
>>>>>>>>>>>>>> computation.
>>>>>>>>>>>>>> The
>>>>>>>>>>>>>> reason
>>>>>>>>>>>>>> we don't want to use a fixed set of neighbors for
>>>>>>>>>>>>>> coordinate
>>>>>>>>>>>>>> computation
>>>>>>>>>>>>>> is because nodes may come and go from the system.   
>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>> it should
>>>>>>>>>>>>>> not be the case that we just rotate through everybody,
>>>>>>>>>>>>>> because
>>>>>>>>>>>>>> then, as
>>>>>>>>>>>>>> you note, we'd be computing against everyone's  
>>>>>>>>>>>>>> coordinate/
>>>>>>>>>>>>>> latency,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> this is counter to the point of network coordinates.
>>>>>>>>>>>>>> With this
>>>>>>>>>>>>>> gradual
>>>>>>>>>>>>>> transience as a desirable quality in mind, I am pretty
>>>>>>>>>>>>>> sure we
>>>>>>>>>>>>>> maintain
>>>>>>>>>>>>>> neighbors in a fairly fixed manner.  Does this not seem  
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> be  the
>>>>>>>>>>>>>> case
>>>>>>>>>>>>>> to you?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -jonathan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Oct 29, 2008, at 9:20 PM, Eric Chan-Tin wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Jonathan and Peter,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> First of all, just wanted to say that Pyxida has been
>>>>>>>>>>>>>>> working
>>>>>>>>>>>>>>> great on
>>>>>>>>>>>>>>> Planetlab for the past 4 months or so. Getting good data
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> experiments
>>>>>>>>>>>>>>> going well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, I do have one question/comment.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Every Pyxida node contains a list of Neighbors (in
>>>>>>>>>>>>>>> NCClient)  and a
>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>> of Gossip nodes (they are called neighbors in NCManager
>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>> I will
>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>> them gossip for now so as not to confuse with the 'real'
>>>>>>>>>>>>>>> neighbors
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> NCClient). Every 10 seconds, Pyxida will pick a random
>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>> (call
>>>>>>>>>>>>>>> it A)
>>>>>>>>>>>>>>> from Gossip and send a gossip request (basically asking
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> rtt,
>>>>>>>>>>>>>>> coordinate, error). But Gossip contains all the nodes in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> network.
>>>>>>>>>>>>>>> The comment in the code said that was to avoid Pyxida
>>>>>>>>>>>>>>> becoming
>>>>>>>>>>>>>>> lonely
>>>>>>>>>>>>>>> and no-one to gossip with, which makes sense.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Once A responds, Pyxida will add A to its Neighbors (if
>>>>>>>>>>>>>>> Neighbors
>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>> not contain A) and remove the first element from the
>>>>>>>>>>>>>>> Neighbors list.
>>>>>>>>>>>>>>> This seems a little bit odd. Was there any particular
>>>>>>>>>>>>>>> reasoning
>>>>>>>>>>>>>>> behind
>>>>>>>>>>>>>>> this?  Because it would seem like you're getting the
>>>>>>>>>>>>>>> coordinate from
>>>>>>>>>>>>>>> every node in the network eventually.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks a lot,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Eric
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jonathan Ledlie wrote:
>>>>>>>>>>>>>>>> Your inclination is correct.  See forwarded message.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -jonathan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Apr 9, 2008, at 1:53 PM, ext Eric Chan-Tin wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you for the clarification.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've been reading through your published papers and  
>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>> say I learned a lot about how Pyxida works :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I still have a quick and easy (hopefully) question.  
>>>>>>>>>>>>>>>>> Each
>>>>>>>>>>>>>>>>> Pyxida
>>>>>>>>>>>>>>>>> node
>>>>>>>>>>>>>>>>> has a small amount of neighbors and learn about other
>>>>>>>>>>>>>>>>> nodes
>>>>>>>>>>>>>>>>> through
>>>>>>>>>>>>>>>>> other nodes. I am running on about 200 machines, and  
>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>> like all
>>>>>>>>>>>>>>>>> the Pyxida nodes are adding each other to their
>>>>>>>>>>>>>>>>> upNeighbors set.
>>>>>>>>>>>>>>>>> NCClient.java has the MAX_NEIGHBORS set to 512.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I wanted to make sure that I am not misreading the
>>>>>>>>>>>>>>>>> code, but
>>>>>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>>>>> each Pyxida node keep a small number of neighbors in  
>>>>>>>>>>>>>>>>> its
>>>>>>>>>>>>>>>>> upNeighbors
>>>>>>>>>>>>>>>>> set. Else, it will be a lot of updates to send a
>>>>>>>>>>>>>>>>> gossipMessage
>>>>>>>>>>>>>>>>> to all
>>>>>>>>>>>>>>>>> 200 neighbors, and it also means that a Pyxida node  
>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>> the whole
>>>>>>>>>>>>>>>>> network in its upNeighbors set - that seems wrong. I  
>>>>>>>>>>>>>>>>> am
>>>>>>>>>>>>>>>>> inclined
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> change the value of MAX_NEIGHBORS but wanted to make
>>>>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>>>> that my
>>>>>>>>>>>>>>>>> understanding is correct first.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Eric
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -------------------------------------------------------------------------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move
>>>>>>>>>>>>>>> Developer's
>>>>>>>>>>>>>>> challenge Build the coolest Linux based applications  
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>> Moblin
>>>>>>>>>>>>>>> SDK &
>>>>>>>>>>>>>>> win great prizes Grand prize is a trip for two to an  
>>>>>>>>>>>>>>> Open
>>>>>>>>>>>>>>> Source
>>>>>>>>>>>>>>> event
>>>>>>>>>>>>>>> anywhere in the world
>>>>>>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pyxida-users  mailing
>>>>>>>>>>>>>>> list Pyx...@li...
>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida- 
>>>>>>>>>>>>>>> users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -------------------------------------------------------------------------
>>>>>>>>>>> This SF.Net email is sponsored by the Moblin Your Move
>>>>>>>>>>> Developer's challenge Build the coolest Linux based  
>>>>>>>>>>> applications
>>>>>>>>>>> with Moblin SDK & win great prizes Grand prize is a trip  
>>>>>>>>>>> for two
>>>>>>>>>>> to an Open Source event anywhere in the world
>>>>>>>>>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>>>>>>>>>> _______________________________________________ Pyxida-users
>>>>>>>>>>> mailing list Pyx...@li...
>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las
>>>>>>>>> Vegas, Nevada. The future of the web can't happen without you.
>>>>>>>>> Join us at MIX09 to help pave the way to the Next Web now.  
>>>>>>>>> Learn
>>>>>>>>> more and register at
>>>>>>>>> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
>>>>>>>>> _______________________________________________
>>>>>>>>> Pyxida-users mailing list Pyx...@li...
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/pyxida-users
>>>>>>>>
>>>>>>>>
>>>>>>
>>
>
>
> ------------------------------------------------------------------------------
> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas,  
> Nevada.
> The future of the web can't happen without you.  Join us at MIX09 to  
> help
> pave the way to the Next Web now. Learn more and register at
> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
> _______________________________________________
> Pyxida-users mailing list
> Pyx...@li...
> https://lists.sourceforge.net/lists/listinfo/pyxida-users