From: <bst...@jb...> - 2006-02-28 20:37:17
|
Very nice :) In your section where you describe what happens if: anonymous wrote : A buddy node dies | | * The Data Owner detects this, and nominates more buddies to meet its configured requirement. | o Initiates state transfers to these buddies so backups are preserved. | These steps should also apply in the case that a DataOwner dies as well -- once the PrimaryBuddy becomes the new DataOwner. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3926914#3926914 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3926914 |
From: <man...@jb...> - 2006-03-01 08:52:58
|
Well, not really - if a DataOwner dies, the first buddy in the list (nominated as the PrimaryBuddy) simply merges the DataOwner's backup data with its own dataset. It does not need to nominate a new set of buddies because it already has a buddy group in place for its own data set for which it is a DataOwner. And existing replication mechanisms to replicate changes in its own dataset to its buddy group will kick in when it performs such a merge. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3927018#3927018 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3927018 |
From: <ben...@jb...> - 2006-03-02 06:02:43
|
Since Brian has another thread discussing the problem with initial state transfer. What is the implication of BR on initial state transfer then? Two scenarios that we have: whloe state transfer and partial state transfer. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3927350#3927350 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3927350 |
From: <man...@jb...> - 2006-03-03 17:00:09
|
I still need to discuss state transfer details with Brian. I've updated details of the design, BTW. More from me shortly ... View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3927829#3927829 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3927829 |
From: <man...@jb...> - 2006-03-07 12:19:01
|
One more thing to consider - designing integration tests with AS for buddy replication. Not entirely relevant to the design of buddy replication, but something to keep in mind all the same. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3928437#3928437 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3928437 |
From: <man...@jb...> - 2006-03-07 17:19:24
|
Updated http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossCacheBuddyReplicationDesign with a potential solution around state transfer. Please have a look and comment accordingly. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3928519#3928519 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3928519 |
From: <man...@jb...> - 2006-03-09 15:24:23
|
Re: specifying a list of nodes to be used as buddies when using the SpecificBuddyLocator, I'm somewhat stumped. The main problem is that if you run multiple instances on the same IP address, on a JGroups level the only way to differentiate one member from the next is port number. And while this is fine when we identify members based on a JGroups View (as in NextMemberBuddyLocator or even FamilyClusterInfo in the JBoss AS HA codebase) it doesn't help at all when you need to pre-configure a list of members since the ports used by JGroups may be picked dynamically. Does anyone have any ideas on this? Have I misunderstood the way FamilyClusterInfo works, perhaps, and is there a solution close at hand? Thanks, Manik View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929070#3929070 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929070 |
From: <be...@jb...> - 2006-03-10 08:25:53
|
You can always configure JGroups to run at fixed IP addresses and ports. To do this, use bind_addr and bind_port. The latter tell JGroups not to pick a random port View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929243#3929243 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929243 |
From: <man...@jb...> - 2006-03-10 13:33:13
|
Makes sense, except that it is one more thing to demand of users. What does everyone feel? Is it reasonable to expect users to specify a JGroups port in their configurations if they wish to use the SpecificBuddyLocator? I don't see this as the default - the default would be to use the NextMemberBuddyLocator. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929325#3929325 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929325 |
From: <man...@jb...> - 2006-03-10 13:51:13
|
What are peoples' thoughts around the configuration of BR? Is it too complex, are there too many fiddly bits to set up? - Buddy Locator class - Buddy Locator properties: - num buddies - specific buddy list - ignore colocaled buddies - colocated server list - Clustered cache loader (for data gravitation) - timeouts - data removal on gravitation View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929332#3929332 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929332 |
From: <ben...@jb...> - 2006-03-11 15:32:13
|
And is the default one to look for next member that is located on a different physical node? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929540#3929540 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929540 |
From: <man...@jb...> - 2006-03-13 10:24:18
|
Yes, the default is the NextMemberBuddyLocator with the ignoreColocatedBuddies property set to true. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929717#3929717 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929717 |
From: <bst...@jb...> - 2006-03-13 16:29:49
|
Is colocatedServerList just for the case where they want to exclude other physical machines (e.g. those running on the same power source)? Presumably we can detect the simpler case of two servers on the same physical machine without requiring the user to spell it out. I don't think this is too many "fiddly bits", particularly if there are reasonable defaults, so that in a simple case most things don't have to be set. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929802#3929802 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929802 |
From: <bst...@jb...> - 2006-03-13 16:40:24
|
"man...@jb..." wrote : | What does everyone feel? Is it reasonable to expect users to specify a JGroups port in their configurations if they wish to use the SpecificBuddyLocator? | Can we parse the configuration so that if they provide the port, we respect it, and if they don't we don't? I think the port is only a problem if they are running two servers on the same machine without setting jboss.bind.address or bind.address or using the JGroups bind_addr property. Most production use case won't be like that. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929809#3929809 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929809 |
From: <man...@jb...> - 2006-03-13 16:44:18
|
colocatedServerList is also for when you run 2 instances on 1 physical machine, but bind each instance to a different IP. Analysing the contents of a JGroups View, these will seem like 2 different hosts, and we'd have no idea that they in fact reside on the same machine. And yes, it can also be used in the scenario you mentioned where you want 2 physical servers connected to the same power source to be treated as colocated. And again, yes, you're right, the simplest case does check the IP addresses of the members in a group and 'guesses' if instances are colocated as well. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929812#3929812 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929812 |
From: <man...@jb...> - 2006-03-13 16:51:06
|
Re: JGroups ports, I suppose this can be ignored if not specified - but then we would have to deal with some default behaviour if in fact there are 2 instances on the same IP address (even if this is unlikely). Probably just pick the first one and document the behaviour as 'indeterminate' ... View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929817#3929817 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929817 |
From: <bst...@jb...> - 2006-03-13 16:54:19
|
"man...@jb..." wrote : colocatedServerList is also for when you run 2 instances on 1 physical machine, but bind each instance to a different IP. Analysing the contents of a JGroups View, these will seem like 2 different hosts, and we'd have no idea that they in fact reside on the same machine. I'm too lazy to look if there is some kind of gotcha, but it seems like we should be able to be aware of all the IP addresses associated with our machine. I know that JGroups UDP walks through all the interfaces/addresses when it binds the multicast socket to all addresses. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929821#3929821 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929821 |
From: <bst...@jb...> - 2006-03-13 16:55:03
|
And log a WARN (or even an ERROR), which should be enough to alert an admin that they have a faulty configuration. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929822#3929822 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929822 |
From: <man...@jb...> - 2006-03-13 23:04:11
|
re: colocatedServerList, thanks for the tip, Brian. Using java.net.NetworkInterface.getNetworkInterfaces() I can easily walk through the collection of interfaces on a single host. Probably do this once when instantiating the BuddyLocator impl. So I don't see a need for a colocatedServerList - except for the scenario you mentioned where separate hosts ought to be considered as colocated because they are connected to the same power source, etc. Or also, the case of running virtualisation software to run multiple OS instances (each with its own virtual NIC), each with a cluster member. Are the above two use cases common enough to warrant a colocatedServerList? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929927#3929927 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929927 |
From: <bst...@jb...> - 2006-03-14 02:44:56
|
Maybe for those kinds of situations they should use SpecificBuddyLocator. This seems to be more the way WL does it: http://edocs.bea.com/wls/docs70/cluster/failover.html#1022145 View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929965#3929965 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929965 |
From: ghinkle <do-...@jb...> - 2006-03-14 05:33:33
|
Applogies if this stuff has already been discussed, but I've only just seen the blog entry referencing these designs. First, I think there is a difference between SpecificBuddyLocator and ReplicationGroups. In ReplicationGroups, I just have to give each node a name and where the names match they're considered grouped. I'd really rather not have to go update many configurations every time I add or remove a machine in my cluster. If I use name-based resolution, I have the flexibility to replace failures and add additional hardware when load demands it without a shutdown-reconfigure. Second, I rather like the idea of configuring a machine name for each node as the way it determines colocation. This lets me choose if I want it to work on one host OS or virtualized OSes on one box. In this way, for each node, I just need to configure a clustername, a cluster group, a replication group and a machine name. Plus those things wouldn't need to change as I reconfigure/add/remove nodes. The other thing I was thinking is that I'm pretty sure I wouldn't care if cross-session object links were maintained. I wouldn't want them to be there in the first place... so I'm not convinced data slicing (at least per session) is a bad idea. The benefit of having a secondary for each session is that when the primary fails, I can have it so the rest of the machines in the cluster share in the work of getting back to steady-state. In bigger clusters, the move back to steady state is more damaging to the cluster than the outage. I've seen clusters trip domino style due to the replication failover causing extra load / memory usage on the secondary. If I've got 512 mb of session per node and I get a failure like your scenario in the wiki, node B and C now have session memory requirements of 1.5 gb up from 1 gb pre-failure. Of course my experience is more from a time when GC was a bigger problem than it is today, but I'd still worry about quick jumps in memory and high traffic between specific nodes. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929979#3929979 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929979 |
From: <chr...@jb...> - 2006-03-14 06:59:51
|
With regard to ports... You could use AUTH in JGroups to validate users joining the group - the AUTH protocol can do whatever checking you like to see if the node should be allowed to join or not...e.g against a predefined list of IP (& ports if needed). If BuddyReplication was switched on you could even stop them running two instances on the same IP if you wanted. Thoughts? View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3929988#3929988 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3929988 |
From: <man...@jb...> - 2006-03-16 12:56:48
|
"chr...@jb..." wrote : With regard to ports... | | You could use AUTH in JGroups to validate users joining the group - the AUTH protocol can do whatever checking you like to see if the node should be allowed to join or not...e.g against a predefined list of IP (& ports if needed). | | If BuddyReplication was switched on you could even stop them running two instances on the same IP if you wanted. | | Thoughts? I have no real problems with people running several instances on a single IP - I don't necessarily want to stop them from doing so. I just want to ensure we don't put 2 instances on the same physical server into the same buddy group. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3930652#3930652 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3930652 |
From: <man...@jb...> - 2006-03-16 13:04:29
|
"ghinkle" wrote : First, I think there is a difference between SpecificBuddyLocator and ReplicationGroups. In ReplicationGroups, I just have to give each node a name and where the names match they're considered grouped. I'd really rather not have to go update many configurations every time I add or remove a machine in my cluster. If I use name-based resolution, I have the flexibility to replace failures and add additional hardware when load demands it without a shutdown-reconfigure. | Would this not involve additional RPC, where a node would have to ask all nodes in the cluster which Replication Group they're in to find its replication group members? "ghinkle" wrote : | Second, I rather like the idea of configuring a machine name for each node as the way it determines colocation. This lets me choose if I want it to work on one host OS or virtualized OSes on one box. | You can do this anyway without naming nodes - each instance will get a unique JGroups address anyway which acts as a name. "ghinkle" wrote : | The other thing I was thinking is that I'm pretty sure I wouldn't care if cross-session object links were maintained. I wouldn't want them to be there in the first place... so I'm not convinced data slicing (at least per session) is a bad idea. The benefit of having a secondary for each session is that when the primary fails, I can have it so the rest of the machines in the cluster share in the work of getting back to steady-state. | | In bigger clusters, the move back to steady state is more damaging to the cluster than the outage. I've seen clusters trip domino style due to the replication failover causing extra load / memory usage on the secondary. If I've got 512 mb of session per node and I get a failure like your scenario in the wiki, node B and C now have session memory requirements of 1.5 gb up from 1 gb pre-failure. Of course my experience is more from a time when GC was a bigger problem than it is today, but I'd still worry about quick jumps in memory and high traffic between specific nodes. My problem with slicing data and letting the entire network help with distributing it upfront is that we have no knowledge of what is in the cache to be able to slice/partition it without losing relationships, etc. Especially with TreeCacheAop (FIELD level session replication) we may have shared references to objects which then break down if you try and split up the sessions held. I agree though that the load spike for the buddy (and buddy's buddy) when a failure occurs is quite high - and hence the need for data gravitation and a load balancer that now starts to redirect requests evenly across a cluster. Gradually this extra load should spread out. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3930657#3930657 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3930657 |
From: ghinkle <do-...@jb...> - 2006-03-16 14:21:36
|
"man...@jb..." wrote : "ghinkle" wrote : First, I think there is a difference between SpecificBuddyLocator and ReplicationGroups. In ReplicationGroups, I just have to give each node a name and where the names match they're considered grouped. I'd really rather not have to go update many configurations every time I add or remove a machine in my cluster. If I use name-based resolution, I have the flexibility to replace failures and add additional hardware when load demands it without a shutdown-reconfigure. | | | | Would this not involve additional RPC, where a node would have to ask all nodes in the cluster which Replication Group they're in to find its replication group members? | It would absolutely require more info, but if I can do it by having a cluster view of replication group membership it is absolutely worth it. It would be part of the metadata a node would push out when it joins the cluster. Everyone else is doing this by having heartbeats that share this information. Each node in the cluster does its best to keep its perspective of who's in the cluster and what their information is (replication group). Let's put it this way, its a little more code for a lot more maintainability. "man...@jb..." wrote : | "ghinkle" wrote : | | Second, I rather like the idea of configuring a machine name for each node as the way it determines colocation. This lets me choose if I want it to work on one host OS or virtualized OSes on one box. | | | | You can do this anyway without naming nodes - each instance will get a unique JGroups address anyway which acts as a name. | No, this doesn't tell me that two nodes are running on different virtualized operating systems on the same hardware. That type of information would have to be configured by the user. A certain other product used to try and automatically figure this out, but now, due to virtualization, just make you configure it. Once again, the critical difference is saying for a node "I'm machine A", rather than "I'm on the same physical hardware as nodes x.x.x.x and x.x.x.y". I don't want to go update the other three nodes on that virtualized hardware when I decide to add a fourth. "man...@jb..." wrote : | "ghinkle" wrote : | | The other thing I was thinking is that I'm pretty sure I wouldn't care if cross-session object links were maintained. I wouldn't want them to be there in the first place... so I'm not convinced data slicing (at least per session) is a bad idea. The benefit of having a secondary for each session is that when the primary fails, I can have it so the rest of the machines in the cluster share in the work of getting back to steady-state. | | | | In bigger clusters, the move back to steady state is more damaging to the cluster than the outage. I've seen clusters trip domino style due to the replication failover causing extra load / memory usage on the secondary. If I've got 512 mb of session per node and I get a failure like your scenario in the wiki, node B and C now have session memory requirements of 1.5 gb up from 1 gb pre-failure. Of course my experience is more from a time when GC was a bigger problem than it is today, but I'd still worry about quick jumps in memory and high traffic between specific nodes. | | My problem with slicing data and letting the entire network help with distributing it upfront is that we have no knowledge of what is in the cache to be able to slice/partition it without losing relationships, etc. Especially with TreeCacheAop (FIELD level session replication) we may have shared references to objects which then break down if you try and split up the sessions held. | | I agree though that the load spike for the buddy (and buddy's buddy) when a failure occurs is quite high - and hence the need for data gravitation and a load balancer that now starts to redirect requests evenly across a cluster. Gradually this extra load should spread out. | I think I disagree with the idea of trying to make this a generic cache and a proper http session cache. To be honest, I couldn't care less about having a generic buddy cache if it doesn't work well for http session replication. And I believe that there would be a lot more people interested in using it for http session replication than as a genric cache. We need to support the things customers really want to do... and right now, they're clamoring for more scalable, more easily configured clustering. You can make it much more easy to configure and maintain with the above changes and you can make it more scalable, well likely, by not strictly having one buddy take the full brunt of another node going down. (Though the best way to see these impacts is to have a big cluster, working under load and start pulling out the network cables.) Just hearing about the mechanisms though, I worry about triggering the types of chaining failures that I've seen before. View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3930680#3930680 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3930680 |