[javagroups-users] Reasonable use case?
Brought to you by:
belaban
From: Joe R <vin...@ya...> - 2007-05-25 13:28:39
|
Hello, I'd like to run a JGroups use case by the list to see if what we're proposing makes sense, given the current state of JGroups. I'd appreciate hearing any comments, at all, regarding this, including suggestions that might include small changes in my assumptions. Thanks. We're building a large number of small groups, each of which is a single, independent unit in our application. We expect to run approximately 375 of these groups, with five members apiece, across 500 physical machines. The update rate will not be too high: Each member will broadcast to the other members a 2k message every thirty seconds or so, or approximately 20k per machine per minute. We're running GigE, with good quality switches. The group must know within a reasonable timeframe (no more than twenty seconds) that one of the members has left the group (if, e.g., a machine or some network component fails.) I suspect that what I've described is a reasonable use case for JGroups, and I built the following stack as a starting point: TCP(start_port=0): TCPGOSSIP(initial_hosts=localhost[5500],localhost[5501]): MERGE2(max_interval=30000;min_interval=10000): FD_SOCK: VERIFY_SUSPECT: pbcast.NAKACK: pbcast.STABLE: VIEW_SYNC: pbcast.GMS(print_local_addr=true): pbcast.STATE_TRANSFER: VIEW_ENFORCER Some assumptions: * TCP is the easiest way to go, since I have a small number of members. Multicast is a possibility, but our ops guys frown on multicast across the WAN between our datacenters and equal numbers of members for each group live on both sides. * Because our ops guys would rather not use multicast, I can't use MPING for member discovery, which means I must use TCPGOSSIP. The following are my concerns: * Using FD_SOCK lets us know right away that a machine has failed, but it appears to use a new connection for every member in the group. The fewer connections, the happier I am. Is there any way to use the connection that TCP uses? * As with FD_SOCK, TCPGOSSIP seems to like keeping a connection open to each of the gossip routers. Again, is there any way to reuse the connections so I don't wind up with thousands of connections? Or is this simply a bad use for TCPGOSSIP? * We need to provide redundancy for the gossip routers, since we cannot ever have a single point of failure. In fact, there should be a minimum of two gossip routers in each of our datacenters, preferably three; local machines use the local router. Can the routers balance the load between them, in addition to providing redundancy? I've looked through the docs and code, but I'm not entirely sure that this is possible. My apologies for the long email. If you need more info, please let me know. Thanks for your help! -Joe ____________________________________________________________________________________ The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php |