[javagroups-users] Reasonable use case?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello,

I'd like to run a JGroups use case by the list to see if what we're proposing
makes sense, given the current state of JGroups.  I'd appreciate hearing any
comments, at all, regarding this, including suggestions that might include
small changes in my assumptions.  Thanks.

We're building a large number of small groups, each of which is a single,
independent unit in our application.  We expect to run approximately 375 of
these groups, with five members apiece, across 500 physical machines.  The
update rate will not be too high:  Each member will broadcast to the other
members a 2k message every thirty seconds or so, or approximately 20k per
machine per minute.  We're running GigE, with good quality switches.

The group must know within a reasonable timeframe (no more than twenty seconds)
that one of the members has left the group (if, e.g., a machine or some network
component fails.)

I suspect that what I've described is a reasonable use case for JGroups, and I
built the following stack as a starting point:

    TCP(start_port=0):
    TCPGOSSIP(initial_hosts=localhost[5500],localhost[5501]):
    MERGE2(max_interval=30000;min_interval=10000):
    FD_SOCK:
    VERIFY_SUSPECT:
    pbcast.NAKACK:
    pbcast.STABLE:
    VIEW_SYNC:
    pbcast.GMS(print_local_addr=true):
    pbcast.STATE_TRANSFER:
    VIEW_ENFORCER

Some assumptions:
* TCP is the easiest way to go, since I have a small number of members. 
Multicast is a possibility, but our ops guys frown on multicast across the WAN
between our datacenters and equal numbers of members for each group live on
both sides.
* Because our ops guys would rather not use multicast, I can't use MPING for
member discovery, which means I must use TCPGOSSIP.

The following are my concerns:
* Using FD_SOCK lets us know right away that a machine has failed, but it
appears to use a new connection for every member in the group.  The fewer
connections, the happier I am.  Is there any way to use the connection that TCP
uses?
* As with FD_SOCK, TCPGOSSIP seems to like keeping a connection open to each of
the gossip routers.  Again, is there any way to reuse the connections so I
don't wind up with thousands of connections?  Or is this simply a bad use for
TCPGOSSIP?
* We need to provide redundancy for the gossip routers, since we cannot ever
have a single point of failure.  In fact, there should be a minimum of two
gossip routers in each of our datacenters, preferably three; local machines use
the local router.  Can the routers balance the load between them, in addition
to providing redundancy?  I've looked through the docs and code, but I'm not
entirely sure that this is possible.

My apologies for the long email.  If you need more info, please let me know. 
Thanks for your help!

-Joe

____________________________________________________________________________________
The fish are biting. 
Get more visitors on your site using Yahoo! Search Marketing.
http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php