|
From: Bela B. <be...@ya...> - 2002-11-27 05:26:14
|
Hi Bruce,
okay, what happened was that the initial state transfer took a long time
because the state was large (747795 bytes), so the method ran into a
timeout.
2 things you can do to remdey this:
1. Increase the timeout (replicationTimeout) in the Manager tag.
2. I have made the unicast state transfer asynchronous; ie. we won't
wait for an ack. Therefore there won't be any timeouts
2 things on my todo list:
1. I will make the state transfer a synchronous call: request and
response will be corelated. Currently we use 2 asynchronous
messages, so if you receive a message *after* you send the state
transfer request, but *before* you receive the state transfer
response, your state will be incorrect. I want state transfer to
be handled on the level of JavaGroups (which does it correctly,
even in light of messages being sent during xfer) rather than the
appl level.
2. I'm working on a new building block in JavaGroups
(TransactionalHashtable), which will allow a developer to mix and
match (a) asynchronous, (b) synchronous and (c) synchronous with
locking method calls. This is essentially the same as the current
MessageDispatcher, but with added transactional semantics. For
paranoid folks this will add total serializability to their
replication (if they want to).
Check out the changes in the CVS.
P.S.: I suggest to upgrade to JavaGroups 2.0.4 soon, the 2.0.1 version
we're using is still from July 2002. Also, we need to change the
documentation: min_wait_time in UNICAST is not supported anymore.
Duncan/Filip: can we add the documentation to the CVS as well so we have
a central place to maintain it ? We can then update the website from the
CVS.
Cheers,
> Iam using the same javagroups jar that is checked into
> CVS for tomcat-javagroups.
>
> Here's my Manager line:
>
> <Manager
> protocolStack="UDP(mcast_addr=228.1.2.3;mcast_port=45566;ip_ttl=32):PING(timeout=3000;num_initial_members=6):FD(timeout=5000):VERIFY_SUSPECT(timeout=1500):pbcast.STABLE(desired_avg_gossip=10000):pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):UNICAST(timeout=5000;min_wait_time=2000):MERGE2:FRAG:pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false)"
> className="org.apache.catalina.session.InMemoryReplicationManager"
> groupName="NetChartsServerGroup"
> synchronousReplication="true" debug="9"/>
>
> I used a loadbalancer i wrote to round robin hit 2
> servers for quite some time. The pages i hit add
> fairly large objects the the session. I then killed
> one of the servers. I then kept browsing around for a
> bit. Then i started the server back up and this is
> what i got:
>
>
> 2002-11-26 19:58:49 StandardManager[]:
> [InMemoryReplicationManager] Trying to send message
> [dst: jenkins:34192, src: chatham:2474, size = 747795
> bytes] with type=SESSION-CREATED
> 2002-11-26 19:58:54 StandardManager[]
> [InMemoryReplicationManager] Unable to send message
> through javagroups channel
> TimeoutException
>
--
Bela Ban
www.javagroups.com
(408) 316-4459
|