From: Bela B. <be...@ya...> - 2002-11-27 05:26:14
|
Hi Bruce, okay, what happened was that the initial state transfer took a long time because the state was large (747795 bytes), so the method ran into a timeout. 2 things you can do to remdey this: 1. Increase the timeout (replicationTimeout) in the Manager tag. 2. I have made the unicast state transfer asynchronous; ie. we won't wait for an ack. Therefore there won't be any timeouts 2 things on my todo list: 1. I will make the state transfer a synchronous call: request and response will be corelated. Currently we use 2 asynchronous messages, so if you receive a message *after* you send the state transfer request, but *before* you receive the state transfer response, your state will be incorrect. I want state transfer to be handled on the level of JavaGroups (which does it correctly, even in light of messages being sent during xfer) rather than the appl level. 2. I'm working on a new building block in JavaGroups (TransactionalHashtable), which will allow a developer to mix and match (a) asynchronous, (b) synchronous and (c) synchronous with locking method calls. This is essentially the same as the current MessageDispatcher, but with added transactional semantics. For paranoid folks this will add total serializability to their replication (if they want to). Check out the changes in the CVS. P.S.: I suggest to upgrade to JavaGroups 2.0.4 soon, the 2.0.1 version we're using is still from July 2002. Also, we need to change the documentation: min_wait_time in UNICAST is not supported anymore. Duncan/Filip: can we add the documentation to the CVS as well so we have a central place to maintain it ? We can then update the website from the CVS. Cheers, > Iam using the same javagroups jar that is checked into > CVS for tomcat-javagroups. > > Here's my Manager line: > > <Manager > protocolStack="UDP(mcast_addr=228.1.2.3;mcast_port=45566;ip_ttl=32):PING(timeout=3000;num_initial_members=6):FD(timeout=5000):VERIFY_SUSPECT(timeout=1500):pbcast.STABLE(desired_avg_gossip=10000):pbcast.NAKACK(gc_lag=10;retransmit_timeout=3000):UNICAST(timeout=5000;min_wait_time=2000):MERGE2:FRAG:pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;print_local_addr=false)" > className="org.apache.catalina.session.InMemoryReplicationManager" > groupName="NetChartsServerGroup" > synchronousReplication="true" debug="9"/> > > I used a loadbalancer i wrote to round robin hit 2 > servers for quite some time. The pages i hit add > fairly large objects the the session. I then killed > one of the servers. I then kept browsing around for a > bit. Then i started the server back up and this is > what i got: > > > 2002-11-26 19:58:49 StandardManager[]: > [InMemoryReplicationManager] Trying to send message > [dst: jenkins:34192, src: chatham:2474, size = 747795 > bytes] with type=SESSION-CREATED > 2002-11-26 19:58:54 StandardManager[] > [InMemoryReplicationManager] Unable to send message > through javagroups channel > TimeoutException > -- Bela Ban www.javagroups.com (408) 316-4459 |