Re: [javagroups-users] NAKACK - range is null, sender not found in xmit_table
Brought to you by:
belaban
From: Bela B. <be...@ya...> - 2008-09-27 15:20:12
|
I've never tested JGroups on NLB. How do you stop the boxes ? If you simply kill the processes, JGroups should exclude them from the cluster gracefully, the time taken is determined by FD_SOCK/FD/FD_ALL. Did you set UDP.loopback to true and do you have MERGE2 in your stack ? Kristof Jozsa wrote: > We have our application using JGroups set up on 2 win2k3 test-server > boxes running in VMware and pulled into a Windows Network Load > Balancer setup. Our daily backup procedure stops both boxes, does the > backup and puts them back online. Sometime in this process, JGroups > starts spamming the logs with the following two lines of error: > > 2008-09-19 17:11:46,362 ERROR 60152285 > [org.jgroups.protocols.pbcast.NAKACK] - range is null > 2008-09-19 17:11:46,362 ERROR 60152285 > [org.jgroups.protocols.pbcast.NAKACK] - sender OTHER_HOST_IP:4931 not > found in xmit_table > > where OTHER_HOST_IP always refers to the ip address of the other > cluster node (while the problem happens on both nodes equally). We get > thousands of these errors per second and it doesn't appear to stop > ever (we actually realized the problem first when the servers had ran > out of disk space. We are using JGroups 2.6.3 (the previous stable > version), didn't yet have a chance to upgrade to the latest. Any idea > what can cause this problem? > > Slightly related, should JGroups with the stock udp.xml setup work > with windows nodes being in an NLB setup? > > > -- Bela Ban Lead JGroups / Clustering Team JBoss - a division of Red Hat |