Re: [javagroups-users] Logs filling up
Brought to you by:
belaban
From: Bela B. <be...@ya...> - 2010-09-30 06:36:47
|
Eric Dalquist wrote: >> Can you reproduce this ? Is the cluster under stress, lots of >> messages being sent ? I don't think so, as the few logs I looked at >> had sent ca 18'000 messages. > > I haven't been able to. We had been seeing this quite often a few > weeks ago then realized we had some firewall rules that were blocking > some of the FD sockets. We removed those and it seemed that everything > was good. Our 4 machine QA cluster (same hardware/network configs) > doesn't appear to be having this problem even under load tests but it > is only 4 machines versus 9 in production. OK >>> I've been looking through it and can't really see anything that >>> stands out. The logs include DEBUG level info for all of the jgroups >>> package but that generally isn't much data compared to the number of >>> warnings we're getting. >>> >>> The log files are available here: >>> https://mywebspace.wisc.edu/dalquist/web/JGroups/portal.jgroups.log.tar.bz2 >>> >>> >>> I've attached our JGroups config, we're using 2.10 >> >> I would suggest to either remove FD_ALL from the config, or increase >> the props, e.g. timeout=35000 interval=10000. In a large cluster, a >> lot of messages can be sent by FD_ALL, and I created [1] today to >> look into it. >> >> [1] https://jira.jboss.org/browse/JGRP-1241 >> > I'm assuming FD_SOCK will still behave correctly without FD_ALL in the > configuration or do I need to add in some other FD layer to replace > FD_ALL? I would actually leave FD_ALL in your config, but increase the timeouts, so we reduce the risk of a broadcast storm when it triggers. Once I've fixed [1], you could try it out and that should really help. In most cases, FD_SOCK will do the job and detect a crashed member quickly, *before* FD_ALL kicks in. [1] https://jira.jboss.org/browse/JGRP-1241 -- Bela Ban Lead JGroups / Clustering Team JBoss |