From: Laurent W. <lw...@hy...> - 2011-08-10 11:43:40
|
On Tue, 09 Aug 2011 20:46:45 -0400 Robert Sandilands <rsa...@ne...> wrote: > Increasing the swap space fixed the fork() issue. It seems that you have > to ensure that memory available is always double the memory needed by > mfsmaster. None of the swap space was used over the last 24 hours. > > This did solve the extreme comb-like behavior of mfsmaster. It still > does not resolve its sensitivity to load on the server. I am still > seeing timeouts on the chunkservers and mounts on the hour due to the > high CPU and I/O load when the meta data is dumped to disk. It did > however decrease significantly. > > An example from the logs: > > Aug 9 04:03:38 http-lb-1 mfsmount[13288]: master: tcp recv error: > ETIMEDOUT (Operation timed out) (1) > Aug 9 04:03:39 http-lb-1 mfsmount[13288]: master: register error (read > header: ETIMEDOUT (Operation timed out)) > Aug 9 04:03:41 http-lb-1 mfsmount[13288]: registered to master Hi, what if you apply these tweaks to ip stack on master/CS/metaloggers ? # to avoid problems with heavily loaded servers echo 16000 > /proc/sys/fs/file-max echo 100000 > /proc/sys/net/ipv4/ip_conntrack_max # to avoid Neighbour table overflow echo "512" > /proc/sys/net/ipv4/neigh/default/gc_thresh1 echo "2048" > /proc/sys/net/ipv4/neigh/default/gc_thresh2 echo "4048" > /proc/sys/net/ipv4/neigh/default/gc_thresh3 No need to restart anything, these can be applied on the fly without disturbing services. HTH, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C |