From: Michał B. <mic...@ge...> - 2011-10-17 13:03:13
|
Hi! Again, this is not that easy so state that you need to double the memory needed by mfsmaster. Fork doesn't copy the whole memory occupied by the process. Memory used by both processes is in "copy on write" state and you need only space for "differences". We estimate that for the master which makes lots of operations it would be neccessary to have 30-40% extra of memory normally used by the process. And in the long run increasing swap is not good. When master starts to use it too much during saves, it may happen that the whole system will hung up. Probably that's why you have these timeouts. To be honest you should increase physical RAM and not the swap. (We had 16GB RAM and it started to be not enough when master needed 13GB - we had to put more RAM then). Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 From: Robert Sandilands [mailto:rsa...@ne...] Sent: Wednesday, August 10, 2011 3:12 PM To: moo...@li... Subject: Re: [Moosefs-users] mfsmaster performance and hardware Hi Laurent, Due to the use of ktune a lot of values are already tweaked. For example file-max. I don't have iptables loaded as I measured at some stage that conntrack was -really- slow with large numbers of connections. I am not seeing gc_threshold related log messages but I can't see any reason not to tweak that. Robert On 8/10/11 2:20 AM, Laurent Wandrebeck wrote: On Tue, 09 Aug 2011 20:46:45 -0400 Robert Sandilands <mailto:rsa...@ne...> <rsa...@ne...> wrote: Increasing the swap space fixed the fork() issue. It seems that you have to ensure that memory available is always double the memory needed by mfsmaster. None of the swap space was used over the last 24 hours. This did solve the extreme comb-like behavior of mfsmaster. It still does not resolve its sensitivity to load on the server. I am still seeing timeouts on the chunkservers and mounts on the hour due to the high CPU and I/O load when the meta data is dumped to disk. It did however decrease significantly. An example from the logs: Aug 9 04:03:38 http-lb-1 mfsmount[13288]: master: tcp recv error: ETIMEDOUT (Operation timed out) (1) Aug 9 04:03:39 http-lb-1 mfsmount[13288]: master: register error (read header: ETIMEDOUT (Operation timed out)) Aug 9 04:03:41 http-lb-1 mfsmount[13288]: registered to master Hi, what if you apply these tweaks to ip stack on master/CS/metaloggers ? # to avoid problems with heavily loaded servers echo 16000 > /proc/sys/fs/file-max echo 100000 > /proc/sys/net/ipv4/ip_conntrack_max # to avoid Neighbour table overflow echo "512" > /proc/sys/net/ipv4/neigh/default/gc_thresh1 echo "2048" > /proc/sys/net/ipv4/neigh/default/gc_thresh2 echo "4048" > /proc/sys/net/ipv4/neigh/default/gc_thresh3 No need to restart anything, these can be applied on the fly without disturbing services. HTH, ---------------------------------------------------------------------------- -- uberSVN's rich system and user administration capabilities and model configuration take the hassle out of deploying and managing Subversion and the tools developers use with it. Learn more about uberSVN and get a free download at: http://p.sf.net/sfu/wandisco-dev2dev _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |