Re: [Moosefs-users] mfsmaster performance and hardware

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi!

Again, this is not that easy so state that you need to double the memory
needed by mfsmaster. Fork doesn't copy the whole memory occupied by the
process. Memory used by both processes is in "copy on write" state and you
need only space for "differences". We estimate that for the master which
makes lots of operations it would be neccessary to have 30-40% extra of
memory normally used by the process. 

And in the long run increasing swap is not good. When master starts to use
it too much during saves, it may happen that the whole system will hung up.
Probably that's why you have these timeouts. To be honest you should
increase physical RAM and not the swap. (We had 16GB RAM and it started to
be not enough when master needed 13GB - we had to put more RAM then).

Kind regards

Michał Borychowski 

MooseFS Support Manager

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Gemius S.A.

ul. Wołoska 7, 02-672 Warszawa

Budynek MARS, klatka D

Tel.: +4822 874-41-00

Fax : +4822 874-41-01

From: Robert Sandilands [mailto:rsa...@ne...] 
Sent: Wednesday, August 10, 2011 3:12 PM
To: moo...@li...
Subject: Re: [Moosefs-users] mfsmaster performance and hardware

Hi Laurent,

Due to the use of ktune a lot of values are already tweaked. For example
file-max. I don't have iptables loaded as I measured at some stage that
conntrack was -really- slow with large numbers of connections. 

I am not seeing gc_threshold related log messages but I can't see any reason
not to tweak that.

Robert

On 8/10/11 2:20 AM, Laurent Wandrebeck wrote: 

On Tue, 09 Aug 2011 20:46:45 -0400
Robert Sandilands  <mailto:rsa...@ne...> <rsa...@ne...>
wrote:

Increasing the swap space fixed the fork() issue. It seems that you have 
to ensure that memory available is always double the memory needed by 
mfsmaster. None of the swap space was used over the last 24 hours.

This did solve the extreme comb-like behavior of mfsmaster. It still 
does not resolve its sensitivity to load on the server. I am still 
seeing timeouts on the chunkservers and mounts on the hour due to the 
high CPU and I/O load when the meta data is dumped to disk. It did 
however decrease significantly.

An example from the logs:

Aug  9 04:03:38 http-lb-1 mfsmount[13288]: master: tcp recv error: 
ETIMEDOUT (Operation timed out) (1)
Aug  9 04:03:39 http-lb-1 mfsmount[13288]: master: register error (read 
header: ETIMEDOUT (Operation timed out))
Aug  9 04:03:41 http-lb-1 mfsmount[13288]: registered to master

Hi,
what if you apply these tweaks to ip stack on master/CS/metaloggers ?
# to avoid problems with heavily loaded servers
echo 16000 > /proc/sys/fs/file-max
echo 100000 > /proc/sys/net/ipv4/ip_conntrack_max

# to avoid Neighbour table overflow
echo "512" > /proc/sys/net/ipv4/neigh/default/gc_thresh1
echo "2048" > /proc/sys/net/ipv4/neigh/default/gc_thresh2
echo "4048" > /proc/sys/net/ipv4/neigh/default/gc_thresh3

No need to restart anything, these can be applied on the fly without
disturbing services.
HTH,

----------------------------------------------------------------------------
--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev

_______________________________________________
moosefs-users mailing list
moo...@li...
https://lists.sourceforge.net/lists/listinfo/moosefs-users

Re: [Moosefs-users] mfsmaster performance and hardware

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] mfsmaster performance and hardware