From: Robert S. <rsa...@ne...> - 2011-10-17 13:34:30
|
Hi Michal, The machine never used the swap. I verified that over several days using sar. It just needed the swap to be able to successfully fork. It seems like Linux will fail to fork if there is not enough space to keep a complete copy of the application forking even though it won't use the memory. The Linux kernel refuses to over-subscribe the memory. I verified this by looking at the fork() code in the kernel. There is a check that verifies the current amount of used memory plus the size of the application forking is less than the memory available. If that is not the case the fork() call fails. I agree that using swap should be a last desperate measure and no production system should depend on swap to operate. Even on a much faster dedicated master with significantly more RAM we still see timeouts. It just seems to be limited to around 1 minute every hour and not 5 minutes every hour. The new master has 72 GB of RAM and it currently has 125 million files. This has improved stability and has allowed me to focus on other bottlenecks in mfsmount and mfschunkserver. Robert On 10/17/11 9:00 AM, Michał Borychowski wrote: > > Hi! > > Again, this is not that easy so state that you need to double the > memory needed by mfsmaster. Fork doesn't copy the whole memory > occupied by the process. Memory used by both processes is in "copy on > write" state and you need only space for "differences". We estimate > that for the master which makes lots of operations it would be > neccessary to have 30-40% extra of memory normally used by the process. > > And in the long run increasing swap is not good. When master starts to > use it too much during saves, it may happen that the whole system will > hung up. Probably that's why you have these timeouts. To be honest you > should increase physical RAM and not the swap. (We had 16GB RAM and it > started to be not enough when master needed 13GB - we had to put more > RAM then). > > Kind regards > > Michał Borychowski > > MooseFS Support Manager > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > > Gemius S.A. > > ul. Wołoska 7, 02-672 Warszawa > > Budynek MARS, klatka D > > Tel.: +4822 874-41-00 > > Fax : +4822 874-41-01 > > *From:*Robert Sandilands [mailto:rsa...@ne...] > *Sent:* Wednesday, August 10, 2011 3:12 PM > *To:* moo...@li... > *Subject:* Re: [Moosefs-users] mfsmaster performance and hardware > > Hi Laurent, > > Due to the use of ktune a lot of values are already tweaked. For > example file-max. I don't have iptables loaded as I measured at some > stage that conntrack was -really- slow with large numbers of connections. > > I am not seeing gc_threshold related log messages but I can't see any > reason not to tweak that. > > Robert > > On 8/10/11 2:20 AM, Laurent Wandrebeck wrote: > > On Tue, 09 Aug 2011 20:46:45 -0400 > Robert Sandilands<rsa...@ne...> <mailto:rsa...@ne...> wrote: > > > Increasing the swap space fixed the fork() issue. It seems that you have > > to ensure that memory available is always double the memory needed by > > mfsmaster. None of the swap space was used over the last 24 hours. > > > > This did solve the extreme comb-like behavior of mfsmaster. It still > > does not resolve its sensitivity to load on the server. I am still > > seeing timeouts on the chunkservers and mounts on the hour due to the > > high CPU and I/O load when the meta data is dumped to disk. It did > > however decrease significantly. > > > > An example from the logs: > > > > Aug 9 04:03:38 http-lb-1 mfsmount[13288]: master: tcp recv error: > > ETIMEDOUT (Operation timed out) (1) > > Aug 9 04:03:39 http-lb-1 mfsmount[13288]: master: register error (read > > header: ETIMEDOUT (Operation timed out)) > > Aug 9 04:03:41 http-lb-1 mfsmount[13288]: registered to master > > > Hi, > what if you apply these tweaks to ip stack on master/CS/metaloggers ? > # to avoid problems with heavily loaded servers > echo 16000> /proc/sys/fs/file-max > echo 100000> /proc/sys/net/ipv4/ip_conntrack_max > > # to avoid Neighbour table overflow > echo "512"> /proc/sys/net/ipv4/neigh/default/gc_thresh1 > echo "2048"> /proc/sys/net/ipv4/neigh/default/gc_thresh2 > echo "4048"> /proc/sys/net/ipv4/neigh/default/gc_thresh3 > > No need to restart anything, these can be applied on the fly without > disturbing services. > HTH, > > > > > ------------------------------------------------------------------------------ > uberSVN's rich system and user administration capabilities and model > configuration take the hassle out of deploying and managing Subversion and > the tools developers use with it. Learn more about uberSVN and get a free > download at:http://p.sf.net/sfu/wandisco-dev2dev > > > > > _______________________________________________ > moosefs-users mailing list > moo...@li... <mailto:moo...@li...> > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |