Re: [MooseFS-Users] mfs_master正点失去响应

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

This is caused by check for available memory in Linux. Linux before "fork" checks if it's enough memory for "two" copies of forking process (which is rather stupid because memory is duplicated in COW mode, so usually both processes shares most of their memory). To "fix" this you can change this behaviour to "classic" using this command (as root):

echo "1" > /proc/sys/vm/overcommit_memory

On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb...> wrote:

> mfs版本：2.0.72社区版
> master、chunkserver、dataserver、client操作系统：centso 5.9 x64
> 
> 问题描述：
> 文件数量千万级，发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本，怀疑软件自身存在bug，后来升级到2.0.72社区版，问题依然存在。以下是正点的错误日志：
> 
> 
> Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in foreground - it will block master for a while): ENOMEM (Cannot allocate memory) 
> Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected 
> Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422)
> 
> 刘亚磊 |  买卖宝信息技术有限公司
> 北京市朝阳区红军营南路傲城融富中心C座三层（100012） 
> 直线: (86) 10 56716100-8995
> 电子邮件: liu...@eb... | 移动电话: (86) 18801039545
> ------------------------------------------------------------------------------
> _________________________________________
> moosefs-users mailing list
> moo...@li...
> https://lists.sourceforge.net/lists/listinfo/moosefs-users

-- 
Regards,
Jakub Kruszona-Zawadzki
- - - - - - - - - - - - - - - -
Segmentation fault (core dumped)
Phone: +48 602 212 039

Re: [MooseFS-Users] mfs_master正点失去响应

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [MooseFS-Users] mfs_master正点失去响应