From: 刘亚磊 <liu...@eb...> - 2015-07-24 06:20:38
|
你好: 根据提示,修改系统内核后,这个问题解决了。但是现在有个新问题,正点的时候,master会报错 Jul 23 20:01:16 mfsmaster1 mfsmaster[22443]: main master server module: (ip:192.168.1.46) write error: EPIPE (Broken pipe) 刘亚磊 | 买卖宝信息技术有限公司 北京市朝阳区红军营南路傲城融富中心C座三层(100012) 直线: (86) 10 56716100-8995 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 发件人: Jakub Kruszona-Zawadzki 发送时间: 2015-07-24 13:29 收件人: 刘亚磊 抄送: moosefs-users 主题: Re: [MooseFS-Users] mfs_master正点失去响应 This is caused by check for available memory in Linux. Linux before "fork" checks if it's enough memory for "two" copies of forking process (which is rather stupid because memory is duplicated in COW mode, so usually both processes shares most of their memory). To "fix" this you can change this behaviour to "classic" using this command (as root): echo "1" > /proc/sys/vm/overcommit_memory On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb...> wrote: mfs版本:2.0.72社区版 master、chunkserver、dataserver、client操作系统:centso 5.9 x64 问题描述: 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in foreground - it will block master for a while): ENOMEM (Cannot allocate memory) Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) 刘亚磊 | 买卖宝信息技术有限公司 北京市朝阳区红军营南路傲城融富中心C座三层(100012) 直线: (86) 10 56716100-8995 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 ------------------------------------------------------------------------------ _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Regards, Jakub Kruszona-Zawadzki - - - - - - - - - - - - - - - - Segmentation fault (core dumped) Phone: +48 602 212 039 |