From: Aleksander W. <ale...@mo...> - 2015-07-24 07:57:58
|
Hi. This is not a problem. It's rather info than an error. This message means, that during data sending through the socket the connection has been closed by other side. It usually means that there were connection timeout. This kind of messages can appear in highly loaded network, but this will not cause any data missing. When system reconnect then such packet will be send again. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 24.07.2015 08:20, 刘亚磊 wrote: > 你好: > 根据提示,修改系统内核后,这个问题解决了。但是现在有个新问题, > 正点的时候,master会报错 > Jul 23 20:01:16 mfsmaster1 mfsmaster[22443]: main master server > module: (ip:192.168.1.46) write error: EPIPE (Broken pipe) > > ------------------------------------------------------------------------ > 刘亚磊 | 买卖宝信息技术有限公司 > 北京市朝阳区红军营南 路傲城融富中心C座三层(100012) > 直线: (86) 10 56716100-8995 > 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 > > > *发件人:* Jakub Kruszona-Zawadzki <mailto:jak...@ge...> > *发送时间:* 2015-07-24 13:29 > *收件人:* 刘亚磊 <mailto:liu...@eb...> > *抄送:* moosefs-users <mailto:moo...@li...> > *主题:* Re: [MooseFS-Users] mfs_master正点失去响应 > This is caused by check for available memory in Linux. Linux > before "fork" checks if it's enough memory for "two" copies of > forking process (which is rather stupid because memory is > duplicated in COW mode, so usually both processes shares most of > their memory). To "fix" this you can change this behaviour to > "classic" using this command (as root): > > echo "1" > /proc/sys/vm/overcommit_memory > > On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb... > <mailto:liu...@eb...>> wrote: > >> mfs版本:2.0.72社区版 >> master、chunkserver、dataserver、client操作系统:centso 5.9 x64 >> >> 问题描述: >> 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。 >> master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀 >> 疑软件自身存 在bug,后来升级到2.0.72 社区版,问题依然存在。以下是 >> 正点的错误日志: >> >> >> Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store >> data in foreground - it will block master for a while): ENOMEM >> (Cannot allocate memory) >> Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using >> ip:port and csid (192.168.1.82:9422,5), but server is still >> connected >> Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept >> chunkserver (ip: 192.168.1.82 / port: 9422) >> >> ------------------------------------------------------------------------ >> 刘亚磊 | 买卖宝信息技术有限公司 >> 北京市朝阳区红军营南路傲城融富中心C座三层(100012) >> 直线: (86) 10 56716100-8995 >> 电子邮件: liu...@eb... <mailto:liu...@eb...> | 移动电 >> 话: (86) 18801039545 >> ------------------------------------------------------------------------------ >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> <mailto:moo...@li...> >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > -- > Regards, > Jakub Kruszona-Zawadzki > - - - - - - - - - - - - - - - - > Segmentation fault (core dumped) > Phone: +48 602 212 039 > > > > ------------------------------------------------------------------------------ > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |