You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Davies L. <dav...@gm...> - 2015-07-24 06:24:51
|
This makes more sense, fork() failed, so snapshot is done by the only one master process. On Thu, Jul 23, 2015 at 10:29 PM, Jakub Kruszona-Zawadzki <jak...@ge...> wrote: > This is caused by check for available memory in Linux. Linux before "fork" > checks if it's enough memory for "two" copies of forking process (which is > rather stupid because memory is duplicated in COW mode, so usually both > processes shares most of their memory). To "fix" this you can change this > behaviour to "classic" using this command (as root): > > echo "1" > /proc/sys/vm/overcommit_memory > > On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb...> wrote: > > mfs版本:2.0.72社区版 > master、chunkserver、dataserver、client操作系统:centso 5.9 x64 > > 问题描述: > 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: > > > Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in > foreground - it will block master for a while): ENOMEM (Cannot allocate > memory) > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port > and csid (192.168.1.82:9422,5), but server is still connected > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: > 192.168.1.82 / port: 9422) > > ________________________________ > 刘亚磊 | 买卖宝信息技术有限公司 > 北京市朝阳区红军营南路傲城融富中心C座三层(100012) > 直线: (86) 10 56716100-8995 > 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 > ------------------------------------------------------------------------------ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > -- > Regards, > Jakub Kruszona-Zawadzki > - - - - - - - - - - - - - - - - > Segmentation fault (core dumped) > Phone: +48 602 212 039 > > > ------------------------------------------------------------------------------ > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |
From: 刘亚磊 <liu...@eb...> - 2015-07-24 06:20:38
|
你好: 根据提示,修改系统内核后,这个问题解决了。但是现在有个新问题,正点的时候,master会报错 Jul 23 20:01:16 mfsmaster1 mfsmaster[22443]: main master server module: (ip:192.168.1.46) write error: EPIPE (Broken pipe) 刘亚磊 | 买卖宝信息技术有限公司 北京市朝阳区红军营南路傲城融富中心C座三层(100012) 直线: (86) 10 56716100-8995 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 发件人: Jakub Kruszona-Zawadzki 发送时间: 2015-07-24 13:29 收件人: 刘亚磊 抄送: moosefs-users 主题: Re: [MooseFS-Users] mfs_master正点失去响应 This is caused by check for available memory in Linux. Linux before "fork" checks if it's enough memory for "two" copies of forking process (which is rather stupid because memory is duplicated in COW mode, so usually both processes shares most of their memory). To "fix" this you can change this behaviour to "classic" using this command (as root): echo "1" > /proc/sys/vm/overcommit_memory On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb...> wrote: mfs版本:2.0.72社区版 master、chunkserver、dataserver、client操作系统:centso 5.9 x64 问题描述: 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in foreground - it will block master for a while): ENOMEM (Cannot allocate memory) Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) 刘亚磊 | 买卖宝信息技术有限公司 北京市朝阳区红军营南路傲城融富中心C座三层(100012) 直线: (86) 10 56716100-8995 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 ------------------------------------------------------------------------------ _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Regards, Jakub Kruszona-Zawadzki - - - - - - - - - - - - - - - - Segmentation fault (core dumped) Phone: +48 602 212 039 |
From: Jakub Kruszona-Z. <jak...@ge...> - 2015-07-24 06:09:15
|
This is caused by check for available memory in Linux. Linux before "fork" checks if it's enough memory for "two" copies of forking process (which is rather stupid because memory is duplicated in COW mode, so usually both processes shares most of their memory). To "fix" this you can change this behaviour to "classic" using this command (as root): echo "1" > /proc/sys/vm/overcommit_memory On 22 Jul, 2015, at 3:27, 刘亚磊 <liu...@eb...> wrote: > mfs版本:2.0.72社区版 > master、chunkserver、dataserver、client操作系统:centso 5.9 x64 > > 问题描述: > 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: > > > Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in foreground - it will block master for a while): ENOMEM (Cannot allocate memory) > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) > > 刘亚磊 | 买卖宝信息技术有限公司 > 北京市朝阳区红军营南路傲城融富中心C座三层(100012) > 直线: (86) 10 56716100-8995 > 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 > ------------------------------------------------------------------------------ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Regards, Jakub Kruszona-Zawadzki - - - - - - - - - - - - - - - - Segmentation fault (core dumped) Phone: +48 602 212 039 |
From: Davies L. <dav...@gm...> - 2015-07-23 20:30:27
|
每到整点的时候,master 会fork一个子进程把内存中的数据快照到磁盘,如果数据量小或者磁盘很快,是不会影响master的响应的。 一旦数据比较大或者磁盘很忙时(并且master还有很多访问),写快照的进程会让磁盘变得繁忙,导致另一个master进程在写changelog 时被阻塞了。 改进办法是使用更好的磁盘(SSD)或者更多内存(使得新写的快照不必立即刷新到磁盘)。 2015-07-21 18:27 GMT-07:00 刘亚磊 <liu...@eb...>: > mfs版本:2.0.72社区版 > master、chunkserver、dataserver、client操作系统:centso 5.9 x64 > > 问题描述: > 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: > > > Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in > foreground - it will block master for a while): ENOMEM (Cannot allocate > memory) > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port > and csid (192.168.1.82:9422,5), but server is still connected > Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: > 192.168.1.82 / port: 9422) > > ________________________________ > 刘亚磊 | 买卖宝信息技术有限公司 > 北京市朝阳区红军营南路傲城融富中心C座三层(100012) > 直线: (86) 10 56716100-8995 > 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 > > ------------------------------------------------------------------------------ > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |
From: Bernd B. <b....@ho...> - 2015-07-23 18:27:24
|
Hallo togehter First of all, Thanks to sales i am able to have a test License for the Pro versions 3.0.38 i did set up 2 Servers as shared Master and Chunkserver with 5 Disks each from scratch (no Metalogger) I have 3 Proxmox Servers which i set up with moosfs-pro-clients I changed my dns-server in the way, that it displays ------------------------------------------------------------------- root@proxmox-4:~# host mfsmaster mfsmaster.holde.lan has address 10.120.3.162 mfsmaster.holde.lan has address 10.120.3.161 ------------------------------------------------------------------- In this setup the system works as expected with a very good performance If i stop the leader with mfsmaster stop the follower ist promoted to master If i start the old leader again with mfsmaster start that machine is denoted as follower so far so good and exected. What i don't unterstand is the following behaviour Status before test: ------------------------------------------------------------------------------ root@mfsmaster1:~# mfssupervisor 1 (10.120.3.161:9419) : FOLLOWER (leader ip: 10.120.3.162 ; meta version: 126865 ; synchronized ; duration: 1511) 2 (10.120.3.162:9419) : LEADER (meta version: 126865 ; duration: 1504) ------------------------------------------------------------------------ mfsmaster1 is ip:xxx.161 mfsmaster2 is ip:xxx.162 If i poweroff the leader (machine 162) Then i have this result ------------------------------------------------------------------------------- root@mfsmaster1:~# mfssupervisor 10.120.3.162: connection timed out master states: 1 (10.120.3.161:9419) : ELECT (meta version: 127149 ; synchronized ; duration: 96) 2 (10.120.3.162:9419) : DEAD ------------------------------------------------------------------------------------- The cgi shows additionally the message: ------------------------------------------------------------------ Leader master server not found, but there is an elect, so make sure that all chnunkservers are running - elect should became a leader soon ------------------------------------------------------------------ This stays like this until the server is powered on again. There was no change for over 15 min then i powereup the machine (162) again -------------------------------------------------------------------------------------------- 1 (10.120.3.161:9419) : LEADER (meta version: 127253 ; duration: 63) 2 (10.120.3.162:9419) : FOLLOWER (leader ip: 10.120.3.161 ; meta version: 127253 ; synchronized ; duration: 69) -------------------------------------------------------------------------------------------------------------- This is in both ways I have the following questions to that 1. Do i have to have the master software on separate hardware to the chunks ? so that master and chunk servers are physically spit systems ? 2. If that is the case how can i deal in a 2 storage Room situation with 2 UPS Systems and Master and Chunk Server on the same ups Do i experience the same behavior ihe if i loose one UPS System amd with that 1 server with master service and one server vith chunk service? I would be very thankful if somebody could give me a hint Thanks in advance -- Mit freundlichen Grüßen Bernd Burg ------------------------------------------------------------ HOLDE AG Zum Roppertsborn 14, 66646 Marpingen Telefon +49-6827-267988-0 Telefax +49-6827-267988-9 Email b....@ho... <mailto:kr...@ac...> Sitz der Gesellschaft: Marpingen AG Saarbrücken, HRB 101630 Ust-Id-Nr.: DE29460253 Vorstand: Dipl.-Ing. Bernd Burg Aufsichtsrat: Dipl.-Ing. Axel Gaus (Vorsitz) Dipl.-Ing. Andreas Krolzig Dipl.-Ing. Gabor Richter ------------------------------------------------------------ |
From: F. O. O. <oz...@gm...> - 2015-07-22 18:17:46
|
On Wed, 22 Jul 2015 14:35:04 -0300 "Ricardo J. Barberis" <ric...@do...> wrote: > (This mail was supposed to go through last week during sourceForge's outage, > sorry if it arrives duplicate). > > El Jueves 16/07/2015, F. O. Ozbek escribió: > > On 07/15/2015 10:09 PM, Michael Tinsay wrote: > > > Is there a performance boost in increasing the chunkservers and goal? > > > Adding an additional chunkserver would be more expensive than doubling > > > the current storage space; and I still have enough drive bays in each > > > chunkserver for this. So at the moment, I can only justify spending for > > > an additional server if there is a significant performance improvement. > > > > > > It is very gratifying to see that upgrading to 2.x would greatly reduce > > > my concern. So I guess I'll have to stick it out with ext4 until btrfs > > > is tested more -- I had several bad experiences (fs/data corruption) > > > with xfs in the past that made me stay away from it for good. > > > > We have used xfs on several servers with couple hundred TBs of actual > > data with no problems. I don't know how long ago you had > > bad experiences with xfs but it is probably worth trying again. > > > > Even in the case of an underlying filesystem corruption, > > keep in mind that MooseFS has built-in error detection and > > self healing capabilities. If it detects an error with a chunk > > it will recreate it from a safe copy of that chunk. > > And in case a filesystem check is needed, xfs_repair is much faster (if a > memory hog) than e2fsck in my experience, which is a plus with filesystems > over 1TB. > > But then again, I've only been using XFS with CentOS 7 for about a year or so, > always used ext3/4 with CentOS 5/6, Ubuntu, Debian, Slackware, etc. We used xfs with ubuntu for the last 4 years exclusively. No problems at all, it is rock solid. -- F. Ozbek > > Regards, > -- > Ricardo J. Barberis > Senior SysAdmin / IT Architect > DonWeb > La Actitud Es Todo > www.DonWeb.com > > ------------------------------------------------------------------------------ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- F. O. Ozbek <oz...@gm...> |
From: Ricardo J. B. <ric...@do...> - 2015-07-22 17:35:26
|
(This mail was supposed to go through last week during sourceForge's outage, sorry if it arrives duplicate). El Jueves 16/07/2015, F. O. Ozbek escribió: > On 07/15/2015 10:09 PM, Michael Tinsay wrote: > > Is there a performance boost in increasing the chunkservers and goal? > > Adding an additional chunkserver would be more expensive than doubling > > the current storage space; and I still have enough drive bays in each > > chunkserver for this. So at the moment, I can only justify spending for > > an additional server if there is a significant performance improvement. > > > > It is very gratifying to see that upgrading to 2.x would greatly reduce > > my concern. So I guess I'll have to stick it out with ext4 until btrfs > > is tested more -- I had several bad experiences (fs/data corruption) > > with xfs in the past that made me stay away from it for good. > > We have used xfs on several servers with couple hundred TBs of actual > data with no problems. I don't know how long ago you had > bad experiences with xfs but it is probably worth trying again. > > Even in the case of an underlying filesystem corruption, > keep in mind that MooseFS has built-in error detection and > self healing capabilities. If it detects an error with a chunk > it will recreate it from a safe copy of that chunk. And in case a filesystem check is needed, xfs_repair is much faster (if a memory hog) than e2fsck in my experience, which is a plus with filesystems over 1TB. But then again, I've only been using XFS with CentOS 7 for about a year or so, always used ext3/4 with CentOS 5/6, Ubuntu, Debian, Slackware, etc. Regards, -- Ricardo J. Barberis Senior SysAdmin / IT Architect DonWeb La Actitud Es Todo www.DonWeb.com |
From: 刘亚磊 <liu...@eb...> - 2015-07-22 01:52:52
|
mfs版本:2.0.72社区版 master、chunkserver、dataserver、client操作系统:centso 5.9 x64 问题描述: 文件数量千万级,发现mfs集群master每到正点会失去响应1-2分钟。master内存、cpu、硬盘、网络监控正常。最开始使用的是1.6.25版本,怀疑软件自身存在bug,后来升级到2.0.72社区版,问题依然存在。以下是正点的错误日志: Jul 22 07:00:00 mfsmaster1 mfsmaster[22443]: fork error (store data in foreground - it will block master for a while): ENOMEM (Cannot allocate memory) Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: csdb: found cs using ip:port and csid (192.168.1.82:9422,5), but server is still connected Jul 22 07:01:47 mfsmaster1 mfsmaster[22443]: can't accept chunkserver (ip: 192.168.1.82 / port: 9422) 刘亚磊 | 买卖宝信息技术有限公司 北京市朝阳区红军营南路傲城融富中心C座三层(100012) 直线: (86) 10 56716100-8995 电子邮件: liu...@eb... | 移动电话: (86) 18801039545 |
From: F. O. O. <oz...@gm...> - 2015-07-16 15:00:41
|
On 07/15/2015 10:09 PM, Michael Tinsay wrote: > > Is there a performance boost in increasing the chunkservers and goal? > Adding an additional chunkserver would be more expensive than doubling > the current storage space; and I still have enough drive bays in each > chunkserver for this. So at the moment, I can only justify spending for > an additional server if there is a significant performance improvement. > > It is very gratifying to see that upgrading to 2.x would greatly reduce > my concern. So I guess I'll have to stick it out with ext4 until btrfs > is tested more -- I had several bad experiences (fs/data corruption) > with xfs in the past that made me stay away from it for good. We have used xfs on several servers with couple hundred TBs of actual data with no problems. I don't know how long ago you had bad experiences with xfs but it is probably worth trying again. Even in the case of an underlying filesystem corruption, keep in mind that MooseFS has built-in error detection and self healing capabilities. If it detects an error with a chunk it will recreate it from a safe copy of that chunk. > > Thanks for the insights Aleksander and Ricardo. > > Best Regards. > > Mike Tinsay > > > ------------------------------------------------------------------------ > Subject: Re: [MooseFS-Users] BTRFS > From: ale...@mo... > Date: Wed, 15 Jul 2015 23:39:33 +0200 > CC: moo...@li... > To: mic...@ho... > > Hi. > > In your situation the best idea is to add another chunk server and > change GOAL to 3 as it was mentioned by Ricardo J. Barberis. > Replication mechanism in MooseFS 2.0 and 3.0 is redesigned, and it works > much more efficient than in old 1.6 version. > So for example if you loose one disk(full 1TB) it will replicate in less > than 2 hours - 1TB RAID1 rebuild will take much much more time than this. > > Another problems: > - RAID1 is slower than JBOD in MooseFS > - Available space is lower when you are using RAID1 for chunkserver disk > than JBOD. > > About BTRFS - we have tested older version of BTFS an we had kernel > panic during tests. > We didn’t test the latest available release but we will in the nearest > future. > At this moment we recommend XFS as backend FS for chunkserver disks. > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <http://MooseFS.com> <moosefs.com <http://moosefs.com>> > > On 15 Jul 2015, at 19:17, Ricardo J. Barberis > <ric...@do... <mailto:ric...@do...>> > wrote: > > Maybe you could consider another chunkserver and a goal of 3? > > It's one more chunkserver but only 1.5x disk space compared to 2x with 2 > chunkservers and RAID1 (whether by software or hardware). > > El Miércoles 15/07/2015, Aleksander Wieliczko escribió: > > Hi Michael > > We had bad experience with BTFS so we are not advising to use > this fs as > backend for chunkserver HDD. > We recommend XFS because we achieved the best performance with > disk of > 90% usage. > > We have extra questions: > How big your MooseFS is - how many files you have? > What is your LAN speed? > What is the MooseFS version? > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <http://moosefs.com/><moosefs.com <http://moosefs.com/>> > > On 15.07.2015 09:24, Michael Tinsay wrote: > > Hi. > > Is anybody using btrfs for moosefs? > > I currently have a 2-chunkserver (using jbod, ext4 fs, and > goal=2 for > all files) setup running that has recently experienced an hd > failure > in one of the chunkservers. While replacing the failed hdd > was an > almost trivial task, there was a long span of time where a > big number > of chunks only had 1 copy. While I did not lose sleep over > it, I had > this concern of potentially losing chunks should there be > another hd > failure in the other chunkserver while moosefs is still > replicating > the 1-copy chunks. > > So now I'm entertaining the idea of moving from > just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the > underlying > storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS > kernel from kernel-ppa. > > I'm thinking that with such a setup I would still be up and > running > with 2 copies for all chunks even if both chunkservers have > 1 disk > failure each at the same time. And the setup would survive > a failed > chunkserver with 1 failed disk in the remaining chunkserver. > > I would like to hear your thoughts on this. > > Regards. > > > --- mike t. > > > ------------------------------------------------------------------------- > ----- Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and > support that > you need to offload your IT needs and focus on growing your > business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > -- > Ricardo J. Barberis > Senior SysAdmin / IT Architect > DonWeb > La Actitud Es Todo > www.DonWeb.com <http://www.donweb.com/> > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _________________________________________ > moosefs-users mailing list > moo...@li... > <mailto:moo...@li...> > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: William K. <wk...@bn...> - 2015-07-16 03:15:36
|
ok, great. We will go to 2.x and wait until, you guys are comfortable recommending 3.x On 7/15/15 7:53 AM, Piotr Robert Konopelko wrote: > Hello WK, > > We are still testing MooseFS 3.0. We want to release a good piece of > software with no bugs. > > We officially don’t recommend to run it on production. > > I can say, that MooseFS 3.0 is being tested in our production > environment for some time - about 2 months (?). > > We are still testing Mounts in *controlled* production environment - > so it /is/ a production environment, but is something would go wrong, > it won’t hurt us :) > For now, I don’t recommend running - especially Mounts - on production. > > Anyway - we are very close to release MooseFS 3. > But - as you can see on moosefs.com <http://moosefs.com> - the new 3.0 > “current” version is released so frequently, and we still fix some > bugs (mainly in Mount). > > You can view list of changes in file “NEWS” after untarring the newest > sources tarball. It is available here: > https://moosefs.com/download/sources.html. > > > Best regards, > > -- > Piotr Robert Konopelko > *MooseFS Technical Support Engineer*| moosefs.com <https://moosefs.com> > >> On 15 Jul 2015, at 4:10 pm, WK <wk...@bn... >> <mailto:wk...@bn...>> wrote: >> >> A couple of months ago I concluded that the devs feel that 2.x is >> considered stable and 3.x is still newish. >> >> Is this still the case? >> >> We have a number of 1.6.27 clusters that we are going to be upgrading. >> >> We plan on building with all new kit and simply copying over the data, >> so inplace upgrade is not required for us. >> >> So given that this is a green field deployment should we go ahead and >> start with 3.x or stay with 2.x? >> >> -wk >> >> ------------------------------------------------------------------------------ >> Don't Limit Your Business. Reach for the Cloud. >> GigeNET's Cloud Solutions provide you with the tools and support that >> you need to offload your IT needs and focus on growing your business. >> Configured For All Businesses. Start Your Cloud Today. >> https://www.gigenetcloud.com/ >> _________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Michael T. <mic...@ho...> - 2015-07-16 02:09:43
|
Is there a performance boost in increasing the chunkservers and goal? Adding an additional chunkserver would be more expensive than doubling the current storage space; and I still have enough drive bays in each chunkserver for this. So at the moment, I can only justify spending for an additional server if there is a significant performance improvement. It is very gratifying to see that upgrading to 2.x would greatly reduce my concern. So I guess I'll have to stick it out with ext4 until btrfs is tested more -- I had several bad experiences (fs/data corruption) with xfs in the past that made me stay away from it for good. Thanks for the insights Aleksander and Ricardo. Best Regards. Mike Tinsay Subject: Re: [MooseFS-Users] BTRFS From: ale...@mo... Date: Wed, 15 Jul 2015 23:39:33 +0200 CC: moo...@li... To: mic...@ho... Hi. In your situation the best idea is to add another chunk server and change GOAL to 3 as it was mentioned by Ricardo J. Barberis.Replication mechanism in MooseFS 2.0 and 3.0 is redesigned, and it works much more efficient than in old 1.6 version.So for example if you loose one disk(full 1TB) it will replicate in less than 2 hours - 1TB RAID1 rebuild will take much much more time than this. Another problems:- RAID1 is slower than JBOD in MooseFS- Available space is lower when you are using RAID1 for chunkserver disk than JBOD. About BTRFS - we have tested older version of BTFS an we had kernel panic during tests.We didn’t test the latest available release but we will in the nearest future.At this moment we recommend XFS as backend FS for chunkserver disks. Best regardsAleksander WieliczkoTechnical Support EngineerMooseFS.com <moosefs.com> On 15 Jul 2015, at 19:17, Ricardo J. Barberis <ric...@do...> wrote:Maybe you could consider another chunkserver and a goal of 3?It's one more chunkserver but only 1.5x disk space compared to 2x with 2 chunkservers and RAID1 (whether by software or hardware).El Miércoles 15/07/2015, Aleksander Wieliczko escribió:Hi Michael We had bad experience with BTFS so we are not advising to use this fs as backend for chunkserver HDD. We recommend XFS because we achieved the best performance with disk of 90% usage. We have extra questions: How big your MooseFS is - how many files you have? What is your LAN speed? What is the MooseFS version? Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 15.07.2015 09:24, Michael Tinsay wrote: Hi. Is anybody using btrfs for moosefs? I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for all files) setup running that has recently experienced an hd failure in one of the chunkservers. While replacing the failed hdd was an almost trivial task, there was a long span of time where a big number of chunks only had 1 copy. While I did not lose sleep over it, I had this concern of potentially losing chunks should there be another hd failure in the other chunkserver while moosefs is still replicating the 1-copy chunks. So now I'm entertaining the idea of moving from just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS kernel from kernel-ppa. I'm thinking that with such a setup I would still be up and running with 2 copies for all chunks even if both chunkservers have 1 disk failure each at the same time. And the setup would survive a failed chunkserver with 1 failed disk in the remaining chunkserver. I would like to hear your thoughts on this. Regards. --- mike t. ------------------------------------------------------------------------- ----- Don't Limit Your Business. Reach for the Cloud. GigeNET's Cloud Solutions provide you with the tools and support that you need to offload your IT needs and focus on growing your business. Configured For All Businesses. Start Your Cloud Today. https://www.gigenetcloud.com/ _________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ricardo J. BarberisSenior SysAdmin / IT ArchitectDonWebLa Actitud Es Todowww.DonWeb.com------------------------------------------------------------------------------Don't Limit Your Business. Reach for the Cloud.GigeNET's Cloud Solutions provide you with the tools and support thatyou need to offload your IT needs and focus on growing your business.Configured For All Businesses. Start Your Cloud Today.https://www.gigenetcloud.com/_________________________________________moosefs-users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Aleksander W. <ale...@mo...> - 2015-07-15 21:40:11
|
Hi. In your situation the best idea is to add another chunk server and change GOAL to 3 as it was mentioned by Ricardo J. Barberis. Replication mechanism in MooseFS 2.0 and 3.0 is redesigned, and it works much more efficient than in old 1.6 version. So for example if you loose one disk(full 1TB) it will replicate in less than 2 hours - 1TB RAID1 rebuild will take much much more time than this. Another problems: - RAID1 is slower than JBOD in MooseFS - Available space is lower when you are using RAID1 for chunkserver disk than JBOD. About BTRFS - we have tested older version of BTFS an we had kernel panic during tests. We didn’t test the latest available release but we will in the nearest future. At this moment we recommend XFS as backend FS for chunkserver disks. Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> > On 15 Jul 2015, at 19:17, Ricardo J. Barberis <ric...@do...> wrote: > > Maybe you could consider another chunkserver and a goal of 3? > > It's one more chunkserver but only 1.5x disk space compared to 2x with 2 > chunkservers and RAID1 (whether by software or hardware). > > El Miércoles 15/07/2015, Aleksander Wieliczko escribió: >> Hi Michael >> >> We had bad experience with BTFS so we are not advising to use this fs as >> backend for chunkserver HDD. >> We recommend XFS because we achieved the best performance with disk of >> 90% usage. >> >> We have extra questions: >> How big your MooseFS is - how many files you have? >> What is your LAN speed? >> What is the MooseFS version? >> >> Best regards >> Aleksander Wieliczko >> Technical Support Engineer >> MooseFS.com <http://moosefs.com/> <moosefs.com <http://moosefs.com/>> >> >> On 15.07.2015 09:24, Michael Tinsay wrote: >>> Hi. >>> >>> Is anybody using btrfs for moosefs? >>> >>> I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for >>> all files) setup running that has recently experienced an hd failure >>> in one of the chunkservers. While replacing the failed hdd was an >>> almost trivial task, there was a long span of time where a big number >>> of chunks only had 1 copy. While I did not lose sleep over it, I had >>> this concern of potentially losing chunks should there be another hd >>> failure in the other chunkserver while moosefs is still replicating >>> the 1-copy chunks. >>> >>> So now I'm entertaining the idea of moving from >>> just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying >>> storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS >>> kernel from kernel-ppa. >>> >>> I'm thinking that with such a setup I would still be up and running >>> with 2 copies for all chunks even if both chunkservers have 1 disk >>> failure each at the same time. And the setup would survive a failed >>> chunkserver with 1 failed disk in the remaining chunkserver. >>> >>> I would like to hear your thoughts on this. >>> >>> Regards. >>> >>> >>> --- mike t. >>> >>> >>> ------------------------------------------------------------------------- >>> ----- Don't Limit Your Business. Reach for the Cloud. >>> GigeNET's Cloud Solutions provide you with the tools and support that >>> you need to offload your IT needs and focus on growing your business. >>> Configured For All Businesses. Start Your Cloud Today. >>> https://www.gigenetcloud.com/ >>> >>> >>> _________________________________________ >>> moosefs-users mailing list >>> moo...@li... >>> https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > -- > Ricardo J. Barberis > Senior SysAdmin / IT Architect > DonWeb > La Actitud Es Todo > www.DonWeb.com <http://www.donweb.com/> > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Ricardo J. B. <ric...@do...> - 2015-07-15 17:39:34
|
Maybe you could consider another chunkserver and a goal of 3? It's one more chunkserver but only 1.5x disk space compared to 2x with 2 chunkservers and RAID1 (whether by software or hardware). El Miércoles 15/07/2015, Aleksander Wieliczko escribió: > Hi Michael > > We had bad experience with BTFS so we are not advising to use this fs as > backend for chunkserver HDD. > We recommend XFS because we achieved the best performance with disk of > 90% usage. > > We have extra questions: > How big your MooseFS is - how many files you have? > What is your LAN speed? > What is the MooseFS version? > > Best regards > Aleksander Wieliczko > Technical Support Engineer > MooseFS.com <moosefs.com> > > On 15.07.2015 09:24, Michael Tinsay wrote: > > Hi. > > > > Is anybody using btrfs for moosefs? > > > > I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for > > all files) setup running that has recently experienced an hd failure > > in one of the chunkservers. While replacing the failed hdd was an > > almost trivial task, there was a long span of time where a big number > > of chunks only had 1 copy. While I did not lose sleep over it, I had > > this concern of potentially losing chunks should there be another hd > > failure in the other chunkserver while moosefs is still replicating > > the 1-copy chunks. > > > > So now I'm entertaining the idea of moving from > > just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying > > storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS > > kernel from kernel-ppa. > > > > I'm thinking that with such a setup I would still be up and running > > with 2 copies for all chunks even if both chunkservers have 1 disk > > failure each at the same time. And the setup would survive a failed > > chunkserver with 1 failed disk in the remaining chunkserver. > > > > I would like to hear your thoughts on this. > > > > Regards. > > > > > > --- mike t. > > > > > > ------------------------------------------------------------------------- > >----- Don't Limit Your Business. Reach for the Cloud. > > GigeNET's Cloud Solutions provide you with the tools and support that > > you need to offload your IT needs and focus on growing your business. > > Configured For All Businesses. Start Your Cloud Today. > > https://www.gigenetcloud.com/ > > > > > > _________________________________________ > > moosefs-users mailing list > > moo...@li... > > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ricardo J. Barberis Senior SysAdmin / IT Architect DonWeb La Actitud Es Todo www.DonWeb.com |
From: Michael T. <mic...@ho...> - 2015-07-15 15:40:15
|
Hi Aleksander. Can you share what the bad experience with btrfs was? As to your extra questions: 1. 13TB (95% full) per chunkserver, 17 million files (most are due to the zimbra mail server and samba file servers)2. Each chunkserver uses a bonded pair of 1Gbps Ethernet3. 1.6.26 -- will move to 2.x soon as the new servers arrived just yesterday --- mike t. Re: [MooseFS-Users] BTRFSFrom: Aleksander Wieliczko <aleksander.wieliczko@mo...> - 2015-07-15 13:16:04Attachments: Message as HTML Hi Michael We had bad experience with BTFS so we are not advising to use this fs as backend for chunkserver HDD. We recommend XFS because we achieved the best performance with disk of 90% usage. We have extra questions: How big your MooseFS is - how many files you have? What is your LAN speed? What is the MooseFS version? Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 15.07.2015 09:24, Michael Tinsay wrote: > Hi. > > Is anybody using btrfs for moosefs? > > I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for > all files) setup running that has recently experienced an hd failure > in one of the chunkservers. While replacing the failed hdd was an > almost trivial task, there was a long span of time where a big number > of chunks only had 1 copy. While I did not lose sleep over it, I had > this concern of potentially losing chunks should there be another hd > failure in the other chunkserver while moosefs is still replicating > the 1-copy chunks. > > So now I'm entertaining the idea of moving from > just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying > storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS > kernel from kernel-ppa. > > I'm thinking that with such a setup I would still be up and running > with 2 copies for all chunks even if both chunkservers have 1 disk > failure each at the same time. And the setup would survive a failed > chunkserver with 1 failed disk in the remaining chunkserver. > > I would like to hear your thoughts on this. > > Regards. > > > --- mike t. > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > > > _________________________________________ > moosefs-users mailing list > moosefs-users@... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Piotr R. K. <pio...@mo...> - 2015-07-15 14:54:22
|
Hello WK, We are still testing MooseFS 3.0. We want to release a good piece of software with no bugs. We officially don’t recommend to run it on production. I can say, that MooseFS 3.0 is being tested in our production environment for some time - about 2 months (?). We are still testing Mounts in controlled production environment - so it is a production environment, but is something would go wrong, it won’t hurt us :) For now, I don’t recommend running - especially Mounts - on production. Anyway - we are very close to release MooseFS 3. But - as you can see on moosefs.com - the new 3.0 “current” version is released so frequently, and we still fix some bugs (mainly in Mount). You can view list of changes in file “NEWS” after untarring the newest sources tarball. It is available here: https://moosefs.com/download/sources.html <https://moosefs.com/download/sources.html>. Best regards, -- Piotr Robert Konopelko MooseFS Technical Support Engineer | moosefs.com <https://moosefs.com/> > On 15 Jul 2015, at 4:10 pm, WK <wk...@bn...> wrote: > > A couple of months ago I concluded that the devs feel that 2.x is > considered stable and 3.x is still newish. > > Is this still the case? > > We have a number of 1.6.27 clusters that we are going to be upgrading. > > We plan on building with all new kit and simply copying over the data, > so inplace upgrade is not required for us. > > So given that this is a green field deployment should we go ahead and > start with 3.x or stay with 2.x? > > -wk > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: WK <wk...@bn...> - 2015-07-15 14:38:37
|
A couple of months ago I concluded that the devs feel that 2.x is considered stable and 3.x is still newish. Is this still the case? We have a number of 1.6.27 clusters that we are going to be upgrading. We plan on building with all new kit and simply copying over the data, so inplace upgrade is not required for us. So given that this is a green field deployment should we go ahead and start with 3.x or stay with 2.x? -wk |
From: Aleksander W. <ale...@mo...> - 2015-07-15 13:16:04
|
Hi Michael We had bad experience with BTFS so we are not advising to use this fs as backend for chunkserver HDD. We recommend XFS because we achieved the best performance with disk of 90% usage. We have extra questions: How big your MooseFS is - how many files you have? What is your LAN speed? What is the MooseFS version? Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 15.07.2015 09:24, Michael Tinsay wrote: > Hi. > > Is anybody using btrfs for moosefs? > > I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for > all files) setup running that has recently experienced an hd failure > in one of the chunkservers. While replacing the failed hdd was an > almost trivial task, there was a long span of time where a big number > of chunks only had 1 copy. While I did not lose sleep over it, I had > this concern of potentially losing chunks should there be another hd > failure in the other chunkserver while moosefs is still replicating > the 1-copy chunks. > > So now I'm entertaining the idea of moving from > just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying > storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS > kernel from kernel-ppa. > > I'm thinking that with such a setup I would still be up and running > with 2 copies for all chunks even if both chunkservers have 1 disk > failure each at the same time. And the setup would survive a failed > chunkserver with 1 failed disk in the remaining chunkserver. > > I would like to hear your thoughts on this. > > Regards. > > > --- mike t. > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > > > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michael T. <mic...@ho...> - 2015-07-15 07:24:06
|
Hi. Is anybody using btrfs for moosefs? I currently have a 2-chunkserver (using jbod, ext4 fs, and goal=2 for all files) setup running that has recently experienced an hd failure in one of the chunkservers. While replacing the failed hdd was an almost trivial task, there was a long span of time where a big number of chunks only had 1 copy. While I did not lose sleep over it, I had this concern of potentially losing chunks should there be another hd failure in the other chunkserver while moosefs is still replicating the 1-copy chunks. So now I'm entertaining the idea of moving from just-a-bunch-of-ext4-disks to one-big-btrfs-raid1-fs as the underlying storage provider -- using Ubuntu 14.04 distro plus the 4.1.x LTS kernel from kernel-ppa. I'm thinking that with such a setup I would still be up and running with 2 copies for all chunks even if both chunkservers have 1 disk failure each at the same time. And the setup would survive a failed chunkserver with 1 failed disk in the remaining chunkserver. I would like to hear your thoughts on this. Regards. --- mike t. |
From: Tru H. <tr...@pa...> - 2015-07-08 13:21:12
|
Hello Just a quick info on our upgrade from 1.6.x to 2.0.y on our 12 nodes setup last night. node: E31270/16GB of RAM/ 2x 3TB sata running CentOS-6/GBE mfs 1.6.27 1 master (no chunkserver on it), 2x metaloggers + chunkservers 9x chunkservers only version RAM used total space avail space trash space trash files reserved space reserved files all fs objects directories files chunks all chunk copies regular chunk copies 1.6.27 14 GiB 59 TiB 9.3 TiB 35 GiB 31449 0 B 0 40289245 2149664 36887910 36610408 93643754 93643754 1) I followed the manual and failed once on the test system because I missed the /var/mfs -> /var/lib/mfs part on the chunkservers causing the complete wipeout of the test cluster... Lesson learned for the production cluster. 2) keep a copy of the previous setup cd /var && sudo rsync -aPSHv mfs mfs-1.6.27.20150707 cd /etc && sudo rsync -aPSHv mfs mfs-1.6.27.20150707 cd /etc/mfs; grep -v '^#' mfsmetalogger.cfg |grep -v ^$ grep -v '^#' mfsmetalogger.cfg.dist |grep -v ^$ -> compare and fix -> loop for all the .cfg files -> loop on master/metalogger/chunkserver or use your favorite management tool (puppet/ansible/cfgengine/scripts/...) 3) mfs -> moosefs name change: don't forget to add/activate cd /etc/init.d && for i in *moosefs*; do sudo /sbin/chkconfig --add $i; sudo /sbin/chkconfig $i on ; done 4) /var/mfs and daemon:daemon -> /var/lib/mfs and mfs:mfs sudo mv /var/mfs/* /var/lib/mfs sudo chown mfs:mfs /var/lib/mfs/* sudo chown -R mfs:mfs /where/your/chunks/lives 5) sizing: typical chunkserver: $ df -THlP Filesystem Type Size Used Avail Use% Mounted on /dev/md1 ext4 17G 1.6G 15G 10% / tmpfs tmpfs 8.4G 0 8.4G 0% /dev/shm /dev/md0 ext4 1.1G 52M 935M 6% /boot /dev/sda4 xfs 3.0T 2.9T 139G 96% /sda4 /dev/sdb4 xfs 3.0T 2.9T 142G 96% /sdb4 -> I should have made a larger / to accomodate the metadata and backup metalogger/chunkserver: $ df -THlP Filesystem Type Size Used Avail Use% Mounted on /dev/md1 ext4 17G 11G 5.2G 68% / tmpfs tmpfs 8.4G 0 8.4G 0% /dev/shm /dev/md0 ext4 1.1G 52M 936M 6% /boot /dev/sda4 xfs 3.0T 2.9T 153G 95% /sda4 /dev/sdb4 xfs 3.0T 2.9T 156G 95% /sdb4 $ du -sh /var/lib/mfs 8.0G /var/lib/mfs $ ls -ld /var/lib/mfs/* -rw-r-----. 1 mfs mfs 0 Jul 8 04:52 /var/lib/mfs/changelog_ml_back.0.mfs -rw-r-----. 1 mfs mfs 0 Jul 8 04:52 /var/lib/mfs/changelog_ml_back.1.mfs -rw-r-----. 1 mfs mfs 10 Jul 7 23:39 /var/lib/mfs/chunkserverid.mfs -rw-r-----. 1 mfs mfs 4066348 Jul 8 14:00 /var/lib/mfs/csstats.mfs -rw-r-----. 1 mfs mfs 4464630776 Jul 8 04:52 /var/lib/mfs/metadata_ml.mfs.back -rw-r-----. 1 mfs mfs 4056189018 Jul 7 23:48 /var/lib/mfs/metadata_ml.mfs.back.1 -rw-r-----. 1 mfs mfs 715 Jul 7 18:41 /var/lib/mfs/sessions_ml.mfs 6) zfs VS xfs... on basic sata spinning hds: don't ! I had a chunkserver using zfs 2 zpool, one on each hard drive. The upgrade on that node did not complete, because the chown took too much time. After waiting more than one hour more than the other chunkservers, I wiped the chunkserver storage (/bin/rm /var/lib/mfs/* and reformated the zfs -> xfs). moosefs 2.0.68-1.rhsysv 1 master (no chunkserver on it), 2x metaloggers + chunkservers 10 chunkservers only 7) poweron: start moosefs-master etc and all was back with one chunkserver missing (2 disks wiped). I also added a new chunkserver and the rebalance is now in progress on the 13 chunkservers. Metadata Servers (masters) # ip version state metadata version RAM used CPU used last successful metadata save last metadata save duration last metadata save status 1 157.99.xx.xx 2.0.68 - 532 256 493 12 GiB all:10.81% sys:0.24% user:10.57% Wed Jul 8 15:00:30 2015 ~30.0s Saved in background total space avail space trash space trash files sustained space sustained files all fs objects directories files chunks all chunk copies regular chunk copies 62 TiB 14 TiB 0 B 0 0 B 0 42416141 2235378 38923409 38666666 95746530 90549037 The memory footprint is lower: 12 GiB now (14GiB before) 8) lost files? Although when I stopped the 1.6.x version, all files were ok (goals=2 or 3), wiping one full chunkserver with 2 drives resulted on 3 missing files. I would suspect that these files were not properly stored, ie rotten chunks and not because of the upgrade. Missing files (gathered by previous file-loop) # paths inode index chunk id 1 c6/shared/boost/1.55.0/python-2.6/include/boost/range/algorithm/partial_sort_copy.hpp 37463127 0 000000000212CA0F 2 c6/shared/molprobity/4.1-537/sources/cctbx_project/smtbx/refinement/constraints/tests/thpp.hkl 37642050 0 00000000021580A2 3 pub-legacy/linux/compilers/intel/11.1/056/l_cproc_p_11.1.056_ia32.tgz 33022764 7 0000000001CE5B59 9) cgi: no longer work with elinks :( The previous version cgi could be used with: elinks http://mfsmaster/cgi-bin/mfs/mfs.cgi now elinks http://mfsmaster:9425 is no longer usable :( Hoping that helps someone Cheers Tru -- Dr Tru Huynh | http://www.pasteur.fr/research/bis mailto:tr...@pa... | tel/fax +33 1 45 68 87 37/19 Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France |
From: Piotr R. K. <pio...@mo...> - 2015-06-25 16:42:11
|
Hello, MooseFS 1.6 is an outdated and not supported anymore version. Is has a lot of bugs fixed in next releases: 2.0 and 3.0. Also, versions 2.0 and 3.0 have a lot of improvements in algorithms, which makes MooseFS more stable and efficient. We very recommend upgrading to MooseFS 2.0 which is a stable release. You can find the complete upgrade guide here: https://moosefs.com/Content/Downloads/moosefs-upgrade.pdf <https://moosefs.com/Content/Downloads/moosefs-upgrade.pdf> Please also check all documentation: https://moosefs.com/documentation/moosefs-2-0.html <https://moosefs.com/documentation/moosefs-2-0.html> Upgrade is a simple process - you mainly stop service, install a new version of package, check old configs and make appropriate changes in new config files and start the service. (we have official repo now: https://moosefs.com/download/centosfedorarhel.html <https://moosefs.com/download/centosfedorarhel.html>) Your data stay untouched. In any case after stopping MFS Master, please copy /var/lib/metadata.mfs (on master) to safe place. If you have any questions, please don’t hesitate to ask. Best regards, -- Piotr Robert Konopelko MooseFS Technical Support Engineer | moosefs.com <https://moosefs.com/> > On 25 Jun 2015, at 6:28 pm, pu...@go... <pu...@gm...> wrote: > > Hi, > I've got a pair of servers running mfs-1.6.26-1.el6.rf.x86_64 > for 10 months now, one is head/chunkserver/mfs-client, the other is chunkserver/head standby/metalogger. > Worked fine until the CentOS update: > just a few weeks ago the servers were upgraded > from CentOS 6.4 to 6.6, and since then there is a problem with the > memory consumption of the mfsmount/FUSE. > Strange facts: kernel was NOT updated (due to a problem with DRBD-module) > the FUSE-libs were also not updated, mfs-package is also the same as before. > But only within 24 hours, sometimes 5 days, the memory for the mfsmount goes > up to the max of the server's RAM, then is killed by the kernel (OOM). > > Watching the PID of mfsmount with pmap showed that it allocates more memory over > time, as the number of allocated memory-chunks is increasing: > watch "pmap $(pidof mfsmount)|grep anon|wc -l" > In just 4 hours, this number goes up from 100 to 400, while allocating 4G memory. > > Any ideas? > > Kernel: > 2.6.32-431.23.3.el6.x86_64 #1 SMP Thu Jul 31 17:20:51 UTC 2014 > > FUSE: > fuse-libs-2.8.3-4.el6.x86_64 > fuse-2.8.3-4.el6.x86_64 > > mount: > mfsmount /home fuse mfsmaster=mfsmaster,_netdev,noatime,nodev,nosuid,noexec 0 0 > > > ------------------------------------------------------------------------------ > Monitor 25 network devices or servers for free with OpManager! > OpManager is web-based network management software that monitors > network devices and physical & virtual servers, alerts via email & sms > for fault. Monitor 25 devices for free with no restriction. Download now > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o_________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: <pu...@go...> - 2015-06-25 16:28:27
|
Hi, I've got a pair of servers running mfs-1.6.26-1.el6.rf.x86_64 for 10 months now, one is head/chunkserver/mfs-client, the other is chunkserver/head standby/metalogger. Worked fine until the CentOS update: just a few weeks ago the servers were upgraded from CentOS 6.4 to 6.6, and since then there is a problem with the memory consumption of the mfsmount/FUSE. Strange facts: kernel was NOT updated (due to a problem with DRBD-module) the FUSE-libs were also not updated, mfs-package is also the same as before. But only within 24 hours, sometimes 5 days, the memory for the mfsmount goes up to the max of the server's RAM, then is killed by the kernel (OOM). Watching the PID of mfsmount with pmap showed that it allocates more memory over time, as the number of allocated memory-chunks is increasing: watch "pmap $(pidof mfsmount)|grep anon|wc -l" In just 4 hours, this number goes up from 100 to 400, while allocating 4G memory. Any ideas? Kernel: 2.6.32-431.23.3.el6.x86_64 #1 SMP Thu Jul 31 17:20:51 UTC 2014 FUSE: fuse-libs-2.8.3-4.el6.x86_64 fuse-2.8.3-4.el6.x86_64 mount: mfsmount /home fuse mfsmaster=mfsmaster,_netdev,noatime,nodev,nosuid,noexec 0 0 |
From: Nathan M. <nat...@gm...> - 2015-06-23 01:46:11
|
Hello, Is adding erasure coding a possibility in the future of MooseFS? I'd like more granularity in the efficiency of redundancy than just 0% redundancy (goal=1), 100% redundancy (goal=2), 200% redundancy (goal=3), etc... Particularly something more space saving than goal=2 but more safe than goal=1, something like 5 out of 6 file chunks must exist on network for file to be reconstructed (120% redundancy). Is this likely or would it constrain performance too much? Even so I imagine it could be added and not used by the more performance sensitive users. And obviously if in use one would want to set goal=1 as erasure coding would make replication of chunks pointless. I realize there are other filesystems with erasure builtin, but they all seem to lack the right amount of polish, stability and ease of use/setup that MooseFS seems to have. Thanks |
From: Nathan M. <nat...@gm...> - 2015-06-21 21:31:39
|
After some testing it seems that if I create a batch script and call it through bash C:\cygwin64\bin\mintty.exe /bin/bash -l -c /usr/sbin/mfschunkserver then it seems to work fine without the 100% CPU utilization. So this is ultimately probably not a MooseFS issue. Sorry for the false alarm, and I hope this helps someone else who might have the same issue, (if anyone else is trying to run MFS processes on Windows that is). On 6/20/2015 10:51 PM, Nathan Morrison wrote: > Hi, > > First off I really like MooseFS. Mostly becuase it actually works on > Windows (once compiled with and running through Cygwin). I'm not sure > if it was coded that way purposely or not but I appreciate it (even if > Client portion must be run on Linux/Mac/BSD). > > Anyway, my problem is that once I have compiled it and run it on > Windows 7 Pro it takes 100% of a processor's cycles. Since it seems > to take advantage of only one core it takes only 25% total CPU on say > a quad core CPU (thankfully). But this is still burdensome to the > client and wastes electricity if nothing else. Setting the nice level > in the config helps but I'm looking for a better solution if possible. > > I of course don't experience this on my linux or RasPi2 clients > (however those are also not self compiled versions either, I'm > apt-getting them from repo). > > Any ideas How to fix this? > |
From: Nathan M. <nat...@gm...> - 2015-06-21 02:51:43
|
Hi, First off I really like MooseFS. Mostly becuase it actually works on Windows (once compiled with and running through Cygwin). I'm not sure if it was coded that way purposely or not but I appreciate it (even if Client portion must be run on Linux/Mac/BSD). Anyway, my problem is that once I have compiled it and run it on Windows 7 Pro it takes 100% of a processor's cycles. Since it seems to take advantage of only one core it takes only 25% total CPU on say a quad core CPU (thankfully). But this is still burdensome to the client and wastes electricity if nothing else. Setting the nice level in the config helps but I'm looking for a better solution if possible. I of course don't experience this on my linux or RasPi2 clients (however those are also not self compiled versions either, I'm apt-getting them from repo). Any ideas How to fix this? |
From: Aleksander W. <ale...@mo...> - 2015-06-15 10:21:28
|
Hi. Is is possible to get some more infos from LOG file on this chunkserver? First of all please check: - if you have right entry in /etc/mfs/mfshdd.cfg for device mount point - if you can list folders in mounted ZFS device which was a chunk disk - if mouted device have right owner and group(default mfs:mfs) Best regards Aleksander Wieliczko Technical Support Engineer MooseFS.com <moosefs.com> On 15.06.2015 11:58, Ben Harker wrote: > > Aleksander - thanks for the reply, sorry to get back late I was away > for a couple of days - > > > the chunkserver in question is running mfschunkserver 3.0.22-1 - as > mentioned before, it's Ubuntu Server 14.04, ZFS, as a VM. there were a > pair, the other one's fine. > > > I'm quite happy to rebuild the box as a hardware unit, as that was the > plan all along - my main question is that the chunks still exist on > that drive, is there a semi-easy way for me to get them back into MFS, > maybe on another chunkserver, and therefore back into use? > > > when I start mfschunkserver I see > > > *scanning folder /Storage/mfschunks/: .chunkdb used - full scan don't > needed* > > > - is there maybe a way to force a scan of the chunks that it seems to > be ignoring? > > > many many thanks in advance :) > > > Benny. > > > Ben Harker > > Apple / Linux Specialist > > Barton Peveril College > Mob: 07525175882 > Tel: 02380367224 > Ext: 2731 > > > >>> Aleksander Wieliczko <ale...@mo...> 6/10/2015 > 9:30 PM >>> > > Hi. > What is the version of your MooseFS instalation? > > Best regards > Aleksander Wieliczko > Technical Support Engineer > > ------------------------------------------------------------------------ > *Od: *Ben Harker <mailto:bj...@ba...> > *Wysłano: *2015-06-10 22:15 > *Do: *moo...@li... > <mailto:moo...@li...> > *Temat: *[MooseFS-Users] Disappearing chunks on Ubuntu 14.04 chunk > server(ZFS) > > Hello all, afraid I'm out of office so unable to post any logs but I > did notice some oddness today; after a reboot, one of my chunkservers > (one of a pair) was not listing any of it's chunks on moosefs-cgi, > though the process was definitely running and the files were > definitely on the drive (...luckily, this is backups of non-critical > data!). > > Chunkservers are both running 1TB ZFS volumes on Ubuntu server VM's, > all mfs utils are current as of today. Most of the data had a goal of > 2 but some had a goal of 1, and that's the stuff I'm after. > > I've grep'd and tail'd through various logs but so far haven't come up > with anything indicating an issue, can anyone think of any simple > things I can check? > > ((If not, is there a way that I am able to get the chunks back into > MFS somehow? I'm still learning the ropes here...)) > > Many thanks in advance for any help. > > Ben. > ---------------------------------------------------------- > This message is sent in confidence for the addressee only. > It may contain confidential or sensitive information. > The contents are not to be disclosed to any one other than > the addressee. Unauthorised recipients are requested to > reserve this confidentiality and to advise us of any > errors in transmission. > > Barton Peveril College reserves the right to monitor > emails as defined in current legislation, > and regards this notice to you as notification of such a > possibility. > ---------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > _________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------ > This message is sent in confidence for the addressee only. It may > contain confidential or sensitive information. The contents are not to > be disclosed to any one other than the addressee. Unauthorised > recipients are requested to preserve this confidentiality and to > advise us of any errors in transmission. Barton Peveril College > reserves the right to monitor emails as defined in current > legislation, and regards this notice to you as notification of such a > possibility. > ------------------------------------------------------------------------ |