You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010 |
Jan
(20) |
Feb
(11) |
Mar
(11) |
Apr
(9) |
May
(22) |
Jun
(85) |
Jul
(94) |
Aug
(80) |
Sep
(72) |
Oct
(64) |
Nov
(69) |
Dec
(89) |
2011 |
Jan
(72) |
Feb
(109) |
Mar
(116) |
Apr
(117) |
May
(117) |
Jun
(102) |
Jul
(91) |
Aug
(72) |
Sep
(51) |
Oct
(41) |
Nov
(55) |
Dec
(74) |
2012 |
Jan
(45) |
Feb
(77) |
Mar
(99) |
Apr
(113) |
May
(132) |
Jun
(75) |
Jul
(70) |
Aug
(58) |
Sep
(58) |
Oct
(37) |
Nov
(51) |
Dec
(15) |
2013 |
Jan
(28) |
Feb
(16) |
Mar
(25) |
Apr
(38) |
May
(23) |
Jun
(39) |
Jul
(42) |
Aug
(19) |
Sep
(41) |
Oct
(31) |
Nov
(18) |
Dec
(18) |
2014 |
Jan
(17) |
Feb
(19) |
Mar
(39) |
Apr
(16) |
May
(10) |
Jun
(13) |
Jul
(17) |
Aug
(13) |
Sep
(8) |
Oct
(53) |
Nov
(23) |
Dec
(7) |
2015 |
Jan
(35) |
Feb
(13) |
Mar
(14) |
Apr
(56) |
May
(8) |
Jun
(18) |
Jul
(26) |
Aug
(33) |
Sep
(40) |
Oct
(37) |
Nov
(24) |
Dec
(20) |
2016 |
Jan
(38) |
Feb
(20) |
Mar
(25) |
Apr
(14) |
May
(6) |
Jun
(36) |
Jul
(27) |
Aug
(19) |
Sep
(36) |
Oct
(24) |
Nov
(15) |
Dec
(16) |
2017 |
Jan
(8) |
Feb
(13) |
Mar
(17) |
Apr
(20) |
May
(28) |
Jun
(10) |
Jul
(20) |
Aug
(3) |
Sep
(18) |
Oct
(8) |
Nov
|
Dec
(5) |
2018 |
Jan
(15) |
Feb
(9) |
Mar
(12) |
Apr
(7) |
May
(123) |
Jun
(41) |
Jul
|
Aug
(14) |
Sep
|
Oct
(15) |
Nov
|
Dec
(7) |
2019 |
Jan
(2) |
Feb
(9) |
Mar
(2) |
Apr
(9) |
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
(1) |
Nov
(12) |
Dec
(2) |
2020 |
Jan
(2) |
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
(4) |
Jul
(4) |
Aug
(1) |
Sep
(18) |
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
|
Sep
(5) |
Oct
(5) |
Nov
(3) |
Dec
|
2022 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Elliot F. <efi...@gm...> - 2012-05-31 16:32:29
|
when you take a chunkserver offline, your active VMs are modifying chunks within their VM image(s). But they can only modify the copies that are online. So when you bring your chunkserver back online, you have undergoal chunks because the offline copy hasn't been modified. Also, the copies that were offline are now an older (i.e. wrong) version, thus causing your version errors. On Thu, May 24, 2012 at 1:56 PM, wkmail <wk...@bn...> wrote: > We have a test MFS cluster (1.6.20) that we are hosting some > utility/research VM's on it using Oracle Virtual Box to see how well MFS > handles VMs. > All has gone well besides the expected disk i/o speed hit, which we are > can handle. > > However we have TWICE now experienced this problem. > > We decide to replace one of the chunkservers with better gear. It had 2 > x 1TB SATA drives and we usually remove them one at a time. > > We first marked drive #2 for removal using '*' in mfshdd.cfg and > restarted the chunkserver process (.i.e. ./chunkerserver restart). > > The CGI showed the drive as marked for removal and replicated off the > chunks as normal. Two days later it was done. No problems or errors were > observed. > > The following day, we then excluded drive #2 the now deprecated drive > by commenting it out in mfshdd.cfg and restarted the chunkerserver again. > > During the scan, the CGI correctly reported the 650,000+ chunks or so as > undergoal and the chunkserver was missing from the server list. > > When the chunkserver process finished scanning the single remaining > drive, the CGI program reported that all chunkers were back but now > showed 700+ chunks as undergoal (2/3) AND 700+ chunks as overgoal which > confused us. > > The MFS cluster then began to fix the undergoal chunks. During that > period we began to see lot of the error 19 messages which the source > code identifies as Wrong Chunk Version. > > During the undergoal replication process, we had a number of the busier > VMs report hard drive errors and either go RO or Panic. > > When the undergoal chunk replication completed about an hour later, we > ceased to have any problems and we are no longer seeing the error 19 > messages. > > Does anyone have an explanation as to what is going on and what we can > do to prevent the issue from re-occurring? > > Are Disk images problematic on MFS? > > Below are the relevant log entries in the MFSmaster log showing us > restarting the CS and the subsequent errors. > > Note, on the chunkerservers themselves, each error in the master log has > a corresponding error in the Chunkserver log > i.e. > May 24 10:58:25 mfs7chunker7 mfschunkserver[1085]: replicator: got > status: 19 from (C0A80016:24CE) > May 24 10:58:26 mfs7chunker7 mfschunkserver[1085]: replicator: got > status: 19 from (C0A80018:24CE) > > > from /var/log/messages on the master. > > May 24 10:27:45 mfs7master mfsmaster[32522]: connection with > CS(192.168.0.23) has been closed by peer > May 24 10:32:49 mfs7master mfsmaster[32522]: chunkserver register begin > (packet version: 5) - ip: 192.168.0.23, port: 9422 > May 24 10:33:37 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BA8A31 replication status: 19 > May 24 10:34:43 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B85FD0 replication status: 19 > May 24 10:35:05 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CF1D62 replication status: 19 > May 24 10:35:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C519DE replication status: 19 > May 24 10:35:34 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C61B66 replication status: 19 > May 24 10:36:05 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD75B2 replication status: 19 > May 24 10:36:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CA45CC replication status: 19 > May 24 10:36:57 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA189 replication status: 19 > May 24 10:37:03 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B12E19 replication status: 19 > May 24 10:37:19 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B4ED79 replication status: 19 > May 24 10:37:20 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA1A6 replication status: 19 > May 24 10:37:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 10:38:14 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000ADC1A1 replication status: 19 > May 24 10:38:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD7055 replication status: 19 > May 24 10:39:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9BE51 replication status: 19 > May 24 10:39:51 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A9BFA2 replication status: 19 > May 24 10:40:00 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CBA184 replication status: 19 > May 24 10:40:07 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CF45E5 replication status: 19 > May 24 10:40:31 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000AA4400 replication status: 19 > May 24 10:40:33 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995E4 replication status: 19 > May 24 10:40:40 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B25 replication status: 19 > May 24 10:40:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9CD0E replication status: 19 > May 24 10:40:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD788D replication status: 19 > May 24 10:41:13 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9C033 replication status: 19 > May 24 10:41:20 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:41:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD64E0 replication status: 19 > May 24 10:41:47 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD7605 replication status: 19 > May 24 10:42:10 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000BCB066 replication status: 19 > May 24 10:43:37 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB285 replication status: 19 > May 24 10:43:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD78B6 replication status: 19 > May 24 10:43:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 10:43:43 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB3B0 replication status: 19 > May 24 10:44:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000BF1674 replication status: 19 > May 24 10:44:09 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CB5FC1 replication status: 19 > May 24 10:44:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000C790A4 replication status: 19 > May 24 10:44:22 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000ADC29C replication status: 19 > May 24 10:44:23 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD7E62 replication status: 19 > May 24 10:44:31 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD78F2 replication status: 19 > May 24 10:45:02 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CA4B55 replication status: 19 > May 24 10:45:13 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2F8BF replication status: 19 > May 24 10:45:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2FF5A replication status: 19 > May 24 10:45:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD6510 replication status: 19 > May 24 10:45:44 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9C007 replication status: 19 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CF1D2D replication status: 28 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA3FA7 replication status: 28 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000A9C41B replication status: 28 > May 24 10:46:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C78D16 replication status: 19 > May 24 10:46:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A9C156 replication status: 19 > May 24 10:46:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:46:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BA89D7 replication status: 19 > May 24 10:46:33 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD79CF replication status: 19 > May 24 10:46:40 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9CCB7 replication status: 19 > May 24 10:47:07 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9BEE6 replication status: 19 > May 24 10:47:36 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA458E replication status: 19 > May 24 10:47:44 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C7B9F1 replication status: 19 > May 24 10:47:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000ADB2AC replication status: 19 > May 24 10:48:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD645B replication status: 19 > May 24 10:48:01 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A995B9 replication status: 19 > May 24 10:48:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C2F955 replication status: 19 > May 24 10:48:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C8007A replication status: 19 > May 24 10:48:21 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C78CF4 replication status: 19 > May 24 10:48:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD78B6 replication status: 28 > May 24 10:48:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000BA8AA4 replication status: 28 > May 24 10:48:40 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CBA179 replication status: 28 > May 24 10:48:41 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B590BC replication status: 19 > May 24 10:48:47 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD7929 replication status: 19 > May 24 10:48:53 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BF1674 replication status: 19 > May 24 10:48:56 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2F8AA replication status: 19 > May 24 10:48:59 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9DAC9 replication status: 19 > May 24 10:49:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A996F3 replication status: 19 > May 24 10:49:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CB5FC1 replication status: 19 > May 24 10:49:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA1B2 replication status: 19 > May 24 10:49:36 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C790A4 replication status: 19 > May 24 10:49:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C7913D replication status: 19 > May 24 10:49:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B582B4 replication status: 19 > May 24 10:50:35 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CF1CE1 replication status: 19 > May 24 10:51:03 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C78D16 replication status: 19 > May 24 10:51:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C2F366 replication status: 19 > May 24 10:51:18 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:51:24 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9BED0 replication status: 19 > May 24 10:51:33 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A91554 replication status: 19 > May 24 10:51:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9154D replication status: 19 > May 24 10:52:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000B12E19 replication status: 19 > May 24 10:52:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000ADC2CD replication status: 19 > May 24 10:53:22 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 10:53:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C33575 replication status: 19 > May 24 10:54:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9C82A replication status: 19 > May 24 10:54:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B582B4 replication status: 19 > May 24 10:55:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C78EDD replication status: 19 > May 24 10:55:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B44 replication status: 19 > May 24 10:55:58 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CB5FBE replication status: 19 > May 24 10:56:06 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BF6DF2 replication status: 19 > May 24 10:56:26 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 10:56:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD789C replication status: 19 > May 24 10:56:48 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD7837 replication status: 19 > May 24 10:56:58 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD7800 replication status: 19 > May 24 10:57:11 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9C09C replication status: 19 > May 24 10:57:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BE0F replication status: 19 > May 24 10:57:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 10:57:49 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9C00B replication status: 19 > May 24 10:57:56 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CBA179 replication status: 19 > May 24 10:58:08 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995B9 replication status: 19 > May 24 10:58:14 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C79138 replication status: 19 > May 24 10:58:23 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD707C replication status: 19 > May 24 10:58:25 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD710F replication status: 19 > May 24 10:58:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 10:58:55 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C68224 replication status: 19 > May 24 10:59:33 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CBA1B2 replication status: 19 > May 24 11:00:05 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000B3CB41 replication status: 19 > May 24 11:00:54 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000AA4400 replication status: 19 > May 24 11:01:17 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD7837 replication status: 19 > May 24 11:01:33 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000BA89D7 replication status: 19 > May 24 11:01:51 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C8375C replication status: 19 > May 24 11:02:22 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995AC replication status: 19 > May 24 11:02:28 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD78B4 replication status: 19 > May 24 11:02:32 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C615E8 replication status: 19 > May 24 11:02:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD64D7 replication status: 19 > May 24 11:02:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA196 replication status: 19 > May 24 11:02:58 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000BEC4A1 replication status: 19 > May 24 11:03:09 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C79138 replication status: 19 > May 24 11:03:18 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9C260 replication status: 19 > May 24 11:03:21 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB266 replication status: 19 > May 24 11:03:23 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 11:03:27 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C68224 replication status: 19 > May 24 11:03:30 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:03:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000A9CCC9 replication status: 19 > May 24 11:04:15 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD70D7 replication status: 19 > May 24 11:04:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CB5FB1 replication status: 19 > May 24 11:04:45 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD7DDF replication status: 19 > May 24 11:05:43 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C8375C replication status: 19 > May 24 11:05:50 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:06:01 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BF163B replication status: 19 > May 24 11:06:03 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A91545 replication status: 19 > May 24 11:06:08 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 11:06:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C7914A replication status: 19 > May 24 11:07:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:07:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:08:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000B4EDF3 replication status: 19 > May 24 11:08:21 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 11:08:28 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:08:35 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C7906F replication status: 19 > May 24 11:08:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD75C5 replication status: 19 > May 24 11:08:48 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 11:09:12 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD70D7 replication status: 19 > May 24 11:10:01 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CF1D62 replication status: 19 > May 24 11:10:07 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:10:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B578E9 replication status: 19 > May 24 11:10:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B44 replication status: 19 > May 24 11:11:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C7914A replication status: 19 > May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C7B551 replication status: 19 > May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:12:35 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB266 replication status: 19 > May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:13:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 11:15:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:15:19 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000B578E9 replication status: 19 > May 24 11:15:55 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BF163B replication status: 19 > May 24 11:15:56 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 11:16:00 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:16:30 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C3A0E7 replication status: 19 > May 24 11:16:45 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CBA190 replication status: 19 > May 24 11:17:31 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:17:36 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:18:25 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:20:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:20:47 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:23:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:25:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:25:48 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:30:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:40:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA4588 replication status: 19 > > Note the undergoal replication process was completed about this time. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: yishi c. <hol...@gm...> - 2012-05-31 10:48:26
|
hi : I discovered some small bug of master. when It clean the serventry which is taged as KILL(eptr->mode==KILL),the matocuservhead's list will broken .however ,as the mount and chunkserver will reconnect after a few seconds ,this problem is uneasy to be exposed. and I don't know if this problem is known? the patch code is listed below. mfsmaster/matocuserv.c @@ -3458,7 +3458,7 @@ void matocuserv_serve(struct pollfd *pdesc) { uint32_t now=main_time(); - matocuserventry *eptr,**kptr; + matocuserventry *eptr,**kptr,**wptr; packetstruct *pptr,*paptr; int ns; @@ -3520,6 +3520,7 @@ } } kptr = &matocuservhead; + wptr = &matocuservhead; while ((eptr=*kptr)) { if (eptr->mode == KILL) { matocu_beforedisconnect(eptr); @@ -3536,11 +3537,19 @@ pptr = pptr->next; free(paptr); } + if(eptr == matocuservhead) { + matocuservhead = eptr->next; + wptr = &matocuservhead; + } + else { + *wptr->next = eptr->next; + } *kptr = eptr->next; free(eptr); } else { + wptr = &eptr; kptr = &(eptr->next); } } } mfsmaster/matocsserv.c void matocsserv_serve(struct poll_fd *pdesc) { uint32_t now=main_time(); uint32_t peerip; - matocsserventry *eptr,**kptr; + matocsserventry *eptr,**kptr,**wptr; packetstruct *pptr,*paptr; int ns; @@ -1753,6 +1753,7 @@ } } kptr = &matocsservhead; + wptr = &matocsservhead; while ((eptr=*kptr)) { if (eptr->mode == KILL) { double us,ts; @@ -1777,9 +1778,17 @@ if (eptr->servstrip) { free(eptr->servstrip); } + if(eptr == matocsservhead) { + matocsservhead = eptr->next; + wptr = &matocsservhead; + } + else { + *wptr->next = eptr->next; + } *kptr = eptr->next; free(eptr); } else { + wptr = &eptr; kptr = &(eptr->next); } } mfsmaster/matomlserv.c void matomlserv_serve(struct pollfd *pdesc) { uint32_t now=main_time(); - matomlserventry *eptr,**kptr; + matomlserventry *eptr,**kptr,**wptr; packetstruct *pptr,*paptr; int ns; @@ -550,6 +550,7 @@ } } kptr = &matomlservhead; + wptr = &matomlservhead; while ((eptr=*kptr)) { if (eptr->mode == KILL) { matomlserv_beforeclose(eptr); @@ -569,9 +570,17 @@ if (eptr->servstrip) { free(eptr->servstrip); } + if(eptr == matomlservhead) { + matomlservhead = eptr->next; + wptr = &matomlservhead; + } + else { + *wptr->next = eptr->next; + } *kptr = eptr->next; free(eptr); } else { + wptr = &eptr; kptr = &(eptr->next); } } |
From: Boris E. <bor...@gm...> - 2012-05-30 12:55:49
|
On Wed, May 30, 2012 at 1:07 AM, Deon Cui <deo...@gm...> wrote: > Hi mooseys, > > I have experienced something similar in the last few days over the > weekend. It appears in my case that one of my chunk servers is experiencing > a kernel panic every now and then and crashing. > > Upon reboot, ZFS was consistent but moose report missing chunk. Luckily it > has only been a tmp file, so the loss of it causes no harm. > However it is distressing to me that ZFS can "lose" the data, or moose is > reporting to the application writing the data that it is committed to disk > when it isn't. > > Boris if moose is reporting a missing chunk then it is likely your chunk > servers have done something with them, like in my case. > > Regards, > Deon > > Deon, Thanks! So what would that mean - that the MooseFS looses files it can not properly write to the underlying file system? If that is the case is there a way to fight that problem? Can we, for instance, tell it to retry a certain number of times? Anybody know if there are relevant settings - and if so, where they would be? Boris. |
From: Wang J. <jia...@re...> - 2012-05-30 06:30:28
|
于 2012/5/22 16:26, Michal Borychowski 写道: > Hi Ken! > > Your solution is really interesting and promising for storing large amounts > of small files. > > I uploaded several files to your demo and as I understand demo operates on > jpg files, but bundle could easily also store other formats as .pdf, .png, > etc.? > > Where do you keep meta information (size, offset) of the files? In some > external database like MySQL or something? > > You say the files can be overwritten with bundle. Even if they have > different size? What happens with the old, unused space (there is a "hole" > in the huge file)? Is it lost? > > What about write permissions? Is it still sth like u/g/o? Can the > permissions be set separately per each small file or just by a huge file? > > One useful thing which you probably lose using such a solution is lack of > "trash bin" per each of the small file. > Their is metadata for small file in respective 'header'. This can be used to hold permission. But I never think this is necessary, because you need a gateway to access the small files, and you can do ACL there anyway. This is just file storage. And, for real application, deletion is rare, and is expensive if you want to do it damn right. The simple solution is mark the 'deleted' file and keep it as is. Gateway application will know it and reject access. You can undelete it by unmark. > Kind regards > Michal > > > -----Original Message----- > From: Ken [mailto:ken...@gm...] > Sent: Thursday, May 10, 2012 1:17 PM > To: moosefs-users > Subject: [Moosefs-users] bundle open source [was: Solution of small file > store] > > hi, all > > As mention in previous mail > (http://sf.net/mailarchive/message.php?msg_id=29171206), > now we open source it - bundle > > https://github.com/xiaonei/bundle > > The source is well tested and documented. > > Demo: > http://60.29.242.206/demo.html > > > Any ideas is appreciated. > > -Ken > > ---------------------------------------------------------------------------- > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and threat > landscape has changed and how IT managers can respond. Discussions will > include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Deon C. <deo...@gm...> - 2012-05-30 05:08:06
|
Hi mooseys, I have experienced something similar in the last few days over the weekend. It appears in my case that one of my chunk servers is experiencing a kernel panic every now and then and crashing. Upon reboot, ZFS was consistent but moose report missing chunk. Luckily it has only been a tmp file, so the loss of it causes no harm. However it is distressing to me that ZFS can "lose" the data, or moose is reporting to the application writing the data that it is committed to disk when it isn't. Boris if moose is reporting a missing chunk then it is likely your chunk servers have done something with them, like in my case. Regards, Deon On Wed, May 30, 2012 at 10:00 AM, Boris Epstein <bor...@gm...>wrote: > Hello listmates, > > I got 3 "unavailable" chunks listed on my MooseFS installation. Thus far I > have not seen any indication that the chunk server (I only use one) has any > disk-level failures. > > What other causes may there be for that problem? > > Thanks. > > Boris. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > |
From: Boris E. <bor...@gm...> - 2012-05-29 22:00:49
|
Hello listmates, I got 3 "unavailable" chunks listed on my MooseFS installation. Thus far I have not seen any indication that the chunk server (I only use one) has any disk-level failures. What other causes may there be for that problem? Thanks. Boris. |
From: Kristofer <kri...@cy...> - 2012-05-27 18:10:55
|
I had to compile MooseFS manually on Ubuntu, but after installing the fuse libraries (and development package) using apt-get, it was very easy to compile MooseFS. On 05/25/2012 11:44 AM, Boris Epstein wrote: > Hello listmates, > > Are there precompiled/standard modules for Ubuntu Linux? > > Thanks. > > Boris. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Travis H. <tra...@tr...> - 2012-05-25 21:44:14
|
No, it is not possible to have the master server "proxy" the communications to all of the chunk servers thru a single public IP address connection on the master. This is by design of the cluster nature of the file system. I have found in practice the "chatter" between the mfsmount client, master server, and chunk servers do not perform as well as What works well to achieve this kind of "single node" / IP address point of access is to use a different network file share protocol, such as CIFS (Samba) or AFP (netatalk) services, or SSHFS ( a FUSE based file system mount over SSH). I have had good success with the latter as it works well with my existing SSH access to the site. Where by you create a gateway machine that runs these service(s), having both public and internal IP address segments, so this gateway machine mounts the moose file system. I guess for simple or small deployments you could put these services on the same machine as the moosefs master, but from a security stand point it might make sense to have them separate. Additionally for CIFS and AFP there are considerations needed to secure these protocols, such as firewall rules or tunneling them thru SSH. On 12-05-25 3:45 PM, Boris Epstein wrote: > Hello all, > > I am trying to run a MooseFS installation in such a way as to control > access to the servers involved. For instance, all my servers are on a > private network and only the master server is intended to have an > external interface. But it looks like the clients need to communicate > to the chunk servers directly - and that was the sort of thing I was > trying to avoid, if at all possible. Is that possible? Can the master > server manage all the traffic to and from the clients? > > Thanks. > > Boris. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Boris E. <bor...@gm...> - 2012-05-25 19:45:58
|
Hello all, I am trying to run a MooseFS installation in such a way as to control access to the servers involved. For instance, all my servers are on a private network and only the master server is intended to have an external interface. But it looks like the clients need to communicate to the chunk servers directly - and that was the sort of thing I was trying to avoid, if at all possible. Is that possible? Can the master server manage all the traffic to and from the clients? Thanks. Boris. |
From: Markus K. <mar...@tu...> - 2012-05-25 17:41:42
|
We are testing moosefs in combination with gridengine and getting about 97% fsync write operations. The only think which get written to moosfs is stdout stderr and at the end of a job a small file. The stdout/stderr output is max. 10MB/hour and job. It seams that each output line gets a synced write operation. Our chunkserver are mainly desktop hosts with an extra disk for the chunkserver (100-900GB). With 200 jobs running we see an average iowait of >20% in the hosts with the huger disks which makes the hosts unusable. Today i have installed an extra chunkserver with 2 2TB disks on our metalogger which reduces the iowait on the clients to about 5%. Nevertheless the metalogger host is now suffering 25% iowait. Together we have now 32 disks in usage for chunkservers. Whats the best strategy to avoid this synced writes? If I copy a huge file to moosefs I do not get a noticeable load on the clients. We are using debian squeeze, kernel 3.2.0 and mossefs 1.6.24. For chunkserver we use ext4 with mount options rw,noatime,nodiratime Thanks Markus Köberl -- Markus Köberl Graz University of Technology Signal Processing and Speech Communication Laboratory E-mail: mar...@tu... |
From: Boris E. <bor...@gm...> - 2012-05-25 16:44:39
|
Hello listmates, Are there precompiled/standard modules for Ubuntu Linux? Thanks. Boris. |
From: wkmail <wk...@bn...> - 2012-05-24 20:18:31
|
Additional info. in regards to this situation: "When the chunkserver process finished scanning the single remaining drive, the CGI program reported that all chunkers were back but now showed 700+ chunks as undergoal (2/3) AND 700+ chunks as overgoal which confused us." Immediately after the mark for removal process had completed, the cluster was actively rebalancing and was still doing that prior to the chunkserver restart where we discontinued using the drive. -wk On 5/24/2012 12:56 PM, wkmail wrote: > We have a test MFS cluster (1.6.20) that we are hosting some > utility/research VM's on it using Oracle Virtual Box to see how well MFS > handles VMs. > All has gone well besides the expected disk i/o speed hit, which we are > can handle. > > However we have TWICE now experienced this problem. > > We decide to replace one of the chunkservers with better gear. It had 2 > x 1TB SATA drives and we usually remove them one at a time. > > We first marked drive #2 for removal using '*' in mfshdd.cfg and > restarted the chunkserver process (.i.e. ./chunkerserver restart). > > The CGI showed the drive as marked for removal and replicated off the > chunks as normal. Two days later it was done. No problems or errors were > observed. > > The following day, we then excluded drive #2 the now deprecated drive > by commenting it out in mfshdd.cfg and restarted the chunkerserver again. > > During the scan, the CGI correctly reported the 650,000+ chunks or so as > undergoal and the chunkserver was missing from the server list. > > When the chunkserver process finished scanning the single remaining > drive, the CGI program reported that all chunkers were back but now > showed 700+ chunks as undergoal (2/3) AND 700+ chunks as overgoal which > confused us. > > The MFS cluster then began to fix the undergoal chunks. During that > period we began to see lot of the error 19 messages which the source > code identifies as Wrong Chunk Version. > > During the undergoal replication process, we had a number of the busier > VMs report hard drive errors and either go RO or Panic. > > When the undergoal chunk replication completed about an hour later, we > ceased to have any problems and we are no longer seeing the error 19 > messages. > > Does anyone have an explanation as to what is going on and what we can > do to prevent the issue from re-occurring? > > Are Disk images problematic on MFS? > > Below are the relevant log entries in the MFSmaster log showing us > restarting the CS and the subsequent errors. > > Note, on the chunkerservers themselves, each error in the master log has > a corresponding error in the Chunkserver log > i.e. > May 24 10:58:25 mfs7chunker7 mfschunkserver[1085]: replicator: got > status: 19 from (C0A80016:24CE) > May 24 10:58:26 mfs7chunker7 mfschunkserver[1085]: replicator: got > status: 19 from (C0A80018:24CE) > > > from /var/log/messages on the master. > > May 24 10:27:45 mfs7master mfsmaster[32522]: connection with > CS(192.168.0.23) has been closed by peer > May 24 10:32:49 mfs7master mfsmaster[32522]: chunkserver register begin > (packet version: 5) - ip: 192.168.0.23, port: 9422 > May 24 10:33:37 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BA8A31 replication status: 19 > May 24 10:34:43 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B85FD0 replication status: 19 > May 24 10:35:05 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CF1D62 replication status: 19 > May 24 10:35:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C519DE replication status: 19 > May 24 10:35:34 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C61B66 replication status: 19 > May 24 10:36:05 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD75B2 replication status: 19 > May 24 10:36:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CA45CC replication status: 19 > May 24 10:36:57 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA189 replication status: 19 > May 24 10:37:03 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B12E19 replication status: 19 > May 24 10:37:19 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B4ED79 replication status: 19 > May 24 10:37:20 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA1A6 replication status: 19 > May 24 10:37:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 10:38:14 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000ADC1A1 replication status: 19 > May 24 10:38:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD7055 replication status: 19 > May 24 10:39:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9BE51 replication status: 19 > May 24 10:39:51 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A9BFA2 replication status: 19 > May 24 10:40:00 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CBA184 replication status: 19 > May 24 10:40:07 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CF45E5 replication status: 19 > May 24 10:40:31 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000AA4400 replication status: 19 > May 24 10:40:33 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995E4 replication status: 19 > May 24 10:40:40 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B25 replication status: 19 > May 24 10:40:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9CD0E replication status: 19 > May 24 10:40:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD788D replication status: 19 > May 24 10:41:13 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9C033 replication status: 19 > May 24 10:41:20 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:41:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD64E0 replication status: 19 > May 24 10:41:47 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD7605 replication status: 19 > May 24 10:42:10 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000BCB066 replication status: 19 > May 24 10:43:37 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB285 replication status: 19 > May 24 10:43:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD78B6 replication status: 19 > May 24 10:43:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 10:43:43 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB3B0 replication status: 19 > May 24 10:44:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000BF1674 replication status: 19 > May 24 10:44:09 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CB5FC1 replication status: 19 > May 24 10:44:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000C790A4 replication status: 19 > May 24 10:44:22 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000ADC29C replication status: 19 > May 24 10:44:23 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD7E62 replication status: 19 > May 24 10:44:31 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD78F2 replication status: 19 > May 24 10:45:02 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CA4B55 replication status: 19 > May 24 10:45:13 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2F8BF replication status: 19 > May 24 10:45:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2FF5A replication status: 19 > May 24 10:45:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD6510 replication status: 19 > May 24 10:45:44 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9C007 replication status: 19 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CF1D2D replication status: 28 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA3FA7 replication status: 28 > May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000A9C41B replication status: 28 > May 24 10:46:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C78D16 replication status: 19 > May 24 10:46:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A9C156 replication status: 19 > May 24 10:46:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:46:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BA89D7 replication status: 19 > May 24 10:46:33 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD79CF replication status: 19 > May 24 10:46:40 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9CCB7 replication status: 19 > May 24 10:47:07 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9BEE6 replication status: 19 > May 24 10:47:36 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA458E replication status: 19 > May 24 10:47:44 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C7B9F1 replication status: 19 > May 24 10:47:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000ADB2AC replication status: 19 > May 24 10:48:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD645B replication status: 19 > May 24 10:48:01 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A995B9 replication status: 19 > May 24 10:48:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C2F955 replication status: 19 > May 24 10:48:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C8007A replication status: 19 > May 24 10:48:21 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C78CF4 replication status: 19 > May 24 10:48:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD78B6 replication status: 28 > May 24 10:48:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000BA8AA4 replication status: 28 > May 24 10:48:40 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CBA179 replication status: 28 > May 24 10:48:41 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B590BC replication status: 19 > May 24 10:48:47 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CD7929 replication status: 19 > May 24 10:48:53 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BF1674 replication status: 19 > May 24 10:48:56 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C2F8AA replication status: 19 > May 24 10:48:59 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9DAC9 replication status: 19 > May 24 10:49:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A996F3 replication status: 19 > May 24 10:49:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CB5FC1 replication status: 19 > May 24 10:49:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA1B2 replication status: 19 > May 24 10:49:36 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C790A4 replication status: 19 > May 24 10:49:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C7913D replication status: 19 > May 24 10:49:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B582B4 replication status: 19 > May 24 10:50:35 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CF1CE1 replication status: 19 > May 24 10:51:03 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C78D16 replication status: 19 > May 24 10:51:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C2F366 replication status: 19 > May 24 10:51:18 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA45A5 replication status: 19 > May 24 10:51:24 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9BED0 replication status: 19 > May 24 10:51:33 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A91554 replication status: 19 > May 24 10:51:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9154D replication status: 19 > May 24 10:52:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000B12E19 replication status: 19 > May 24 10:52:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000ADC2CD replication status: 19 > May 24 10:53:22 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 10:53:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C33575 replication status: 19 > May 24 10:54:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000A9C82A replication status: 19 > May 24 10:54:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000B582B4 replication status: 19 > May 24 10:55:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C78EDD replication status: 19 > May 24 10:55:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B44 replication status: 19 > May 24 10:55:58 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CB5FBE replication status: 19 > May 24 10:56:06 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000BF6DF2 replication status: 19 > May 24 10:56:26 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 10:56:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD789C replication status: 19 > May 24 10:56:48 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD7837 replication status: 19 > May 24 10:56:58 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD7800 replication status: 19 > May 24 10:57:11 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000A9C09C replication status: 19 > May 24 10:57:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BE0F replication status: 19 > May 24 10:57:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 10:57:49 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000A9C00B replication status: 19 > May 24 10:57:56 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CBA179 replication status: 19 > May 24 10:58:08 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995B9 replication status: 19 > May 24 10:58:14 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C79138 replication status: 19 > May 24 10:58:23 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD707C replication status: 19 > May 24 10:58:25 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CD710F replication status: 19 > May 24 10:58:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 10:58:55 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C68224 replication status: 19 > May 24 10:59:33 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000CBA1B2 replication status: 19 > May 24 11:00:05 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000B3CB41 replication status: 19 > May 24 11:00:54 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000AA4400 replication status: 19 > May 24 11:01:17 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD7837 replication status: 19 > May 24 11:01:33 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000BA89D7 replication status: 19 > May 24 11:01:51 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C8375C replication status: 19 > May 24 11:02:22 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A995AC replication status: 19 > May 24 11:02:28 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD78B4 replication status: 19 > May 24 11:02:32 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C615E8 replication status: 19 > May 24 11:02:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD64D7 replication status: 19 > May 24 11:02:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CBA196 replication status: 19 > May 24 11:02:58 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000BEC4A1 replication status: 19 > May 24 11:03:09 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C79138 replication status: 19 > May 24 11:03:18 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9C260 replication status: 19 > May 24 11:03:21 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB266 replication status: 19 > May 24 11:03:23 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 11:03:27 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C68224 replication status: 19 > May 24 11:03:30 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:03:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000A9CCC9 replication status: 19 > May 24 11:04:15 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD70D7 replication status: 19 > May 24 11:04:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CB5FB1 replication status: 19 > May 24 11:04:45 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD7DDF replication status: 19 > May 24 11:05:43 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C8375C replication status: 19 > May 24 11:05:50 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:06:01 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BF163B replication status: 19 > May 24 11:06:03 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000A91545 replication status: 19 > May 24 11:06:08 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 11:06:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C7914A replication status: 19 > May 24 11:07:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:07:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:08:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000B4EDF3 replication status: 19 > May 24 11:08:21 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CBA195 replication status: 19 > May 24 11:08:28 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:08:35 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000C7906F replication status: 19 > May 24 11:08:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD75C5 replication status: 19 > May 24 11:08:48 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 11:09:12 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD70D7 replication status: 19 > May 24 11:10:01 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CF1D62 replication status: 19 > May 24 11:10:07 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:10:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000B578E9 replication status: 19 > May 24 11:10:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CA4B44 replication status: 19 > May 24 11:11:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C7914A replication status: 19 > May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C7B551 replication status: 19 > May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:12:35 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000ADB266 replication status: 19 > May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:13:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: > 0000000000A9BF85 replication status: 19 > May 24 11:15:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:15:19 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000B578E9 replication status: 19 > May 24 11:15:55 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000BF163B replication status: 19 > May 24 11:15:56 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000A9BE28 replication status: 19 > May 24 11:16:00 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:16:30 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000C3A0E7 replication status: 19 > May 24 11:16:45 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CBA190 replication status: 19 > May 24 11:17:31 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: > 0000000000CD784C replication status: 19 > May 24 11:17:36 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000C78EE1 replication status: 19 > May 24 11:18:25 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:20:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:20:47 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:23:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: > 0000000000C5769C replication status: 19 > May 24 11:25:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:25:48 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CD7625 replication status: 19 > May 24 11:30:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA4588 replication status: 19 > May 24 11:40:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: > 0000000000CA4588 replication status: 19 > > Note the undergoal replication process was completed about this time. > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: wkmail <wk...@bn...> - 2012-05-24 19:56:25
|
We have a test MFS cluster (1.6.20) that we are hosting some utility/research VM's on it using Oracle Virtual Box to see how well MFS handles VMs. All has gone well besides the expected disk i/o speed hit, which we are can handle. However we have TWICE now experienced this problem. We decide to replace one of the chunkservers with better gear. It had 2 x 1TB SATA drives and we usually remove them one at a time. We first marked drive #2 for removal using '*' in mfshdd.cfg and restarted the chunkserver process (.i.e. ./chunkerserver restart). The CGI showed the drive as marked for removal and replicated off the chunks as normal. Two days later it was done. No problems or errors were observed. The following day, we then excluded drive #2 the now deprecated drive by commenting it out in mfshdd.cfg and restarted the chunkerserver again. During the scan, the CGI correctly reported the 650,000+ chunks or so as undergoal and the chunkserver was missing from the server list. When the chunkserver process finished scanning the single remaining drive, the CGI program reported that all chunkers were back but now showed 700+ chunks as undergoal (2/3) AND 700+ chunks as overgoal which confused us. The MFS cluster then began to fix the undergoal chunks. During that period we began to see lot of the error 19 messages which the source code identifies as Wrong Chunk Version. During the undergoal replication process, we had a number of the busier VMs report hard drive errors and either go RO or Panic. When the undergoal chunk replication completed about an hour later, we ceased to have any problems and we are no longer seeing the error 19 messages. Does anyone have an explanation as to what is going on and what we can do to prevent the issue from re-occurring? Are Disk images problematic on MFS? Below are the relevant log entries in the MFSmaster log showing us restarting the CS and the subsequent errors. Note, on the chunkerservers themselves, each error in the master log has a corresponding error in the Chunkserver log i.e. May 24 10:58:25 mfs7chunker7 mfschunkserver[1085]: replicator: got status: 19 from (C0A80016:24CE) May 24 10:58:26 mfs7chunker7 mfschunkserver[1085]: replicator: got status: 19 from (C0A80018:24CE) from /var/log/messages on the master. May 24 10:27:45 mfs7master mfsmaster[32522]: connection with CS(192.168.0.23) has been closed by peer May 24 10:32:49 mfs7master mfsmaster[32522]: chunkserver register begin (packet version: 5) - ip: 192.168.0.23, port: 9422 May 24 10:33:37 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000BA8A31 replication status: 19 May 24 10:34:43 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000B85FD0 replication status: 19 May 24 10:35:05 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CF1D62 replication status: 19 May 24 10:35:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C519DE replication status: 19 May 24 10:35:34 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C61B66 replication status: 19 May 24 10:36:05 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD75B2 replication status: 19 May 24 10:36:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CA45CC replication status: 19 May 24 10:36:57 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CBA189 replication status: 19 May 24 10:37:03 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000B12E19 replication status: 19 May 24 10:37:19 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000B4ED79 replication status: 19 May 24 10:37:20 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CBA1A6 replication status: 19 May 24 10:37:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD784C replication status: 19 May 24 10:38:14 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000ADC1A1 replication status: 19 May 24 10:38:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD7055 replication status: 19 May 24 10:39:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A9BE51 replication status: 19 May 24 10:39:51 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A9BFA2 replication status: 19 May 24 10:40:00 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CBA184 replication status: 19 May 24 10:40:07 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CF45E5 replication status: 19 May 24 10:40:31 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000AA4400 replication status: 19 May 24 10:40:33 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A995E4 replication status: 19 May 24 10:40:40 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CA4B25 replication status: 19 May 24 10:40:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9CD0E replication status: 19 May 24 10:40:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CD788D replication status: 19 May 24 10:41:13 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A9C033 replication status: 19 May 24 10:41:20 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CA45A5 replication status: 19 May 24 10:41:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD64E0 replication status: 19 May 24 10:41:47 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CD7605 replication status: 19 May 24 10:42:10 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000BCB066 replication status: 19 May 24 10:43:37 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000ADB285 replication status: 19 May 24 10:43:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD78B6 replication status: 19 May 24 10:43:42 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9BF85 replication status: 19 May 24 10:43:43 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000ADB3B0 replication status: 19 May 24 10:44:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000BF1674 replication status: 19 May 24 10:44:09 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CB5FC1 replication status: 19 May 24 10:44:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000C790A4 replication status: 19 May 24 10:44:22 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000ADC29C replication status: 19 May 24 10:44:23 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CD7E62 replication status: 19 May 24 10:44:31 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CD78F2 replication status: 19 May 24 10:45:02 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CA4B55 replication status: 19 May 24 10:45:13 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C2F8BF replication status: 19 May 24 10:45:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C2FF5A replication status: 19 May 24 10:45:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD6510 replication status: 19 May 24 10:45:44 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9C007 replication status: 19 May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CF1D2D replication status: 28 May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CA3FA7 replication status: 28 May 24 10:46:00 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000A9C41B replication status: 28 May 24 10:46:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C78D16 replication status: 19 May 24 10:46:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A9C156 replication status: 19 May 24 10:46:18 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CA45A5 replication status: 19 May 24 10:46:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000BA89D7 replication status: 19 May 24 10:46:33 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD79CF replication status: 19 May 24 10:46:40 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9CCB7 replication status: 19 May 24 10:47:07 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A9BEE6 replication status: 19 May 24 10:47:36 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CA458E replication status: 19 May 24 10:47:44 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C7B9F1 replication status: 19 May 24 10:47:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000ADB2AC replication status: 19 May 24 10:48:00 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CD645B replication status: 19 May 24 10:48:01 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A995B9 replication status: 19 May 24 10:48:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C2F955 replication status: 19 May 24 10:48:19 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C8007A replication status: 19 May 24 10:48:21 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C78CF4 replication status: 19 May 24 10:48:38 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD78B6 replication status: 28 May 24 10:48:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000BA8AA4 replication status: 28 May 24 10:48:40 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CBA179 replication status: 28 May 24 10:48:41 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000B590BC replication status: 19 May 24 10:48:47 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CD7929 replication status: 19 May 24 10:48:53 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000BF1674 replication status: 19 May 24 10:48:56 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C2F8AA replication status: 19 May 24 10:48:59 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000A9DAC9 replication status: 19 May 24 10:49:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A996F3 replication status: 19 May 24 10:49:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CB5FC1 replication status: 19 May 24 10:49:32 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CBA1B2 replication status: 19 May 24 10:49:36 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C790A4 replication status: 19 May 24 10:49:43 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C7913D replication status: 19 May 24 10:49:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000B582B4 replication status: 19 May 24 10:50:35 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CF1CE1 replication status: 19 May 24 10:51:03 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C78D16 replication status: 19 May 24 10:51:04 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C2F366 replication status: 19 May 24 10:51:18 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CA45A5 replication status: 19 May 24 10:51:24 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000A9BED0 replication status: 19 May 24 10:51:33 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A91554 replication status: 19 May 24 10:51:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000A9154D replication status: 19 May 24 10:52:01 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000B12E19 replication status: 19 May 24 10:52:37 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000ADC2CD replication status: 19 May 24 10:53:22 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CBA195 replication status: 19 May 24 10:53:42 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C33575 replication status: 19 May 24 10:54:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000A9C82A replication status: 19 May 24 10:54:48 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000B582B4 replication status: 19 May 24 10:55:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C78EDD replication status: 19 May 24 10:55:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CA4B44 replication status: 19 May 24 10:55:58 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CB5FBE replication status: 19 May 24 10:56:06 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000BF6DF2 replication status: 19 May 24 10:56:26 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000A9BE28 replication status: 19 May 24 10:56:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD789C replication status: 19 May 24 10:56:48 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD7837 replication status: 19 May 24 10:56:58 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD7800 replication status: 19 May 24 10:57:11 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000A9C09C replication status: 19 May 24 10:57:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000A9BE0F replication status: 19 May 24 10:57:40 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD784C replication status: 19 May 24 10:57:49 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000A9C00B replication status: 19 May 24 10:57:56 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CBA179 replication status: 19 May 24 10:58:08 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A995B9 replication status: 19 May 24 10:58:14 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C79138 replication status: 19 May 24 10:58:23 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD707C replication status: 19 May 24 10:58:25 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CD710F replication status: 19 May 24 10:58:26 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C5769C replication status: 19 May 24 10:58:55 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000C68224 replication status: 19 May 24 10:59:33 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000CBA1B2 replication status: 19 May 24 11:00:05 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000B3CB41 replication status: 19 May 24 11:00:54 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000AA4400 replication status: 19 May 24 11:01:17 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CD7837 replication status: 19 May 24 11:01:33 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000BA89D7 replication status: 19 May 24 11:01:51 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C8375C replication status: 19 May 24 11:02:22 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A995AC replication status: 19 May 24 11:02:28 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD78B4 replication status: 19 May 24 11:02:32 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C615E8 replication status: 19 May 24 11:02:35 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CD64D7 replication status: 19 May 24 11:02:51 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CBA196 replication status: 19 May 24 11:02:58 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000BEC4A1 replication status: 19 May 24 11:03:09 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C79138 replication status: 19 May 24 11:03:18 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000A9C260 replication status: 19 May 24 11:03:21 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000ADB266 replication status: 19 May 24 11:03:23 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CBA195 replication status: 19 May 24 11:03:27 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000C68224 replication status: 19 May 24 11:03:30 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C5769C replication status: 19 May 24 11:03:32 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000A9CCC9 replication status: 19 May 24 11:04:15 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CD70D7 replication status: 19 May 24 11:04:25 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CB5FB1 replication status: 19 May 24 11:04:45 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CD7DDF replication status: 19 May 24 11:05:43 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C8375C replication status: 19 May 24 11:05:50 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CD7625 replication status: 19 May 24 11:06:01 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000BF163B replication status: 19 May 24 11:06:03 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000A91545 replication status: 19 May 24 11:06:08 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9BE28 replication status: 19 May 24 11:06:13 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C7914A replication status: 19 May 24 11:07:28 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CD784C replication status: 19 May 24 11:07:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000C78EE1 replication status: 19 May 24 11:08:08 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000B4EDF3 replication status: 19 May 24 11:08:21 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CBA195 replication status: 19 May 24 11:08:28 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C5769C replication status: 19 May 24 11:08:35 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000C7906F replication status: 19 May 24 11:08:43 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CD75C5 replication status: 19 May 24 11:08:48 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000A9BF85 replication status: 19 May 24 11:09:12 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CD70D7 replication status: 19 May 24 11:10:01 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CF1D62 replication status: 19 May 24 11:10:07 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000CA4588 replication status: 19 May 24 11:10:25 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000B578E9 replication status: 19 May 24 11:10:39 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CA4B44 replication status: 19 May 24 11:11:15 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C7914A replication status: 19 May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C7B551 replication status: 19 May 24 11:12:28 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CD784C replication status: 19 May 24 11:12:35 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000C78EE1 replication status: 19 May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000ADB266 replication status: 19 May 24 11:13:25 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000C5769C replication status: 19 May 24 11:13:41 mfs7master mfsmaster[32522]: (192.168.0.27:9422) chunk: 0000000000A9BF85 replication status: 19 May 24 11:15:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CA4588 replication status: 19 May 24 11:15:19 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000B578E9 replication status: 19 May 24 11:15:55 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000BF163B replication status: 19 May 24 11:15:56 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000A9BE28 replication status: 19 May 24 11:16:00 mfs7master mfsmaster[32522]: (192.168.0.21:9422) chunk: 0000000000CD7625 replication status: 19 May 24 11:16:30 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000C3A0E7 replication status: 19 May 24 11:16:45 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CBA190 replication status: 19 May 24 11:17:31 mfs7master mfsmaster[32522]: (192.168.0.25:9422) chunk: 0000000000CD784C replication status: 19 May 24 11:17:36 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000C78EE1 replication status: 19 May 24 11:18:25 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C5769C replication status: 19 May 24 11:20:08 mfs7master mfsmaster[32522]: (192.168.0.24:9422) chunk: 0000000000CA4588 replication status: 19 May 24 11:20:47 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD7625 replication status: 19 May 24 11:23:26 mfs7master mfsmaster[32522]: (192.168.0.22:9422) chunk: 0000000000C5769C replication status: 19 May 24 11:25:06 mfs7master mfsmaster[32522]: (192.168.0.23:9422) chunk: 0000000000CA4588 replication status: 19 May 24 11:25:48 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CD7625 replication status: 19 May 24 11:30:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CA4588 replication status: 19 May 24 11:40:04 mfs7master mfsmaster[32522]: (192.168.0.26:9422) chunk: 0000000000CA4588 replication status: 19 Note the undergoal replication process was completed about this time. |
From: yishi c. <hol...@gm...> - 2012-05-24 07:57:06
|
If there is anyone who can provide me the mfs-1.6.3 or older verison. with best regard 2012/5/22 <moo...@li...> > Send moosefs-users mailing list submissions to > moo...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/moosefs-users > or, via email, send a message with subject or body 'help' to > moo...@li... > > You can reach the person managing the list at > moo...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of moosefs-users digest..." > > > Today's Topics: > > 1. Re: Best FileSystem for chunkers. (Deon Cui) > 2. Re: Best FileSystem for chunkers. (Alexander Akhobadze) > 3. Re: How to speedup replication from disk marked for removal > (Atom Powers) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 21 May 2012 20:32:48 +1200 > From: Deon Cui <deo...@gm...> > Subject: Re: [Moosefs-users] Best FileSystem for chunkers. > To: Alexander Akhobadze <akh...@ri...> > Cc: moo...@li... > Message-ID: > <CAMRow-e_E3SP2nz-X=-E05omSusuCwpostAbJ=Miw...@ma... > > > Content-Type: text/plain; charset="windows-1252" > > Hi Alex, > > Are you turning on dedup for a new ZFS filesystem or an existing one? > (doesn't matter if its an existing zpool or not). > > ZFS uses in-line dedup, which means that if you are trying to dedup an > existing ZFS filesystem it will only create new dedup blocks, existing > blocks will not be deduped. If you are indeed trying this on a new zfs > filesystem try making 10 copies of a large file. > > Deon > > > > On Mon, May 21, 2012 at 8:21 PM, Alexander Akhobadze > <akh...@ri...>wrote: > > > > > Hi Michal ! > > > > I have tested to turn on deduplication on a ZFS chunk storage > > but unfortunately did not get any profit :--( > > I thought that chunk file format prevents ZFS to find dups. > > May be I make mistake... Correct me if yes. > > > > wbr > > Alexander > > > > ====================================================== > > > > Hi! > > > > Users of MooseFS may now be interested in a new feature of ext4 called > > ?bigalloc? introduced in 3.2 kernel. > > > > According to http://lwn.net/Articles/469805/: ?The "bigalloc" patch set > > adds the concept of "block clusters" to the filesystem; rather than > > allocate single blocks, a filesystem using clusters will allocate them in > > larger groups. Mapping between these larger blocks and the 4KB blocks > seen > > by the core kernel is handled entirely within the filesystem.? > > > > Setting 64KB cluster size may make sense as MooseFS operates on 64KB > > blocks. We have not tested it but we can expect it may give some > > performance boost. It would also depend on the average size of the files > in > > your system. > > > > > > And as MooseFS doesn?t support deduplication by itself you can also > > consider using dedup functionality in ZFS. > > > > > > Kind regards > > Michal Borychowski > > > > > > From: Allen, Benjamin S [mailto:bs...@la...] > > Sent: Friday, May 18, 2012 5:30 PM > > To: moo...@li... > > Subject: Re: [Moosefs-users] Best FileSystem for chunkers. > > > > My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get > > 2.32x, which is along the lines of the compress ratio I get with similar > > systems serving NFS. Note, my data is inherently well compressible. With > 2x > > Intel X5675, load is never an issue. As you up the level of gzip you'll > see > > a diminishing return, and pretty heavy hits on CPU load. > > > > I'd also suggest using ZFS for raid if you care about single stream > > performance. Serve up one or two big zpools per chunkserver to MFS. Keep > in > > mind the size of your pool however, as having MFS fail off that HD will > can > > take ages. Also of course you'll loose capacity in this approach to > parity > > of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high > > availability. > > > > If you're thinking of using ZFS, I'd highly suggest using one of the > > Illumos based OSes instead of FreeBSD or Linux variants. The Linux port > is > > still pretty young in my opinion. I'd suggest Illumian, > > http://illumian.org/ which grew out of Nexenta Core. By the way MFS is > > the only distributed FS that I know of that compiles and runs well on > > Solaris. > > > > I've found small file performance isn't all that great in this setup. Sub > > what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up > much > > for it to solve this issue. You could perhaps throw a good amount of > small > > SSD drives at ZFS' ZIL to improve synchronous write speeds, but when > using > > ZIL you're funneling all your synchronous writes through the ZIL devices. > > So while using two SSDs will likely give you a touch better latency, it > > will kill your throughput compared to a full chassis of drives. > > > > I've also tested use of L2Arc on MLC SSDs for read cache. If its > > affordable for you, I'd suggest throwing RAM in the box for L1ARC > instead. > > At least in my workload, I see very little L2Arc hits. Most hits (90%) > > comes from L1ARC in memory in my chunkservers that have 96GB. Next series > > of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than > > half them of my current systems( 2.4T -> 960G). I guessing I could > actually > > remove L2ARC all together and not see a performance hit, but I haven't > done > > enough benchmarking to prove that one way or another. > > > > Ben > > > > On May 17, 2012, at 2:22 PM, Steve Wilson wrote: > > > > > > On 05/17/2012 04:17 PM, Steve Wilson wrote: > > > > On 05/17/2012 04:05 PM, Atom Powers wrote: > > On 05/17/2012 12:44 PM, Steve Wilson wrote: > > On 05/17/2012 03:26 PM, Atom Powers wrote: > > * Compression, 1.16x in my environment > > I don't know if 1.16x would give me much improvement in performance. > > I typically see about 1.4x on my ZFS backup servers which made me > > think that this reduction in disk I/O could result in improved > > overall performance for MooseFS. > > Not for performance, for disk efficiency. Ostensibly those 64MiB chunks > > won't always use 64MiB with compression on, especially for smaller > > files. > > > > This is a good point and it might help where it's most needed: all > > those small configuration files, etc. that have a large impact on the > > user's perception of disk performance. > > > > Bad: * high RAM requirement > > Is the high RAM due to using raidz{2-3}? I was thinking of making > > each disk a separate ZFS volume and then letting MooseFS combine the > > disks into an MFS volume (i.e., no raidz). I realize that greater > > performance could be achieved by striping across disks in the chunk > > servers but I'm willing to trade off that performance gain for > > higher redundancy (in the case of using simple striping) and/or > > greater capacity (in the case of using raidz, raidz2, or raidz3). > > ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, > > not raidz, and still use several hundred MiB of RAM. > > > > Personally, I would prefer to use raidz for muliple disks over MooseFS, > > because managing individual disks and disk failures should be much > > better. For example, to minimize the amount of re-balancing MooseFS > > needs to do; not to mention the possible performance benefit. But I can > > think of no reason why you couldn't do a combination of both. > > > > > > That is certainly worth considering. I hope to have enough time with > > the new chunk servers to try out different configurations before I > > have to put them into service. > > > > Steve > > > > > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > moosefs-users mailing list > > moo...@li... > > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > > > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > moosefs-users mailing list > > moo...@li... > > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 2 > Date: Mon, 21 May 2012 13:13:29 +0400 > From: Alexander Akhobadze <akh...@ri...> > Subject: Re: [Moosefs-users] Best FileSystem for chunkers. > To: Deon Cui <deo...@gm...> > Cc: moo...@li... > Message-ID: <104...@ri...> > Content-Type: text/plain; charset=windows-1252 > > Hi Deon! > > It was a new pool for chunks without any data when I turned dedup on. > And You are right: on that pool whan i copy some normal file > more than once - free space shown by df does not decrease. > > ====================================================== > > Hi Alex, > > Are you turning on dedup for a new ZFS filesystem or an existing one? > (doesn't matter if its an existing zpool or not). > > ZFS uses in-line dedup, which means that if you are trying to dedup > an existing ZFS filesystem it will only create new dedup blocks, > existing blocks will not be deduped. > If you are indeed trying this on a new zfs filesystem try making 10 copies > of a large file. > > Deon > > > > On Mon, May 21, 2012 at 8:21 PM, Alexander Akhobadze <akh...@ri...> > wrote: > > Hi Michal ! > > I have tested to turn on deduplication on a ZFS chunk storage > but unfortunately did not get any profit :--( > I thought that chunk file format prevents ZFS to find dups. > May be I make mistake... Correct me if yes. > > wbr > Alexander > > ====================================================== > > Hi! > > Users of MooseFS may now be interested in a new feature of ext4 called > ?bigalloc? introduced in 3.2 kernel. > > According to http://lwn.net/Articles/469805/: ?The "bigalloc" patch set > adds the concept of "block clusters" to the filesystem; rather than > allocate single blocks, a filesystem using clusters will allocate them in > larger groups. Mapping between these larger blocks and the 4KB blocks seen > by the core kernel is handled entirely within the filesystem.? > > Setting 64KB cluster size may make sense as MooseFS operates on 64KB > blocks. We have not tested it but we can expect it may give some > performance boost. It would also depend on the average size of the files in > your system. > > > And as MooseFS doesn?t support deduplication by itself you can also > consider using dedup functionality in ZFS. > > > Kind regards > Michal Borychowski > > > From: Allen, Benjamin S [mailto:bs...@la...] > Sent: Friday, May 18, 2012 5:30 PM > To: moo...@li... > Subject: Re: [Moosefs-users] Best FileSystem for chunkers. > > My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get > 2.32x, which is along the lines of the compress ratio I get with similar > systems serving NFS. Note, my data is inherently well compressible. With 2x > Intel X5675, load is never an issue. As you up the level of gzip you'll see > a diminishing return, and pretty heavy hits on CPU load. > > I'd also suggest using ZFS for raid if you care about single stream > performance. Serve up one or two big zpools per chunkserver to MFS. Keep in > mind the size of your pool however, as having MFS fail off that HD will can > take ages. Also of course you'll loose capacity in this approach to parity > of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high > availability. > > If you're thinking of using ZFS, I'd highly suggest using one of the > Illumos based OSes instead of FreeBSD or Linux variants. The Linux port is > still pretty young in my opinion. I'd suggest Illumian, > http://illumian.org/ which grew out of Nexenta Core. By the way MFS is > the only distributed FS that I know of that compiles and runs well on > Solaris. > > I've found small file performance isn't all that great in this setup. Sub > what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up much > for it to solve this issue. You could perhaps throw a good amount of small > SSD drives at ZFS' ZIL to improve synchronous write speeds, but when using > ZIL you're funneling all your synchronous writes through the ZIL devices. > So while using two SSDs will likely give you a touch better latency, it > will kill your throughput compared to a full chassis of drives. > > I've also tested use of L2Arc on MLC SSDs for read cache. If its > affordable for you, I'd suggest throwing RAM in the box for L1ARC instead. > At least in my workload, I see very little L2Arc hits. Most hits (90%) > comes from L1ARC in memory in my chunkservers that have 96GB. Next series > of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than > half them of my current systems( 2.4T -> 960G). I guessing I could actually > remove L2ARC all together and not see a performance hit, but I haven't done > enough benchmarking to prove that one way or another. > > Ben > > On May 17, 2012, at 2:22 PM, Steve Wilson wrote: > > > On 05/17/2012 04:17 PM, Steve Wilson wrote: > > On 05/17/2012 04:05 PM, Atom Powers wrote: > On 05/17/2012 12:44 PM, Steve Wilson wrote: > On 05/17/2012 03:26 PM, Atom Powers wrote: > * Compression, 1.16x in my environment > I don't know if 1.16x would give me much improvement in performance. > I typically see about 1.4x on my ZFS backup servers which made me > think that this reduction in disk I/O could result in improved > overall performance for MooseFS. > Not for performance, for disk efficiency. Ostensibly those 64MiB chunks > won't always use 64MiB with compression on, especially for smaller > files. > > This is a good point and it might help where it's most needed: all > those small configuration files, etc. that have a large impact on the > user's perception of disk performance. > > Bad: * high RAM requirement > Is the high RAM due to using raidz{2-3}? I was thinking of making > each disk a separate ZFS volume and then letting MooseFS combine the > disks into an MFS volume (i.e., no raidz). I realize that greater > performance could be achieved by striping across disks in the chunk > servers but I'm willing to trade off that performance gain for > higher redundancy (in the case of using simple striping) and/or > greater capacity (in the case of using raidz, raidz2, or raidz3). > ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, > not raidz, and still use several hundred MiB of RAM. > > Personally, I would prefer to use raidz for muliple disks over MooseFS, > because managing individual disks and disk failures should be much > better. For example, to minimize the amount of re-balancing MooseFS > needs to do; not to mention the possible performance benefit. But I can > think of no reason why you couldn't do a combination of both. > > > That is certainly worth considering. I hope to have enough time with > the new chunk servers to try out different configurations before I > have to put them into service. > > Steve > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > ------------------------------ > > Message: 3 > Date: Mon, 21 May 2012 09:48:32 -0700 > From: Atom Powers <ap...@di...> > Subject: Re: [Moosefs-users] How to speedup replication from disk > marked for removal > To: moo...@li... > Message-ID: <4FB...@di...> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 05/18/2012 10:03 AM, Allen, Benjamin S wrote: > > I have no idea what MFS will do if it tries to write to a non-writtable > > chunk file. > > This story might be related. > > Last week we lost cooling in our server room. Among the other effects > one of our chunk servers entered a weird state where it was still > running but wasn't operating correctly. The MFS master didn't recognize > that the server was down so it wasn't taken out of the pool. Most of the > MFS client mounts became read-only, presumably when they were trying to > write chunks to that server. > > So it appears that if the MFS client can't write a chunk then the entire > mfsmount becomes read-only. However, this wasn't the focus of our > troubleshooting so I can't be confident in that conclusion. > > -- > -- > Perfection is just a word I use occasionally with mustard. > --Atom Powers-- > Director of IT > DigiPen Institute of Technology > +1 (425) 895-4443 > > > > ------------------------------ > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > ------------------------------ > > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > End of moosefs-users Digest, Vol 29, Issue 26 > ********************************************* > |
From: mARK b. <mb...@gm...> - 2012-05-22 16:07:19
|
> Did your read 'Solution of small file store' ( > http://sourceforge.net/mailarchive/message.php?msg_id=29244311). i am interested in this too, but the relevant part of that thread is very terse. there is no discussion of the issue that is helpful, except perhaps to one familiar with the source code and run-time behaviour of mfs. -- mARK bLOORE <mb...@gm...> |
From: Allen, B. S <bs...@la...> - 2012-05-22 16:01:42
|
With a 2,408,276 object MFS (31T used out of 106T), it took me 30 seconds or there about to run mfsmetarestore. If you have a backup of your /var/mfs (data directory) or metalogger copies, I'd suggesting trying to run mfsmetarestore on that data on a separate system. Examples: mfsmetarestore -m metadata_ml.mfs.back -o metadata_restore.mfs changelog_ml.0.mfs changelog_ml.1.mfs mfsmetarestore -m metadata.mfs.back -o metadata_restore.mfs changelog.0.mfs Ben On May 22, 2012, at 9:34 AM, Boris Epstein wrote: > Hello listmates, > > I shutdown a VM that ran as an MFS master server and then had run mfsmetarestore to restore the configuration. The MFs installation had several million files (chunks) occupying about 4.8 TB of space. That mfsmetarestore has been running for about 18 hours now and is still not done. Is that normal? How long should I expect it to run? > > Thanks. > > Boris. > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/_______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Boris E. <bor...@gm...> - 2012-05-22 15:34:24
|
Hello listmates, I shutdown a VM that ran as an MFS master server and then had run mfsmetarestore to restore the configuration. The MFs installation had several million files (chunks) occupying about 4.8 TB of space. That mfsmetarestore has been running for about 18 hours now and is still not done. Is that normal? How long should I expect it to run? Thanks. Boris. |
From: Ken <ken...@gm...> - 2012-05-22 09:10:47
|
Reply inline. Regards -Ken On Tue, May 22, 2012 at 4:26 PM, Michal Borychowski <mic...@co...> wrote: > Hi Ken! > > Your solution is really interesting and promising for storing large amounts > of small files. > > I uploaded several files to your demo and as I understand demo operates on > jpg files, but bundle could easily also store other formats as .pdf, .png, > etc.? The demo is very limited. We will store MIME type in the future, then the fastcgi program can support any type. Normally bundle read/write content, application deal how to use the content. > > Where do you keep meta information (size, offset) of the files? In some > external database like MySQL or something? We store them(photo id and url) in the MySQL now. Maybe applications have more responsibility. I thinks a NoSQL database is enough. > > You say the files can be overwritten with bundle. Even if they have > different size? What happens with the old, unused space (there is a "hole" > in the huge file)? Is it lost? Overwritten in bundle is danger. Old/unused space are wasted. There is a flag in meta info of small files, it indicate deleted/redirect/reference count... The content is remained. Maybe carefully reuse is a better choice. > > What about write permissions? Is it still sth like u/g/o? Can the > permissions be set separately per each small file or just by a huge file? Just the huge file. It's more easy. > > One useful thing which you probably lose using such a solution is lack of > "trash bin" per each of the small file. > > > Kind regards > Michal > > > -----Original Message----- > From: Ken [mailto:ken...@gm...] > Sent: Thursday, May 10, 2012 1:17 PM > To: moosefs-users > Subject: [Moosefs-users] bundle open source [was: Solution of small file > store] > > hi, all > > As mention in previous mail > (http://sf.net/mailarchive/message.php?msg_id=29171206), > now we open source it - bundle > > https://github.com/xiaonei/bundle > > The source is well tested and documented. > > Demo: > http://60.29.242.206/demo.html > > > Any ideas is appreciated. > > -Ken > > ---------------------------------------------------------------------------- > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and threat > landscape has changed and how IT managers can respond. Discussions will > include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Michal B. <mic...@co...> - 2012-05-22 08:26:45
|
Hi Ken! Your solution is really interesting and promising for storing large amounts of small files. I uploaded several files to your demo and as I understand demo operates on jpg files, but bundle could easily also store other formats as .pdf, .png, etc.? Where do you keep meta information (size, offset) of the files? In some external database like MySQL or something? You say the files can be overwritten with bundle. Even if they have different size? What happens with the old, unused space (there is a "hole" in the huge file)? Is it lost? What about write permissions? Is it still sth like u/g/o? Can the permissions be set separately per each small file or just by a huge file? One useful thing which you probably lose using such a solution is lack of "trash bin" per each of the small file. Kind regards Michal -----Original Message----- From: Ken [mailto:ken...@gm...] Sent: Thursday, May 10, 2012 1:17 PM To: moosefs-users Subject: [Moosefs-users] bundle open source [was: Solution of small file store] hi, all As mention in previous mail (http://sf.net/mailarchive/message.php?msg_id=29171206), now we open source it - bundle https://github.com/xiaonei/bundle The source is well tested and documented. Demo: http://60.29.242.206/demo.html Any ideas is appreciated. -Ken ---------------------------------------------------------------------------- -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Chris P. <ch...@ec...> - 2012-05-22 07:57:43
|
Hi all I am setting up a new moosefs install with 6 chunkservers which have 5 disks each (1 Tb Enterprise SATA) I have a megaraid based battery backed raid card in each. To do some testing, I have run iozone with the following options: iozone -b /root/report-ext4-raid5-stripe256-stride-align1-sync-8g.csv -e -o -r 64k -s 8g I have run a few tests on different setups and have found the following results: Aligning the partitions to the underlying raid blocks makes a noticeable difference (up to 40% in some tests) On the random write test, Raid 0 is giving me around 150MB/s, while Raid 5 goes up to only 30MB/s On other tests raid 0 is faster, but not by nearly as much (and when using sync options in iozone, the performance is quite similar) Using a 64k stripe size on the raid is slower than a 256k stripe size (which I did not expect) So to my question. If I am generally going to be using a goal of 3, is the increased performance of Raid 0 (especially on random writes) worth the potential of data loss if I have drives failing at the same time (especially if they are from the same batch which tend to start failing together at the same time)? Chris |
From: Atom P. <ap...@di...> - 2012-05-21 16:48:46
|
On 05/18/2012 10:03 AM, Allen, Benjamin S wrote: > I have no idea what MFS will do if it tries to write to a non-writtable > chunk file. This story might be related. Last week we lost cooling in our server room. Among the other effects one of our chunk servers entered a weird state where it was still running but wasn't operating correctly. The MFS master didn't recognize that the server was down so it wasn't taken out of the pool. Most of the MFS client mounts became read-only, presumably when they were trying to write chunks to that server. So it appears that if the MFS client can't write a chunk then the entire mfsmount becomes read-only. However, this wasn't the focus of our troubleshooting so I can't be confident in that conclusion. -- -- Perfection is just a word I use occasionally with mustard. --Atom Powers-- Director of IT DigiPen Institute of Technology +1 (425) 895-4443 |
From: Alexander A. <akh...@ri...> - 2012-05-21 09:13:52
|
Hi Deon! It was a new pool for chunks without any data when I turned dedup on. And You are right: on that pool whan i copy some normal file more than once - free space shown by df does not decrease. ====================================================== Hi Alex, Are you turning on dedup for a new ZFS filesystem or an existing one? (doesn't matter if its an existing zpool or not). ZFS uses in-line dedup, which means that if you are trying to dedup an existing ZFS filesystem it will only create new dedup blocks, existing blocks will not be deduped. If you are indeed trying this on a new zfs filesystem try making 10 copies of a large file. Deon On Mon, May 21, 2012 at 8:21 PM, Alexander Akhobadze <akh...@ri...> wrote: Hi Michal ! I have tested to turn on deduplication on a ZFS chunk storage but unfortunately did not get any profit :--( I thought that chunk file format prevents ZFS to find dups. May be I make mistake... Correct me if yes. wbr Alexander ====================================================== Hi! Users of MooseFS may now be interested in a new feature of ext4 called “bigalloc” introduced in 3.2 kernel. According to http://lwn.net/Articles/469805/: “The "bigalloc" patch set adds the concept of "block clusters" to the filesystem; rather than allocate single blocks, a filesystem using clusters will allocate them in larger groups. Mapping between these larger blocks and the 4KB blocks seen by the core kernel is handled entirely within the filesystem.” Setting 64KB cluster size may make sense as MooseFS operates on 64KB blocks. We have not tested it but we can expect it may give some performance boost. It would also depend on the average size of the files in your system. And as MooseFS doesn’t support deduplication by itself you can also consider using dedup functionality in ZFS. Kind regards Michal Borychowski From: Allen, Benjamin S [mailto:bs...@la...] Sent: Friday, May 18, 2012 5:30 PM To: moo...@li... Subject: Re: [Moosefs-users] Best FileSystem for chunkers. My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get 2.32x, which is along the lines of the compress ratio I get with similar systems serving NFS. Note, my data is inherently well compressible. With 2x Intel X5675, load is never an issue. As you up the level of gzip you'll see a diminishing return, and pretty heavy hits on CPU load. I'd also suggest using ZFS for raid if you care about single stream performance. Serve up one or two big zpools per chunkserver to MFS. Keep in mind the size of your pool however, as having MFS fail off that HD will can take ages. Also of course you'll loose capacity in this approach to parity of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high availability. If you're thinking of using ZFS, I'd highly suggest using one of the Illumos based OSes instead of FreeBSD or Linux variants. The Linux port is still pretty young in my opinion. I'd suggest Illumian, http://illumian.org/ which grew out of Nexenta Core. By the way MFS is the only distributed FS that I know of that compiles and runs well on Solaris. I've found small file performance isn't all that great in this setup. Sub what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up much for it to solve this issue. You could perhaps throw a good amount of small SSD drives at ZFS' ZIL to improve synchronous write speeds, but when using ZIL you're funneling all your synchronous writes through the ZIL devices. So while using two SSDs will likely give you a touch better latency, it will kill your throughput compared to a full chassis of drives. I've also tested use of L2Arc on MLC SSDs for read cache. If its affordable for you, I'd suggest throwing RAM in the box for L1ARC instead. At least in my workload, I see very little L2Arc hits. Most hits (90%) comes from L1ARC in memory in my chunkservers that have 96GB. Next series of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than half them of my current systems( 2.4T -> 960G). I guessing I could actually remove L2ARC all together and not see a performance hit, but I haven't done enough benchmarking to prove that one way or another. Ben On May 17, 2012, at 2:22 PM, Steve Wilson wrote: On 05/17/2012 04:17 PM, Steve Wilson wrote: On 05/17/2012 04:05 PM, Atom Powers wrote: On 05/17/2012 12:44 PM, Steve Wilson wrote: On 05/17/2012 03:26 PM, Atom Powers wrote: * Compression, 1.16x in my environment I don't know if 1.16x would give me much improvement in performance. I typically see about 1.4x on my ZFS backup servers which made me think that this reduction in disk I/O could result in improved overall performance for MooseFS. Not for performance, for disk efficiency. Ostensibly those 64MiB chunks won't always use 64MiB with compression on, especially for smaller files. This is a good point and it might help where it's most needed: all those small configuration files, etc. that have a large impact on the user's perception of disk performance. Bad: * high RAM requirement Is the high RAM due to using raidz{2-3}? I was thinking of making each disk a separate ZFS volume and then letting MooseFS combine the disks into an MFS volume (i.e., no raidz). I realize that greater performance could be achieved by striping across disks in the chunk servers but I'm willing to trade off that performance gain for higher redundancy (in the case of using simple striping) and/or greater capacity (in the case of using raidz, raidz2, or raidz3). ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, not raidz, and still use several hundred MiB of RAM. Personally, I would prefer to use raidz for muliple disks over MooseFS, because managing individual disks and disk failures should be much better. For example, to minimize the amount of re-balancing MooseFS needs to do; not to mention the possible performance benefit. But I can think of no reason why you couldn't do a combination of both. That is certainly worth considering. I hope to have enough time with the new chunk servers to try out different configurations before I have to put them into service. Steve ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Deon C. <deo...@gm...> - 2012-05-21 08:32:56
|
Hi Alex, Are you turning on dedup for a new ZFS filesystem or an existing one? (doesn't matter if its an existing zpool or not). ZFS uses in-line dedup, which means that if you are trying to dedup an existing ZFS filesystem it will only create new dedup blocks, existing blocks will not be deduped. If you are indeed trying this on a new zfs filesystem try making 10 copies of a large file. Deon On Mon, May 21, 2012 at 8:21 PM, Alexander Akhobadze <akh...@ri...>wrote: > > Hi Michal ! > > I have tested to turn on deduplication on a ZFS chunk storage > but unfortunately did not get any profit :--( > I thought that chunk file format prevents ZFS to find dups. > May be I make mistake... Correct me if yes. > > wbr > Alexander > > ====================================================== > > Hi! > > Users of MooseFS may now be interested in a new feature of ext4 called > “bigalloc” introduced in 3.2 kernel. > > According to http://lwn.net/Articles/469805/: “The "bigalloc" patch set > adds the concept of "block clusters" to the filesystem; rather than > allocate single blocks, a filesystem using clusters will allocate them in > larger groups. Mapping between these larger blocks and the 4KB blocks seen > by the core kernel is handled entirely within the filesystem.” > > Setting 64KB cluster size may make sense as MooseFS operates on 64KB > blocks. We have not tested it but we can expect it may give some > performance boost. It would also depend on the average size of the files in > your system. > > > And as MooseFS doesn’t support deduplication by itself you can also > consider using dedup functionality in ZFS. > > > Kind regards > Michal Borychowski > > > From: Allen, Benjamin S [mailto:bs...@la...] > Sent: Friday, May 18, 2012 5:30 PM > To: moo...@li... > Subject: Re: [Moosefs-users] Best FileSystem for chunkers. > > My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get > 2.32x, which is along the lines of the compress ratio I get with similar > systems serving NFS. Note, my data is inherently well compressible. With 2x > Intel X5675, load is never an issue. As you up the level of gzip you'll see > a diminishing return, and pretty heavy hits on CPU load. > > I'd also suggest using ZFS for raid if you care about single stream > performance. Serve up one or two big zpools per chunkserver to MFS. Keep in > mind the size of your pool however, as having MFS fail off that HD will can > take ages. Also of course you'll loose capacity in this approach to parity > of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high > availability. > > If you're thinking of using ZFS, I'd highly suggest using one of the > Illumos based OSes instead of FreeBSD or Linux variants. The Linux port is > still pretty young in my opinion. I'd suggest Illumian, > http://illumian.org/ which grew out of Nexenta Core. By the way MFS is > the only distributed FS that I know of that compiles and runs well on > Solaris. > > I've found small file performance isn't all that great in this setup. Sub > what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up much > for it to solve this issue. You could perhaps throw a good amount of small > SSD drives at ZFS' ZIL to improve synchronous write speeds, but when using > ZIL you're funneling all your synchronous writes through the ZIL devices. > So while using two SSDs will likely give you a touch better latency, it > will kill your throughput compared to a full chassis of drives. > > I've also tested use of L2Arc on MLC SSDs for read cache. If its > affordable for you, I'd suggest throwing RAM in the box for L1ARC instead. > At least in my workload, I see very little L2Arc hits. Most hits (90%) > comes from L1ARC in memory in my chunkservers that have 96GB. Next series > of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than > half them of my current systems( 2.4T -> 960G). I guessing I could actually > remove L2ARC all together and not see a performance hit, but I haven't done > enough benchmarking to prove that one way or another. > > Ben > > On May 17, 2012, at 2:22 PM, Steve Wilson wrote: > > > On 05/17/2012 04:17 PM, Steve Wilson wrote: > > On 05/17/2012 04:05 PM, Atom Powers wrote: > On 05/17/2012 12:44 PM, Steve Wilson wrote: > On 05/17/2012 03:26 PM, Atom Powers wrote: > * Compression, 1.16x in my environment > I don't know if 1.16x would give me much improvement in performance. > I typically see about 1.4x on my ZFS backup servers which made me > think that this reduction in disk I/O could result in improved > overall performance for MooseFS. > Not for performance, for disk efficiency. Ostensibly those 64MiB chunks > won't always use 64MiB with compression on, especially for smaller > files. > > This is a good point and it might help where it's most needed: all > those small configuration files, etc. that have a large impact on the > user's perception of disk performance. > > Bad: * high RAM requirement > Is the high RAM due to using raidz{2-3}? I was thinking of making > each disk a separate ZFS volume and then letting MooseFS combine the > disks into an MFS volume (i.e., no raidz). I realize that greater > performance could be achieved by striping across disks in the chunk > servers but I'm willing to trade off that performance gain for > higher redundancy (in the case of using simple striping) and/or > greater capacity (in the case of using raidz, raidz2, or raidz3). > ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, > not raidz, and still use several hundred MiB of RAM. > > Personally, I would prefer to use raidz for muliple disks over MooseFS, > because managing individual disks and disk failures should be much > better. For example, to minimize the amount of re-balancing MooseFS > needs to do; not to mention the possible performance benefit. But I can > think of no reason why you couldn't do a combination of both. > > > That is certainly worth considering. I hope to have enough time with > the new chunk servers to try out different configurations before I > have to put them into service. > > Steve > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > |
From: Alexander A. <akh...@ri...> - 2012-05-21 08:21:26
|
Hi Michal ! I have tested to turn on deduplication on a ZFS chunk storage but unfortunately did not get any profit :--( I thought that chunk file format prevents ZFS to find dups. May be I make mistake... Correct me if yes. wbr Alexander ====================================================== Hi! Users of MooseFS may now be interested in a new feature of ext4 called “bigalloc” introduced in 3.2 kernel. According to http://lwn.net/Articles/469805/: “The "bigalloc" patch set adds the concept of "block clusters" to the filesystem; rather than allocate single blocks, a filesystem using clusters will allocate them in larger groups. Mapping between these larger blocks and the 4KB blocks seen by the core kernel is handled entirely within the filesystem.” Setting 64KB cluster size may make sense as MooseFS operates on 64KB blocks. We have not tested it but we can expect it may give some performance boost. It would also depend on the average size of the files in your system. And as MooseFS doesn’t support deduplication by itself you can also consider using dedup functionality in ZFS. Kind regards Michal Borychowski From: Allen, Benjamin S [mailto:bs...@la...] Sent: Friday, May 18, 2012 5:30 PM To: moo...@li... Subject: Re: [Moosefs-users] Best FileSystem for chunkers. My chunkservers are on top of ZFS pools on Solaris. Using gzip-1 I get 2.32x, which is along the lines of the compress ratio I get with similar systems serving NFS. Note, my data is inherently well compressible. With 2x Intel X5675, load is never an issue. As you up the level of gzip you'll see a diminishing return, and pretty heavy hits on CPU load. I'd also suggest using ZFS for raid if you care about single stream performance. Serve up one or two big zpools per chunkserver to MFS. Keep in mind the size of your pool however, as having MFS fail off that HD will can take ages. Also of course you'll loose capacity in this approach to parity of RAIDZ or RAIDZ2, and then again to MFS' goal > 1 if you want high availability. If you're thinking of using ZFS, I'd highly suggest using one of the Illumos based OSes instead of FreeBSD or Linux variants. The Linux port is still pretty young in my opinion. I'd suggest Illumian, http://illumian.org/ which grew out of Nexenta Core. By the way MFS is the only distributed FS that I know of that compiles and runs well on Solaris. I've found small file performance isn't all that great in this setup. Sub what NFS can do on a similar ZFS pool, so I wouldn't get your hopes up much for it to solve this issue. You could perhaps throw a good amount of small SSD drives at ZFS' ZIL to improve synchronous write speeds, but when using ZIL you're funneling all your synchronous writes through the ZIL devices. So while using two SSDs will likely give you a touch better latency, it will kill your throughput compared to a full chassis of drives. I've also tested use of L2Arc on MLC SSDs for read cache. If its affordable for you, I'd suggest throwing RAM in the box for L1ARC instead. At least in my workload, I see very little L2Arc hits. Most hits (90%) comes from L1ARC in memory in my chunkservers that have 96GB. Next series of systems I buy will have 128G, and I'll cut my L2ARC SSDs to less than half them of my current systems( 2.4T -> 960G). I guessing I could actually remove L2ARC all together and not see a performance hit, but I haven't done enough benchmarking to prove that one way or another. Ben On May 17, 2012, at 2:22 PM, Steve Wilson wrote: On 05/17/2012 04:17 PM, Steve Wilson wrote: On 05/17/2012 04:05 PM, Atom Powers wrote: On 05/17/2012 12:44 PM, Steve Wilson wrote: On 05/17/2012 03:26 PM, Atom Powers wrote: * Compression, 1.16x in my environment I don't know if 1.16x would give me much improvement in performance. I typically see about 1.4x on my ZFS backup servers which made me think that this reduction in disk I/O could result in improved overall performance for MooseFS. Not for performance, for disk efficiency. Ostensibly those 64MiB chunks won't always use 64MiB with compression on, especially for smaller files. This is a good point and it might help where it's most needed: all those small configuration files, etc. that have a large impact on the user's perception of disk performance. Bad: * high RAM requirement Is the high RAM due to using raidz{2-3}? I was thinking of making each disk a separate ZFS volume and then letting MooseFS combine the disks into an MFS volume (i.e., no raidz). I realize that greater performance could be achieved by striping across disks in the chunk servers but I'm willing to trade off that performance gain for higher redundancy (in the case of using simple striping) and/or greater capacity (in the case of using raidz, raidz2, or raidz3). ZFS does a lot of caching in RAM. My chunk servers use hardware RAID, not raidz, and still use several hundred MiB of RAM. Personally, I would prefer to use raidz for muliple disks over MooseFS, because managing individual disks and disk failures should be much better. For example, to minimize the amount of re-balancing MooseFS needs to do; not to mention the possible performance benefit. But I can think of no reason why you couldn't do a combination of both. That is certainly worth considering. I hope to have enough time with the new chunk servers to try out different configurations before I have to put them into service. Steve ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Davies L. <dav...@gm...> - 2012-05-21 07:33:17
|
This bug was fixed in 1.6.26. On Fri, Apr 20, 2012 at 1:58 PM, Ken <ken...@gm...> wrote: > Nobody interesting this? > > There is one-thousandth of the possibility which cause file damage. > > Maybe log like: > mfsmaster[7192]: chunk 00000000000EC3A8 has only invalid copies (2) - please > repair it manually > mfsmaster[7192]: chunk 00000000000EC3A8_00000002 - invalid copy on (10.1.1.3 > - ver:00000001) > > > -Ken > > > > > On Thu, Apr 19, 2012 at 9:01 AM, Ken <ken...@gm...> wrote: >> >> hi, list >> >> We found some crashes in mfschunkserver(1.6.24) in stopping. The test >> script maybe weired: >> >> while true: >> select a ChunkServer >> stop_it >> start_it >> sleep 1 second >> >> Almost 20MiB/s are writing to the system when the script running. It's a >> little crazy? er. >> >> >> The crash stack: >> #0 0x00000000004139e7 in masterconn_replicationfinished (status=0 '\0', >> packet=0x269b170) at masterconn.c:351 >> 351 if (eptr->mode==DATA || eptr->mode==HEADER) { >> >> #0 0x00000000004139e7 in masterconn_replicationfinished (status=0 '\0', >> packet=0x269b170) at masterconn.c:351 >> #1 0x0000000000403b6e in job_pool_check_jobs (jpool=0x7f39b43ddea0) at >> bgjobs.c:338 >> #2 0x0000000000403f17 in job_pool_delete (jpool=0x7f39b43ddea0) at >> bgjobs.c:365 >> #3 0x0000000000414b31 in masterconn_term () at masterconn.c:864 >> #4 0x0000000000419173 in destruct () at ../mfscommon/main.c:312 >> #5 0x000000000041b60f in main (argc=1, argv=0x7fffc810dda0) at >> ../mfscommon/main.c:1162 >> >> # mfschunkserver -v >> version: 1.6.24 >> >> I think masterconn_termm cause crash: >> >> void masterconn_term(void) { >> packetstruct *pptr,*paptr; >> // syslog(LOG_INFO,"closing %s:%s",MasterHost,MasterPort); >> masterconn *eptr = masterconnsingleton; >> >> if (eptr->mode!=FREE && eptr->mode!=CONNECTING) { >> tcpclose(eptr->sock); >> >> if (eptr->inputpacket.packet) { >> free(eptr->inputpacket.packet); >> } >> pptr = eptr->outputhead; >> while (pptr) { >> if (pptr->packet) { >> free(pptr->packet); >> } >> paptr = pptr; >> pptr = pptr->next; >> free(paptr); >> } >> } >> >> free(eptr); >> masterconnsingleton = NULL; >> job_pool_delete(jpool); // this is too later >> free(MasterHost); >> free(MasterPort); >> free(BindHost); >> } >> >> So we move the line to start. And patch below >> >> --- a/mfschunkserver/masterconn.c >> +++ b/mfschunkserver/masterconn.c >> @@ -842,6 +842,8 @@ void masterconn_term(void) { >> // syslog(LOG_INFO,"closing %s:%s",MasterHost,MasterPort); >> masterconn *eptr = masterconnsingleton; >> >> + job_pool_delete(jpool); >> + >> if (eptr->mode!=FREE && eptr->mode!=CONNECTING) { >> tcpclose(eptr->sock); >> >> @@ -861,7 +863,7 @@ void masterconn_term(void) { >> >> free(eptr); >> masterconnsingleton = NULL; >> - job_pool_delete(jpool); >> + >> free(MasterHost); >> free(MasterPort); >> free(BindHost); >> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ patch end >> >> Crash did not happened again with the patch, and the test almost run 12 >> hours. >> >> >> HTH >> >> -Ken >> > > > ------------------------------------------------------------------------------ > For Developers, A Lot Can Happen In A Second. > Boundary is the first to Know...and Tell You. > Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! > http://p.sf.net/sfu/Boundary-d2dvs2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |