Re: [Moosefs-users] mfsmaster hang

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

于 2012/3/1 20:35, Steve Thompson 写道:
> On Tue, 28 Feb 2012, Ricardo J. Barberis wrote:
>
>> This happened to me once and I also took down every server of the cluster, 9
>> chunkservers and one dedicated metalogger (previously, I unmounted all the
>> clients, about 250).
>>
>> Bad idea: when the master came on-line again and I started one chunkserver,
>> the master went "crazy" triyng to recreate empty chunks for later deletion.
>>
>> My "solution" was to start all the chunkservers at the same time, so the
>> master saw all the chunks almost simultaneously and didn't try to create
>> empty chunks.
> It turns out that the master was very very slow because a RAID-5
> reconstruction was in progress on the box. The I/O performance dropped to
> 5% of its normal value, in the spite of the RAID controller throttle being
> set to 30% maximum reconstruction rate (it's a Dell PE2900 server with a
> Perc 5 controller). Once the reconstruction finished, I restarted
> everything (all chunk servers at the same time) and it came up fine within
> a few seconds.
>
> Steve

Use SSD and RAID10 whenever possible for "meta" servers nowadays, IO load during recovery should always be taken into consideration.

Re: [Moosefs-users] mfsmaster hang

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] mfsmaster hang