Re: [Moosefs-users] MFS Access very slow during chunk deletions

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 6/28/2011 4:14 AM, Ólafur Ósvaldsson wrote:
> When reading the man page for mfsmaster.cfg I see the comments for 
> CHUNKS_LOOP_TIME and CHUNKS_DEL_LIMIT and my understanding is that 
> with the default values the maximum number of chunks to delete in one 
> loop (300 sek) is 100, it does not say if that is pr. chunkserver or 
> for the whole system, but each server here was around and over 5000 
> chunk deletions pr. minute and with 10 servers thats over 50k chunk 
> deletions pr. minute for the whole system.
>
>        CHUNKS_LOOP_TIME
>               Chunks loop frequency in seconds (default is 300)
>
>        CHUNKS_DEL_LIMIT
>               Maximum number of chunks to delete in one loop (default 
> is 100)
>

We just got hit by this.

We had a small 7 million file cluster that had chunkservers with only 
1GB or RAM, all of sudden start doing 10K+ deletions a minute. That 
bogged down the entire cluster making it unusuable and even sent two 
chunkservers into swap.

We were able to get the chunkservers shutdown and restarted the 
MFSMaster with the CHUNKS_DEL_LIMIT set to 50,  after a settling down 
time, the deletion started again but this time at half the rate (around 
5K deletions a minute), which was still excessive given what we have.

So I can verify that change CHUNKS_DEL_LIMIT does have an affect (if you 
resetart the master), but that default is way to high, unless you are 
careful.  In our case, we weren't paying attention and didn't realize 
the number of files was increasing past a reasonable point for those 
resources.

We have since set it to 20 and have increased the RAM in the chunkservers.

BTW, the sizing data in the FAQ for chunkservers, should be more 
explicit. It should say that you need about 150MB of RAM per chunk 
server for every Million chunks you get on chunk server (which is about 
what we are seeing), so the more chunkservers you have the less ram you 
need.

Maybe the CGI could be used to query resources and warn about potential 
RAM resource issues.

-bill

Re: [Moosefs-users] MFS Access very slow during chunk deletions

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

Re: [Moosefs-users] MFS Access very slow during chunk deletions