From: WK <wk...@bn...> - 2011-09-23 18:53:33
|
On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s So I checked the code and here is the offending section in chunks.c of the mfsmaster code if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } So CHUNKS_DEL_LIMIT will automatically increase by 30% every cycle until the deletion queue gets caught up. Of course it just keeps rising if you are deleting hundreds of thousands of files irregardless of the performance hit which can be severe (at least thats what we see) So that whole section could be commented out and mfsmaster recompiled so that overwhelming deletion run doesn't happen and CHUNKS_DEL_LIMIT now really means LIMIT, even if the deletion queue stacks up. Instead we decided to drop our DEL_LIMIT to 12 which has no impact on our system for normal deletions and we are going to let the rate increase by 10% per cycle to DEL_LIMIT*2 (i.e. 24 in our case) which is still comfortable and then give us a warning if we are still not keeping up. // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } else { syslog(LOG_NOTICE,"DEL_LIMIT at MAXIMUM of: %u/s",TmpMaxDel); } } We are testing now on our test cluster. This was a quickie, so someone let me know if I'm missing something important occuring elsewhere. -bill |