From: <wk...@bn...> - 2011-09-22 05:31:07
|
Ok, we deleted a couple hundred thousand files from a large Maildir folder set. We have had problems with deletions overwhelming the cluster in the past, so we have a DEL_LIMIT set to 20 (which we will probably lower) But when the expiretime hit, the server became lethargic. in checking the logs I see this Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary increased to: 26/s Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary increased to: 33/s Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: 26/s OK, WHY IS DOING THIS! I told it no more than 20 I do NOT want this, it kills my server and we've learned the hard way it can really screw up VM images (they go read-only) if the deletions overwhelm the cluster. -bill |
From: Ólafur Ó. <osv...@ne...> - 2011-09-22 10:00:20
|
We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s This is also the case for chunk replications, it does not seem to honor the mfsmaster.cfg settings, although that does not get logged. /Oli On 22.9.2011, at 05:03, wk...@bn... wrote: > Ok, we deleted a couple hundred thousand files from a large Maildir > folder set. > > We have had problems with deletions overwhelming the cluster in the > past, so we have a DEL_LIMIT set to 20 (which we will probably lower) > > But when the expiretime hit, the server became lethargic. in checking > the logs I see this > > Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary > increased to: 26/s > Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary > increased to: 33/s > Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: > 26/s > > OK, WHY IS DOING THIS! I told it no more than 20 > > I do NOT want this, it kills my server and we've learned the hard way it > can really screw up VM images (they go read-only) if the deletions > overwhelm the cluster. > > > -bill > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users -- Ólafur Osvaldsson System Administrator Nethonnun ehf. e-mail: osv...@ne... phone: +354 517 3400 |
From: WK <wk...@bn...> - 2011-09-23 01:07:10
|
Well I suppose it would be easy enough to grep the source code for 'DEL_LIMIT temporary increase' and start commenting some things out for a quick fix. However, I'd prefer if the maintainers addressed the issue with something more comprehensive and/or a flag for strict enforcement of the DEL_LIMIT setting or the current setup which is obviously some sort of 'oh no, we have a LOT of files we need to work through and need more resources' logic. There may also be some sort of reason they do this, such as some resource issue. We will see what they have to say. Maybe its already addressed in the next version. -bill On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s > > This is also the case for chunk replications, it does not seem to honor the mfsmaster.cfg settings, although that does not get logged. > > /Oli > > On 22.9.2011, at 05:03, wk...@bn... wrote: > >> Ok, we deleted a couple hundred thousand files from a large Maildir >> folder set. >> >> We have had problems with deletions overwhelming the cluster in the >> past, so we have a DEL_LIMIT set to 20 (which we will probably lower) >> >> But when the expiretime hit, the server became lethargic. in checking >> the logs I see this >> >> Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 26/s >> Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 33/s >> Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: >> 26/s >> >> OK, WHY IS DOING THIS! I told it no more than 20 >> >> I do NOT want this, it kills my server and we've learned the hard way it >> can really screw up VM images (they go read-only) if the deletions >> overwhelm the cluster. >> >> >> -bill >> >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- > Ólafur Osvaldsson > System Administrator > Nethonnun ehf. > e-mail: osv...@ne... > phone: +354 517 3400 > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: WK <wk...@bn...> - 2011-09-23 18:53:33
|
On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s So I checked the code and here is the offending section in chunks.c of the mfsmaster code if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } So CHUNKS_DEL_LIMIT will automatically increase by 30% every cycle until the deletion queue gets caught up. Of course it just keeps rising if you are deleting hundreds of thousands of files irregardless of the performance hit which can be severe (at least thats what we see) So that whole section could be commented out and mfsmaster recompiled so that overwhelming deletion run doesn't happen and CHUNKS_DEL_LIMIT now really means LIMIT, even if the deletion queue stacks up. Instead we decided to drop our DEL_LIMIT to 12 which has no impact on our system for normal deletions and we are going to let the rate increase by 10% per cycle to DEL_LIMIT*2 (i.e. 24 in our case) which is still comfortable and then give us a warning if we are still not keeping up. // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } else { syslog(LOG_NOTICE,"DEL_LIMIT at MAXIMUM of: %u/s",TmpMaxDel); } } We are testing now on our test cluster. This was a quickie, so someone let me know if I'm missing something important occuring elsewhere. -bill |
From: Kristofer P. <kri...@cy...> - 2011-09-23 19:01:38
|
It looks like it is trying to increase it if the things to be deleted is growing too fast. Perhaps there should be a hard stop maximum (like chunks_del_hard_limit), where TmpMaxDel will never exceed that. In your case, it could be set to the same as the current chunks_del_limit, so that it has the best of both worlds. ----- Original Message ----- From: "WK" <wk...@bn...> To: moo...@li... Sent: Friday, September 23, 2011 1:53:04 PM Subject: Re: [Moosefs-users] Why is it increasing my DEL_LIMIT when I don't want it to! On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s So I checked the code and here is the offending section in chunks.c of the mfsmaster code if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } So CHUNKS_DEL_LIMIT will automatically increase by 30% every cycle until the deletion queue gets caught up. Of course it just keeps rising if you are deleting hundreds of thousands of files irregardless of the performance hit which can be severe (at least thats what we see) So that whole section could be commented out and mfsmaster recompiled so that overwhelming deletion run doesn't happen and CHUNKS_DEL_LIMIT now really means LIMIT, even if the deletion queue stacks up. Instead we decided to drop our DEL_LIMIT to 12 which has no impact on our system for normal deletions and we are going to let the rate increase by 10% per cycle to DEL_LIMIT*2 (i.e. 24 in our case) which is still comfortable and then give us a warning if we are still not keeping up. // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } else { syslog(LOG_NOTICE,"DEL_LIMIT at MAXIMUM of: %u/s",TmpMaxDel); } } We are testing now on our test cluster. This was a quickie, so someone let me know if I'm missing something important occuring elsewhere. -bill ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: Michał B. <mic...@ge...> - 2011-09-24 13:06:03
|
Hi! We discussed similiar problem with Ólafur in July as far as I remember. Yes, we know it is not very optimal - if there are many files to be deleted system tries to increase the limit so that they get deleted, but unfortunately with realy huge number of files to be deleted, system stucks up... We have some ideas for improvement, eg. first doing truncate (setting to 0 bytes) and later do the real deleting. Also setting of this limit would be possible on the fly with mastertools. Kind regards Michał Borychowski MooseFS Support Manager _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Gemius S.A. ul. Wołoska 7, 02-672 Warszawa Budynek MARS, klatka D Tel.: +4822 874-41-00 Fax : +4822 874-41-01 -----Original Message----- From: WK [mailto:wk...@bn...] Sent: Friday, September 23, 2011 3:07 AM To: moo...@li... Subject: Re: [Moosefs-users] Why is it increasing my DEL_LIMIT when I don't want it to! Well I suppose it would be easy enough to grep the source code for 'DEL_LIMIT temporary increase' and start commenting some things out for a quick fix. However, I'd prefer if the maintainers addressed the issue with something more comprehensive and/or a flag for strict enforcement of the DEL_LIMIT setting or the current setup which is obviously some sort of 'oh no, we have a LOT of files we need to work through and need more resources' logic. There may also be some sort of reason they do this, such as some resource issue. We will see what they have to say. Maybe its already addressed in the next version. -bill On 9/22/2011 2:42 AM, Ólafur Ósvaldsson wrote: > We have the exact same problem, chunk deletions have caused problems in the past and we have DEL_LIMIT set at 5, but mfsmaster increases it to 40-50 right away and sometimes goes to 70/s > > This is also the case for chunk replications, it does not seem to honor the mfsmaster.cfg settings, although that does not get logged. > > /Oli > > On 22.9.2011, at 05:03, wk...@bn... wrote: > >> Ok, we deleted a couple hundred thousand files from a large Maildir >> folder set. >> >> We have had problems with deletions overwhelming the cluster in the >> past, so we have a DEL_LIMIT set to 20 (which we will probably lower) >> >> But when the expiretime hit, the server became lethargic. in checking >> the logs I see this >> >> Sep 21 21:10:31 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 26/s >> Sep 21 21:15:30 mfs1master mfsmaster[2373]: DEL_LIMIT temporary >> increased to: 33/s >> Sep 21 21:55:24 mfs1master mfsmaster[2373]: DEL_LIMIT decreased back to: >> 26/s >> >> OK, WHY IS DOING THIS! I told it no more than 20 >> >> I do NOT want this, it kills my server and we've learned the hard way it >> can really screw up VM images (they go read-only) if the deletions >> overwhelm the cluster. >> >> >> -bill >> >> >> >> ---------------------------------------------------------------------------- -- >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2dcopy1 >> _______________________________________________ >> moosefs-users mailing list >> moo...@li... >> https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- > Ólafur Osvaldsson > System Administrator > Nethonnun ehf. > e-mail: osv...@ne... > phone: +354 517 3400 > > > ---------------------------------------------------------------------------- -- > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2dcopy1 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users ---------------------------------------------------------------------------- -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ moosefs-users mailing list moo...@li... https://lists.sourceforge.net/lists/listinfo/moosefs-users |
From: <wk...@bn...> - 2011-09-24 20:22:23
|
On 9/24/11 6:04 AM, Michał Borychowski wrote: > Hi! > > We discussed similiar problem with Ólafur in July as far as I remember. Yes, > we know it is not very optimal - if there are many files to be deleted > system tries to increase the limit so that they get deleted, but > unfortunately with realy huge number of files to be deleted, system stucks > up... We have some ideas for improvement, eg. first doing truncate (setting > to 0 bytes) and later do the real deleting. Also setting of this limit would > be possible on the fly with mastertools. > > > What creates the Deletion Problem is this code in chunks.c if (delnotdone > deldone && delnotdone > prevdelnotdone) { TmpMaxDelFrac *= 1.3; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } This allows the deletion rate to be increased at very fast rate (30% every 5 minutes) and there is no hard LIMIT, so the deletion rate keeps on going up until the server is overwhelmed and the deletions are consuming all the resources. Unless you are running out of space on the server, deletion from the Trash is a very, very low priority process and should only be happening AFTER normal reads/writes and replications. So it not necessary to ramp up the deletion rate in most cases. As I previously indicated, we have tested (and now are running in production on 3 clusters) the following replacement code: // Local version 09-23-2011 if (delnotdone > deldone && delnotdone > prevdelnotdone) { if (TmpMaxDelFrac < (MaxDel*2)) { TmpMaxDelFrac *= 1.1; TmpMaxDel = TmpMaxDelFrac; syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); } } This minor change limits the Deletion rate to a HARD LIMIT of 2x the CHUNK_DEL_LIMIT and only increases it by 10% each 5 minutes when its in ramp up phase. This is working very well for us. We are no longer terrified about deleting large folders and we don't care it it takes 6-8 hours to clear the post-trashtime deletion queue instead of the cluster being unusable for 1-2 hours. If we were to find that the number of post-trashtime files were growing to an unreasonably large level, then we would raise the rate for a limited time (probably in the evening when nothing else is going on) and take the performance hit (or add chunkservers/better equipment). So we would like to see a HARD_DEL_LIMIT in mfsmaster.cfg (instead of just assuming 2x DEL_LIMIT as in our example) and the ability to change those settings on the fly as you mentioned. (ideally via a cron job, so we could automatically speed things up a bit in the evenings). Further down our todo list would be some logic that makes the deletion rate subject to the other activity, so if the cluster is otherwise not busy doing read/writes and replications (as in our at night scenario) then it could go ahead and speed things up and conversely if the server is really busy, then postpone the post-trashtime deletions completely. The truncate to 0 idea sounds interesting if there is an actual performance gain and it doesn't introduce complications, but if mfsmaster was more intelligent about 'when' and 'how quickly' it deletes files from the trash queue, its not really high on our wishlist. Finally, I'd like to thank the maintainers for MFS. Now that we have the deletion issue solved and we learned not to let the mfsmaster process exceed 50% of RAM, MFS is a huge improvement over our NFS/DRBD setups in regards to administration and even the ability to use somewhat older servers in the cluster, allowing us to save the state of the art kit for databases and VM's. We've even had a few incidents where equipment failed or we did something stupid and we were able to recover cleanly. The process was well documented, easy to follow and 'just worked'. -bill |
From: Davies L. <dav...@gm...> - 2011-09-25 01:43:50
|
Maybe we could try int another way: limit the total operations in every event loop, then distribute the operation limit by priority for every kind operation, such as delete, replicated. Then the mfsmaster ban balance throughput performance and operation latency. Davies 2011/9/25 wk...@bn... <wk...@bn...> > On 9/24/11 6:04 AM, Michał Borychowski wrote: > > Hi! > > > > We discussed similiar problem with Ólafur in July as far as I remember. > Yes, > > we know it is not very optimal - if there are many files to be deleted > > system tries to increase the limit so that they get deleted, but > > unfortunately with realy huge number of files to be deleted, system > stucks > > up... We have some ideas for improvement, eg. first doing truncate > (setting > > to 0 bytes) and later do the real deleting. Also setting of this limit > would > > be possible on the fly with mastertools. > > > > > > > > What creates the Deletion Problem is this code in chunks.c > > if (delnotdone > deldone && delnotdone > prevdelnotdone) { > TmpMaxDelFrac *= 1.3; > TmpMaxDel = TmpMaxDelFrac; > syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); > } > > This allows the deletion rate to be increased at very fast rate (30% > every 5 minutes) and there is no hard LIMIT, so the deletion rate keeps > on going up until the server is overwhelmed and the deletions are > consuming all the resources. > > Unless you are running out of space on the server, deletion from the > Trash is a very, very low priority process and should only be happening > AFTER normal reads/writes and replications. So it not necessary to ramp > up the deletion rate in most cases. > > As I previously indicated, we have tested (and now are running in > production on 3 clusters) the following replacement code: > > // Local version 09-23-2011 > if (delnotdone > deldone && delnotdone > prevdelnotdone) { > if (TmpMaxDelFrac < (MaxDel*2)) { > TmpMaxDelFrac *= 1.1; > TmpMaxDel = TmpMaxDelFrac; > syslog(LOG_NOTICE,"DEL_LIMIT temporary increased to: %u/s",TmpMaxDel); > } > } > > This minor change limits the Deletion rate to a HARD LIMIT of 2x the > CHUNK_DEL_LIMIT and only increases it by 10% each 5 minutes when its in > ramp up phase. > > This is working very well for us. We are no longer terrified about > deleting large folders and we don't care it it takes 6-8 hours to clear > the post-trashtime deletion queue instead of the cluster being unusable > for 1-2 hours. > > If we were to find that the number of post-trashtime files were growing > to an unreasonably large level, then we would raise the rate for a > limited time (probably in the evening when nothing else is going on) and > take the performance hit (or add chunkservers/better equipment). > > So we would like to see a HARD_DEL_LIMIT in mfsmaster.cfg (instead of > just assuming 2x DEL_LIMIT as in our example) and the ability to change > those settings on the fly as you mentioned. (ideally via a cron job, so > we could automatically speed things up a bit in the evenings). > > Further down our todo list would be some logic that makes the deletion > rate subject to the other activity, so if the cluster is otherwise not > busy doing read/writes and replications (as in our at night scenario) > then it could go ahead and speed things up and conversely if the server > is really busy, then postpone the post-trashtime deletions completely. > > The truncate to 0 idea sounds interesting if there is an actual > performance gain and it doesn't introduce complications, but if > mfsmaster was more intelligent about 'when' and 'how quickly' it deletes > files from the trash queue, its not really high on our wishlist. > > Finally, I'd like to thank the maintainers for MFS. > > Now that we have the deletion issue solved and we learned not to let the > mfsmaster process exceed 50% of RAM, MFS is a huge improvement over our > NFS/DRBD setups in regards to administration and even the ability to use > somewhat older servers in the cluster, allowing us to save the state of > the art kit for databases and VM's. > > We've even had a few incidents where equipment failed or we did > something stupid and we were able to recover cleanly. The process was > well documented, easy to follow and 'just worked'. > > -bill > > > > > > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > moosefs-users mailing list > moo...@li... > https://lists.sourceforge.net/lists/listinfo/moosefs-users > -- - Davies |