From: Tony S. <sch...@bc...> - 2008-04-17 19:10:24
|
On one of my BackupPC setups, I back a lot of data. On occasion things run for more than 24 hours and I start getting .... Botch on admin job for admin : already in use!! messages in the log file. I'm guessing that this means that a BackupPC_nightly has been queued when there is already one running. My question is, eventually they all run to completion, but is there any value in queueing one up when another is already running? Does running one after the other make any sense? Tony |
From: Tino S. <bac...@ti...> - 2008-04-17 19:30:15
|
On Thu, Apr 17, 2008 at 03:10:13PM -0400, Tony Schreiner wrote: > On one of my BackupPC setups, I back a lot of data. On occasion > things run for more than 24 hours and I start getting > > .... Botch on admin job for admin : already in use!! > > messages in the log file. I'm guessing that this means that a > BackupPC_nightly has been queued when there is already one running. > > My question is, eventually they all run to completion, but is there > any value in queueing one up when another is already running? Does > running one after the other make any sense? Just set $Conf{BackupPCNightlyPeriod} = 2; Or 4 or whatever. That way, BackupPC_nightly will do only a part of the pool each run. Bye, Tino, having that setting at 4. -- „What we resist, persists.” (Zen saying) www.craniosacralzentrum.de www.forteego.de |
From: Tony S. <sch...@bc...> - 2008-04-17 19:41:48
|
On Apr 17, 2008, at 3:30 PM, Tino Schwarze wrote: > On Thu, Apr 17, 2008 at 03:10:13PM -0400, Tony Schreiner wrote: >> On one of my BackupPC setups, I back a lot of data. On occasion >> things run for more than 24 hours and I start getting >> >> .... Botch on admin job for admin : already in use!! >> >> messages in the log file. I'm guessing that this means that a >> BackupPC_nightly has been queued when there is already one running. >> >> My question is, eventually they all run to completion, but is there >> any value in queueing one up when another is already running? Does >> running one after the other make any sense? > > Just set $Conf{BackupPCNightlyPeriod} = 2; Or 4 or whatever. That way, > BackupPC_nightly will do only a part of the pool each run. > > Bye, > > Tino, having that setting at 4. > Sorry, I replied first to Tino; here it is again to the list: I can do that, yes, But the BackupPC_nightly actually do complete, even when 3 or 4 get run in succession. I'm just wondering if the 2nd and 3rd time BackupPC_nightly -m 0 127 runs in a row actually does anything. Tony |
From: Tino S. <bac...@ti...> - 2008-04-18 08:16:15
|
On Thu, Apr 17, 2008 at 03:41:53PM -0400, Tony Schreiner wrote: > >> On one of my BackupPC setups, I back a lot of data. On occasion > >> things run for more than 24 hours and I start getting > >> > >> .... Botch on admin job for admin : already in use!! > >> > >> messages in the log file. I'm guessing that this means that a > >> BackupPC_nightly has been queued when there is already one running. > >> > >> My question is, eventually they all run to completion, but is there > >> any value in queueing one up when another is already running? Does > >> running one after the other make any sense? > > > > Just set $Conf{BackupPCNightlyPeriod} = 2; Or 4 or whatever. That way, > > BackupPC_nightly will do only a part of the pool each run. > > > > Bye, > > > > Tino, having that setting at 4. > > Sorry, I replied first to Tino; here it is again to the list: > > > I can do that, yes, But the BackupPC_nightly actually do complete, > even when 3 or 4 get run in succession. > > I'm just wondering if the 2nd and 3rd time BackupPC_nightly -m 0 127 > runs in a row actually does anything. There are two settings for BackupPC_nightly: BackupPCNightlyPeriod and MaxBackupPCNightlyJobs. My experience is that it makes no sense to set MaxBackupPCNightlyJobs to anything higher than 1 since the whole job is I/O-bound anyway and doing almost no CPU-intensive operations. The BackupPC_nightly job should only run once a day, processing a part of the pool (depending on the period setting). The next day, the next part is processed etc. It skims through the pool and looks for files which are no longer in use by any backup. The whole process works roughly like this: - backups are done by BackuPC_dump on per-client basis, no explicit pooling involved yet (apart from full backups where existing/unaltered files are linked into the new backup from last full, so link count in the pool increases too) - BackupPC_link traverses completed backups and links new files to the pool (so the new file has link count 2) - when a backup starts, old backups get expired and are moved to the trash directory without further processing - BackupPC_trashClean simply removes everything in the trash periodically -> link count on some files in the pool decreases - BackupPC_nightly skims through the pool and removes files with link count 1 since they are not used by any backup any more HTH, Tino. -- „What we resist, persists.” (Zen saying) www.craniosacralzentrum.de www.forteego.de |
From: Craig B. <cba...@us...> - 2008-04-18 15:02:22
|
Tino, Very good explanation. One minor comment... > The whole process works roughly like this: > - backups are done by BackuPC_dump on per-client basis, no explicit > pooling involved yet (apart from full backups where existing/unaltered > files are linked into the new backup from last full, so link count in > the pool increases too) Actually any new file is linked immediately if it exists in the pool. Rsync additionally can avoid transferring the file (or transfer just the changes to file) provided a file with the exact same path exists in the prior (reference) backup. > - BackupPC_link traverses completed backups and links new files to the > pool (so the new file has link count 2) > - when a backup starts, old backups get expired and are moved to the > trash directory without further processing > - BackupPC_trashClean simply removes everything in the trash > periodically -> link count on some files in the pool decreases > - BackupPC_nightly skims through the pool and removes files with link > count 1 since they are not used by any backup any more Craig |
From: dan <dan...@gm...> - 2008-04-18 15:10:35
|
how many files are being backed up that it is taking so long? is it a bandwidth issue or a file count issue. are you backing up multiple hosts at the same time? I think that the 24hours is a awefull long time to do the backup, think about how many things can happen in 24 hours and think about what you are actually backing up. you could be backing up data that is just not going to be of much value because the files could change during that 24 hour period. On Fri, Apr 18, 2008 at 9:02 AM, Craig Barratt < cba...@us...> wrote: > Tino, > > Very good explanation. One minor comment... > > > The whole process works roughly like this: > > - backups are done by BackuPC_dump on per-client basis, no explicit > > pooling involved yet (apart from full backups where existing/unaltered > > files are linked into the new backup from last full, so link count in > > the pool increases too) > > Actually any new file is linked immediately if it exists in the pool. > Rsync additionally can avoid transferring the file (or transfer just > the changes to file) provided a file with the exact same path exists > in the prior (reference) backup. > > > - BackupPC_link traverses completed backups and links new files to the > > pool (so the new file has link count 2) > > - when a backup starts, old backups get expired and are moved to the > > trash directory without further processing > > - BackupPC_trashClean simply removes everything in the trash > > periodically -> link count on some files in the pool decreases > > - BackupPC_nightly skims through the pool and removes files with link > > count 1 since they are not used by any backup any more > > Craig > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Tony S. <sch...@bc...> - 2008-04-18 15:32:38
|
On Apr 18, 2008, at 11:10 AM, dan wrote: > how many files are being backed up that it is taking so long? is > it a bandwidth issue or a file count issue. are you backing up > multiple hosts at the same time? I think that the 24hours is a > awefull long time to do the backup, think about how many things can > happen in 24 hours and think about what you are actually backing > up. you could be backing up data that is just not going to be of > much value because the files could change during that 24 hour period. > > As I originally said, it happens occasionally, not all the time. It usually only happens during a full backup of one of the larger clients, which I do on a 28 day interval. It's a big backup setup. The pool volume is 4 TB. Gbit ethernet, A dozen client with filesytems of 1-6 TB. My pre-pooling-and- compression backup size stands now at 33 TB; I have very nearly reached capacity and will need to expand soon. And yes, I struggle with what needs be backed up. The users (bioinformatics research) can generate a couple of 100 GB of data every day, some of it very large files, some of it hectathousands of small files, some of which needs to be saved, some of which does not. There is no easy way for me to predict yes or no. I have asked the users to use certain directories for temporary files, which I don't backup; but users are users as you may know. It will come down to policy, not technical decisions eventually. Tony Schreiner |
From: Jonathan D. <jon...@ne...> - 2008-04-18 16:59:03
|
On Apr 18, 2008, at 11:32 AM, Tony Schreiner wrote: > And yes, I struggle with what needs be backed up. The users > (bioinformatics research) can generate a couple of 100 GB of data > every day, some of it very large files, some of it hectathousands of > small files, some of which needs to be saved, some of which does not. > There is no easy way for me to predict yes or no. How much memory does the server have? Just a guess, but especially if you are using rsync, you could be running into problems with the size of the "list" that rsync has to maintain in memory, and rsync could spend all of its time paging. "top" for e.g. should give you some idea. I would recommend using 64-bit OS and max out the memory if possible, that should help unless something else is going on. Also the BackupPC server(s) should be dedicated for that purpose and not running other jobs (my prior experience with bioinformatics is they like to squeeze CPU cycles from any box they can get their hands on). Second, take a look at aggregation and backplane speed on the network switches, maybe links are getting saturated somewhere, or you need to do some trunking, make sure your link speeds are really Gb/s from end to end, try some large file transfers to see how much throughput you really get from point to point. Take a look at the network interface stats e.g. from ifconfig to see if any of the numbers look awry, double check duplex settings and for MTU consistency, any routers or firewalls in the path. I am sure you already know all that, but once in awhile I know myself I forget to check the things that are quick and easy to check and would be easy to fix, very little time wasted if that turns out not to be relevant. If possible, you may want to have a parallel "storage network" connected to extra NIC card in each server to get maximum bandwidth for backups--if you make that secure and separate from the regular LAN, then you don't need to e.g. tunnel through SSH, that will save you some overhead if you are encrypting right now. I take it for granted that you are using some type of hardware RAID or a SAN to get better I/O throughput on the server, you would pretty much have to be doing that for your capacity. Jonathan |
From: Tino S. <bac...@ti...> - 2008-04-18 18:13:07
|
On Fri, Apr 18, 2008 at 11:32:47AM -0400, Tony Schreiner wrote: > I have asked the users to use certain directories for temporary > files, which I don't backup; but users are users as you may know. I've "educated" my users the hard way. We've got home directories with quotas set up. Then we've got a temporary "exchange" directory. Every file not changed within the last 14 days is removed by a nightly job (first moved to some hidden location, then really removed after another 14 days). That works reasonably well (if you tell find to use ctime, not mtime). Tell users which directories get backed up and which don't. Use quotas for backed up directories and have temporary directories be cleaned automatically (or just move files somewhere else from time to time for educational purposes). :-> HTH, Tino. -- „What we resist, persists.” (Zen saying) www.craniosacralzentrum.de www.forteego.de |
From: Tony S. <sch...@bc...> - 2008-04-18 18:34:03
|
On Apr 18, 2008, at 2:13 PM, Tino Schwarze wrote: > On Fri, Apr 18, 2008 at 11:32:47AM -0400, Tony Schreiner wrote: > >> I have asked the users to use certain directories for temporary >> files, which I don't backup; but users are users as you may know. > > I've "educated" my users the hard way. We've got home directories with > quotas set up. Then we've got a temporary "exchange" directory. Every > file not changed within the last 14 days is removed by a nightly job > (first moved to some hidden location, then really removed after > another > 14 days). That works reasonably well (if you tell find to use > ctime, not > mtime). > > Tell users which directories get backed up and which don't. Use quotas > for backed up directories and have temporary directories be cleaned > automatically (or just move files somewhere else from time to time for > educational purposes). :-> > > HTH, > > Tino. > I don't want to impose quotas, as appealing an idea as that sounds. The machines are for grant funded work by a relatively small number of total users. There are legitimate reasons for them to be generating the amount of data that they are. Bionformatics being the field that it is, the size of data sets grows exponentially with time. I do have a policy of certain directories not being backed up; but I need to remind the users; and perhaps take a harder line with them. Tony |
From: Jonathan D. <jon...@ne...> - 2008-04-18 20:00:52
|
On Apr 18, 2008, at 2:34 PM, Tony Schreiner wrote: > I don't want to impose quotas, as appealing an idea as that sounds. > The machines are for grant funded work by a relatively small number > of total users. There are legitimate reasons for them to be > generating the amount of data that they are. Bionformatics being the > field that it is, the size of data sets grows exponentially with time. > > I do have a policy of certain directories not being backed up; but I > need to remind the users; and perhaps take a harder line with them. I used to work in an academic biotech lab for over 10 years, and I can vouch for that volume of data, this is not just Word docs and e-mail and such. The data is what really makes them their money to get new grants, so there is not much you can do to get them to limit the data, data = $ in this case. Every time grant renewals come around, and people are putting requests for money into their proposal for new machines, you have to remind them that the data also needs to be backed up, and they should also budget for the hardware that will be needed to do that, the hardware does not just appear out of thin air and there may not be enough in the general IT budget alone to cover that. End of the FY is another good time to put a bug in their ear, there may be some "use it or lose it" funds they could contribute to hardware purchases that are of general benefit to everybody. Always advise people what they "should" be doing and what is really required to get the job done--if they are unwilling or unable to provide what is really necessary, you can do your best and make do, but you can't promise that there won't be problems down the road if things are held together with chewing gum and bailing wire so to speak. Jonathan |
From: Jonathan D. <jon...@ne...> - 2008-04-18 19:32:06
|
On Apr 18, 2008, at 2:00 PM, Tony Schreiner wrote: > dedicated backup server, 64-bit CentOS on 4 GB RAM. dstat doesn't > show any paging. the clients tend to have much more RAM Very good, that sounds adequate, more cores and more RAM on the server could still help a bit, or an additional server or two (but again budget) but sounds like not the main cause of the problem. > The BackupPC host summary web page shows speeds between 9 MB/s and > 35 MB/s depending on the client, I think that's in line with what > others are seeing on Gbit networks. The storage is a 3Ware 9550 with > 10 disks ( I admit that the controller BBU has failed and needs > replacing, and that is slowing down my write speed). Sounds about right. > When I watch the progress of backups on the server, what really > seems to be the slowest part is when large files ( > several Gb ) > are being compared against the pool. I don't know enough about the > internals of the software to know quite what is happening there. This is good info, could get you answers more suited to your particular set up. First, a word of caution, it's possible that changing compression or checksum caching will cause BackupPC to immediately eat up 2x as much disk space until old files expire from the pool, someone here should be better qualified to explain if that could be a problem. I do know for sure that happens if you change certain parameters that affect the hashing function. You should use "Rsync checksum caching" if you do not already do so, that may help quite a bit, pretty good description in the faq in the section with the same title, this will take advantage of rsync's built- in checksum functions. There is some risk that if a file gets corrupted in the pool for some reason, you won't learn about it until you try to restore, so you should keep that in mind especially since the RAID controller has dead BBU. http://backuppc.sourceforge.net/faq/BackupPC.html If these huge files are not more or less "static" and you can manage enough disk space, you may want to turn off compression, or perhaps exclude those files from BackupPC and use a different method to back them up, at least try a smaller value for $Conf{CompressLevel} See also previous faq topic "Compressed file format". If the files themselves are already compressed on the source, then BackupPC should "detect" that compressing them more doesn't help and switch to flush() in which case turning off compression probably won't help. I haven't dug around in the guts of BackupPC for some time and all of the configuration options that are possible, there may well be some option to not compress files over some size limit, or other tweaks that you could make, hopefully someone here could give you some better answers on that. Jonathan |