From: David <wiz...@gm...> - 2009-08-17 11:53:08
|
Hi there. Firstly, this isn't a backuppc-specific question, but it is of relevance to backup-pc users (due to backuppc architecture), so there might be people here with insight on the subject (or maybe someone can point me to a more relevant project or mailing list). My problem is as follows... with backup systems based on complete hardlink-based snapshots, you often end up with a large number of hardlinks. eg, at least one per server file, per backup generation, per file. Now, this is fine most of the time... but there is a problem case that comes up because of this. If the servers you're backing up, themselves have a huge number of files (like, hundreds of thousands or millions even), that means that you end up making a huge number of hardlinks on your backup server, for each backup generation. Although inefficient in some ways (using up a large number of inode entries in the filesystem tables), this can work pretty nicely. Where the real problem comes in, is if admins want to use 'updatedb', or 'du' on the linux system. updatedb gets a *huge* database and uses up tonnes of cpu & ram (so, I usually disable it). And 'du' can take days to run, and make multi-gb files. Here's a question for backuppc users (and people who use hardlink snapshot-based backups in general)... when your backup server, that has millions of hardlinks on it, is running low on space, how do you correct this? The most obvious thing is to find which host's backups are taking up the most space, and then remove some of the older generations. Normally the simplest method to do this, is to run a tool like 'du', and then perhaps view the output in xdiskusage. (One interesting thing about 'du', is that it's clever about hardlinks, so doesn't count the disk usage twice. I think it must keep a table in memory of visited inodes, which had a link count of 2 or greater). However, with a gazillion hardlinks, du takes forever to run, and has a massive output. In my case, about 3-4 days, and about 4-5 GB output file. My current setup is a basic hardlink snapshot-based backup scheme, but backuppc (due to it's pool structure, where hosts have generations of hardlink snapshot dirs) would have the same problems. How do people solve the above problem? (I also imagine that running "du" to check disk usage of backuppc data is also complicated by the backuppc pool, but at least you can exclude the pool from the "du" scan to get more usable results). My current fix is an ugly hack, where I go through my snapshot backup generations (from oldest to newest), and remove all redundant hard links (ie, they point to the same inodes as the same hardlink in the next-most-recent generation). Then that info goes into a compressed text file that could be restored from later. And after that, compare the next 2-most-recent generations and so on. But yeah, that's a very ugly hack... I want to do it better and not re-invent the wheel. I'm sure this kind of problem has been solved before. fwiw, I was using rdiff-backup before. It's very du-friendly, since only the differences between each backup generation is stored (rather than a large number of hardlinks). But I had to stop using it, because with servers with a huge number of files it uses up a huge amount of memory + cpu, and takes a really long time. And the mailing list wasn't very helpful with trying to fix this, so I had to change to something new so that I could keep running backups (with history). That's when I changed over to a hardlink snapshots approach, but that has other problems, detailed above. And my current hack (removing all redundant hardlinks and empty dir structures) is kind of similar to rdiff-backup, but coming from another direction. Thanks in advance for ideas and advice. David. |