>On Thu, Feb 25, 2010 at 09:37:23AM -0500, Wilson Snyder wrote:
>> I'm attempting to replace a custom "hacky" script that has a
>> similar technique (rsync links) as rsnapshot with rsnapshot.
>> The one major feature I see missing is the ability to keep
>> the disk space on a snapshot mountpoint below a specific
>> Even better would be to do a trial run of the sync to
>> estimate the space for the next snapshot, and subtract that
>> from the threashold. The threashold could then be set much
>> tighter (90%ish), as there's less danger of overrunning the
>> Does anyone have a cmd_preexec or other script to do this?
>> If I was to implement this, could the patches be considered
>> for inclusion in rsnapshot?
>If rsnapshot is working properly and in a steady state, then it
>should use pretty much constant disk space relative to the size
>of the file systems being backed up (unless there is an unusual
>amount of space taken up by file which have changed).
>So my conclusion is that if you are running out of disk space
>on your rsnapshot destination, it means one of these things:
>(1) your destination disk is not big enough
>(2) you are trying to back up too much stuff
>(3) you are trying to keep too many different backups
>(4) your rsnapshot had not yet retained a complete set of backups,
> but ran out of space when it tried to
>(5) the source file system(s) have grown in size beyond the
> capacity of the destination disk
>(6) there is an unusual amount of file changing (counted by the
> aggregate size of the files being changed) on the source(s)
>(7) your rsnapshot is not working properly - for example destination
> files are not properly hard linked and therefore take up much
> more space on the destination disk than they need to
>Problems 1 to 5 can be solved by provisioning more disk space for
>the destination or changing what you are trying to back up so less
>space is needed. Basically capacity planning.
>Problem 6 can sometimes be managed by investigating what has changed
>and seeing whether a large directory (or a directory with large files,
>or a directory that has large directories for children) has been moved
>If so, it can be save space to move/rename the relevant directory in the
>right destination snapshot to match the change that has happened at source,
>and modify snapshots between now and then. But please don't try this if
>you are not sure about what you are doing - if in doubt, don't change the
Thanks. I should have perhaps given you our use case.
We're using it to backup 500GB+ disks used for subversion
checkouts and chip simulations. These fit will into your
case 6; often on weekends there's few changes, and few
checkout changes, but then the next day a user may remove
their areas, make a new checkout and run several simulations
over night, which can easily make the next snapshot have
several 100GB. We have dozens of filesystems like this, all
which may vary from 1-25% of the disk per snapshot. The
point is I don't want to *think* about provisioning.
>But the more interesting point relates to (7). Running low on disk space
>can be a warning that there is a problem with your rsnapshot installation.
>If you are not watching rsnapshot closely, you may not notice if unchanged
>files are not hard-linked together as they should be, until you start to
>run low on disk space.
>So this is a possible counter-argument about a risk that could arise if
>rsnapshot were configured to automatically delete old snapshots when disk
>space starts running low. This could mask a symptom of a problem that
>should be addressed.
That's good points. Perhaps one idea is to have a
configurable minimum number of snapshots to keep?
>However I am interested in the idea. If risks like the masked symptom
>risk can be managed appropriately, and of course if it is implemented
>in a sensible way that doesn't cause other problems, then I would be
>happy to consider it (and also happy to hear opinions for or against
>from people on this list).
>And the idea of a tool that can estimate disk space requirements for
>the next snapshot seems like a good one. For example, that could be
>used to build an early warning system that would check whether there
>seems to be enough space for the next rsnapshot, and automatically
>email a warning a few hours before the rsnapshot is attempted.
If the warning is ignored, and there's nothing to delete
(perhaps due to the limits set), and the disk would
otherwise fill, what is the right default behavior? Skip it
entirely with an error? Having a half-snapshot seems
dangerous, and filling a filesystem often causes strange
>(I like to run "rsnapshot -t -q daily" from cron during business
>hours as an early warning system - if there is any output, then I
>get an email.)
Thanks, I'll look into this further then.