File | Date | Author | Commit |
---|---|---|---|
README.md | 2018-06-14 |
![]() |
[70e49b] Update to gitlab's latest, 0.60 |
ezrsync | 2018-06-14 |
![]() |
[70e49b] Update to gitlab's latest, 0.60 |
ezrsync.conf | 2012-09-26 |
![]() |
[c457ee] Version 0.52-beta |
ezrsync.cron | 2018-06-14 |
![]() |
[70e49b] Update to gitlab's latest, 0.60 |
Git repo: http://gitlab.com/pepa65/ezrsync
Copyright 2012-2016 by dagurasuforge, pepa65 (sourceforge.net ids)
License GNU GPLv3+ http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
An rsnapshot-like tool which uses fixed time-stamped directories instead of rotations.
The initial author dagurasuforge has seen many rsnapshot-like scripts/programs, mostly based on Mike Rubels blog,
with no real updates to the idea other than added usability features. Thus most use sync/rotate/promote
with complex crontab structures that require careful timing and use of BOTH cron AND anacron
to deal with hourly backups as well as daily, weekly, etc. on a system with downtimes.
Using only cron, systems could miss daily, weekly, monthly or yearly backups entirely.
Without using cron, hourly backups can't be done well.
(Mike Rubel's original method was the most efficent way to explain how people could get something working
using available tools and minimal work, but if we're writing a program, we can do better.)
So ezrsync delivers precisely the same result (multiple intervals of retention) with only one standard call
from cron that does all within one cycle (typically hourly, but it could be longer or shorter),
without the need for anacron. There is no need for renaming, rotating, or promoting any backups after
they are made, but the required number of daily, weekly, monthly and yearly backups are garanteed to be retained.
bash, rsync, ln, mv, rm, date, find, grep and optionally cpio in $PATH
(ln provides 1 service, can be removed; grep could be eliminated with some work;
find, date, mv, rm and rsync are fairly essential; bash is indispensable;
cpio is optional, mandatory if directive 'cpio' has 1 as argument)
Bash-version compatibility: bash-3.0 (which introduced array index expansion: ${!array[@]})
Prepare a configuration-file patterned after rsnapshot:
Use hourly, daily, weekly, monthly or yearly as interval names, or define other ones.
Use only source/destination backup definitions (extra per-backup options will be ignored).
All non-implemented or unapplicaple directives will be flagged but ignored.
The 'snapshot_root'/'destdir' combinations of different configurations must not be the same directory,
unless all interval names are different.
Either place the config in '/etc/ezrsync.conf' or in '/etc/ezrsync.conf.d/' or run this script as:
'ezrsync --mainconf <configfile>' or 'ezrsync --soleconf <configfile>'
If there are scripts present in '/etc/ezrsync.conf.d/' directory, they are (also) all run. This directory
can only be changed in the main configuration file, either in the default one or in the one given
as a commandline option after '--mainconf'.
When '--soleconf' is used, only the specified config file is used, and none of the scripts in any directory.
The 'lockfile' and 'stop_on_stale_lockfile' directives can only be used in either the sole or the main
configuration-file, and not in any of the files in a configuration directory (because there will only be
one logfle and lockfile for the whole run).
The 'confdir' directive (to specify an alternate configuration directory) can only be set in
the main configuration file. If a sole configuration file is specified, it will be the only one used.
If a main configuration file is present at the default location or specified, it will be used first,
and then all the configuration files that are present in the default or specified configuration directory.
For each configuration file, all unspecified directives will be reset to default.
Run ezrsync in a single cron entry, no arguments required, like this (for example every 15 minutes):
*/15 * * * * root /usr/local/bin/ezrsync
It should be run at the smallest interval that is used (or more often), but it won't run if another instance
is still running. If the cron-interval is bigger than the smallest interval, chances are the larger intervals
never get any backups. (A crontab.d script could be packaged with ezrsync without needing user intervention,
except if the smallest used interval is smaller than the crontab that is packaged.)
If the source is empty, the backup will not proceed, so backups will not be erased.
So if you use sources that are mounted, make sure the mount-directories are empty when not mounted!
A good idea is to use 'chattr +i' when unmounted to make them unwritable even to root, because
files tend to get accidentally written to unmounted mount points. Backups will over time be replaced
by the files that are present in unmounted source directories. Even if this happens once, the hard links
will be broken, causing more storage space to be used on the destinations!
The 'snapshot_root' directive is the only directive that is mandatory, all others have defined default values.
The script can be run more often than the smallest backup cycle that is used, or less often (in the latter case
the larger interval backups might never happen). No worrying about cron jobs interfering with one another.
No anacron is needed. If the computer is off for a period, required backups will be made at the next run
if/when appropriate, and as many backups for each interval will be kept as specified.
The directory-names for ezrsync carry both an explicit time stamp and the interval name.
In case (for example) yearly backups were made when the files were a mess that day, they can
just be deleted, and new ones will be made on next run.
The most recent backup is linked to by the symlink '$symlinkname' (set to 'MostRecentBackup' by default).
Forced backups are supported with the '--forced' option followed by an interval name.
They do not count towards the total number of backups to be kept for that particular interval.
They will eventually get pushed out by regular backups for that interval when there are a
sufficient number of newer regular backups. The regular backups are always properly spaced in time.
So in case (for example) 'retain hourly 10', and 10 hourly backups are forced in one hour, the old
hourly backups will still be kept and slowly be replaced by newer regular hourly backups, and
eventually the forced backups will be deleted when they get older than the oldest of the regular 10.
Independent, persistent backups can be made with the '--independent' option. These never get automatically deleted
and do not interact with the deletion schedule of other interval backups at all.
In case of interruptions, there is a roll forward or roll back mechanism in place
to guard against deletions.
Empty source directories are never backed up. This is really a minimal check, but stricter criteria
like tagging backup-sources with hidden files won't work for read-only sources and may just prevent backups.
Different interval specifications can be applied to different sets of backup directives
by using separate configfiles.
A date-based system is vulnerable to missed backups or auto-deletion if there is a clock error.
Much consideration has been given to this issue in the details of the logic.
If any backups exists 'in the future', the program is aborted. But a clock jump to the past
does not threaten existing backups anyway, it just won't make new ones, so has to be detected.
If the clock jumps forward (for example 2 months, perhaps the machine had just been switched off),
things are fine. Your backups, while suddenly being older, will not all suddenly get expired,
because expiry is based on count within an interval, and not based on time.
For all intervals, starting with the shortest interval: