App::DistSync is a ready-to-use solution for synchronizing two or more web resources containing static data. The project has proven itself in scenarios involving distribution mirrors, software repositories, and collections of photos, videos, audio files, and other multimedia content.
RPM, APT, CPAN, or manually with make installBefore installation, ensure the following packages are present on the system where App::DistSync will run:
You can install App::DistSync automatically or manually.
For RHEL-based systems:
sudo dnf install perl-libwww-perl
For Debian-based systems:
sudo apt install perl-libwww-perl
From CPAN:
cpan install App::DistSync
For ActivePerl (Windows):
ppm install App-DistSync
perl Makefile.PL
make
make test
sudo make install
Or using cpanminus:
cpanm App::DistSync
Missing modules will be installed automatically.
Before first use, initialize the directory that will act as the mirror root. Initialization creates a basic directory structure and several system files. These files are described in the next section.
To initialize, run:
distsync -D /var/www/foo.localhost init
The -D option specifies the resource directory where required system files will be created.
Next, configure your web server to serve this directory. Example Apache2 configuration:
<VirtualHost *:80>
ServerName foo.localhost
DocumentRoot /var/www/foo.localhost
<Directory /var/www/foo.localhost>
DirectoryIndex index.html
Options +Indexes +FollowSymLinks -Includes -MultiViews
AllowOverride All
Require all granted
</Directory>
</VirtualHost>
Note: the
-Includesoption must be disabled. SSI and similar post-processing filters interfere with App::DistSync, which relies on correctLast-Modifiedheaders for all files. Disable any filters that modify response bodies.
Once the server is running, the resource is ready. Add its own URL to the MIRRORS descriptor file, e.g.:
http://foo.localhost
If your mirror should reside in a subdirectory (e.g. /dist), initialize it there:
mkdir -p /var/www/foo.localhost/dist
distsync -D /var/www/foo.localhost/dist init
And provide this specific URL to other mirrors:
http://foo.localhost/dist
The resource directory is the root of the mirror and contains all files and descriptors. A single server may host multiple resources with different URLs.
Some descriptor files must not be edited manually — they are marked accordingly.
NOT EDITABLE. A YAML file containing metadata about the resource, including the last synchronization time. Also provides system information required by other mirrors.
NOT EDITABLE. Generated automatically on each run. Describes the current directory structure. After sync completion, it is regenerated and the mtime field in META is updated.
Format:
DIRNAME/FILENAME MTIME SIZE MTIME_AS_STRING
Paths are relative, with / used as the separator regardless of OS.
Editable descriptor defining which files must be excluded from MANIFEST. Supports Perl-compatible regular expressions in YAML format.
The following files are always ignored:
META
MANIFEST
MANIFEST.DEL
MANIFEST.SKIP
MANIFEST.LOCK
README
Format:
DIRNAME/FILENAME COMMENT
Editable descriptor listing files that must be deleted after a specified offset (DTIME). DTIME is relative to the modification time of MANIFEST.DEL itself (default: +3d).
The file is not synchronized, but remote mirrors download it and delete the listed files locally. After processing, the file is cleared and recreated.
Format:
DIRNAME/FILENAME DTIME
Editable descriptor listing the URLs of all mirrors. The current resource must be included in this list.
Format:
URL COMMENT
NOT EDITABLE. Contains the PID of the process performing synchronization. Prevents concurrent distsync runs across mirrors.
NOT EDITABLE. Temporary file holding downloaded data. May persist between runs.
Optional, local-only informational file. Not synchronized. If you need it to sync, use README.md instead.
After initialization, edit the MIRRORS file and add the new mirror’s URL. Then upload the updated file to any existing mirror so others can discover the new resource.
Run the first synchronization:
distsync -D /var/www/foo.localhost sync -d
-d enables progress output; -dv adds more verbose diagnostics.
-D DATADIR — Specifies the local resource directory where synchronized files are stored.
Help:
distsync --help
man distsync
Once mirrors can reach the newly created resource, it may be scheduled for automatic, periodic synchronization. The most common method is cron.
Example crontab entry:
37 * * * * /usr/bin/distsync -D /var/www/foo.localhost >/var/log/distsync.log 2>&1
All logs and errors are written to /var/log/distsync.log.
Copy files into the resource directory. They will be synchronized automatically.
Add filenames to MANIFEST.DEL. The system will handle deletion. You may remove files from disk manually afterwards.
Replace the file with a newer version. Synchronization will propagate the change.
⚠ Do NOT rename or move files or directories.
Missing items will be recreated during synchronization, leading to divergence or duplication across mirrors.