Robinhood - Policy Engine and monitoring tool for large file systems
====================================================================
== Changes in robinhood version 2.5.2 ==
- rbh-du: fixed major performance regression (since v2.5.0).
- rbh-find: fixed occasional crash.
- HSM and backup modes: fixed a risk of removing an existing entry from the backend (in some situations of hardlink/rename+unlink).
- backup mode: optimized sendfile()-based copy (Linux kernel >= 2.6.33).
- logs: avoid flood of log messages in case of DB connection error.
- alerts: added host name to alert mail title.
- rbh-config empty_db/repair_db: also manage/fix stored procedures.
- cosmetic: fix wrong display of purged blocks for count-based triggers.
- cosmetic: fix migration counter display.
- init script: check that 'ulimit -s' is reasonable.
- fixed build dependancies on Fedora19 and Fedora20.
- code sanity: fixed many 'coverity' warnings + a couple of minor memleaks.
- doc: details about RPM installation locations.
- doc: detail of 'backend' paramaters for backup mode.
RPMs available in the download section:
=======================================
Common RPM (noarch), for all robinhood modes:
(contains an admin helper and that can be useful on the database host and the Lustre MDS host)
rpms/robinhood-adm-2.5.2-1.noarch.x86_64.rpm
[TMPFS] Monitoring and space cleaning for scratch filesystem
(RPM must be installed on a fileystem client)
Lustre 1.8 / RHEL5: rpms/lustre1.8/robinhood-tmpfs-2.5.2-1.lustre1.8.el5.x86_64.rpm
Lustre 2.1 / RHEL6: rpms/lustre2.1/robinhood-tmpfs-2.5.2-1.lustre2.1.el6.x86_64.rpm
Lustre 2.2 / RHEL6: rpms/lustre2.2/robinhood-tmpfs-2.5.2-1.lustre2.2.el6.x86_64.rpm
Lustre 2.3 / RHEL6: rpms/lustre2.3/robinhood-tmpfs-2.5.2-1.lustre2.3.el6.x86_64.rpm
Lustre 2.4 / RHEL6: rpms/lustre2.4/robinhood-tmpfs-2.5.2-1.lustre2.4.el6.x86_64.rpm
Lustre 2.5 / RHEL6: rpms/lustre2.5/robinhood-tmpfs-2.5.2-1.lustre2.5.el6.x86_64.rpm
Other POSIX FS / RHEL5: rpms/posix_fs/robinhood-tmpfs-2.5.2-1.el5.x86_64.rpm
Other POSIX FS / RHEL6: rpms/posix_fs/robinhood-tmpfs-2.5.2-1.el6.x86_64.rpm
Other POSIX FS / Fedora19: rpms/posix_fs/robinhood-tmpfs-2.5.2-1.fc19.x86_64.rpm
Other POSIX FS / Fedora20: rpms/posix_fs/robinhood-tmpfs-2.5.2-1.fc20.x86_64.rpm
For other configurations, build from source tarball: robinhood-2.5.2.tar.gz
[BACKUP] Monitoring and file archiving
(RPM must be installed on a Lustre client >= 2.1)
Lustre 2.1 / RHEL6: rpms/lustre2.1/robinhood-backup-2.5.2-1.lustre2.1.el6.x86_64.rpm
Lustre 2.2 / RHEL6: rpms/lustre2.2/robinhood-backup-2.5.2-1.lustre2.2.el6.x86_64.rpm
Lustre 2.3 / RHEL6: rpms/lustre2.3/robinhood-backup-2.5.2-1.lustre2.3.el6.x86_64.rpm
Lustre 2.4 / RHEL6: rpms/lustre2.4/robinhood-backup-2.5.2-1.lustre2.4.el6.x86_64.rpm
Lustre 2.5 / RHEL6: rpms/lustre2.5/robinhood-backup-2.5.2-1.lustre2.5.el6.x86_64.rpm
For other configurations, build from source tarball: robinhood-2.5.2.tar.gz
[LHSM] Policy Engine for Lustre/HSM
(RPM must be installed on a Lustre client >= 2.5)
Lustre 2.5 / RHEL6: rpms/lustre2.5/robinhood-lhsm-2.5.2-1.lustre2.5.el6.x86_64.rpm
For other configurations, build from source tarball: robinhood-2.5.2.tar.gz
[WEBGUI] Stats interface (unchanged since v2.5.0)
(to be installed on a HTTP server that can connect to the robinhood DB)
webui/robinhood-webgui-2.5.0.tar.gz
webui/robinhood-webgui-2.5.0-1.noarch.x86_64.rpm
[Documentation]
doc/robinhood-tmpfs-252_admin_guide.pdf
doc/robinhood-tmpfs-252_tutorial.pdf
doc/robinhood-backup-252_tutorial.pdf
doc/rbh25-disaster_recovery.pdf
== Previous versions ==
=== Changes in robinhood 2.5.1 ===
- entry processing (major fix): fixed deadlock when the pipeline is full
and an entry with an unknown parent is encountered.
- purge (enhancement): start purging data from the most used OSTs.
- rbh-find (features): new options: -pool, -exec, -print, -nouser, -nogroup, -lsost
- rbh-find (optimization): automatically switch to bulk DB request mode when
command argument is filesystem root (+new option -nobulk to disable it).
- logging (enhancement): new config parameters to control log header format
- backup (feature): allow compressing data in archive.
- backup (fix): wrong path in archive when robinhood root directory != mount point.
- backup (fix): fix segfault when importing a single file with a FID-ending name.
=== What's new in robinhood 2.5? ===
rbh-diff:
* new command to detect differences between the filesystem and the information
in robinhood database.
* option "--apply=fs" for disaster recovery purpose: restore the filesystem
metadata from robinhood DB.
* makes it possible to rebuild a Lustre MDT from scratch, or from a LVM snapshot
(see "Robinhood Lustre disaster recovery guide" for more details).
database:
* new namespace implementation in database with new NAMES table (Cray contribution)
- fixes/improves hardlink support
- fixes/improves Lustre ChangeLog hardlink/rename/unlink support
- saves DB storage space
* database request batching: significantly increase database ingest rate.
No longer needs innodb_flush_log_at_tx_commit != 1 to speed up DB operations.
* additional information in DB that can help for disaster recovery:
symlink info, access rights, stripe object indexes, stripe order, nlink...
* set default commit behavior to transaction (prevent from DB inconsistencies)
* optimized multi-table requests
* optimization: minimized attribute set in DB update operations
(don't update attributes that didn't change)
* Fix: deal with mysql case insensitivity for string matching
* triggers and stored procedures versioning mechanism
* prevent from overflows for large INSERT requests, wide stripes...
* prevent from DB deadlocks
scanning:
* --partial-scan option is deprecated and replaced by an optional argument to --scan (e.g. --scan=/fs/subdir).
* better management of partial scans:
- better detection of removed entries vs. entries moved from a directory to another.
- partial scans can be used for initial DB population (even if the DB is initially empty).
* dealing with dead/deactivated OSTs: don't remove entries from DB if stat() returns ESHUTDOWN.
* garbage collection of removed entries in DB is a long operation when terminating a scan (and even more
when terminating a partial scan). Added --no-gc option to skip it (recommanded for partial scans).
* automatically enabling --no-gc if the DB is initially empty (eg. for initial scan).
* optimization: use *at() functions (openat, fstatat) and readdir by chunk (using getdents) instead of POSIX lstat() and readdir_r().
* optimization: use NOATIME flag to access entries as much as possible
* optimizations of get_stripe and get_fid operations.
* new --diff option for robinhood --scan and --readlog: output detected changes in a diff-like format.
Lustre changelogs:
* changelog batching (Cray contribution): to speed up changelog processing,
robinhood retains changelog records in memory a short time,
to aggregate similar/redundant Changelog records on the same entry before
updating its database.
* support multiple changelog readers (for DNE) as multiple threads (default)
or as multiple processes, possibly on different hosts, by giving a MDT index
to --readlog option.
* resilience to filesystem umount/mount.
rbh-report:
* new option --entry-info to get all the stored information about an entry
* option --dump-ost can now list multiple OSTs and support ranges notation (e.g. 3,5-8,12-23).
* --dump-ost output indicates if a file has data on a given OST (could be striped on the OST but have no data on it).
rbh-find:
* new option -crtime to filter entries on creation time.
* output ordering closer to find output
* added missing info in 'rbh-find -ls' output (nlink, mode, symlink info...)
robinhood-backup:
* by default, use a built-in copy function to avoid the cost of forking copy commands.
* rbh-backup-rebind: tool to rebind an entry in the backend if its fid changed in the filesystem
for any reason (file copied to a new one to change its stripe, etc...)
* rbh-backup-recov new features and options:
--list (list information about entries to be recovered)
--ost <ost_set> to only recover entries for a given set of OSTs (support range notation):
the basic use-case is OST disaster recovery.
--since <time> to only recover entries modified since a given date:
the basic use case is after restoring an OST snapshot.
* symlinks archiving to backend made optional (new parameter 'archive_symlinks')
as they can now be restored using robinhood database information.
* new parameter: sync_archive_data: force sync'ing data to disk to
finalize the copy.
configuration:
* can specify environment variables in config file (e.g. fs_path = $ROOT_DIR ;)
* prevent from using a wrong config file (Cray contribution):
- only check files in /etc/robinhood.d/<purpose>, no longer in the current directory
- fails if to many config files are available.
documentation:
* added man pages for robinhood daemon, rbh-report, rbh-find