File | Date | Author | Commit |
---|---|---|---|
CHANGELOG | 2019-02-17 | vincent.delft | [60a06c] for version 0.4: |
INSTALL | 2019-02-18 | vincent.delft | [49b405] add the INSTALL file |
LICENSE | 2018-10-19 | vincent.delft | [e1044d] Add annexe files |
Makefile | 2019-02-18 | vincent.delft | [50b933] improve Makefile |
README.md | 2018-10-19 | vincent.delft | [e3ac34] README improvement |
TODO | 2019-02-17 | vincent.delft | [60a06c] for version 0.4: |
yabitrot | 2019-06-08 | vincent.delft | [c08721] change some comments |
yabitrot.1 | 2019-02-17 | vincent.delft | [60a06c] for version 0.4: |
yet an another bitrot application: yabitrot
Let me first remind you what such application can do.
As very well explained on the blog of Solene a bitrot detect the rotation of one bit on your storage device. This could a Hard disk, a SSD, an USB or a CD.
Indeed with the time some bits are loosing their values and become "rotated". This could damage your files if you do not take cares of it.
So, in short a bitrot application is looking if the checksum of unmodified files are well equal. If not, they are in bad shape and a restore from backup is more than welcome for the impacted file.
I'm using a NAS server (running on OpenBSD) to store all important information of several users (included me). This is a disk of 1TB which is backuped regularly on a 2nd external disk. Indeed I do not want to have both PROD and backup located on the same place, they could be damaged both by fire, water, ... and I will loose all my data.
This NAS is heavily using the concept of "time machine" and do lot of harlinks between each "versions" of the local backups (details here)
I'm not using a smart application to backup my data from PROD to Backup. I'm just doing disk copy with the command "pax -rw". So, if I have some, bad bits from my PROD machine will be copied to the backup, with very bad consequences. So, I need a bitrot scanner that could guarantee that all my files are in good shape before copying them on the backup disk.
Indeed, we can find several bitrot app on internet. For example, you can find bitrot in python, bitrot in javascript, bitrot in go, bitrot scanner in go
In total I've found more than 10 bitrot applications. Unfortunately none are matching important elements for me:
So, I wrote yabitrot taking into account the hardkinks and after evaluation, I decided to use adler as checksum. Adler is one of the fastest checksum I've tested.
In short, yabitrot is based on the Inode of each files. For each Inode yabitrot compute the checksum and store the timestamps at which this calculation has been made.
yabitrot use few parameters. One of them is the path of the folder you want to scan.
Since yabitrot remember the Inodes of each file, you cannot scan, in one pass, files spread around different filesystems.
The first time you run yabitrot, it will scan all files, compute the checksum and store it in his DataBase (a file called .cksum.db and located at the root of your folder)
During the next runs, yabitrot will compare the checksums of unmodified files to make sure that their checksum are equal to the previous one (stored int he DB). For new or modified files, we just compute the checksum and store it. We will also remove from the DB all files stored on the DB but no more present in the folder.
On OpenBSD, I suggest you trigger yabitrot via the weekly.local or via the monthly.local file. You will thus receive an email listing the files having bitrot issues.
You must not be root to run it, but you have to run it with the user having enough permissions to read all targeted files and to store the DB at the root of your targeted folder.
You can find this code on Sourceforge: here
You can download it here
I'm running this script since several weeks now without too much troubles.
I can scan my NAS disk of 720GB in 2h40 on my small board with 2 CPU of 3.3GHz and 4GB of Ram. This disk has 6.8 millions files, but 760.000 Inodes (so each files has +- 9 different names).
Here after the log's details after the 1st scan.
No cleanup required
760238 files added
0 files updates
0 files error
6092222 files analysed in 8612.82 sec, 706.492 GB
760238 entries in the DB
We see that such DB takes 29MB !!!
obsd-nas:~#ls -alh /mnt/sd1
total 60004
drwxrwxrwx 7 root wheel 512B Oct 13 20:37 .
drwxr-xr-x 4 root wheel 512B Mar 24 2018 ..
-rw------- 1 root wheel 29.1M Oct 13 20:37 .cksum.db
Here after the log's details after the next scan:
Wed Oct 17 07:54:04 2018: 1025 files removed from DB
Wed Oct 17 07:54:04 2018: 1302 files added
Wed Oct 17 07:54:04 2018: 205 files updates
Wed Oct 17 07:54:04 2018: 0 files error
Wed Oct 17 07:54:04 2018: 6094853 files analysed in 8643.72 sec, 706.798 GB
Wed Oct 17 07:54:04 2018: 760515 entries in the DB