From: Raoul B. <ra...@bh...> - 2020-08-29 05:50:56
|
Hi, another update from my side, for all who might be interested / future generations ;-) I set $Conf{PoolNightlyDigestCheckPercent} = 100 and ran BackupPC_refCountUpdate -m . I get 104 ERRORs with md5sum d41d8cd98f00b204e9800998ecf8427e (= empty file) instead of the expected digest. All of these files have 0 length, see https://paste.ubuntu.com/p/RRBSsXvMWV/ My observations until now: A. The timestamp of these files falls into 1 of 3 dates, see https://paste.ubuntu.com/p/FcY3gb5q8V/ (sorted via xargs ls -altr) If I disregard the 2 files from 2017, these files first started to appear when I upgraded rsync-bpc from 3.0.9.14 to 3.0.9.15 (I usually initate a test backup right after an upgrade to validate everything is functioning correctly) > /var/log/dpkg.log: 2020-08-12 22:46:22 upgrade rsync-bpc:amd64 3.0.9.14 > 3.0.9.15 B. I verified that there are no other pool files with the same digest (so no digest* / digest + extension). C. At least some of these pool files are related to attrib files, i.e. > BackupPC_attribPrint > aaa.aaa.aaa/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b > BackupPC_attribPrint: cannot read attrib file > mail.bhatia.eu/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b > (f1811600a7b6f6e3f2fcafea644e005b) D. Looking at the source of rsync-bpc, I find: https://github.com/backuppc/rsync-bpc/blob/3.0.9/backuppc/bpc_poolWrite.c#L348 and https://github.com/backuppc/rsync-bpc/commit/3eb30d1b4fe844d0a1409f8a090c518e4d713f40#diff-11e5a98fabcaf3b4938189c693342209 . Could this play a role here? (I have to admit that I am not (yet?) able to wrap my head around the pooling, so please excuse any error in my assuptions.) --> 1. To me, this reads in a way that backuppc might (have?) decide(d) to zero out a file / create an empty pool file? 2. Would it make sense to handle this specific case (empty pool file) in BackupPC_refCountUpdate when checking the pool, i.e. at https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L821 Any further help to debug this problem would be appreciated! Raoul PS. Cross-posting to backuppc-devel mailing list, just in case. On 2020-08-27 21:58, Raoul Bhatia wrote: > Hi Craig, Guillermo, all. > > A quick update from my side > > I have a (complete?) list of (potentially) broken files as reported by > BTRFS read errors. > > I then validated the md5sum from the (c)pool with a one-liner: >> for i in $(grep /cpool/ ~/broken_files.txt); do >> M=$(/usr/share/backuppc/bin/BackupPC_zcat $i | md5sum | cut -d ' ' -f >> 1); echo -n "$M: "; echo $i | grep --color $M || echo "$f ERR"; done > --> No error, perhaps I am lucky? :-) > > > This leaves only one file that might be damaged: > backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 ? > > I am now experimenting with running > /usr/share/backuppc/bin/BackupPC_refCountUpdate -m > (Until now: No error; exit code 0) > > > > Reading the source I have the following questions: > > 1. > https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L819 > does "rand(100) < $Conf{PoolNightlyDigestCheckPercent}" > > My understanding would be that I won't get a predictable 1% check on > *each* run. Perhaps on average, with a sufficiently large pool, over > a longer period of time, this might work out, but for a few runs of > BackupPC_refCountUpdate not. > > --> Perhaps there would be another implementation that gets a more > predictable result? > (i.e. create an array of all the files to check, sort them, and then > take the first $Conf{PoolNightlyDigestCheckPercent} of entries, but at > least 1?) > > > > 2. I also seem to have accumulated checksum errors over the past years. > > --> How do I proceed with these files? If they still exist on the > source, I'd like to re-sync them to the backup. > > > 3. For backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 , I do not > find a way to re-check only this one, old backup. > > --> Would it be a good idea to add an option to add a "-n num" flag to > BackupPC_refCountUpdate to be able to operate on a particular backup? > > > > Thanks for your guidance, > Raoul > > On 2020-08-24 20:39, Raoul Bhatia wrote: >> Hi Craig and Guillermo, >> >> On 2020-08-23 19:35, Craig Barratt via BackupPC-users wrote: >> >>> $Conf{PoolNightlyDigestCheckPercent} is in percent, so you should set >>> it to 100 to check all the pool file's MD5 digest against their file >>> names. >>> >>> As Guillermo mentions, to check the pool MD5 digests, you can set >>> temporarily set $Conf{PoolNightlyDigestCheckPercent} to 100 and >>> $Conf{PoolSizeNightlyUpdatePeriod} to 1. >> >> When reading the documentation, I also came across these options. >> However, I didn't dare to run backuppc / BackupPC_nightly, because >> from the documentation: >> >> Overnight, when BackupPC_nightly next runs, >> all the unused pool files will be deleted and >> this will recover the disk space used by the client's backups. >> >> I didn't want to end up with an empty pool... >> >>> If you stop BackupPC, to check all the pool digests, run: >>> >>>> BackupPC_refCountUpdate -m >>> If you want to also regenerate all the host reference counts (which >>> will take a long time), you could run: >>> >>>> BackupPC_refCountUpdate -m -F >> >> Meanwhile, with the kind help of the btrfs community, I figured out a >> way to get the damaged files. This process is not finished, yet, >> however, I have a first list: >> >> /mnt/backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 >> /mnt/backuppc/cpool/5c/c8/5cc9373a32e06baaa308a7b341db5ac9 >> /mnt/backuppc/cpool/b2/62/b3629c46481cb038682aea248c45b89f >> /mnt/backuppc/cpool/20/c6/21c61013d40e644af734e28459df0a1a >> /mnt/backuppc/cpool/8a/f8/8af935bc53f7199ed75a5695bfd57f26 >> >> FYI: The most recent backup from host abc.def.ghi is 2291, so 1989 is >> quite far in the past. >> >> How should I proceed when I have a list of broken files? >> Move them out from cpool and hope they will be re-synced by the next >> (full?) backup? >> >> Thanks, >> Raoul >> >>> Craig >>> >>> On Sun, Aug 23, 2020 at 6:45 AM Guillermo Rozas >>> <gui...@gm...> wrote: >>> >>> Hi Raoul, >>> >>> are you using BackupPC v4? If yes, you can use a modification of the >>> script I posted here: >>> https://sourceforge.net/p/backuppc/mailman/message/37032497/ >>> >>> In the latest version (4.4.0) you also have the config option >>> $Conf{PoolNightlyDigestCheckPercent}, which checks the md5 digest of >>> this fraction of the pool files each night. You can probably set it >>> to 1 and wait a night for it to run. >>> >>> Regards, >>> Guillermo >>> >>> On Sun, Aug 23, 2020 at 5:38 AM Raoul Bhatia <ra...@bh...> wrote: >>> Hi, >>> >>> related to my previous email, it seems that the cause of my issues >>> was a >>> file system corruption after a "power cut". >>> >>> I managed to recover (most of?) the data and would now like to do a >>> thorough check of the data. >>> >>> Is there any way to "fully verify" the integrity of my backuppc >>> installation, ideally in a nondestructive way ;-) >>> >>> Thanks, >>> Raoul >>> >>> PS. My backuppc process is stopped. >>> -- >>> DI (FH) Raoul Bhatia MSc >>> E-Mail. ra...@bh... >>> Tel. +43 699 10132530 > > > > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: https://github.com/backuppc/backuppc/wiki > Project: https://backuppc.github.io/backuppc/ -- DI (FH) Raoul Bhatia MSc E-Mail. ra...@bh... Tel. +43 699 10132530 |