|
From: Raoul B. <ra...@bh...> - 2020-08-29 05:50:56
|
Hi,
another update from my side, for all who might be interested / future
generations ;-)
I set $Conf{PoolNightlyDigestCheckPercent} = 100
and ran BackupPC_refCountUpdate -m .
I get 104 ERRORs with md5sum d41d8cd98f00b204e9800998ecf8427e (= empty
file) instead of the expected digest. All of these files have 0 length,
see https://paste.ubuntu.com/p/RRBSsXvMWV/
My observations until now:
A. The timestamp of these files falls into 1 of 3 dates, see
https://paste.ubuntu.com/p/FcY3gb5q8V/ (sorted via xargs ls -altr)
If I disregard the 2 files from 2017, these files first started to
appear when I upgraded rsync-bpc from 3.0.9.14 to 3.0.9.15
(I usually initate a test backup right after an upgrade to validate
everything is functioning correctly)
> /var/log/dpkg.log: 2020-08-12 22:46:22 upgrade rsync-bpc:amd64 3.0.9.14
> 3.0.9.15
B. I verified that there are no other pool files with the same digest
(so no digest* / digest + extension).
C. At least some of these pool files are related to attrib files, i.e.
> BackupPC_attribPrint
> aaa.aaa.aaa/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b
> BackupPC_attribPrint: cannot read attrib file
> mail.bhatia.eu/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b
> (f1811600a7b6f6e3f2fcafea644e005b)
D. Looking at the source of rsync-bpc, I find:
https://github.com/backuppc/rsync-bpc/blob/3.0.9/backuppc/bpc_poolWrite.c#L348
and
https://github.com/backuppc/rsync-bpc/commit/3eb30d1b4fe844d0a1409f8a090c518e4d713f40#diff-11e5a98fabcaf3b4938189c693342209
. Could this play a role here?
(I have to admit that I am not (yet?) able to wrap my head around the
pooling, so please excuse any error in my assuptions.)
-->
1. To me, this reads in a way that backuppc might (have?) decide(d) to
zero out a file / create an empty pool file?
2. Would it make sense to handle this specific case (empty pool file) in
BackupPC_refCountUpdate when checking the pool, i.e. at
https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L821
Any further help to debug this problem would be appreciated!
Raoul
PS. Cross-posting to backuppc-devel mailing list, just in case.
On 2020-08-27 21:58, Raoul Bhatia wrote:
> Hi Craig, Guillermo, all.
>
> A quick update from my side
>
> I have a (complete?) list of (potentially) broken files as reported by
> BTRFS read errors.
>
> I then validated the md5sum from the (c)pool with a one-liner:
>> for i in $(grep /cpool/ ~/broken_files.txt); do
>> M=$(/usr/share/backuppc/bin/BackupPC_zcat $i | md5sum | cut -d ' ' -f
>> 1); echo -n "$M: "; echo $i | grep --color $M || echo "$f ERR"; done
> --> No error, perhaps I am lucky? :-)
>
>
> This leaves only one file that might be damaged:
> backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 ?
>
> I am now experimenting with running
> /usr/share/backuppc/bin/BackupPC_refCountUpdate -m
> (Until now: No error; exit code 0)
>
>
>
> Reading the source I have the following questions:
>
> 1.
> https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L819
> does "rand(100) < $Conf{PoolNightlyDigestCheckPercent}"
>
> My understanding would be that I won't get a predictable 1% check on
> *each* run. Perhaps on average, with a sufficiently large pool, over
> a longer period of time, this might work out, but for a few runs of
> BackupPC_refCountUpdate not.
>
> --> Perhaps there would be another implementation that gets a more
> predictable result?
> (i.e. create an array of all the files to check, sort them, and then
> take the first $Conf{PoolNightlyDigestCheckPercent} of entries, but at
> least 1?)
>
>
>
> 2. I also seem to have accumulated checksum errors over the past years.
>
> --> How do I proceed with these files? If they still exist on the
> source, I'd like to re-sync them to the backup.
>
>
> 3. For backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 , I do not
> find a way to re-check only this one, old backup.
>
> --> Would it be a good idea to add an option to add a "-n num" flag to
> BackupPC_refCountUpdate to be able to operate on a particular backup?
>
>
>
> Thanks for your guidance,
> Raoul
>
> On 2020-08-24 20:39, Raoul Bhatia wrote:
>> Hi Craig and Guillermo,
>>
>> On 2020-08-23 19:35, Craig Barratt via BackupPC-users wrote:
>>
>>> $Conf{PoolNightlyDigestCheckPercent} is in percent, so you should set
>>> it to 100 to check all the pool file's MD5 digest against their file
>>> names.
>>>
>>> As Guillermo mentions, to check the pool MD5 digests, you can set
>>> temporarily set $Conf{PoolNightlyDigestCheckPercent} to 100 and
>>> $Conf{PoolSizeNightlyUpdatePeriod} to 1.
>>
>> When reading the documentation, I also came across these options.
>> However, I didn't dare to run backuppc / BackupPC_nightly, because
>> from the documentation:
>>
>> Overnight, when BackupPC_nightly next runs,
>> all the unused pool files will be deleted and
>> this will recover the disk space used by the client's backups.
>>
>> I didn't want to end up with an empty pool...
>>
>>> If you stop BackupPC, to check all the pool digests, run:
>>>
>>>> BackupPC_refCountUpdate -m
>>> If you want to also regenerate all the host reference counts (which
>>> will take a long time), you could run:
>>>
>>>> BackupPC_refCountUpdate -m -F
>>
>> Meanwhile, with the kind help of the btrfs community, I figured out a
>> way to get the damaged files. This process is not finished, yet,
>> however, I have a first list:
>>
>> /mnt/backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16
>> /mnt/backuppc/cpool/5c/c8/5cc9373a32e06baaa308a7b341db5ac9
>> /mnt/backuppc/cpool/b2/62/b3629c46481cb038682aea248c45b89f
>> /mnt/backuppc/cpool/20/c6/21c61013d40e644af734e28459df0a1a
>> /mnt/backuppc/cpool/8a/f8/8af935bc53f7199ed75a5695bfd57f26
>>
>> FYI: The most recent backup from host abc.def.ghi is 2291, so 1989 is
>> quite far in the past.
>>
>> How should I proceed when I have a list of broken files?
>> Move them out from cpool and hope they will be re-synced by the next
>> (full?) backup?
>>
>> Thanks,
>> Raoul
>>
>>> Craig
>>>
>>> On Sun, Aug 23, 2020 at 6:45 AM Guillermo Rozas
>>> <gui...@gm...> wrote:
>>>
>>> Hi Raoul,
>>>
>>> are you using BackupPC v4? If yes, you can use a modification of the
>>> script I posted here:
>>> https://sourceforge.net/p/backuppc/mailman/message/37032497/
>>>
>>> In the latest version (4.4.0) you also have the config option
>>> $Conf{PoolNightlyDigestCheckPercent}, which checks the md5 digest of
>>> this fraction of the pool files each night. You can probably set it
>>> to 1 and wait a night for it to run.
>>>
>>> Regards,
>>> Guillermo
>>>
>>> On Sun, Aug 23, 2020 at 5:38 AM Raoul Bhatia <ra...@bh...> wrote:
>>> Hi,
>>>
>>> related to my previous email, it seems that the cause of my issues
>>> was a
>>> file system corruption after a "power cut".
>>>
>>> I managed to recover (most of?) the data and would now like to do a
>>> thorough check of the data.
>>>
>>> Is there any way to "fully verify" the integrity of my backuppc
>>> installation, ideally in a nondestructive way ;-)
>>>
>>> Thanks,
>>> Raoul
>>>
>>> PS. My backuppc process is stopped.
>>> --
>>> DI (FH) Raoul Bhatia MSc
>>> E-Mail. ra...@bh...
>>> Tel. +43 699 10132530
>
>
>
> _______________________________________________
> BackupPC-users mailing list
> Bac...@li...
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: https://github.com/backuppc/backuppc/wiki
> Project: https://backuppc.github.io/backuppc/
--
DI (FH) Raoul Bhatia MSc
E-Mail. ra...@bh...
Tel. +43 699 10132530
|