Re: [Burp-users] Large file problems?
Brought to you by:
grke
|
From: Graham K. <gr...@gr...> - 2026-01-26 00:48:00
|
On Sun, Jan 25, 2026 at 01:04:41PM -0500, Brendon Higgins wrote: > Hi all, > > On Saturday, December 13, 2025 12:46:36 p.m. Eastern Standard Time Brendon > Higgins wrote: > > What I *am* noticing is that the backup medium is acting very slowly (curse > > shingled magnetic recording!), and dmesg shows some statements about > several > > tasks - one of them burp - being blocked for more than 120 seconds, and > > that "task burp:8409 is blocked on a mutex likely owned by task burp:4996". > > I have more observations from this weekend's (failed) backup attempt which > indicate that, in my case, the backup medium is NOT too slow. I'm convinced > this was a red herring because I can see (using `sudo lsof -c burp`) burp > getting stuck at a particular file, and can manually copy that file to the > medium, with a `sync`, and it only takes a few seconds. The disk is clearly > fine at the moment, and burp is running the entire time - it's just not doing > anything (near-zero CPU activity, IO activity, socket activity), and > eventually (~15 minutes) times out, just like Mark had described observing. > And I'm not seeing those dmesg "blocked on a mutex" lines anymore, either. > > I've now let burp continue operating overnight, attempting to resume, timing > out after about 15 minutes, and restarting at the next (also 15 minute) cron > cycle. It's gone about 40 cycles, now. When I probe with lsof, it's always > stuck on the same file, /usr/lib/chromium/chromium . This isn't even the > largest file burp has copied into the working backup, although it is the > largest one which is being compressed (the others have file-type exclusions). > > This file also always seems to get stuck at a little over 100 MB saved to the > backup (under data.tmp), although it can be slightly different each time, > varying by a few tens of kB (although one time might have only gotten to 89 > MB). On resume it seems burp retries the file from its beginning, as I've > noticed the size get smaller a few times. The file is certainly incomplete; > gunzip complains about unexpected end of file. If I gzip the original myself, > the result comes to about 114 MB - so roughly 14 MB is missing. > > I'm only seeing this happen with the client and server being on the same > machine. Other machines on my network have no evidence of such failures - no > signs of "resumed" logs in their backup folders. I have one Windows machine > with burp 2.4.0, which has a similarly sized backup, and a few other Linux > machines with smaller backups, some with 2.4.0 and others with 3.1.4. (I even > increased max_children to accommodate all of them at once, and beside this > local client, there was no problem.) > > Other things I tried which seemed to make no difference: switching librsync on, > and changing compression level from zlib6 to zlib9 (the ~100 MB failure point > even seemed to remain the same). > > I'm not quite sure what to do at this point. The backup is not progressing, > which is a concern, but given that it's reproducible in this state perhaps > there's some other aspect I could look more closely at which could yield more > useful info? > > Best, > Brendon Hello, It could well be possible that some particular form of disk latency could cause a burp backup process to get jammed. In order to fully rule out the backup medium causing burp to behave poorly, have you tried using a different backup medium? Meanwile, if I were able to reproduce it myself, I would be putting debug into various places in the code, recompiling, and trying to track down exactly what is getting stuck. I am currently not able to reproduce the problem, but I have an idea to do with setting up a loopback medium such as described here: https://serverfault.com/questions/523509/linux-how-to-simulate-hard-disk-latency-i-want-to-increase-iowait-value-withou |