From: Mantis B. T. <no...@bu...> - 2009-08-14 20:50:34
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=1346 ====================================================================== Reported By: CyprusBlue Assigned To: ====================================================================== Project: bacula Issue ID: 1346 Category: scheduling Reproducibility: sometimes Severity: minor Priority: normal Status: feedback ====================================================================== Date Submitted: 2009-08-12 17:18 BST Last Modified: 2009-08-14 21:50 BST ====================================================================== Summary: Volume reservation glare issue Description: I have been having a mount lock issue a few times now over the past few weeks (since upgrading to current version I believe). I finally caught it doing it today and handled it instead of one of my lower level tape operators and found what is going on. I have a large number of jobs that start at the same time at 2300 or so, and it looks like some jobs are starting on different drives wanting the same tape. (Log attached) My backups are set up as follows: All jobs start at 2300 nightly for incrementals, with differential and full upgrade periods set. Incrementals go to one pool, differentials go to a different pool, and fulls go to a third pool. FWICT it looks like there is an asyncronus locking issue with both drives and volumes if the start times are simultaneous. (Yeah, I realize how that sounds, its the bane of thread programmers everywhere, I'm just reporting it) Fixing is easy with little intervention (simply unmounting the volume from one once all the jobs are done on it and running a mount on the other drive), but it is not ideal to require user action. I'm not familiar with how the drive/volume reservation system works in bacula so I won't comment on that part, but perhaps conflict handling could be done by releasing holds, and restarting from scratch after a short + randomized delay? ====================================================================== ---------------------------------------------------------------------- (0004507) kern (administrator) - 2009-08-14 17:25 http://bugs.bacula.org/view.php?id=1346#c4507 ---------------------------------------------------------------------- What version of Bacula are you using? ---------------------------------------------------------------------- (0004508) CyprusBlue (reporter) - 2009-08-14 17:37 http://bugs.bacula.org/view.php?id=1346#c4508 ---------------------------------------------------------------------- Opps, sorry. nrepbak01-dir Version: 3.0.2 (18 July 2009) i686-pc-linux-gnu redhat Nahant nrepbak01-sd Version: 3.0.2 (18 July 2009) i686-pc-linux-gnu redhat Nahant Dir has a minor patch applied to purge jobs as well as files when a migration runs, but other than that, all are release versions compiled from source release. ---------------------------------------------------------------------- (0004509) kern (administrator) - 2009-08-14 20:19 http://bugs.bacula.org/view.php?id=1346#c4509 ---------------------------------------------------------------------- I have an experimental patch that I have created and uploaded to this bug report. However, I have no way to test it because I am unable to duplicate the problem. The patch is relative to 3.0.3, but it should apply to your version. If you are in the main source directory, you can apply it with: patch -p2 <bug-1346.patch If it works, you should see the first warning message printed saying that the Volume Bacula wants is on another drive, then it should attempt to get a different Volume, but since I cannot test it, who knows, it may not work, and possibly even crash the SD, though I consider that unlikely. If it doesn't automatically apply, there are two one line insertions that you can probably easily do by hand, and a Jmsg() enhancement modification that you can ignore. ---------------------------------------------------------------------- (0004510) CyprusBlue (reporter) - 2009-08-14 20:20 http://bugs.bacula.org/view.php?id=1346#c4510 ---------------------------------------------------------------------- It will try to grab a different volume than the one in the other drive? ---------------------------------------------------------------------- (0004511) kern (administrator) - 2009-08-14 20:37 http://bugs.bacula.org/view.php?id=1346#c4511 ---------------------------------------------------------------------- Yes. ---------------------------------------------------------------------- (0004512) CyprusBlue (reporter) - 2009-08-14 21:50 http://bugs.bacula.org/view.php?id=1346#c4512 ---------------------------------------------------------------------- Ok, testing this now. Ideally I'd prefer that it not continue using the addtl drive and just get in line for the first one, but I realize why this is the solution for the moment after looking at the code. Issue History Date Modified Username Field Change ====================================================================== 2009-08-12 17:18 CyprusBlue New Issue 2009-08-12 17:18 CyprusBlue File Added: baculalog.txt 2009-08-12 22:36 slords Issue Monitored: slords 2009-08-14 17:25 kern Note Added: 0004507 2009-08-14 17:25 kern Status new => feedback 2009-08-14 17:37 CyprusBlue Note Added: 0004508 2009-08-14 20:14 kern File Added: bug-1346.patch 2009-08-14 20:19 kern Note Added: 0004509 2009-08-14 20:20 CyprusBlue Note Added: 0004510 2009-08-14 20:37 kern Note Added: 0004511 2009-08-14 21:50 CyprusBlue Note Added: 0004512 ====================================================================== |