From: Robin O'L. <ro...@eq...> - 2008-11-20 16:13:13
|
On Mon, Nov 17, 2008 at 05:24:13PM +0100, Kern Sibbald wrote: > On Monday 17 November 2008 16:51:31 Graham Keeling wrote: > > On Mon, Nov 17, 2008 at 02:05:53PM +0100, Kern Sibbald wrote: > > > Solution: > > > - I don't have one, because we have no way to "lock" a volume from being > > > purged. Any thing we might do would be prone to errors if the SD should > > > fail while the volume was "locked". > > > Bottom line: it is easy to work around this problem, and unless we are > > > lucky and come up with a good idea, I don't see that there is any easy > > > way to resolve the problem. Luckily, Kern did come up with a good idea! http://bugs.bacula.org/view.php?id=1188 Thanks. This fixes the problem in the test case and in our real-world situation. As Kern points out above, this fix does open up the possibility that certain sorts of abnormal failure can leave a volume in a state where it won't ever be re-used without manual intervention, but that is certainly preferable to having volumes silently overwritten (and we have a suggestion to improve that in another post and bug report). On Mon, Nov 17, 2008 at 05:24:13PM +0100, Kern Sibbald wrote: > On Monday 17 November 2008 16:51:31 Graham Keeling wrote: > > I do not think that it is easy to work around this problem. I also think > > that the problem is very serious and that it is quite likely that other > > people have triggered it without noticing - it is hard to realise that it > > has happened if you are not watching very closely indeed. > > You have certainly found a bug, but it is a rather artificial problem that > virtually no one is likely to have, so I do not consider it at this point to > be too serious. Looking back through the mailing list discussions for similar topics, I think this bug is possibly related to Ulrich Leodolter's query "Maximum Volume Jobs ignored": http://sourceforge.net/mailarchive/message.php?msg_id=1211877887.7417.17.camel%40leodolter.bibvb.ac.at and surely also the cause of Kevin Keane's "Bacula ignores Max Volume Jobs": http://sourceforge.net/mailarchive/forum.php?thread_name=491E952B.5070608%40kkeane.com&forum_name=bacula-users The documentation refered to in the latter thread (Configuring the Director, The Pool Resource, Maximum Volume Jobs) says: If you are running multiple simultaneous jobs, this directive [Maximum Volume Jobs] may not work correctly because when a drive is reserved for a job, this directive is not taken into account, so multiple jobs may try to start writing to the Volume. At some point, when the Media record is updated, multiple simultaneous jobs may fail since the Volume can no longer be written. It sounds as if this might be alluding to the same issue, so should this caveat be removed from the documentation now we have a fix? If it needs to stay (to document the behaviour of earlier versions before the fix, or becuase there are other problematic conditions not covered by the fix), perhaps the wording of the last bit should be amended to point out that even jobs reported as successful could be damaged (if that's still the case), and perhaps it should be moved (or copied) to the section of the manual specifically related to concurrent jobs: "Basic Volume Management", "Concurrent Disk Jobs". Robin O'Leary. -- email: ro...@eq... Equiinet Ltd., Edison Road, Dorcan, Tel.: +44 1793 603708 Swindon, SN3 5JX, U.K. 51.5558N,1.7286W |