Re: [Bacula-devel] [Bacula-users] Improving job scheduling flexibility

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Monday 25 February 2008 22.07:06 Arno Lehmann wrote:
> Hi,
>
> 25.02.2008 18:47, mar...@up... wrote:
> > In the message dated: Sat, 23 Feb 2008 12:40:43 +0100,
> > Kern Sibbald  used the subject line
> > 	<[Bacula-users] Improving job scheduling flexibility>
> > and wrote:
>
> ...
>
> > =>
> > => Finally Job Proximity is to allow a bit of overlap.  For example, if a
> > job has => been running 20 minutes or ran 20 minutes ago, you might want
> > to not apply => the rules.
> >
> > Could you elaborate on what this means to you a bit more?
> >
> > I see the distinction here being mainly in terms of jobs that take a
> > "long" time vrs a "short" time. If the entire job normally takes 30
> > minutes, I don't really care whether there's a duplicate, and it doesn't
> > matter to me if the duplicate starts 1 minute after the original or 29
> > minutes after.
> >
> > However, if the job normally takes 18 hours, then the conditions are very
> > different. In this case, I really, really, really don't want a duplicate
> > running if there's a lot of overlap--this would have a major effect on
> > disk loads on the client, on network traffic, and on disk/cpu/media
> > resource on the bacula server. However, if the original job is almost
> > near completion when the duplicate is launched, then I don't want to
> > cancel the duplicate. In this case, the reasoning is that canceling the
> > duplicate would result in a long window with no backups, in an effort to
> > close a small window of duplicate (simultaneous) backups running.
>
> That's a very important distinction, and Kerns proposal will work only
> if the one setting up the jobs correctly estimates the expected job
> run time. As such, it's a "good enough" solution IMO.
>
> > Here's a very complicated proposal, which will almost certainly be
> > rejected,
>
> Let's wait... I would vote for it :-)
>
> > that really leverages Bacula's database backend and gives a really
> > powerful feature:
> >
> > 	if the job historically takes over $DURATION [minutes|hours|days]
> > 	and the current job is at least $PERCENTAGE complete, then allow the
> > duplicate to run, otherwise kill the duplicate
> >
> > 		in this case, $DURATION would be determined from database stats,
> > 		as an average of previous runs of the same job at the same level.
> >
> > 		I could also see an algorithm that
> > 		gives more weight to the duration of the most recent backups if the
> > 		standard deviation of the average vrs. the most recent backups is
> > 		greater than a specified value. This is because a given backup is
> > 		more likely to take "almost as much" time as the most recent backup
> > 		of the same level than as much time as a much earlier backup.
> >
> > 		similarly, the $PERCENTAGE value could be expressed as a range,
> > 		incorporating the standard deviation in the backup duration
>
> Yes. Actually, I don't think this is so hard to get out of the
> catalog. Of course, I would need quite a number of attempts in SQL to
> get reasonable results, but still it should be possible with a
> reasonable amount of work invested.
>
> > [As an aside, I'd like to see this kind of predictive/AI capability put
> > into more of bacula, particularly in the scheduling. It would be
> > wonderful to use the historic records to allow bacula to schedule jobs
> > most efficiently, in a way similar to Amanda, rather than hard-coding
> > specific times in each job resource.]
>
> I also agree, though I find that in many cases, especially due to
> auditing requirements or scheduled removal to off-site storage, a
> fixed schedule for backups is required despite the easier setup with
> such an automatic scheduling. I would like to be able to express
> something like "job retention time = 8 months" "keep jobs = 6" and
> "job max interval = 5 weeks" /for full backups) and Bacula decides
> when to run full backups and which old ones to purge...

Well currently (in SVN), you can do everything except the "keep jobs = 6". but 
that has been on my radar screen for some time now ...

>
> Arno