RE: [Bacula-users] Re: [Bacula-devel] Migration jobs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

=20

> > I don't think there is any choice, since it is probably not=20
> advisable=20
> > to move just a piece of  a job to another device.  In addition, it=20
> > would be really hairy trying to update the JobMedia block=20
> pointers if=20
> > you started moving stuff in the middle of the job.  It=20
> really needs to=20
> > move a whole job at a time.
>=20
> Hmm. Two thoughts:
> First, migrating parts of jobs will probably be necessary=20
> when you use migration to achive what you do now with=20
> spooling. Otherwise, you'd need to have more spool space than=20
> the largest jobs needs. When the spool space fills and the=20
> job is not finished, it is very important to migrate a job=20
> that is not even finished. The same when migration=20
> (despooling) should happen before the primary job is finished.

One observation is that if a volume is in use at the time of migration,
I think it is by definition, not eligible for migration (ie, the
migration process needs to check to see if the volume is mounted or
being requested anywhere and skip to the next volume in the pool). If
this observation obtains, then Kern's approach will work in that some
OTHER volume will get selected and he won't deadlock on trying to read
and write to the same volume. Once a volume is mounted for migration, it
becomes unavailable for normal selection until released by the migration
job, and thus is protected from confusion by the migration job
reservation.=20

The problem I see with only job migration is the case where a job is
known to span multiple volumes (I have lots of these). The job migration
would then need to somehow reserve all the volumes used in a job during
migration, which sounds to me more likely to deadlock than the
volume-based migration process (or at least more likely to trigger
resource starvation in a pool). This is the majority reason why I was
thinking of a volume orientation for migration -- if by definition, a
volume is ineligible for migration when it is in use, then again, by
definition, a volume that is available is not being modified by a active
job so the update of the file records on an inactive source volume
*shouldn't* be a chokepoint.=20

Using this theory, migration should mount the source volume (implicitly
reserving it), select and mount an output volume from the next-pool
(implicitly reserving it), extract a list of physical files on the
source volume from the database, copy a physical file to new volume,
verify the copy (for safety's sake), update the database pointer to new
file location (after verify of copy, so if db update fails, the original
data is still present), repeat until source volume is empty.

At EOV or source volume, close the new tape, unmount source/destination
volumes, mark the migrated source volume as available/appendable, and
repeat if necessary. The released source volumes are treated as
available pool volumes, and are treated as such by any running jobs
using that pool.=20

> Second, this is probably too simplicistic once you consider=20
> multiple concurrent jobs. When these jobs are interleaved on=20
> the source volume, you'd have to read the whole volume for=20
> each jobs. No fun...

This can be alleviated by not allowing multiple jobs to access the same
volume at the same time. IMHO, this is good practice in any case. In the
spooling case, you would define an autochanger with the number of
simultaneous 'disk' devices equal to the number of streams you want to
support simultaneously, and let the autochanger code handle assigning
"drives" to jobs.=20

Experience with this kind of migration-based spooling with TSM indicates
that controlling the # of sessions is an important performance knob --
if you only have a few paths to a disk, you need to control the number
of clients intent on beating it to death closely.=20

> > I'll mull this over. In any case, I'm more interesting in=20
> Migration at=20
> > the moment rather than Copy or Archive, so we have plenty=20
> of time to=20
> > work out the terminology.

One other thing to mull over:=20

If the Extract idea gets implemented, it might be worth having the
ability for an extract job to write either standard tar format, cpio
format, or ANSI SL format as the output. That would make creating tapes
or CDs that are readable anywhere very simple.=20