RE: [Bacula-users] Re: [Bacula-devel] Migration jobs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> > One observation is that if a volume is in use at the time of=20
> > migration, I think it is by definition, not eligible for migration=20
> > (ie, the migration process needs to check to see if the volume is=20
> > mounted or being requested anywhere and skip to the next=20
> volume in the=20
> > pool). If this observation obtains, then Kern's approach=20
> will work in=20
> > that some OTHER volume will get selected and he won't deadlock on=20
> > trying to read and write to the same volume. Once a volume=20
> is mounted=20
> > for migration, it becomes unavailable for normal selection until=20
> > released by the migration job, and thus is protected from=20
> confusion by=20
> > the migration job reservation.
>=20
> I plan to implement this as in any other job. The migration=20
> job will simply wait for the tape to become available.  The=20
> problem is not waiting for a tape, but the fact that the=20
> archive job needs two tapes (or Volumes).  The deadlock only=20
> occurs because of the fact that two are needed. =20

OK, I'm totally confused now. Data is coming into the initial pool, and
backup jobs are writing to volumes within that pool. A migration job
starts because the initial pool is filling up or the user said start a
migration from the initial pool to the next pool via the console. What
do you envision happening at that point? I'm probably missing some
crucial detail here because I'm not following the progression of events
at all.=20

> I think I can resolve these problems much simpler. The only=20
> condition I think that has to be put on Migration is that the=20
> Volumes it wants to read should all be marked non-append, so=20
> we are sure they are not being written.  The Migration job=20
> can do that itself.

OK. I guess I still don't understand what would happen if a job spans
10-20 volumes. I've got some servers that dump 1-2 TB at a shot onto
DLT7000s. If I follow what you're saying, then the migration job would
essentially freeze all 20 volumes when a job is selected for migration,
do the migration one volume at a time using 1 input, 1 output volume
from the next pool, then make the frozen volumes available again after
the migration job finishes. Is that essentially correct?=20

If so, then I think your approach will work for small sites (where
multivolume dumps are relatively rare), but not for larger sites --
holding that reservation on a large number of volumes would impact the
initial pool fairly severely, I'd think.

We can see how it goes, though. Working code usually wins...8-)

> > > Second, this is probably too simplicistic once you=20
> consider multiple=20
> > > concurrent jobs. When these jobs are interleaved on the source=20
> > > volume, you'd have to read the whole volume for each=20
> jobs. No fun...
> >
> > This can be alleviated by not allowing multiple jobs to access the=20
> > same volume at the same time.
>=20
> There is no chance in changing this behavior. The Bacula=20
> users would definitely not agre.

Again, maybe I've just not had enough coffee, but are you saying it's
possible to have Job A writing to a volume and Job B reading from the
*same* volume simultaneously? I'm not talking about having data from
multiple jobs residing on the same volume, I'm talking about
simultaneous use of a specific volume in two jobs at the same instant.=20

> > If the Extract idea gets implemented, it might be worth having the
> > ability for an extract job to write either standard tar format, cpio
> > format, or ANSI SL format as the output. That would make=20
> creating tapes
> > or CDs that are readable anywhere very simple.
>=20
> I'm not much in favor of an Extract as you had described it=20
> since I think that=20
> Archive essentially does that.  However, I think it is an=20
> excellent idea to=20
> have an Extract or Export function that would convert the=20
> data into somoe=20
> other format (tar, ...)

Can you give me an example of where/how you would use the Archive
function as you define it? I'm having trouble seeing how if I archive
something (which to me, implies that I may want it back some day), why
would I not need records of where that archive data went?=20

I'm really trying to understand the different terminology here -- I
*think* we may be saying the same thing in different words, but I can't
seem to make the bigger leap.

-- db