Re: [Bacula-users] Re: [Bacula-devel] Migration jobs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Wednesday 30 November 2005 23:04, Arno Lehmann wrote:
> Hello,
>
> On 30.11.2005 09:09, Kern Sibbald wrote:
> > On Tuesday 29 November 2005 22:29, Arno Lehmann wrote:
> >>Hello,
> >>
> >>nice to see that Migration is making progress...
> >>
> >>On 29.11.2005 17:19, Kern Sibbald wrote:
> >>
> >>...
> >>
> >>>I don't think there is any choice, since it is probably not advisable to
> >>>move just a piece of  a job to another device.  In addition, it would be
> >>>really hairy trying to update the JobMedia block pointers if you started
> >>>moving stuff in the middle of the job.  It really needs to move a whole
> >>>job at a time.
> >>
> >>Hmm. Two thoughts:
> >>First, migrating parts of jobs will probably be necessary when you use
> >>migration to achive what you do now with spooling. Otherwise, you'd need
> >>to have more spool space than the largest jobs needs. When the spool
> >>space fills and the job is not finished, it is very important to migrate
> >>a job that is not even finished. The same when migration (despooling)
> >>should happen before the primary job is finished.
> >
> > I know there has been some discussion of spooling and migration, but for
> > me they are totally separate.
>
> Ok, that clarifies this point.
>
> >  It is possible to write a job to one medium, then
> > later move it to another. I don't consider that spooling.  Migration is
> > done after the job is complete. Spooling is a process like a spool, it
> > can get filled and emptied many times.  I currently have no intention of
> > mixing the two.  That said, one can use Migration as a sort of simplistic
> > spooling where *everything* gets written, then it is moved.
>
> That's the scenario I thought about.
>
> >>>A Volume migration (for me in the current Bacula context) is simply:
> >>>1. What jobs are stored on the Volume
> >>>2. Loop over jobs, doing a job migration.
> >>
> >>Second, this is probably too simplicistic once you consider multiple
> >>concurrent jobs. When these jobs are interleaved on the source volume,
> >>you'd have to read the whole volume for each jobs. No fun...
> >
> > You may consider it simplistic, but I am not going to try to implement
> > job migration by any other means.  Yes, you essentially have to pass
> > through the volume (not the whole volume) N times where N is the number
> > of jobs on the volume.  Please re-read my first sentence ...
>
> I don't see where I misunderstood you. The probelm I see is rather more
> basic: Assume a volume where 10 jobs are stored, and assume they are all
> interleaved bacause they were written concurrently and without spooling
> or with not enough spool space. Now, to free that volume, you'd have to
> trigger a migration job for each of the ten jobs. These migration jobs
> would need to run sequentially, and each of them could require the whole
> volume to be read (not each byte, but rather in terms of tape use: the
> tape would be moved ten times over the read head [perhaps more often - I
> know what S in SLR is ;-) ]).
>
> There might be better solutions to migrate the complete contents of a
> volume - although I wouldn't want to implement any of them :-)

Yes, there might be more efficient solutions, but at the expense of writing a 
lot of tricky new code.  If someone wishes to supply a patch OK, but I plan 
to implement it the simple inefficient way -- one job at a time.

>
> ...
>
> >>>I'll mull this over. In any case, I'm more interesting in Migration at
> >>>the moment rather than Copy or Archive, so we have plenty of time to
> >>> work out the terminology.
> >>
> >>True, but I think Davids comments regarding Archive (think of long-term
> >>secure storage) and Extract (copy for external use) sound very resonable.
> >
> > I am not very hot about David's comments on Archiving. I for one would
> > not want any archive records left in my current production database. This
> > does not seem to be a good idea.  When I archive something, I want it and
> > all its data gone.
>
> Hmm, my understanding of an Archive is definitely different then.
>
> >  Well, the archive tapes in themselves are sufficient to recover
> > all the data.
>
> As is the case for working Bacula catalog data - you keep the metadata
> in the catalog for some reason, and I think that the same reasons apply
> to an Archive copy.
>
> >  If one wants to keep the database as well, which I consider
> > completely redundant, we would have to think of other mechanisms, because
> > I believe that all the database information should be in a fresh database
> > that is perhaps an Archive database.  In any case, backing up a database
> > is a good way to loose your data because you are dependent on two things:
> > 1. the particular database (MySQL, PostgreSQL, ...) and 2. you are most
> > likely dependent on the version of the database.  I don't call that
> > archiving -- unless you archive a copy of your machine including the
> > running database software.
>
> I think that argument goes too far, but I now understand better what you
> consider an Archive - I'd in fact rather call this an Extract or something.
>
> It seems that we need some sort of a Bacula glossary, or rather to
> extend the explanation of terms in the manual a bit...
>
> Arno

-- 
Best regards,

Kern

  (">
  /\
  V_V