From: Kern S. <ke...@si...> - 2005-11-28 15:14:23
|
Hello, I'm probably more than half way (possibly 3/4) to getting the first migration/copy job running, so I thought I would write up a few of the details for those who are interested in checking my "design" or who simply want to make comments. What I have implemented already is (passes regression testing, so all existing features work despite the new code): - Separation of read/write descriptors in the Storage daemon. - Separation of the read/write Storage names in the Director that are sent to the Storage daemon (both a read and a write storage device can now be sent). - Implementation of a skeleton of Migration/Copy job code in the Director. - Implementation of the following new Pool Directives: Migration Time = <duration> Migration High Bytes = <size> Migration Low Bytes = <size> Next Pool = <Pool-res-name> (nothing is done with them yet, but all the Catalog variables exist already in 1.38.x). - Implementation of a new Job Directive: Migration Job = <Job-res-name> This is identical to the current Verify Job directive. It allows specification of a Job to be migrated/copied. - Implementation of a Migrate and Copy Job Type (similar to Backup, Restore, and Verify). What remains to be done: - Finish the skeleton of the Migration job code in the Director including having it check the new Pool Directives ... - Implement sending the bootstrap file directly from the Director to the Storage daemon (cutting out the FD). Requires a DIR<->SD protocol change. - Implement Migration/Copy code in the SD (rather trivial, I think). How does it work? Much like a Verify job. You define a Migrate or Copy job much the same as you do a Verify job, with the exception that you specify a Migration Job (target) rather than a Verify Job name (i.e. you tell it what job you want migrated). The from Storage daemon is defined in the specified target Migration Job. The from Pool is specified in the target Job's Pool's Next Pool, and if not specified, it is taken from the Pool specified in the current job. You then schedule this Migration job using a schedule. When it runs, it will check that either the Migration Time is depassed (it is if it is zero) or the Migration High Bytes are exceeded in the target's Pool. If one of those is true, the job will start and will migrate the last target job run (this needs to be improved) by reading that job, much like a restore, and writing it to the destination pool, then for a Migration, the old job is deleted from the catalog (perhaps the Volume will be removed -- another Feature Request), or in the case of a Copy, the old Job information will be left unchanged. Consequences/problems: - This is a bit simplistic (OK with me) and relies on the user to schedule Migration jobs rather than Bacula interally checking and automatically firing off a Migration job. I'm not too comfortable with automatically generated jobs. - There exist dead locks situations in obtaining resources. For example, once the read device is acquired, that device is blocked, and if another job has the output device that is needed, the Migrate Job will block until the other job completes. If the other job needs the read device of the Migrate job, a dead lock will occur. - Each Migration job will migrate a single previous job. Scheduling multiple migration jobs will, assuming the migration conditions are met, migrate one job each time. This is fine with me. - You need a different Migration job for each job to be migrated. This is a bit annoying but is mitigated by JobDefs. - I haven't worked out exactly what to keep in the catalog for Migration jobs (a job that runs but does nothing should probably be recorded, a job that runs and migrates data should probably be labeled as a Backup job to simplify the restore code ...). - The last 20% of the programming effort requires 80% of the work :-) - I'm thinking about adding an interactive migration console command, similar to the restore command, except that the files selected will be written to the output. This is a way to "migrate" multiple jobs (i.e. the current state of the system) or in other words do a "vitual Full backup" or a "consolidation". To be consistent this command would not allow selection of individual files, i.e. it will take all files from the specified jobs. - An Archive feature can also be implement from this -- it is simply a Copy with no record being put in the catalog and the output Volume types being Archive rather than Backup. (Note, my concept of Archive is that no record of them remains in the catalog -- this may be a subjet of discussion ...) Comments? -- Best regards, Kern ("> /\ V_V |