From: James H. <jam...@be...> - 2007-09-02 10:22:50
|
>=20 > My plan was to have plugins as fine as the Options level (as currently > partially implemented), which is finer grained than the Job level, and it > does not at all solve the problem, but creates new problems; the main one > being that unless you figure out how to put these plugin names on the > Volume, > the Volume is no longer complete, but requires a *special* conf file to be > properly read. I consider this a partial implementation that will get the > user into trouble, so it is something I have ruled out implementing -- at > least for the moment. Hmmm... maybe there could be some value in writing out a 'dummy' header file at the start of the backup that contains the 'options' section that was used to create this backup. I do know that Veritas Backup Exec makes it look like everything is backed up in one big job, but really all the different bits and pieces are in separate jobs on tape (eg file1=3DC:, file2=3DD:, = file3=3D'Microsoft Information Store' (exchange), file4=3D'SQL Server', file5=3D'System State'). Have you ever thought about implementing 'Job Groups' in Bacula? > > It would be nice to be able to override this though, as for at > > least MSSQL backups, the format on tape is exactly the same as the > > format stored on disk when you use the "BACKUP DATABASE xxx TO > > DISK=3D'filename.bak'" command, so as a last resort you could just restore > > your SQL backup to a plain file and use MSSQL to do the restore from > > there, eg in disaster recovery mode where you don't have a working > > plugin yet. >=20 > It seems to me that if a particular plugin is not available, the user will > not be able to restore his data. I believe that you are viewing the > problem from too narrow a perspective, because in general the kinds of > plugins that will > be written will not be easy to simulate in some sort of disaster recovery > mode without the plugin. Were it that simple, I don't think we would need > the plugins. Maybe. I'm not convinced of this yet though... (see my MSSQL example below) > > The problem with trying to integrate your whole backup (eg files + > > exchange + mssql) into one job is that each deals with different logical > > things. A file backup obviously just deals with plain files, but an > > exchange backup logically deals with storage groups and databases > > (databases consist of multiple files - normally two - and a storage > > group has multiple databases and then multiple log files which hold > > transactions for all databases in that storage groups). >=20 > At the very lowest level of the code, you are correct, but in reality a > "file backup" does not just deal with plain files. It deals with a tree of > objects. > Normally you call those objects files, which is OK, but in fact to Bacula > they are a whole bunch of different types of objects (directories, which > form the tree; sockets; character files; block files; normal files; FIFOS, > ...). > There is no limit to the number of objects that Bacula can deal with. Each > one has a different backup and restore method (though the code is not > really > that well organized). The only *current* requirement is that they be in a > tree relationship. >=20 > I'm not sure why the above is a problem, other than finding a proper > namespace and browsing the backup objects. The problem I was attempting to describe is that the files that Bacula currently backs up consist of a small known number of streams (as you say, depending on what type of 'file' it is). But as long as the directory that it lives in exists, any of these files can be restored on it's own without any problems, a single 'file' (device node, fifo, etc) exists as a single useful entity. The files backed up in my latest backup of an Exchange Server are (MIS = =3D 'Microsoft Information Store', FSG =3D 'First Storage Group' - reduced = to fit in a readable way - I hope): " 1. /MIS/FSG/Mailbox Store (CMSERVER1)/=20 2. /MIS/FSG/Mailbox Store (CMSERVER1)/DatabaseBackupInfo 3. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.edb 4. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.stm 5. /MIS/FSG/Public Folder Store (CMSERVER1)/ 6. /MIS/FSG/Public Folder Store (CMSERVER1)/DatabaseBackupInfo 7. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.edb 8. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.stm 9. /MIS/FSG/D:\...\E000010B.log 10. /MIS/FSG/ 11. /MIS/ " Ignoring folders, possible restore combinations are: " 2. /MIS/FSG/Mailbox Store (CMSERVER1)/DatabaseBackupInfo 3. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.edb 4. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.stm 6. /MIS/FSG/Public Folder Store (CMSERVER1)/DatabaseBackupInfo 7. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.edb 8. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.stm>=20 9. /MIS/FSG/D:\...\E000010B.log " (Full Restore) Or: " 2. /MIS/FSG/Mailbox Store (CMSERVER1)/DatabaseBackupInfo 3. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.edb 4. /MIS/FSG/Mailbox Store (CMSERVER1)/D:\...\priv1.stm 9. /MIS/FSG/D:\...\E000010B.log " (Just user mailboxes) Or: " 6. /MIS/FSG/Public Folder Store (CMSERVER1)/DatabaseBackupInfo 7. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.edb 8. /MIS/FSG/Public Folder Store (CMSERVER1)/D:\...\pub1.stm>=20 9. /MIS/FSG/D:\...\E000010B.log " (Just the public folders) Any other combination is not valid - you must restore the logfile(s) with any restore that you do, and you must restore all files in a given database (.edb and .stm - DatabaseBackupInfo is a metafile that exists to allow the agent to know the GUID and filenames of the files in the database before it gets to them when restoring). Now it may make sense for the exchange agent to internally roll some of the files together into a single stream, eg create a stream that has the DatabaseBackupInfo + .edb + .stm files all stored sequentially (and I may yet to do that...), but you still need the database files separate from the log files so that each database can be selected for restore independently. As long as the instructions are clear from a user point of view, a restore is still pretty straight forward. This leads into another item to (maybe) add to the list - does a plugin need to have any influence over how the files are selected for restore in the user interface. This would be really nice for the above case, but a nightmare to implement as suddenly you need the plugin running in the director too. Unless maybe we developed some sort of restore recipe that would be stored at the director for each backup... maybe just ignore this whole paragraph for now :) > > True, but a _lot_ more work for bacula. Although I haven't been > > following the discussions on 'true' incremental/differential backups so > > you may already have worked out a nice solution to this. >=20 > Yes, I know how to do project #1 "Accurate restoration of renamed/deleted > files". The only unknown are some minor details mainly concerning > performance. That last sentence was the bit I was referring to. > > Your above statement is also true for Exchange right now, as per my > > previous paragraph. Exchange keeps track internally of when the last > > full backup was done, so bacula needs to somehow know that it doesn't > > have all the control it would like over incremental and differential > > backups. > > > > I guess one of the other things a plugin needs to do is tell bacula > > about its capabilities, eg 'Can do Incremental Backup', 'Can do > > Differential Backup', etc. >=20 > I don't think that is quite the right question. All plugins will have to > know > how to do all implemented backup types. It doesn't make sense to do a > partial implementation. That said, I imagine the plugin may have a > certain > flexibility in the type of backup it does. If an Incremental is > requested, > there is no real harm if it does a Differential other than efficiency ... > however, I don't imagine that would be a normal case. Well... one of the things MSSQL (and probably other databases) can do is a restore to a point in time. eg suppose I had forgotten the WHERE clause on a delete and done something stupid like 'DELETE FROM Order_Detail', and the last backup was run 3 hours earlier, and in that time 100 people had been madly entering orders. That's 300 person hours of labour down the drain! What I could do though is: 1. backup the current transaction logs 2. restore the most recent full copy of the database 3. restore all transactions since then up until right before I issued my monumentally stupid DELETE command (this is done via the RESTORE syntax in MSSQL). That repairs the database exactly as it was, and everyone can keep moving ahead. Now I imagine that driving all of that from within Bacula would be quite hard using a 'plugin unaware' user interface, but it might be nice to have some of the back end framework in place for when we design it. Incidentally, how I would get around this with a simpler MSSQL agent is to simply restore everything to plain files in the filesystem and get MSSQL to restore from those. But maybe MSSQL is the only agent where it would make sense to be able to do this... James |