From: Tuomas J. <tj...@so...> - 2008-02-29 11:48:08
|
Hello, On Tue, Feb 19, 2008 at 05:13:04PM -0500, John Stoffel wrote: > > Kern> Mark today in your calendar. Bacula just did its first backup > Kern> and restore of a MySQL database using a plugin. I did it with > Kern> using a simplistic "pipe" plugin. > > Congrats! > > Kern> The operation consisted of adding the following line to the > Kern> Include section of the FileSet: > > Kern> 1 2 3 4 > Kern> Plugin = "bpipe:/@MYSQL/regress.sql:mysqldump -f --opt --databases regress:mysql" > > Kern> This plugin line goes in the FileSet section where you have your > Kern> File = xxx lines, and for this plugin is composed of 4 fields > Kern> separated by colons (I've indicated the field numbers above the > Kern> Plugin line for reference only. > > Ugh... sorry to be negative, but could we spend some time coming up > with a nicer syntax please? > > Kern> Field 1 specifies a specific plugin name (in this case bpipe). > > Ok. But would it make more sense to have a plugin { ... } resource block > instead in the FileSet? > > Kern> Field 2 specifies the namespace (in this case the pseudo > Kern> path/filename under which the backup will be saved -- this > Kern> will show up in the restore tree selection). > > Hmm... what are the limits in Bacula's design in terms of namespace > support? Does it strictly follow the unix rooted tree, or does it > also support the notion of drive letters as in PCs? If so, why not > just extend the drive letter to be: > > DB:/MYSQL[345N]/<database>/<table> > > instead? Maybe the version of mysql doesn't matter, but might be > useful if you try to restore a version 5 mysql dump onto a version 3 > server to get a warning. What about even further developing the notion of plugins and what is being backed up. I think that the with plugin architecture, it would be better to get rid of the file-oriented backups. Instead, a notion of entities would better describe what is being backed up. These entities would not require globally matching naming scheme, i.e. one would not be forced to try represent all backed up entities as Unix file names with funny prefixes. Databases are not logically files. Instead, the naming scheme, configuration that defines what to backup and needed meta data, and methods how to backup and restore data would be implemented by various plugins. Plugin architecture would be needed for both Director and File Daemon. The Director plugins would handle backup configuration (think FileSet block in the current Director configuration) of the plugin, and interfacing with the catalog, or alternatively provide plugin-specific helper hooks for the Director core code accessing the catalog. File Daemon plugins would handle actual backing up and restoring of the data in the plugin-specific way. In practice, this would mean that the current feature set of bacula would be implemented by Filesystem plugin. After the plugin API is defined, it would be quite trivial to move the current Bacula backup and restore code to the Filesystem plugin. The naming scheme of the Filesystem would be the current way of naming files (/path on Unix, DriveLetter:/Path on Windows). So the backup entity in the Filesystem plugin case is a single file residing on a filesystem. The configuration of the Filesystem plugin would be done using the familiar FileSet block in the Director configuration. Also File Daemon side, the current code for actually providing the data to be backed up for the Storage Daemon (reading file contents from the file system and so on) and writing restore data in usable form (writing files on the file system with given content), could be re-used pretty much as-is in the File Daemon Filesystem plugin. As for handling databases natively, the plugins would work like this (I'm using PostgreSQL as an example, but this is not tied to PostgreSQL in anyway, plugins for other RDBMs would be pretty much the same at the high level I'm talking about). In the Director, there would be a configuration block that specifies the database resources to be backend up (but this block would not be called FileSets, remember that handling of the block is implemented by each plugin, and PostgreSQL plugin has nothing to do with the Filesystem plugin). These resources include at least the name of the database and/or tables to be backed up, the usual info needed to connect the PostgreSQL server (hostname, port, user, password and so on). This info is delivered to the configured File Daemon, and it uses the info to acutally read the backup data from the database. So the actual PostgreSQL instance to be backed up does not need to reside on the same machine that is running the File Daemon. For instance, you can use just one File Daemon in your institution to backup several PostgreSQL servers running on distinct hosts just fine, since the File Daemon PostgreSQL plugin uses the native PostgreSQL API to connect to the actual server as specified in the Director configuration. Director does not know anything about what or where the actual PostgreSQL server is, and it does not need to know, it's just instructed to backup some database using this and that File Daemon, which interfaces with the actual service. As for implementing the PostgreSQL File Daemon plugin, a lot of code for backuping up and restoring could be borrowed from pg_dump and psql tools of the PostgreSQL distribution. The backup entity of the PostgreSQL plugin would be the name of the database and/or tables. Pair of the database/tables name and the configured File Daemon is unique within the domain of the PostgreSQL plugin (just like with files, you can have many /etc/passwd files backed up, you can have many databases with the same names backed up. But the /etc/passwd on a certain File Daemon makes it unique within all the /etc/passwd's). But the main point is that the entry in the catalog for a PostgreSQL database does not need to resamble a valid file name in any way (it's trivial technical detail what the string would actually look like so I'm not proposing any format here). Of course, this is not a simple thing to design and implement. There would be no universally known grammar for the Director and File Daemon configuration file, for instance. The set of available keywords depends on the fact which plugins are loaded, and since anybody is able to implement a new plugin using the public plugin API, the amount of possible keywords is infinite. Also the catalog database schema would need a complete re-design. For instance, the filename, file and path tables in the current form would make sense only for the Filesystem Directory plugin. Ideas? Best Regards, -- Tuomas Jormola <tj at solitudo.net> |