|
From: Eric L. G. <er...@ba...> - 2001-07-03 02:31:24
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've put the preliminary architecture documentation up at http://tapioca.sourceforge.net . That should give you a good idea of how this is structured. Aside from the 'pudding', the rest of the architecture should be somewhat familiar, except for the parts inflicted by Java (like the tape server being multi-threaded rather than multi-process). Now it's time to decide on the archive format. We need to get this right because changing it after we've started making backups will be a major pain in the %$#@@#. Some criteria: 1. Must be able to multiplex (and demultiplex) multiple streams into the same tape file. This indicates that the input needs to be blocked in a structured way, rather than just being a plain old stream of data, and that each block needs to be tagged with a stream ID. 2. Archives should have a header that includes the stream ID's (and descriptive labels) of every stream multiplexed into them, otherwise doing a manual restore of a multiplexed archive (e.g. to restore the central authority if it crashes) will be horribly difficult. When restoring by hand, a stream ID (or rather index -- e.g. we can tell it "--stream=1" and it knows to only dump stuff labeled with stream id=a00523134.342) can be used and it'll strip out everything except that stream on the restore. 3. The tape format should not require doing a MT_TELL for every bloody block written to tape, only for blocks that actually need it (i.e, blocks that contain the beginning of a piece of data logged into the database). This tends to indicate that blocks need tagging with a "type" field. 4. The format should be able to handle two things other than raw data blocks: a) producing location information suitable for logging into the central authority's location database for use in future restores, and b) holding any OS-specific data needed to fully restore the file. 5. The stream format will have to hold data about what kind of writer produced the data in the file, so that the file logger can properly account for the differences in display format and pass that data upstream to the user interface. We don't want to force Unix filename format onto Windows or Mac or etc.! Similarly, if we're backing up a database file dump stream (one possible data source) we don't want to have to pretend that it contains Unix-structured data, and we need to know it came from a database stream dumper rather than from a filesystem dumper, so that when we go to restore it we know what restorer to use! So each type of data stream creator will need a unique creator ID of some sort to tell us what kind of widget created the data stream, and this gets put into the header so that we can grab it and know what to restore this data stream with. 6. For volume changes, the full header information should be replicated on the new volume, along with what volume we're working on etc. so that if we have a tape that is a volume 2, we have more of a chance of associating it with the correct volume 1 if we have to do this by hand. 7. Fixed-size blocks, or variable-sized blocks? Fixed-sized blocks, like 'tar' uses, are easy to deal with, and can be easily packed into larger buffers (as long as said larger buffers are a multiple of the blocksize in length). However, each block adds overhead. If the block size is too small, overhead becomes too much of a percentage of the block. If the block size is too large, then we have too much wasted space at the end of the block. Variable-sized blocks could be used, but we could require that these be packed into a fixed-size buffer of some large size (perhaps 64K or 128K) such that each buffer begins with a block and no block spans buffers. This is a pain, but results in less wasted space and thus better performance in the end. Note that if we limit the variable-sized blocks to 32k in size, we can represent the size of the block with only 2 bytes in the block's header. 8. Checksumming streams: We should probably only worry about checksumming buffer-sized chunks of data, not individual blocks of structured data. Setup time for the CRC calculations can thus be reduced, as can the overhead of the CRC checksum itself. 9. I think Mr. Fish mentioned that we probably want an "end of file" block in file streams so that we know we have reached the end of a file. This simplifies some programming, I guess. Did I misread the message? Okay, I think this is enough to think about. I am especially curious to know what you think about the notion of putting variable-sized blocks into bigger buffer-sized blocks. I think this solves many problems (we never really know how much OS-specific data is going to be in file headers, for example), but is somewhat more complex than fixed-size blocks like 'tar', and yes, there is still some overhead in some cases (if we don't have enough space at the end of a buffer for a block, that space is wasted). Comments? Once we have tossed around the criteria, we can come up with some possible tape layouts, and then I'll be happy to write a spec for it and put it into the CVS archive. I'm currently working on a spec/RFC type template that we can use for that (it's not in the CVS archive yet though). - -- Eric Lee Green mailto:er...@ba... BadTux: http://www.badtux.org GnuPG public key at http://badtux.org/eric/eric.gpg -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7QSwj3DrrK1kMA04RAvIvAJ0bz1uc+MWd8fHLd4BGJngMl8lA7QCeNb8L NT6WVcEtcAOYFbasYOEFCcU= =2S0P -----END PGP SIGNATURE----- |