From: Kern S. <ke...@si...> - 2007-04-28 20:53:27
|
On Friday 27 April 2007 21:19, Vladimir Doisan wrote: > What came out of this project? I was testing GZIP compression of the > backups and was hoping that bacula would have other algorithms available > such as 7zip (as a side note - this would complement nicely the server > side compression module) =46rom what I remember, at some point someone (perhaps it was Lee Lists)=20 submitted a patch or a partial patch that I looked at and suggested a numbe= r=20 of changes. I never got a response. As Landon pointed out recently, the code has been patched a number of times= ,=20 and at some point, possibly before adding yet another stream, we will need = to=20 refactor the code, so exactly how the inclusion of any new algorithm (such = as=20 7-zip) is handled needs to be carefully considered. >=20 >=20 > Thanks for the info >=20 >=20 >=20 > Kern Sibbald a =C3=A9crit : > > On Monday 20 February 2006 20:11, Lee Lists wrote: > > > >> Hi Kern, > >> > >> I looked through the code and tried to figure out how to extend bacula > >> in such a way to include compression plugins. > >> > > > > Yes, the current Bacula is not really setup for plugins. > > > > > >> So I think I need 3 streams defines say: > >> > >> STREAM_COMPRESSED_DATA > >> STREAM_SPARSE_COMPRESSED_DATA > >> STREAM_WIN32_COMPRESSED_DATA > >> > >> and some room to qualify the stream ie the compression algo (and perha= ps > >> plugin specific data) > >> > > > > Contrary to the problem of how to deal with "plugins" within Bacula, > having a > > single stream contain any number of different data streams all handled > by the > > same code is easy. Each stream is just a stream of bytes, so for exampl= e a > > stream STREAM_COMPRESSED_DATA, you could devote the first 100 byte of t= he > > stream to contain information on the buffer sizes, the compression > > algorithm, ... After that you could follow the "100 byte header" with a= ny > > data you want. Obviously, the first 100 bytes could be any size you wan= t. > > > > The only restriction (more a condition than a restriction) is that all > data > > you write including headers must be byte order independent and integer > byte > > size independent -- i.e. able to be read/written by big/littleendian > machines > > as well as 32/64 bit machines. Bacula already has several ways of handl= ing > > that -- see the serial routines for example. > > > > > >> My first concern is to not break existing on disk format so my question > >> is : where can I store the compression algo if not in the "stream" fie= ld. > >> > > > > If you don't want to store the algorithm in the data stream as I indica= ted > > above, you can always define another stream such as > > STREAM_COMPRESSION_ALGORITHM. Then you don't have to worry about > changing the > > size or separating a "header" out from the STREAM_COMPRESSION_DATA. > > > > You are free to define the data within a stream that you create any > way you > > want. No code but your own code will attempt to look at the contents > of the > > stream. The SD just receives the bytes and writes them to tape in the > order > > they come. > > > > > >> In the extended attributes ? If not where else ? > >> > > > > It may be possible that the 7-zip code already handles these kinds of > > problems, without using the Bacula mechanisms. It would be something > to look > > at. > > > > Best regards, > > > > Kern > > > > > >> Regards, > >> Lee > >> > >> Kern Sibbald a =C3=A9crit : > >> > >>> On Monday 20 February 2006 09:44, Michel Meyers wrote: > >>> > >>>> Lee Lists wrote: > >>>> > >>>>> Of course I will take a look, but my knowledge lzo is up to know the > >>>>> fastest compression algo in the wild. > >>>>> > >>>>> Regards, > >>>>> Lee > >>>>> > >>>>> Kern Sibbald a =C3=A9crit : > >>>>> > >>>> >> [...] > >>>> > >>>>>> easier than now. Second, it seems to me that two things are needed: > >>>>>> 1. a compression algorithm that is faster than the current zlib. 2= =2E a > >>>>>> compression algorithm that compresses better than the current zlib. > >>>>>> > >>>>>> Without having yet done any detailed studies of the question, it > seems > >>>>>> to me that the code distributed by the 7-zip project most likely > >>>>>> satisfies these desires. > >>>>>> > >>>>>> Would you be interested in looking at the 7-zip code and giving us > >>>>>> your evaluation of it relative to the lzo code that you have used? > >>>>>> > >>>>>> Best regards, > >>>>>> > >>>>>> Kern > >>>>>> > >>>> Just FYI: The 7-zip algorythm is called LZMA as far as I know. > >>>> > >>> 7-zip is a project (www.7-zip.org), which is a suite of compression > >>> algorithms from what I understand. Their "main" compression algorithm= is > >>> an enhanced form of LZMA -- at least that is what I understand from > >>> reading their web site. What interests me the most about their site is > >>> that they claim to have a programmers interface to many different > >>> algorithms. > >>> > >>> lzo may well be the fastest algorithm, but I would be surprised if it > >>> does the most compression. I think we need a range of possibilities f= or > >>> users depending on their application (i.e. fast, or high compression,= or > >>> some trade-off). > >>> > >>> > >>>> A quick > >>>> Google search brought up these links to benchmarks: > >>>> http://www.linuxjournal.com/node/8051/ > >>>> http://tukaani.org/lzma/benchmarks > >>>> > >>>> Not sure how valid those are but if I interpret them correctly, LZMA= is > >>>> quite slow and very memory hungry but does achieve very good > compression > >>>> ratios. > >>>> > >>>> Greetings, > >>>> Michel > >>>> > > > > >=20 >=20 > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel >=20 |