From: Kern S. <ke...@si...> - 2008-07-31 05:17:52
|
On Thursday 31 July 2008 00:27:45 T. Horsnell wrote: > > Hello, > > > > Le Wednesday 30 July 2008 21:05:12 Kern Sibbald, vous avez écrit : > >>On Wednesday 30 July 2008 20:19:20 T. Horsnell wrote: > >>>Regarding performance: > >>>We will be backing up a 16TB raid system to LTO4, both hosted by the > >>> same machine. > >>>dd'ing a 100GB file from our raid to /dev/null I can achive > > >>>400Mbytes/sec. > >>> > >>>dd'ing the same file to /dev/nst0 with bs=256*512 I can get > >>>100Mbytes/sec. (This drops to 75 Mbytes/sec with a blocksize of 128*512 > >>>...) > >>>I get a similar speed if I tar the file to tape with the same blocksize. > >>> > >>> > >>>The documented (streaming?) speed of the drive is 120Mbytes/sec. > >> > >>120MB/sec seems terribly slow for an LTO4 drive. > > > > It's certainly without compression (it's something like 80MB/s for LTO3). > > With compression and good files (database, log, text, etc...) the speed > > can go to 180 and maybe 200MB/s > > This file was deliberately created full of random numbers to avoid > compression (I dont know how to set my drives to 'no-compression' yet...) > > >>>Backing up this file with Bacula (again with bs=256*512) I get > >>>70Mbytes/sec > > > > Try to make your tests with /dev/zero file, you will have the biggest > > throughput. > > You're telling me! A file of zero's compresses wonderfully, but is > hardly realistic. > > > When the drive will write up to 180MB/s, try to change > > "Minimum block size" and "Maximum block size" parameters (to something > > like 256KB and 512KB like kern saids) and set "Maximum File Size=10G" > > I did this with my dd and tar tests, but it made no further improvement. > I assumed there would be no improvement with Bacula either, so didnt > try. Every program is different. Comparing Bacula to tar and dd don't give very good results. If you want to know the raw speed that the SD can do without the FD and the DIR, then use btape -- see the Tape Testing chapter of the manual. > I didnt know about 'Maximum File Size', but it wouldnt have > affected my 100GByte test. I don't understand the above statement. IMO, as Eric said, it could well have changed your 100GB test for Bacula. > > > The Maximum File Size write a file mark every 1G by default, this breaks > > the data flow and reduce the throughput. > > > > If the data stream isn't constant (with small files for example) your > > LTO4 can became more slower than a LTO1. To avoid this you can use > > spooling. > > I cant predict the eventual file-mix, so I may use spooling. My > experiments so far have only been with the single 100Gbyte file. > > >>Well, two comments here: > >> > >>1. You probably should at least double your Bacula buffer size to 256KB > >> and possibly up it to 512KB. > > OK, I'll play. > > >>2. With an LTO3 tape drive it is possible with multiple simultaneous jobs > >>and very careful hardware tuning to get 150MB/sec through Bacula across a > >>network. With a good hardware setup and an LTO4 even higher throughput > >>should be possible. You may be limited by a hardware bottleneck > >> somewhere or possibly by the FD being unable to feed the SD fast enough > >> ... > > There is *no* network involved. The filesystem being backed up and the > tapechanger are both on the same box, so DIR, FD and SD all run on the > same box. As I said before there *is* a network involved. Bacula always transmits all the data between the daemons using TCP/IP. If you turn off networking on your machine and try to run Bacula it won't work. That is why comparing Bacula to dd and tar doesn't give very useful numbers ... > > >>>(from Bacula's email: > >>> FD Files Written: 1 > >>> SD Files Written: 1 > >>> FD Bytes Written: 100,000,000,000 (100.0 GB) > >>> SD Bytes Written: 100,000,000,106 (100.0 GB) > >>> Rate: 70028.0 KB/s) > >>> > >>>16Tbytes at 70Mbytes/sec is 63 hours > > > > If you define 2 jobs in // on two drives (with different pool), you can > > go up to 240MB/s (without compression). > > Yes, but as I said before, this means that each job would have to be > given a part of the whole filesystem (true?), and the size of those > parts may vary enormously, so one job may have very little to do whilst > the other has a lot to do. So one drive would be mostly idle while the > other is always busy. > > As always, thanks everyone for your input. > > Terry. > > >>>Using two copies if this file, and backing each one up with two separate > >>>jobs > >>>onto two separate tapedrives achieves about the same rate *per drive*, > >>> so I think that for me it would be worthwhile. > >>>And for users who are trying to make use of a number of slower cheaper > >>>drives instead of fast drives in an expensive tapechanger, this would be > >>>even more beneficial. > >> > >>Perhaps, but even assuming the project were approved, you would need to > >>find someone to implement it ... > >> > >>>I agree that such rates might not be achievable over a network. > >>>There is no network involved for our specific case, > >> > >>With Bacula there is *always* a network involved ... > >> > >>Regards, > >> > >>Kern > >> > >>>but our emerging > >>>10GbE net would easily sustain multiple 70Mbyte/sec streams. > >>> > >>>Cheers, > >>>Terry > >>> > >>>>Hello, > >>>> > >>>>After having discussed this a bit on the list, and re-reading your note > >>>>here, I realize that yes splitting to separate tapes would be possible > >>>>on a file by file basis. However, that feature would work much better > >>>>if the multiple threads for a given job were implemented, which is > >>>>already a project listed in the Projects file. > >>>> > >>>>Regards, > >>>> > >>>>Kern > >>>> > >>>>On Monday 21 July 2008 19:55:20 T. Horsnell wrote: > >>>>>OK, thanks. And sorry to have pestered the development *and* the users > >>>>>list about this. I just wanted to be sure you understood what I meant. > >>>>> > >>>>>Actually, I would have said that striping was the process of spreading > >>>>>a single *file* across multiple drives simultaneously (just like > >>>>>disk-striping). To my mind, spreading the files of a single *job* > >>>>>across multiple drives doesnt mean that part of each *file* is written > >>>>>to multiple drives, but instead, that when the storage daemon is > >>>>>writing to a device which has been declared as an autochanger with > >>>>>multiple drives, it would take the next file from its input stream and > >>>>>write it to an idle drive in that autochanger. > >>>>> > >>>>>I guess if the storage-daemon scheme has one thread per tapedrive, > >>>>>it doesnt lend itself to this mode of operation. > >>>>> > >>>>>THanks again, > >>>>>Terry > >>>>> > >>>>>Kern Sibbald wrote: > >>>>>>Yes, sorry, I did not understand. Bacula does not have the ability > >>>>>>to write a single job to multiple drives -- normally that is called > >>>>>>striping. It is unlikely that it will be implemented any time in the > >>>>>>near future. > >>>>>> > >>>>>>On Monday 21 July 2008 18:27:43 T. Horsnell wrote: > >>>>>>>Thank you for that quick reply, and once again, apologies for the > >>>>>>>interruption, but I dont want to split the backup across multiple > >>>>>>>jobs, (with part of the filesystem being handled by one job and part > >>>>>>>by the other), because however I make the split, the content (and > >>>>>>>hence the size) of each part of the filesystem will be continually > >>>>>>>changing (this is a 16TB multiuser raid system), and so one > >>>>>>>tapedrive may well be mostly idle whilst the other one is > >>>>>>>continually busy. So I want, if possible, to do this with a single > >>>>>>>job. > >>>>>>> > >>>>>>>Cheers, > >>>>>>>Terry > >>>>>>> > >>>>>>>Kern Sibbald wrote: > >>>>>>>>Hello, > >>>>>>>> > >>>>>>>>We generally do not supply support help on this list, but I will > >>>>>>>>give a few tips ... > >>>>>>>> > >>>>>>>>Bacula has been able to write to multiple drives simultaneously for > >>>>>>>>a very long time now -- many years and many versions. > >>>>>>>> > >>>>>>>>The simplest way to do it is to use different pools for each job. > >>>>>>>> > >>>>>>>>A not so satisfactory way of doing it is to use "Prefer Mounted > >>>>>>>>Volumes = no". I don't recommend this as it leads to many > >>>>>>>>operational problems. > >>>>>>>> > >>>>>>>>In general, if you are backing up raid disks, you should be able to > >>>>>>>>tune your hardware so that it will write approximately 150 MB/sec > >>>>>>>>with Bacula to an LTO3 drive, and so splitting jobs is not > >>>>>>>>generally necessary. Getting your hardware tuned to run at those > >>>>>>>>speeds is not easy and requires professional help. > >>>>>>>> > >>>>>>>>Best regards, > >>>>>>>> > >>>>>>>>Kern > >>>>>>>> > >>>>>>>>On Monday 21 July 2008 17:20:11 T. Horsnell wrote: > >>>>>>>>>Apologies for pestering the developers list, but I cant determine > >>>>>>>>>from the user docs whether what I want to do is failing because > >>>>>>>>>I'm doing it wrongly, or simply that its not supported. > >>>>>>>>> > >>>>>>>>>I want to define a single job which will backup a single (big) > >>>>>>>>>raid filesystem to an autochanger which contains multiple > >>>>>>>>>tapedrives, and I want this job to use all the tapedrives > >>>>>>>>>simultaneously. This would seem to me to be a pretty standard > >>>>>>>>>requirement, but I cant get it to work with Bacula version 2.4.1 > >>>>>>>>> > >>>>>>>>>Looking at the Storage Daemon section (6.4) in the new delvelopers > >>>>>>>>>docs for 2.5.2 (dated 20th July 2008 !) I see that this may not > >>>>>>>>>yet be possible: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>---cut--- > >>>>>>>>>Today three jobs (threads), two physical devices each job writes > >>>>>>>>>to only one device: > >>>>>>>>>Job1 -> DCR1 -> DEVICE1 > >>>>>>>>>Job2 -> DCR2 -> DEVICE1 > >>>>>>>>>Job3 -> DCR3 -> DEVICE2 > >>>>>>>>> > >>>>>>>>>To be implemented three jobs, three physical devices, but job1 is > >>>>>>>>>writing simultaneously to three devices: > >>>>>>>>> > >>>>>>>>>Job1 -> DCR1 -> DEVICE1 > >>>>>>>>> -> DCR4 -> DEVICE2 > >>>>>>>>> -> DCR5 -> DEVICE3 > >>>>>>>>>Job2 -> DCR2 -> DEVICE1 > >>>>>>>>>Job3 -> DCR3 -> DEVICE2 > >>>>>>>>>---cut--- > >>>>>>>>> > >>>>>>>>>Is what I want possible in 2.5.2, or should I wait? > >>>>>>>>> > >>>>>>>>>Cheers, > >>>>>>>>>Terry. > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge Build the coolest Linux based applications with Moblin SDK & win > great prizes Grand prize is a trip for two to an Open Source event anywhere > in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel |