From: Henrik J. <he...@sc...> - 2010-04-09 06:35:37
|
On 04/ 8/10 03:10 PM, Craig Ringer wrote: > Hi > > After discussion on -users about concurrent writes to disk volumes by > the sd, which raised concerns about fragmentation and impact of > concurrency on overall write performance, I thought I'd do some testing > to look into it. I have spend some time profiling the bacula-sd under production load on Solaris. Besides checking for signs of lock contention and memory / CPU usage issues I also tracked the I/O statistics for each file descriptor opened within the bacula-sd process while running. According to my notes there was a noticeable change in throughput when testing 100 concurrent jobs against 100 separate devices (100 separate volumes) compared against writing the same jobs against 20 concurrent volumes. I/O throughput increased with more concurrency. There also was a noticeable increase in CPU time between those tests (more concurrency == more CPU time spend by the bacua-sd process) but nothing that worried me (but may need further investigation to fully understand). The above is specific to our setup so YMMV on other platforms and hardware. > I've just put together a simple tool that uses a specified number of > threads to write dummy data to files (one per thread) on a volume. It > optionally uses posix_fallocate(...) to pre-allocate space in > user-specified chunks up to the total target file size. > > I don't do a lot of work in C, and it probably shows; I mostly work with > C++, Python and Java. It's also a utility and benchmark not a production > program. > > The tool should be useful for testing out the effects of using > posix_fallocate(...) to pre-allocate space in volumes on various file > systems, with various levels of write concurrency, where performance is > limited by storage write speeds. Fragmentation, write throughput, and > overall write time are of interest. > > The usage summary from the executable should provide very basic > guidance on its use.It builds from a simple Makefile on any Linux (and > probably BSD) system. Typical invocation might be: > > ./palloc 5 1G 900M y > > ( five threads each write 1GB dummy volumes, posix_fallocate()ing 900MB > chunks ) or > > ./palloc 3 100M 1M n > > ( three threads each write 100MB files, not using posix_fallocate. The > chunksize parameter is required but has no effect if posix_fallocate is > n. I should make that prettier, but can't be bothered. ) > > e2fsprogs is required for file fragmentation measurement on ext* file > systems. It expects the e2fsprogs filefrag utility to be in > /usr/sbin/filefrag. If not found, no fragmentation measurement will be > done. Fragmentation measurement isn't supported for non ext- systems. > > ( Actually, it seems to work for xfs as well, as it provides the same > ioctl as ext3 for extent examination. Handy. ) > > > > I've already found some interesting things, albeit with only brief and > preliminary testing. The first is that periodically calling > posix_fallocate(...) to pre-allocate anything short of very large chunks > chunks of the volume in advance seems to be counter-productive. It's > slower on ext4 and ext3, for one thing. On xfs it also dramatically > *increases* fragmentation. I assume that's because posix_fallocate > forces immediate allocation of the data, overriding xfs's delayed > allocation logic, which otherwise works astonishingly well. > > However, if the expected size of the volume is known in advance, calling > posix_fallocate() to preallocate that space seems to be a significant > performance win at least on ext4 and xfs, and drastically reduces > fragmentation. > > Unsurprisingly, it looks like it'd be necessary for the sd to have a > decent idea of how big the backup volume will need to be when it starts > creating/appending it in order to be able to help the file system make > better decisions. > > I'll play with it some more to do some proper tests tomorrow, time > permitting. I thought it'd be of interest to have the tool in the mean > time, though. Please send complaints/abuse/improvements/bugs/cries of > horror my way. > > -- > Craig Ringer -- Med venlig hilsen / Best Regards Henrik Johansen he...@sc... Tlf. 75 53 35 00 ScanNet Group A/S ScanNet |