From: Andrew M. <ak...@zi...> - 2002-07-11 02:59:43
|
Hanna Linder wrote: > > ... > Andrew and Martin, > > I ran this updated patch on 2.5.25 with dbench on > the 8-way with 4 Gb of memory compared to clean 2.5.25. > I saw a significant improvement in throughput about 15% > (averaged over 5 runs each). Thanks, Hanna. The kernel compile test isn't a particularly heavy user of copy_to_user(), whereas with RAM-only dbench, copy_*_user() is almost the only thing it does. So that makes sense. Tried dbench on the 2.5G 4xPIII Xeon: no improvement at all. This thing seems to have quite poor memory bandwidth - maybe 250 megabyte/sec downhill with the wind at its tail. > Included is the pretty picture (akpm-2525.png) the > data that picture came from (akpm-2525.data) and the raw > results of the runs with min/max and timing results > (2525akpmkmaphi and 2525clnhi). > I believe the drop at around 64 clients is caused by > memory swapping leading to increased disk accesses since the > time increased by 200% in direct correlation with the decreased > throughput. Yes. The test went to disk. There are two reasons why it will do this: 1: Some dirty data was in memory for more than 30-35 seconds or 2: More than 40% of memory is dirty. In your case, the 64-client run was taking 32 seconds. After that the disks lit up. Once that happens, dbench isn't a very good benchmark. It's an excellent benchmark when it's RAM-only though. Very repeatable and hits lots of code paths which matter. You can run more clients before the disk I/O cuts in by increasing /proc/sys/vm/dirty_expire_centisecs and /proc/sys/vm/dirty_*_ratio. The patch you tested only uses the atomic kmap across generic_file_read. It is reasonable to hope that another 15% or morecan be gained by holding an atomic kmap across writes as well. On your machine ;) Here's what oprofile says about `dbench 40' with that patch: c0140f1c 402 0.609543 __block_commit_write c013dfd4 413 0.626222 vfs_write c01402cc 431 0.653515 __find_get_block c013a895 472 0.715683 .text.lock.highmem c017fe30 494 0.749041 ext2_get_block c012cef0 564 0.85518 unlock_page c013ee80 564 0.85518 fget c01079f4 571 0.865794 apic_timer_interrupt c01e8ecc 594 0.900669 radix_tree_lookup c013da90 597 0.905218 generic_file_llseek c01514b4 607 0.92038 __d_lookup c0106ff8 687 1.04168 system_call c013a02c 874 1.32523 kunmap_high c0148388 922 1.39801 link_path_walk c0140b00 1097 1.66336 __block_prepare_write c01346d0 1138 1.72552 rmqueue c01127ac 1243 1.88473 smp_apic_timer_interrupt c0139eb8 1514 2.29564 kmap_high c0105368 6188 9.38272 poll_idle c012d8a8 9564 14.5017 file_read_actor c012ea70 21326 32.3361 generic_file_write Not taking a kmap in generic_file_write is a biggish patch - it means changing the prepare_write/commit_write API and visiting all filesystems. The API change would be: core kernel no longer holds a kmap across prepare/commit. If the filesystem wants one for its own purposes then it gets to do it for itself, possibly in its prepare_write(). I think I'd prefer to get some additional testing and understanding before undertaking that work. It arguably makes sense as a small cleanup/speedup anyway, but that's not a burning issue. hmm. I'll do just ext2, and we can take another look then. - |