From: Bill H. <ha...@au...> - 2002-09-23 16:46:32
|
A microbenchmark was used to compare : ====================================== * 2.5.36 raw io mapped to scsi devices for both read() and readv() * 2.5.36 direct io on scsi devices for both read() and readv() * 2.5.25 raw io mapped to scsi device for read() (baseline - performs well) Both throughput and CPU utilization were measured. The microbenchmark used a sequential read test. More on microbenchmark below. The kernel config options used were : ===================================== * SMP * 1GB Memory support * IBM ServeRAID support * Adaptec AIC7xxx support * Raw driver (/dev/raw/rawN) The test was ran on : ===================== * 8-way 700 Mhz PIII Xeon, 4GB mem, 1 MB L2 * (4) ServeRAID-4Mx controllers * (64) 9.1 GB 10K rpm 40 MB/s drives configured as (32) RAID0 logical drives with a stripe size of 64KB Summary of Results : ==================== Throughput and CPU utilization for read() and readv() for both raw io mapped to scsi devices and direct io to scsi devices performs equally. All perform better than 2.4.17+bounce+io_request_lock+raw vary+readv. Throughput (KB/s) ----------------- read size 2.5.25 2.5.36 2.5.36 2.5.36 2.5.36 bytes raw read raw read raw readv direct read direct readv ------- -------- -------- --------- ----------- ------------ 4096 90061 90539 -- 90925 -- 8192 129142 129423 -- 130076 -- 16384 193644 193753 -- 195001 -- 32768 254528 255197 -- 253962 -- 65536 275309 273590 275521 277257 274998 131072 298339 298602 298227 298403 298647 262144 296258 296591 296563 296162 296429 524288 296682 296067 296565 296328 296107 1048576 301537 301075 302025 302133 301638 CPU Utilization --------------- read size 2.5.25 2.5.36 2.5.36 2.5.36 2.5.36 bytes raw read raw read raw readv direct read direct readv --------- -------- -------- --------- ----------- ------------ 4096 21.10 20.60 -- 20.84 -- 8192 15.43 14.54 -- 14.61 -- 16384 11.33 10.59 -- 10.67 -- 32768 7.23 6.82 -- 6.76 -- 65536 4.03 4.02 4.42 3.99 4.44 131072 4.19 4.32 4.58 4.25 4.59 262144 3.82 4.12 4.23 4.13 4.22 524288 3.71 4.16 4.19 4.09 4.25 1048576 3.78 4.40 4.43 4.35 4.45 Microbenchmark : ================ For the raw io test, a raw device was mapped to each SCSI device. For the direct io test, each SCSI device was opened with the O_DIRECT flag. Each test was ran 3 times and the average of throughput and CPU utilization is reported. The CPU utilization was measured by reading /proc/stat. (32) processes were started and each process read from one of the (32) devices. The read() size was varied from 4KB to 1024KB and the number of reads was adjusted so the total bytes read by each process for each read size was 256M and 256MB * 32 = 8GB for the entire test. For the readv() test the number of iovecs was 16. So when comparing the read() results to readv() results the comparisons would be, for example, a read() size of 64KB to a readv() of 16 iovecs of 4KB - should perform nearly the same except that the readv() may have slightly higher CPU. The microbenchmark can be found here : http://www-124.ibm.com/developerworks/opensource/linuxperf/rawread/rawread.html Future work : ============= We will also compare O_DIRECT for a device with O_DIRECT for a file. The performance of O_DIRECT for a file should approach the performance of O_DIRECT for a device. The microbenchmark will be modified to add async IO tests for all of the above. Comments or suggestions for additional performance tests are welcomed. Bill Hartner IBM Linux Technology Center |