From: Bill H. <ha...@au...> - 2001-09-21 16:52:49
|
I created a simple program (readraw.c) that does reads from a raw device in order to understand the code path [in support of a database benchmarking effort that is using raw i/o]. A link to the program : http://lse.sourceforge.net/benchmarks/ltc/rawread/20_sep_2001/rawread.c I collected a kernprof ACG for one instance of the rawread program that did (128) reads of 128KB from the raw device. The system under test was a 4-way 200 Mhz and a 2.4.9 kernel w/ SGI profile patch. Here is a link to the ACG : http://lse.sourceforge.net/benchmarks/ltc/rawread/20_sep_2001/rawread249.acg.txt Here is the raw read code down to the block device layer : ----------------------------------------------- 0.00 0.31 150/150 system_call [3] [4] 1.4 0.00 0.31 150 sys_read [4] 0.00 0.29 128/128 raw_read [5] 0.00 0.02 18/18 proc_file_read [22] 0.00 0.00 4/14 generic_file_read [78] 0.00 0.00 151/253 fput [191] 0.00 0.00 150/177 fget [194] ----------------------------------------------- 0.00 0.29 128/128 sys_read [4] [5] 1.3 0.00 0.29 128 raw_read [5] 0.00 0.29 128/128 rw_raw_dev [6] ----------------------------------------------- 0.00 0.29 128/128 raw_read [5] [6] 1.3 0.00 0.29 128 rw_raw_dev [6] 0.03 0.25 128/128 brw_kiovec [7] 0.00 0.01 128/128 map_user_kiobuf [35] 0.00 0.00 128/128 mark_dirty_kiobuf [205] 0.00 0.00 128/128 unmap_kiobuf [206] ----------------------------------------------- 0.03 0.25 128/128 rw_raw_dev [6] [7] 1.2 0.03 0.25 128 brw_kiovec [7] 0.03 0.17 32768/32795 submit_bh [8] 0.03 0.00 32768/32771 set_bh_page [13] 0.01 0.00 128/128 wait_kio [36] 0.00 0.01 128/128 kiobuf_wait_for_io [44] 0.00 0.00 32768/32770 init_buffer [129] ----------------------------------------------- 0.00 0.00 1/32795 block_read_full_page [118] 0.00 0.00 2/32795 ll_rw_block [111] 0.00 0.00 24/32795 write_locked_buffers [61] 0.03 0.17 32768/32795 brw_kiovec [7] [8] 0.9 0.03 0.17 32795 submit_bh [8] 0.01 0.16 32795/32795 generic_make_request [9] ----------------------------------------------- 0.01 0.16 32795/32795 submit_bh [8] [9] 0.8 0.01 0.16 32795 generic_make_request [9] 0.15 0.00 32795/32795 _make_request [10] 0.01 0.00 32795/32795 blk_get_queue [40] ----------------------------------------------- 0.15 0.00 32795/32795 generic_make_request [9] [10] 0.7 0.15 0.00 32795 _make_request [10] 0.00 0.00 32661/32661 elevator_linus_merge [130] 0.00 0.00 32640/32640 scsi_back_merge_fn_c [131] 0.00 0.00 32514/32514 elevator_linus_merge_cleanup [132] 0.00 0.00 134/134 generic_plug_device [203] 0.00 0.00 3/3 attempt_merge [345] 0.00 0.00 2/2 scsi_front_merge_fn_c [426] ----------------------------------------------- A couple of observations from the ACG : (1) The (128) raw reads of 128KB result in 32768 calls to submit_bh which is acquiring the io_request_lock at least (2) times for each call. (2) The 128K raw read appears to be broken down into (2) I/O of 64KB each. I think the 1st one completes and then the 2nd one is initiated. Seems that there may be some room for improvement if we were able to reduce the calls to submit_bh by sending down 1 or 2 buffer heads instead of 256 for each 128KB read. Also, I did a quick test on SMP running (8) instances of the rawread program. The observation was that possibly some of the I/O is driven down to the SCSI layer before all of the buffer heads for one of the 128KB reads have made it down to the block device layer. This may result in more IO which could impact performance. We will be studying this code path more. Comments ? Bill Hartner bha...@us... IBM Linux Technology Center |