From: Hanna L. <ha...@us...> - 2002-02-22 17:17:04
|
Here are the lockstat results I will be referring to in the call today including the results with my patch installed running dbench. Hanna ---- ___________________________________________________________________________________________ System: Linux elm3b96.eng 2.4.17 #2 SMP Mon Feb 18 14:15:56 PST 2002 i686 Total counts All (8) CPUs Start time: Mon Feb 18 14:25:40 2002 End time: Mon Feb 18 14:34:21 2002 Delta Time: 521.04 sec. Hash table slots in use: 308. Global read lock slots in use: 999. *************************** Warnings! ****************************** Read Lock table overflowed. The data in this report may be in error due to this. ************************ End of Warnings! ************************** - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME 12.4% 1.0us( 21ms) 10us( 20ms)( 7.0%) 232460100 87.6% 12.4% 0.00% *TOTAL* 0.00% 0% 0.1us( 0.1us) 0us 82 100% 0% 0% [0xf6fa9c38] 0.00% 0% 0.1us( 0.1us) 0us 82 100% 0% 0% n_tty_receive_buf+0x854 0.00% 0% 0.4us( 2.0us) 0us 454 100% 0% 0% [0xf6faac38] 0.00% 0% 0.4us( 2.0us) 0us 454 100% 0% 0% n_tty_receive_buf+0x44 0.00% 0% 3.2us( 9.4us) 0us 68 100% 0% 0% [0xf739c170] 0.00% 0% 1.9us( 4.7us) 0us 36 100% 0% 0% unmap_fixup+0x8c 0.00% 0% 4.7us( 9.4us) 0us 32 100% 0% 0% unmap_fixup+0x134 0.00% 0% 4.8us( 92us) 0us 912 100% 0% 0% [0xf753a4cc] 0.00% 0% 1.3us( 2.8us) 0us 82 100% 0% 0% tcp_delack_timer+0x18 0.00% 0% 0.1us( 0.4us) 0us 82 100% 0% 0% tcp_recvmsg+0x40 0.00% 0% 0.1us( 0.4us) 0us 82 100% 0% 0% tcp_recvmsg+0x7d4 0.00% 0% 0.2us( 0.9us) 0us 172 100% 0% 0% tcp_sendmsg+0x3c 0.00% 0% 0.2us( 0.7us) 0us 172 100% 0% 0% tcp_sendmsg+0x108c 0.00% 0% 17us( 92us) 0us 243 100% 0% 0% tcp_v4_rcv+0x2dc 0.00% 0% 1.5us( 3.1us) 0us 79 100% 0% 0% tcp_write_timer+0x18 0.00% 0% 1.3us( 7.1us) 0us 2084 100% 0% 0% [0xf7ac8a98] 0.00% 0% 1.3us( 7.1us) 0us 2084 100% 0% 0% rh_int_timer_do+0x4c 0.00% 0% 4.1us( 11us) 0us 291 100% 0% 0% [0xf7b7b0d0] 0.00% 0% 4.7us( 11us) 0us 130 100% 0% 0% dev_watchdog+0x14 0.00% 0% 3.6us( 6.6us) 0us 161 100% 0% 0% qdisc_restart+0x2c 0.00% 0% 1.4us( 6.9us) 0us 322 100% 0% 0% [0xf7b7b0dc] 0.00% 0% 2.1us( 6.9us) 0us 161 100% 0% 0% dev_queue_xmit+0x12c 0.00% 0% 0.6us( 1.9us) 0us 161 100% 0% 0% qdisc_restart+0x90 0.00% 0% 2.9us( 10us) 0us 322 100% 0% 0% [0xf7b7b4e4] 0.00% 0% 3.9us( 10us) 0us 161 100% 0% 0% tulip_interrupt+0xe8 0.00% 0% 2.0us( 3.4us) 0us 161 100% 0% 0% tulip_start_xmit+0x18 0.00% 0% 0.5us( 0.5us) 0us 1 100% 0% 0% [0xf7b8c658] 0.00% 0% 0.5us( 0.5us) 0us 1 100% 0% 0% kupdate+0xdc 0.00% 0% 0.1us( 0.7us) 0us 192 100% 0% 0% [0xf7bb64ec] 0.00% 0% 0.1us( 0.4us) 0us 48 100% 0% 0% shmem_delete_inode+0x68 0.00% 0% 0.1us( 0.1us) 0us 48 100% 0% 0% shmem_get_inode+0x1c 0.00% 0% 0.1us( 0.1us) 0us 48 100% 0% 0% shmem_getpage_locked+0x274 0.00% 0% 0.1us( 0.7us) 0us 48 100% 0% 0% shmem_recalc_inode+0x48 0.00% 0.38% 2.5us( 43us) 7.0us( 9.9us)(0.00%) 1836 99.6% 0.38% 0% [0xf7bca810] 0.00% 0.38% 2.5us( 43us) 7.0us( 9.9us)(0.00%) 1836 99.6% 0.38% 0% ahc_linux_queue+0x38 0.00% 0% 0.8us( 2.7us) 0us 730 100% 0% 0% allocator_request_lock 0.00% 0% 0.7us( 2.2us) 0us 365 100% 0% 0% scsi_free+0x1c 0.00% 0% 1.0us( 2.7us) 0us 365 100% 0% 0% scsi_malloc+0x48 0.09% 0.03% 0.3us( 48us) 1.5us( 4.8us)(0.00%) 1532602 100% 0.03% 0% arbitration_lock 0.00% 1.1% 0.4us( 1.7us) 0.8us( 1.6us)(0.00%) 901 98.9% 1.1% 0% deny_write_access+0xc 0.09% 0.03% 0.3us( 48us) 1.5us( 4.8us)(0.00%) 1531701 100% 0.03% 0% get_write_access+0xc 0.01% 0% 1.7us( 46us) 0us 45763 100% 0% 0% bdev_lock 0.01% 0% 1.7us( 46us) 0us 45763 100% 0% 0% bdget+0x34 0.03% 0% 13us( 78us) 0us 10378 100% 0% 0% call_lock 0.03% 0% 13us( 78us) 0us 10378 100% 0% 0% smp_call_function+0x58 3.1% 3.5% 0.6us(1047us) 3.2us( 979us)(0.07%) 25499542 96.5% 3.5% 0% dcache_lock 0.00% 0% 1.3us( 3.4us) 0us 35 100% 0% 0% autofs4_expire+0x38 0.00% 6.6% 1.2us( 11us) 10us( 154us)(0.00%) 14064 93.4% 6.6% 0% cached_lookup+0x78 0.01% 4.8% 0.2us( 10us) 4.0us( 122us)(0.00%) 173324 95.2% 4.8% 0% d_alloc+0x138 0.02% 9.3% 0.3us( 39us) 3.5us( 979us)(0.00%) 396624 90.7% 9.3% 0% d_delete+0x10 0.00% 2.6% 0.2us( 4.0us) 2.4us( 22us)(0.00%) 8616 97.4% 2.6% 0% d_delete+0x94 0.02% 7.9% 0.2us( 10us) 3.4us( 478us)(0.00%) 569964 92.1% 7.9% 0% d_instantiate+0x30 2.7% 3.0% 0.7us( 146us) 3.0us( 781us)(0.05%) 21621529 97.0% 3.0% 0% d_lookup+0x14 0.01% 8.0% 0.9us( 7.2us) 3.1us( 40us)(0.00%) 79560 92.0% 8.0% 0% d_move+0x48 0.01% 4.8% 0.2us( 18us) 3.0us( 95us)(0.00%) 173276 95.2% 4.8% 0% d_rehash+0x50 0.00% 2.0% 0.2us( 3.0us) 3.1us( 13us)(0.00%) 2856 98.0% 2.0% 0% d_unhash+0x44 0.29% 6.0% 0.6us( 55us) 3.5us( 960us)(0.01%) 2347088 94.0% 6.0% 0% dput+0x30 0.00% 6.2% 0.3us( 1.1us) 3.4us( 6.0us)(0.00%) 48 93.8% 6.2% 0% link_path_walk+0x1ac 0.00% 0% 0.8us( 4.7us) 0us 270 100% 0% 0% link_path_walk+0x398 0.00% 0% 1.2us( 1.7us) 0us 6 100% 0% 0% link_path_walk+0x7f8 0.02% 20.1% 19us( 302us) 17us( 164us)(0.00%) 6437 79.9% 20.1% 0% path_init_walk+0x18 0.00% 3.5% 1.4us( 398us) 4.0us( 70us)(0.00%) 6337 96.5% 3.5% 0% prune_dcache+0x14 0.01% 7.8% 0.8us( 764us) 4.3us( 59us)(0.00%) 88278 92.2% 7.8% 0% prune_dcache+0x148 0.01% 2.3% 5.1us(1047us) 3.1us( 20us)(0.00%) 11230 97.7% 2.3% 0% select_parent+0x20 0.00% 0% 2.3us( 43us) 0us 1638 100% 0% 0% device_request_lock 0.00% 0% 0.3us( 5.6us) 0us 819 100% 0% 0% __scsi_release_command+0x14 0.00% 0% 4.2us( 43us) 0us 819 100% 0% 0% scsi_allocate_device+0x30 0.00% 0% 2.1us( 44us) 0us 2122 100% 0% 0% emergency_lock 0.00% 0% 1.8us( 44us) 0us 1233 100% 0% 0% bounce_end_io_read+0xb8 0.00% 0% 2.6us( 43us) 0us 889 100% 0% 0% bounce_end_io_write+0x28 0.73% 3.4% 0.9us( 61us) 2.4us( 77us)(0.01%) 4447903 96.6% 3.4% 0% files_lock 0.00% 0% 0.3us( 0.8us) 0us 96 100% 0% 0% check_tty_count+0x10 0.13% 4.5% 0.4us( 61us) 2.3us( 56us)(0.00%) 1482284 95.5% 4.5% 0% file_move+0x18 0.10% 4.2% 0.4us( 21us) 2.5us( 46us)(0.00%) 1482576 95.8% 4.2% 0% fput+0x80 0.50% 1.3% 1.8us( 47us) 2.2us( 77us)(0.00%) 1482576 98.7% 1.3% 0% get_empty_filp+0xc 0.00% 2.2% 1.1us( 3.7us) 2.0us( 4.2us)(0.00%) 371 97.8% 2.2% 0% get_empty_filp+0xdc 0.08% 3.1% 7.2us(2010us) 0us 55226 96.9% 0% 3.1% global_bh_lock 0.08% 3.1% 7.2us(2010us) 0us 55226 96.9% 0% 3.1% bh_action+0x18 0.09% 0% 8.6us( 24us) 0us 52104 100% 0% 0% i8253_lock 0.09% 0% 8.6us( 24us) 0us 52104 100% 0% 0% timer_interrupt+0x2c 0.03% 0% 2.6us( 8.4us) 0us 52104 100% 0% 0% i8259A_lock 0.03% 0% 2.6us( 8.4us) 0us 52104 100% 0% 0% timer_interrupt+0x90 0.00% 0% 1.0us( 1.6us) 0us 5 100% 0% 0% inet_peer_unused_lock 0.00% 0% 1.0us( 1.6us) 0us 4 100% 0% 0% cleanup_once+0x24 0.00% 0% 0.7us( 0.7us) 0us 1 100% 0% 0% ipv4_dst_destroy+0x38 0.22% 0.20% 0.7us( 50us) 1.6us(10.0us)(0.00%) 1648444 99.8% 0.20% 0% inode_lock 0.04% 0% 0.5us( 7.7us) 0us 396793 100% 0% 0% __mark_inode_dirty+0x48 0.08% 0.16% 1.1us( 48us) 1.7us(10.0us)(0.00%) 397096 99.8% 0.16% 0% get_empty_inode+0x24 0.00% 0% 0.6us( 2.6us) 0us 129 100% 0% 0% get_new_inode+0x34 0.01% 0% 2.0us( 8.1us) 0us 28775 100% 0% 0% iget4+0x3c 0.03% 0% 0.4us( 35us) 0us 396640 100% 0% 0% insert_inode_hash+0x44 0.06% 0.61% 0.7us( 50us) 1.6us( 7.8us)(0.00%) 428011 99.4% 0.61% 0% iput+0x78 0.00% 0% 1.6us( 5.6us) 0us 17 100% 0% 0% sync_unlocked_inodes+0x10 0.00% 0% 0.8us( 2.9us) 0us 983 100% 0% 0% sync_unlocked_inodes+0x12c 0.00% 0.24% 2.9us( 393us) 2.4us( 6.1us)(0.00%) 8058 99.8% 0.24% 0% io_request_lock 0.00% 12.5% 0.4us( 0.9us) 2.2us( 2.2us)(0.00%) 8 87.5% 12.5% 0% __get_request_wait+0x90 0.00% 0.11% 1.8us( 19us) 2.9us( 6.1us)(0.00%) 2611 99.9% 0.11% 0% __make_request+0xec 0.00% 0.84% 6.4us( 393us) 2.2us( 3.3us)(0.00%) 833 99.2% 0.84% 0% ahc_linux_isr+0x2ec 0.00% 0% 9.4us( 46us) 0us 313 100% 0% 0% generic_unplug_device+0x10 0.00% 0.44% 3.2us( 49us) 2.3us( 4.6us)(0.00%) 1836 99.6% 0.44% 0% scsi_dispatch_cmd+0x11c 0.00% 0% 0.5us( 4.2us) 0us 819 100% 0% 0% scsi_finish_command+0x18 0.00% 0% 3.4us( 45us) 0us 819 100% 0% 0% scsi_queue_next_request+0x18 0.00% 0% 1.7us( 6.7us) 0us 819 100% 0% 0% scsi_request_fn+0x31c 15.2% 21.7% 4.0us( 21ms) 20us( 20ms)( 2.0%) 19757159 78.3% 21.7% 0% kernel_flag 0.00% 0% 13us( 16us) 0us 48 100% 0% 0% chrdev_open+0x4c 0.00% 2.8% 0.6us( 2.2us) 6.7us( 10us)(0.00%) 72 97.2% 2.8% 0% de_put+0x28 0.39% 15.8% 3663us( 21ms) 2676us( 20ms)(0.01%) 558 84.2% 15.8% 0% do_exit+0xe8 2.0% 19.9% 26us( 639us) 60us( 17ms)(0.11%) 392496 80.1% 19.9% 0% ext2_delete_inode+0x20 0.32% 22.6% 0.9us( 70us) 15us( 11ms)(0.15%) 1900785 77.4% 22.6% 0% ext2_discard_prealloc+0x20 7.1% 31.1% 7.5us( 772us) 21us(6750us)(0.78%) 4945389 68.9% 31.1% 0% ext2_get_block+0x54 0.00% 0% 3.6us( 5.5us) 0us 48 100% 0% 0% get_chrfops+0x8c 0.44% 27.4% 14us( 468us) 36us( 11ms)(0.04%) 164856 72.6% 27.4% 0% lookup_hash+0x80 0.07% 17.2% 6.5us( 142us) 16us( 343us)(0.00%) 52226 82.8% 17.2% 0% notify_change+0x64 0.00% 0% 0.9us( 1.0us) 0us 2 100% 0% 0% permission+0x38 0.02% 20.3% 15us( 609us) 71us( 11ms)(0.00%) 8274 79.7% 20.3% 0% real_lookup+0x64 0.01% 38.8% 110us(8793us) 48us(1968us)(0.00%) 595 61.2% 38.8% 0% schedule+0x558 0.00% 0% 36us( 305us) 0us 17 100% 0% 0% sync_old_buffers+0x1c 0.00% 8.4% 7.6us( 29us) 15us( 35us)(0.00%) 95 91.6% 8.4% 0% sys_ioctl+0x4c 0.00% 0% 0.3us( 0.8us) 0us 54 100% 0% 0% sys_llseek+0x88 1.3% 17.4% 0.6us( 82us) 18us(9245us)(0.85%) 11417880 82.6% 17.4% 0% sys_lseek+0x6c 0.42% 18.5% 27us( 381us) 16us(2249us)(0.01%) 79560 81.5% 18.5% 0% sys_rename+0x1a0 0.00% 5.9% 5.9us( 19us) 6.1us( 40us)(0.00%) 254 94.1% 5.9% 0% tty_read+0xbc 0.00% 0% 10us( 14us) 0us 48 100% 0% 0% tty_release+0x1c 0.00% 0% 12us( 39us) 0us 254 100% 0% 0% tty_write+0x1bc 2.0% 27.5% 27us( 500us) 14us(5198us)(0.04%) 392104 72.5% 27.5% 0% vfs_create+0x8c 0.10% 75.0% 117us( 552us) 32us( 510us)(0.00%) 4536 25.0% 75.0% 0% vfs_mkdir+0x88 0.00% 0% 16us( 57us) 0us 16 100% 0% 0% vfs_readdir+0x70 0.01% 24.0% 14us( 361us) 198us( 12ms)(0.01%) 4895 76.0% 24.0% 0% vfs_rmdir+0x158 0.00% 11.1% 7.7us( 10us) 2.9us( 2.9us)(0.00%) 9 88.9% 11.1% 0% vfs_statfs+0x4c 1.00% 19.9% 13us( 386us) 18us(8616us)(0.03%) 392088 80.1% 19.9% 0% vfs_unlink+0x118 7.4% 16.8% 0.5us( 433us) 7.0us( 10ms)( 2.2%) 76683966 83.2% 16.8% 0% kmap_lock 4.6% 17.2% 0.6us( 433us) 7.2us( 476us)( 1.1%) 38341983 82.8% 17.2% 0% kmap_high+0x10 2.9% 16.4% 0.4us( 107us) 6.8us( 10ms)( 1.0%) 38341983 83.6% 16.4% 0% kunmap_high+0xc 0.00% 0% 0.3us( 2.0us) 0us 558 100% 0% 0% lastpid_lock 0.00% 0% 0.3us( 2.0us) 0us 558 100% 0% 0% get_pid+0x20 7.4% 22.6% 1.3us(1015us) 16us( 987us)( 2.5%) 29247857 77.4% 22.6% 0% lru_list_lock 0.63% 8.2% 0.7us( 77us) 6.5us( 579us)(0.06%) 4938097 91.8% 8.2% 0% buffer_insert_inode_data_queue+0x10 0.17% 14.5% 0.2us( 56us) 7.7us( 763us)(0.10%) 3594480 85.5% 14.5% 0% buffer_insert_inode_queue+0x14 0.02% 14.4% 0.3us( 48us) 12us( 517us)(0.02%) 396290 85.6% 14.4% 0% inode_has_buffers+0x10 0.03% 15.3% 0.3us( 40us) 11us( 530us)(0.02%) 395954 84.7% 15.3% 0% invalidate_inode_buffers+0x10 0.00% 0% 63us(1015us) 0us 17 100% 0% 0% kupdate+0x98 2.6% 27.9% 1.3us( 83us) 17us( 987us)( 1.1%) 10037943 72.1% 27.9% 0% refile_buffer+0xc 1.5% 26.2% 1.6us( 78us) 18us( 810us)(0.55%) 4937208 73.8% 26.2% 0% remove_from_queues+0xc 0.00% 0% 15us( 29us) 0us 52 100% 0% 0% sync_old_buffers+0x74 2.5% 29.7% 2.6us( 161us) 18us( 844us)(0.64%) 4947816 70.3% 29.7% 0% try_to_free_buffers+0x1c 0.00% 0% 0.4us( 6.6us) 0us 1116 100% 0% 0% mmlist_lock 0.00% 0% 0.2us( 1.3us) 0us 558 100% 0% 0% copy_mm+0x120 0.00% 0% 0.6us( 6.6us) 0us 558 100% 0% 0% mmput+0x28 0.00% 0% 0.6us( 1.1us) 0us 1227 100% 0% 0% page_uptodate_lock.0 0.00% 0% 0.6us( 1.1us) 0us 1227 100% 0% 0% end_buffer_io_async+0x38 4.7% 8.1% 0.7us( 79us) 2.6us( 142us)(0.19%) 37219395 91.9% 8.1% 0% pagecache_lock 0.09% 9.4% 0.4us( 47us) 2.5us( 142us)(0.01%) 1165089 90.6% 9.4% 0% __find_get_page+0x18 1.4% 8.2% 0.5us( 79us) 2.7us( 122us)(0.08%) 15381288 91.8% 8.2% 0% __find_lock_page+0xc 0.00% 6.2% 1.2us( 2.4us) 0.9us( 1.2us)(0.00%) 48 93.8% 6.2% 0% add_to_page_cache+0x18 1.4% 8.3% 1.5us( 74us) 2.8us( 102us)(0.03%) 4934774 91.7% 8.3% 0% add_to_page_cache_unique+0x18 0.65% 7.7% 0.6us( 60us) 2.3us( 107us)(0.02%) 5476637 92.3% 7.7% 0% do_generic_file_read+0x1a4 0.00% 6.7% 1.6us( 6.5us) 2.4us( 24us)(0.00%) 10619 93.3% 6.7% 0% do_generic_file_read+0x370 0.00% 0% 0.1us( 0.5us) 0us 983 100% 0% 0% filemap_fdatasync+0x1c 0.00% 0% 0.1us( 0.4us) 0us 983 100% 0% 0% filemap_fdatawait+0x10 0.01% 11.3% 0.7us( 67us) 3.2us( 36us)(0.00%) 50299 88.7% 11.3% 0% find_or_create_page+0x38 0.02% 11.6% 2.1us( 9.5us) 3.2us( 68us)(0.00%) 50299 88.4% 11.6% 0% find_or_create_page+0x78 0.00% 0% 0.1us( 0.6us) 0us 1222 100% 0% 0% page_cache_read+0x48 0.63% 7.5% 0.7us( 72us) 2.3us( 89us)(0.02%) 4947864 92.5% 7.5% 0% remove_inode_page+0x28 0.00% 25.0% 0.5us( 1.1us) 2.4us( 6.1us)(0.00%) 48 75.0% 25.0% 0% set_page_dirty+0x24 0.04% 10.6% 0.9us( 40us) 2.4us( 67us)(0.00%) 251378 89.4% 10.6% 0% truncate_inode_pages+0x38 0.40% 8.5% 0.4us( 56us) 2.4us( 59us)(0.02%) 4947864 91.5% 8.5% 0% truncate_list_pages+0x1f8 1.3% 2.5% 0.6us( 88us) 1.9us( 87us)(0.01%) 10535738 97.5% 2.5% 0% pagemap_lru_lock 0.14% 6.4% 1.3us( 88us) 2.1us( 87us)(0.00%) 554467 93.6% 6.4% 0% activate_page+0xc 0.73% 3.6% 0.8us( 72us) 1.8us( 85us)(0.01%) 5014574 96.4% 3.6% 0% lru_cache_add+0x1c 0.43% 0.95% 0.5us( 55us) 2.0us( 67us)(0.00%) 4966697 99.1% 0.95% 0% lru_cache_del+0xc 0.11% 1.8% 5.0us( 33us) 4.8us( 150us)(0.00%) 118061 98.2% 1.8% 0% runqueue_lock 0.02% 2.6% 6.9us( 28us) 4.9us( 43us)(0.00%) 18765 97.4% 2.6% 0% __wake_up+0x5c 0.00% 32.0% 0.4us( 1.5us) 4.2us( 18us)(0.00%) 510 68.0% 32.0% 0% deliver_signal+0x48 0.00% 0.21% 6.7us( 33us) 2.7us( 3.5us)(0.00%) 968 99.8% 0.21% 0% process_timeout+0x14 0.00% 41.6% 1.7us( 3.4us) 2.5us( 7.6us)(0.00%) 113 58.4% 41.6% 0% schedule_tail+0x58 0.05% 1.9% 4.2us( 23us) 4.7us( 150us)(0.00%) 62553 98.1% 1.9% 0% schedule+0xc0 0.00% 0.29% 1.2us( 9.0us) 5.5us( 16us)(0.00%) 5930 99.7% 0.29% 0% schedule+0x284 0.03% 0.74% 6.5us( 29us) 5.9us( 61us)(0.00%) 28161 99.3% 0.74% 0% schedule+0x518 0.00% 1.1% 3.3us( 23us) 3.1us( 4.3us)(0.00%) 1061 98.9% 1.1% 0% wake_up_process+0x14 0.00% 0% 0.6us( 4.0us) 0us 42 100% 0% 0% sb_lock 0.00% 0% 0.1us( 0.1us) 0us 1 100% 0% 0% drop_super+0x24 0.00% 0% 0.3us( 1.4us) 0us 18 100% 0% 0% sync_supers+0x6c 0.00% 0% 1.0us( 4.0us) 0us 17 100% 0% 0% sync_unlocked_inodes+0x18 0.00% 0% 0.3us( 0.9us) 0us 6 100% 0% 0% sync_unlocked_inodes+0x1ac 0.00% 0% 0.2us( 3.9us) 0us 3502 100% 0% 0% scsi_bhqueue_lock 0.00% 0% 0.1us( 3.9us) 0us 1666 100% 0% 0% scsi_bottom_half_handler+0x1c 0.00% 0% 0.2us( 3.3us) 0us 1836 100% 0% 0% scsi_done+0x3c 0.00% 1.7% 0.6us( 5.1us) 2.1us( 7.1us)(0.00%) 3513 98.3% 1.7% 0% semaphore_lock 0.00% 1.8% 0.3us( 4.1us) 2.6us( 7.1us)(0.00%) 1465 98.2% 1.8% 0% __down+0x44 0.00% 1.7% 0.9us( 5.1us) 1.8us( 4.8us)(0.00%) 1995 98.3% 1.7% 0% __down+0x78 0.00% 0% 1.4us( 3.5us) 0us 41 100% 0% 0% __down_interruptible+0x48 0.00% 0% 1.0us( 2.0us) 0us 12 100% 0% 0% __down_interruptible+0x88 0.00% 0% 1.0us( 6.3us) 0us 1104 100% 0% 0% shm_ids+0x28 0.00% 0% 1.0us( 1.5us) 0us 48 100% 0% 0% ipc_addid+0x94 0.00% 0% 1.7us( 6.3us) 0us 456 100% 0% 0% shm_close+0x54 0.00% 0% 0.3us( 0.7us) 0us 48 100% 0% 0% shm_mmap+0x54 0.00% 0% 0.4us( 1.9us) 0us 408 100% 0% 0% shm_open+0x40 0.00% 0% 1.8us( 2.9us) 0us 48 100% 0% 0% sys_shmat+0xb8 0.00% 0% 0.4us( 0.7us) 0us 48 100% 0% 0% sys_shmat+0x25c 0.00% 0% 1.0us( 1.6us) 0us 48 100% 0% 0% sys_shmctl+0x684 0.00% 0% 0.3us( 1.4us) 0us 96 100% 0% 0% shmem_ilock 0.00% 0% 0.5us( 1.4us) 0us 48 100% 0% 0% shmem_delete_inode+0x38 0.00% 0% 0.2us( 1.3us) 0us 48 100% 0% 0% shmem_get_inode+0x154 0.00% 0% 1.1us( 1.6us) 0us 50 100% 0% 0% swaplock 0.00% 0% 1.1us( 1.6us) 0us 50 100% 0% 0% si_swapinfo+0x18 0.02% 0.02% 1.2us( 12us) 1.5us( 2.7us)(0.00%) 64918 100% 0.02% 0% timerlist_lock 0.00% 0.08% 0.8us( 6.5us) 1.6us( 1.8us)(0.00%) 4945 100% 0.08% 0% add_timer+0x10 0.00% 0.11% 0.2us( 3.8us) 0.8us( 0.8us)(0.00%) 1836 99.9% 0.11% 0% del_timer+0x14 0.00% 0.06% 0.6us( 6.5us) 1.2us( 1.2us)(0.00%) 1580 100% 0.06% 0% del_timer_sync+0x1c 0.00% 0.14% 1.0us( 5.4us) 1.5us( 1.5us)(0.00%) 713 99.9% 0.14% 0% mod_timer+0x18 0.01% 0.00% 1.4us( 12us) 2.3us( 2.7us)(0.00%) 52104 100% 0.00% 0% timer_bh+0xcc 0.00% 0% 0.3us( 5.2us) 0us 3740 100% 0% 0% timer_bh+0x254 0.00% 0% 5.0us( 9.8us) 0us 106 100% 0% 0% tlbstate_lock 0.00% 0% 5.0us( 9.8us) 0us 106 100% 0% 0% flush_tlb_others+0x90 0.00% 0% 0.2us( 5.2us) 0us 1756 100% 0% 0% tqueue_lock 0.00% 0% 0.1us( 2.7us) 0us 878 100% 0% 0% __run_task_queue+0x14 0.00% 0% 0.3us( 5.2us) 0us 573 100% 0% 0% batch_entropy_store+0x7c 0.00% 0% 0.2us( 2.4us) 0us 305 100% 0% 0% generic_plug_device+0x34 1.1% 1.2% 0.6us( 158us) 2.5us( 97us)(0.01%) 9943508 98.8% 1.2% 0% unused_list_lock 0.54% 1.7% 0.6us( 81us) 2.7us( 97us)(0.01%) 4995692 98.3% 1.7% 0% get_unused_buffer_head+0x8 0.52% 0.61% 0.5us( 158us) 1.9us( 62us)(0.00%) 4947816 99.4% 0.61% 0% try_to_free_buffers+0x54 0.00% 0% 2.6us( 2.6us) 0us 1 100% 0% 0% vga_lock 0.00% 0% 2.6us( 2.6us) 0us 1 100% 0% 0% vgacon_cursor+0x5c 0.92% 2.5% 1.0us( 20us) 1.9us( 73us)(0.01%) 4973314 97.5% 2.5% 0% __free_pages_ok+0x13c 0.18% 0.00% 0.2us( 51us) 2.6us( 27us)(0.00%) 4985772 100% 0.00% 0% __wake_up+0x24 0.00% 0.08% 0.2us( 3.4us) 0.6us( 1.1us)(0.00%) 18607 100% 0.08% 0% add_wait_queue+0x10 0.00% 7.1% 0.2us( 3.6us) 6.3us( 36us)(0.00%) 1569 92.9% 7.1% 0% add_wait_queue_exclusive+0x10 0.00% 0.06% 6.9us( 428us) 1.5us( 1.5us)(0.00%) 1666 100% 0.06% 0% ahc_linux_isr+0x24 0.00% 0% 0.9us( 2.6us) 0us 252 100% 0% 0% change_protection+0x44 0.00% 0% 31us( 60us) 0us 708 100% 0% 0% clear_page_tables+0x1c 0.00% 0% 0.1us( 1.2us) 0us 6344 100% 0% 0% copy_mm+0x1e8 0.00% 0% 2.4us( 100us) 0us 8769 100% 0% 0% copy_mm+0x230 0.00% 0% 2.8us( 99us) 0us 8769 100% 0% 0% copy_page_range+0x100 0.01% 0% 0.7us( 8.9us) 0us 53503 100% 0% 0% do_IRQ+0x40 0.00% 0% 0.2us( 6.7us) 0us 53503 100% 0% 0% do_IRQ+0xc0 0.00% 0% 2.1us( 18us) 0us 5328 100% 0% 0% do_anonymous_page+0xb8 0.00% 0% 0.4us( 0.9us) 0us 148 100% 0% 0% do_brk+0x1d4 0.00% 0% 0.1us( 2.1us) 0us 558 100% 0% 0% do_exit+0x104 0.00% 0% 0.1us( 0.1us) 0us 558 100% 0% 0% do_exit+0x138 0.00% 0% 0.5us( 2.4us) 0us 558 100% 0% 0% do_exit+0x94 0.00% 0.72% 0.9us( 36us) 1.7us( 4.2us)(0.00%) 2490 99.3% 0.72% 0% do_mmap_pgoff+0x40c 0.00% 0% 0.6us( 36us) 0us 3656 100% 0% 0% do_mmap_pgoff+0x418 0.00% 2.5% 0.2us( 1.2us) 0.9us( 1.5us)(0.00%) 554 97.5% 2.5% 0% do_munmap+0x1b8 0.00% 0% 1.0us( 6.3us) 0us 1486 100% 0% 0% do_munmap+0xe0 0.00% 0% 0.1us( 5.2us) 0us 34534 100% 0% 0% do_no_page+0x194 0.00% 0% 0.7us( 4.2us) 0us 659 100% 0% 0% do_page_fault+0xe0 0.00% 0% 0.8us( 6.4us) 0us 2051 100% 0% 0% do_sigaction+0x58 0.00% 0% 0.5us( 2.0us) 0us 450 100% 0% 0% do_sigaction+0xd8 0.00% 0% 5.3us( 14us) 0us 561 100% 0% 0% do_signal+0x54 0.01% 0% 2.7us( 21us) 0us 12352 100% 0% 0% do_wp_page+0x20c 0.00% 0% 0.1us( 2.5us) 0us 708 100% 0% 0% exit_mmap+0x18 0.00% 1.2% 0.2us( 3.3us) 1.1us( 1.9us)(0.00%) 8684 98.8% 1.2% 0% exit_mmap+0x88 0.00% 0% 1.4us( 18us) 0us 558 100% 0% 0% exit_sighand+0x18 0.16% 0.01% 28us( 153us) 9.5us( 9.6us)(0.00%) 30576 100% 0.01% 0% free_block+0x1c 0.01% 0% 0.5us( 14us) 0us 60712 100% 0% 0% handle_mm_fault+0x34 0.00% 0% 0.4us( 2.8us) 0us 561 100% 0% 0% handle_signal+0xb0 0.00% 0% 1.7us( 3.7us) 0us 150 100% 0% 0% insert_vm_struct+0x70 0.11% 0.38% 15us( 92us) 11us( 33us)(0.00%) 40348 99.6% 0.38% 0% kmem_cache_alloc_batch+0x18 0.00% 0.17% 0.3us( 5.2us) 2.9us( 4.9us)(0.00%) 5268 99.8% 0.17% 0% kmem_cache_grow+0x1f4 0.00% 0.08% 0.7us( 4.8us) 4.3us( 6.8us)(0.00%) 5268 100% 0.08% 0% kmem_cache_grow+0xa4 0.00% 0.40% 1.4us( 3.1us) 1.5us( 1.5us)(0.00%) 252 99.6% 0.40% 0% mprotect_fixup+0x2a8 0.00% 0% 0.6us( 1.1us) 0us 252 100% 0% 0% mprotect_fixup+0x2b4 0.00% 0.64% 0.2us( 1.3us) 1.0us( 1.5us)(0.00%) 779 99.4% 0.64% 0% n_tty_chars_in_buffer+0x18 0.00% 0% 0.6us( 1.4us) 0us 139 100% 0% 0% proc_pid_stat+0x64 0.00% 0% 3.9us( 96us) 0us 2124 100% 0% 0% pte_alloc+0x88 0.00% 0% 0.2us( 0.6us) 0us 150 100% 0% 0% put_dirty_page+0x3c 0.00% 0% 0.1us( 0.1us) 0us 254 100% 0% 0% read_chan+0x504 0.00% 0% 0.1us( 0.4us) 0us 254 100% 0% 0% read_chan+0x568 0.00% 1.2% 0.1us( 0.6us) 1.1us( 1.3us)(0.00%) 254 98.8% 1.2% 0% read_chan+0x5a4 0.00% 4.8% 0.2us( 1.7us) 1.0us( 1.7us)(0.00%) 607 95.2% 4.8% 0% release_task+0x3c 0.00% 0.80% 0.2us( 5.3us) 6.3us( 33us)(0.00%) 20176 99.2% 0.80% 0% remove_wait_queue+0x10 2.2% 9.1% 2.3us( 25us) 4.0us( 42us)(0.04%) 5026462 90.9% 9.1% 0% rmqueue+0x28 0.05% 0.00% 4.2us( 69us) 1.9us( 2.8us)(0.00%) 55463 100% 0.00% 0% schedule+0x4c8 0.00% 0% 1.4us( 12us) 0us 558 100% 0% 0% schedule_tail+0x20 0.00% 0% 5.2us( 40us) 0us 1213 100% 0% 0% send_sig_info+0x4c 0.00% 0% 1.4us( 3.1us) 0us 48 100% 0% 0% shmem_getpage_locked+0xd4 0.00% 0% 3.3us( 6.9us) 0us 96 100% 0% 0% shmem_truncate+0x38 0.00% 0% 0.1us( 0.3us) 0us 384 100% 0% 0% sys_rt_sigprocmask+0x18c 0.00% 0% 0.2us( 2.2us) 0us 4987 100% 0% 0% sys_rt_sigprocmask+0x98 0.00% 0% 0.5us( 3.4us) 0us 561 100% 0% 0% sys_sigreturn+0x84 0.00% 0% 0.5us( 1.0us) 0us 139 100% 0% 0% task_dumpable+0x18 0.00% 2.0% 1.5us( 2.3us) 0.9us( 1.2us)(0.00%) 152 98.0% 2.0% 0% unmap_fixup+0xa8 0.00% 0% 0.8us( 1.2us) 0us 153 100% 0% 0% unmap_fixup+0xb8 0.00% 0% 0.2us( 1.8us) 0us 813 100% 0% 0% vma_merge+0x54 0.00% 0% 0.3us( 7.4us) 0us 52226 100% 0% 0% vmtruncate+0x4c 0.03% 0% 12us( 236us) 0us 13196 100% 0% 0% zap_page_range+0x58 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - RWLOCK READS HOLD MAX RDR BUSY PERIOD WAIT UTIL CON MEAN RDRS MEAN( MAX ) MEAN( MAX )( %CPU) TOTAL NOWAIT SPIN NAME 1.4% 3.0us( 119us)(0.01%) 10060706 98.6% 1.4% *TOTAL* 0.00% 0% 0.4us 1 0.4us( 1.3us) 0us 161 100% 0% [0xf751dfb4] 0% 0us 161 100% 0% ip_output+0x68 0.00% 0% 4.0us 1 4.0us( 6.7us) 0us 82 100% 0% [0xf753a578] 0% 0us 82 100% 0% sock_def_readable+0x14 0.00% 0% 0.5us 1 0.5us( 13us) 0us 4 100% 0% [0xf7658204] 0% 0us 4 100% 0% copy_files+0x160 0.00% 0% 0.8us 1 0.8us( 2.8us) 0us 243 100% 0% [0xf78d0d20] 0% 0us 243 100% 0% tcp_v4_rcv+0x1b0 0.00% 0% 0.9us 1 0.9us( 4.8us) 0us 124 100% 0% arp_tbl+0xc8 0% 0us 124 100% 0% neigh_lookup+0x40 0.00% 0% 1.0us 1 1.0us( 2.0us) 0us 152 100% 0% binfmt_lock 0% 0us 151 100% 0% search_binary_handler+0x38 0% 0us 1 100% 0% search_binary_handler+0xdc 0.00% 0% 1.0us 1 1.0us( 1.3us) 0us 48 100% 0% chrdevs_lock 0% 0us 48 100% 0% get_chrfops+0x28 0.00% 0% 0.3us 1 0.3us( 1.5us) 0us 482 100% 0% fasync_lock 0% 0us 482 100% 0% kill_fasync+0x18 0.00% 0% 2.9us 1 2.9us( 8.6us) 0us 320 100% 0% fib_hash_lock 0% 0us 320 100% 0% fn_hash_lookup+0x10 1.4% 1.6% 0.8us 4 0.9us( 189us) 3.0us( 119us)(0.01%) 9105291 98.4% 1.6% hash_table_lock 1.6% 3.0us( 119us)(0.01%) 9105291 98.4% 1.6% get_hash_table+0x60 0.00% 0% 0.7us 1 0.7us( 3.3us) 0us 284 100% 0% inetdev_lock 0% 0us 124 100% 0% arp_rcv+0x28 0% 0us 160 100% 0% ip_route_input_slow+0x18 0.05% 0% 28.0us 4 28us( 219us) 0us 8590 100% 0% tasklist_lock 0% 0us 104 100% 0% count_active_tasks+0xc 0% 0us 558 100% 0% exit_notify+0x1c 0% 0us 4 100% 0% exit_notify+0xd0 0% 0us 10 100% 0% get_pid_list+0x14 0% 0us 48 100% 0% kill_pg_info+0x20 0% 0us 2 100% 0% kill_something_info+0xc8 0% 0us 123 100% 0% proc_pid_lookup+0x11c 0% 0us 139 100% 0% proc_pid_stat+0x134 0% 0us 5930 100% 0% schedule+0x238 0% 0us 18 100% 0% session_of_pgrp+0x14 0% 0us 8 100% 0% sys_setpgid+0x3c 0% 0us 1642 100% 0% sys_wait4+0x8c 0% 0us 4 100% 0% will_become_orphaned_pgrp+0x14 0.00% 0% 5.6us 1 5.6us( 8.3us) 0us 2 100% 0% udp_hash_lock 0% 0us 2 100% 0% udp_v4_mcast_deliver+0x10 0.00% 0.07% 0.9us 1 0.9us( 6.5us) 14us( 14us)(0.00%) 1359 100% 0.07% xtime_lock 0.07% 14us( 14us)(0.00%) 1359 100% 0.07% do_gettimeofday+0x14 0% 0us 70 100% 0% copy_files+0x108 0% 0us 70 100% 0% do_fork+0x344 0% 0us 906 100% 0% do_select+0x24 0% 0us 792804 100% 0% fget+0x1c 0% 0us 367 100% 0% ip_route_input+0x88 0% 0us 3 100% 0% link_path_walk+0x17c 0% 0us 405 100% 0% net_rx_action+0x44 0% 0us 148086 100% 0% path_init+0x134 0% 0us 622 100% 0% path_init+0x30 0% 0us 32 100% 0% path_init_walk+0x38 0% 0us 199 100% 0% path_init_walk+0xf4 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - RWLOCK WRITES HOLD WAIT (ALL) WAIT (WW) UTIL CON MEAN( MAX ) MEAN( MAX )( %CPU) MEAN( MAX ) TOTAL NOWAIT SPIN( WW ) NAME 1.0% 0.7us( 159us) 2.3us( 55us)(0.01%) 0.8us( 35us) 14491457 99.0% 0.95%(0.10%) *TOTAL* 0.00% 0% 0.7us( 0.9us) 0us 0us 2 100% 0%( 0%) [0xc323ed64] 0.00% 0% 0.7us( 0.9us) 0us 0us 2 100% 0%( 0%) sys_chdir+0x88 0.00% 0% 1.2us( 3.3us) 0us 0us 18 100% 0%( 0%) [0xf6fe3f64] 0.00% 0% 1.2us( 3.3us) 0us 0us 18 100% 0%( 0%) neigh_periodic_timer__thr+0xa8 0.00% 0% 8.2us( 15us) 0us 0us 18 100% 0%( 0%) arp_tbl+0xc8 0.00% 0% 8.2us( 15us) 0us 0us 18 100% 0%( 0%) neigh_periodic_timer__thr+0x20 0.00% 0% 0.8us( 1.8us) 0us 0us 103 100% 0%( 0%) dn_lock 0.00% 0% 0.8us( 1.8us) 0us 0us 103 100% 0%( 0%) fcntl_dirnotify+0x94 0.00% 0% 0.4us( 0.7us) 0us 0us 48 100% 0%( 0%) fasync_lock 0.00% 0% 0.4us( 0.7us) 0us 0us 48 100% 0%( 0%) fasync_helper+0x68 1.9% 1.5% 1.0us( 159us) 2.3us( 55us)(0.01%) 0.8us( 35us) 9930787 98.5% 1.4%(0.14%) hash_table_lock 0.01% 1.3% 1.3us( 17us) 6.1us( 35us)(0.00%) 3.0us( 35us) 45763 98.7% 0%( 1.3%) hash_page_buffers+0x48 0.33% 1.5% 0.3us( 70us) 2.2us( 50us)(0.00%) 0.6us( 11us) 4937208 98.5% 1.3%(0.16%) remove_from_queues+0x18 1.6% 1.6% 1.6us( 159us) 2.3us( 55us)(0.00%) 0.7us( 11us) 4947816 98.4% 1.4%(0.12%) try_to_free_buffers+0x28 0.00% 0.12% 4.0us( 17us) 22us( 31us)(0.00%) 0us 1674 99.9% 0.12%( 0%) tasklist_lock 0.00% 0% 1.1us( 3.7us) 0us 0us 558 100% 0%( 0%) do_fork+0x560 0.00% 0% 9.7us( 17us) 0us 0us 558 100% 0%( 0%) exit_notify+0x190 0.00% 0.36% 1.3us( 5.9us) 22us( 31us)(0.00%) 0us 558 99.6% 0.36%( 0%) release_task+0x8c 0.18% 0.00% 8.8us( 43us) 2.8us( 2.8us)(0.00%) 0us 104208 100% 0.00%( 0%) xtime_lock 0.02% 0.00% 1.7us( 17us) 2.8us( 2.8us)(0.00%) 0us 52104 100% 0.00%( 0%) timer_bh+0xc 0.16% 0% 16us( 43us) 0us 0us 52104 100% 0%( 0%) timer_interrupt+0x10 0.00% 0% 0.2us( 0.4us) 0us 0us 4 100% 0%( 0%) copy_files+0x134 0.00% 0% 0.1us( 0.6us) 0us 0us 121 100% 0%( 0%) do_fcntl+0x140 0.00% 0% 0.1us( 0.4us) 0us 0us 146 100% 0%( 0%) do_pipe+0x184 0.00% 0% 0.1us( 0.5us) 0us 0us 146 100% 0%( 0%) do_pipe+0x1c4 0.00% 0% 0.0us( 0.9us) 0us 0us 408 100% 0%( 0%) expand_fd_array+0x13c 0.00% 0% 0.2us( 11us) 0us 0us 412 100% 0%( 0%) expand_fd_array+0x88 0.00% 0% 0.0us( 0.9us) 0us 0us 150 100% 0%( 0%) flush_old_exec+0x230 0.00% 0% 0.0us( 0.2us) 0us 0us 54 100% 0%( 0%) flush_old_exec+0x29c 0.00% 0% 0.0us( 7.9us) 0us 0us 1482627 100% 0%( 0%) get_unused_fd+0x24 0.00% 0% 0.0us( 0.2us) 0us 0us 150 100% 0%( 0%) load_elf_binary+0x18c 0.00% 0% 0.0us( 0.8us) 0us 0us 7994 100% 0%( 0%) rt_check_expire__thr+0x64 0.00% 0% 0.0us( 6.4us) 0us 0us 1480058 100% 0%( 0%) sys_close+0x1c 0.00% 0% 0.1us( 2.6us) 0us 0us 144 100% 0%( 0%) sys_dup2+0x2c 0.00% 0% 0.0us( 5.0us) 0us 0us 1481935 100% 0%( 0%) sys_open+0x60 0.00% 0% 0.0us( 0.1us) 0us 0us 250 100% 0%( 0%) sys_open+0xbc _________________________________________________________________________________________________________________________ Number of read locks found=14 |
From: Dipankar S. <dip...@in...> - 2002-02-22 18:19:05
|
On Fri, Feb 22, 2002 at 09:18:41AM -0800, Hanna Linder wrote: > 3.1% 3.5% 0.6us(1047us) 3.2us( 979us)(0.07%) 25499542 96.5% 3.5% 0% dcache_lock IIUC, holding the dcache_lock while walking the path as long as we can find things in cache should decrease the number of times we grab dcache_lock, but it seems both 2.4.17 and fast-walk patch take the lock 25 million times. Is there something missing here ? Or are there too many cache misses ? Perhaps we need to investigate that. Decreasing the dcache_lock acquisition should be a goal here. Dipankar -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. |
From: Hanna L. <ha...@us...> - 2002-02-22 18:34:20
|
--On Friday, February 22, 2002 23:52:04 +0530 Dipankar Sarma <dip...@in...> wrote: > On Fri, Feb 22, 2002 at 09:18:41AM -0800, Hanna Linder wrote: >> 3.1% 3.5% 0.6us(1047us) 3.2us( 979us)(0.07%) 25499542 96.5% 3.5% 0% dcache_lock > > IIUC, holding the dcache_lock while walking the path as long as > we can find things in cache should decrease the number of times > we grab dcache_lock, but it seems both 2.4.17 and fast-walk patch Well, I am actually adding holding the dcache_lock where it was not held before (instead of incrementing d_count). So my understanding is the dcache_lock contention should go up a little bit. Although perhaps there is more tuning that needs to happen. Hanna |
From: Rick L. <ric...@us...> - 2002-02-22 19:00:43
|
Well, I am actually adding holding the dcache_lock where it was not held before (instead of incrementing d_count). So my understanding is the dcache_lock contention should go up a little bit. Although perhaps there is more tuning that needs to happen. "contention"? or "use"? I think we need to be precise about these terms. I'm defining "contention" as "wanted the lock but it was taken". Contention is often dependent on other cpu and interrupt code interactions, whereas use can often be determined from inspection. Consider the case where a lock is grabbed and released in some function. If you split that into two grabs/releases you can increase the "use" of the lock but if both the original and new hold times are fairly brief, you may not noticeably affect the "contention". Similarly, reducing "use" *may* reduce contention, but only if that's where the bottleneck was. When I look at lockstat output and want to reduce contention for a particular lock, I start by looking at the call with the longest hold time for that lock. The longer a lock is being held, the more likely it is that other requests will pile up behind it. The one with the highest "use" call will frequently be that one. Can the hold time be reduced? If it can't, THEN can that long operation's "use" be reduced? Picking away at locks held for 20us usually don't yield interesting results when there are places where it is held for (say) 100ms. Of course, many many very brief calls for a lock can induce "cache thrash" where more of the time is spent just getting the data structure to the proper cpu, not acquiring the actual lock itself. I'm not sure if lockstat measures (or can measure) cache thrash. I'd presume that time spent acquiring the lock (including waiting for it to be brought into your cache) counts toward the "wait" rather than the "hold" time. Does anybody know for sure? Rick |
From: Dipankar S. <dip...@in...> - 2002-02-22 20:44:25
|
On Fri, Feb 22, 2002 at 11:00:08AM -0800, Rick Lindsley wrote: > Of course, many many very brief calls for a lock can induce "cache > thrash" where more of the time is spent just getting the data structure > to the proper cpu, not acquiring the actual lock itself. I'm not sure Hmm... Here is how understood things - There can be lock cacheline thrashing too if you have a large number of brief acquisitions. If you have a spinlock implementation where you spinwait reading the location and then do a atomic dec/inc or xchg to acquire the lock, with a large number of consumers, the probability of doing the dec/xchg and then finding out that somebody else already got the lock increases significantly. This means that there will be a lot cpus who will modify the lock cacheline and then realize that somebody else already got the lock before going back to spin. More modifications of the lock cacheline -> more bouncing. > if lockstat measures (or can measure) cache thrash. I'd presume that > time spent acquiring the lock (including waiting for it to be brought > into your cache) counts toward the "wait" rather than the "hold" time. > Does anybody know for sure? As far as I could understand the lockmeter code, yes. I don't know how we can measure lock cache thrash - maybe using PMC instrumentation (L2_LINES_IN) ?? IMO, a global lock with lots of acquisitions is probably bad even if it is not very contentious. Thanks -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. |
From: Hanna L. <ha...@us...> - 2002-02-22 22:48:28
|
> to the proper cpu, not acquiring the actual lock itself. I'm not sure > if lockstat measures (or can measure) cache thrash. I'd presume that I know lockstat does not measure cache thrashing. This was brought up at an lse call a while ago to John Hawkes. As far as I recall no one had even heard of a tool that could measure cache hits/misses or thrashing. Sounds like something that we could use... > time spent acquiring the lock (including waiting for it to be brought > into your cache) counts toward the "wait" rather than the "hold" time. > Does anybody know for sure? John Hawkes or Ray Bryant might know. Hanna |
From: John H. <ha...@sg...> - 2002-02-27 23:11:13
|
> I'd presume that > time spent acquiring the lock (including waiting for it to be brought > into your cache) counts toward the "wait" rather than the "hold" time. > Does anybody know for sure? When metering a spin_lock(), lockmeter first does a spin_trylock(). If that's successful, then "hold" time begins. If the spin_trylock() is not successful, then the "wait" start-time is captured and an unmetered spin_lock() is done. When that acquisition completes, then the "wait" end-time is captured. So the answer is 'no', the "wait" time doesn't count that initial spin_trylock(). And the "hold" time doesn't count that initial spin_trylock(), either. If people think this is important, I can easily change it. John Hawkes |
From: Dipankar S. <dip...@in...> - 2002-02-22 19:27:25
|
On Fri, Feb 22, 2002 at 10:35:53AM -0800, Hanna Linder wrote: > > > --On Friday, February 22, 2002 23:52:04 +0530 Dipankar Sarma <dip...@in...> wrote: > > > On Fri, Feb 22, 2002 at 09:18:41AM -0800, Hanna Linder wrote: > >> 3.1% 3.5% 0.6us(1047us) 3.2us( 979us)(0.07%) 25499542 96.5% 3.5% 0% dcache_lock > > > > IIUC, holding the dcache_lock while walking the path as long as > > we can find things in cache should decrease the number of times > > we grab dcache_lock, but it seems both 2.4.17 and fast-walk patch > > Well, I am actually adding holding the dcache_lock where > it was not held before (instead of incrementing d_count). So my > understanding is the dcache_lock contention should go up a little > bit. Although perhaps there is more tuning that needs to happen. That is not what I understood. My point is this - if you are walking up /home/hanna/foo/mysql/blah, in 2.4.17 you would have to do 6 d_lookup()s and thus acquire the lock 6 times. If you hold the lock during cached lookups as long as you can find them during walk and assuming all of them are in cache, you will have only 1 acquisition of dcache_lock. But, yes, the average lock hold time should increase. Does this logic make sense or have I missed something ? Thanks -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. |
From: Hanna L. <ha...@us...> - 2002-02-22 19:31:39
|
>> > IIUC, holding the dcache_lock while walking the path as long as >> > we can find things in cache should decrease the number of times >> > we grab dcache_lock, but it seems both 2.4.17 and fast-walk patch >> >> Well, I am actually adding holding the dcache_lock where >> it was not held before (instead of incrementing d_count). So my >> understanding is the dcache_lock contention should go up a little >> bit. Although perhaps there is more tuning that needs to happen. > > That is not what I understood. > > My point is this - if you are walking up /home/hanna/foo/mysql/blah, > in 2.4.17 you would have to do 6 d_lookup()s and thus acquire the > lock 6 times. If you hold the lock during cached lookups as long Yes that is the way I understand it too. But with many CPU's all trying to access / and home for their path lookups too then wont that increase contention because the dcache_lock is being held longer by one cpu? Hanna |
From: Dipankar S. <dip...@in...> - 2002-02-22 19:43:43
|
On Fri, Feb 22, 2002 at 11:33:12AM -0800, Hanna Linder wrote: > > > > > My point is this - if you are walking up /home/hanna/foo/mysql/blah, > > in 2.4.17 you would have to do 6 d_lookup()s and thus acquire the > > lock 6 times. If you hold the lock during cached lookups as long > > Yes that is the way I understand it too. But with many CPU's > all trying to access / and home for their path lookups too then wont > that increase contention because the dcache_lock is being held longer > by one cpu? Yes, that is a possibility. Long avarage lock hold time and large number of acquisitions with shorter lock hold time, both can lead to high level of contention. It would depend on the statistical distribution of the lock requests for that workload. What we would need to do is to determine which is good for most "real" workloads. It may happen that slightly longer lock hold time with a drastic drop in acquisitions would give us good performance because of less contention *and* reduced cacheline bouncing. Besides there are always lock-free lookup techniques if necessary ;-) Thanks -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. |
From: Dipankar S. <dip...@in...> - 2002-02-22 20:12:59
|
On Fri, Feb 22, 2002 at 11:33:12AM -0800, Hanna Linder wrote: > > >> > IIUC, holding the dcache_lock while walking the path as long as > >> > we can find things in cache should decrease the number of times > >> > we grab dcache_lock, but it seems both 2.4.17 and fast-walk patch > >> > >> Well, I am actually adding holding the dcache_lock where > >> it was not held before (instead of incrementing d_count). So my > >> understanding is the dcache_lock contention should go up a little > >> bit. Although perhaps there is more tuning that needs to happen. > > > > That is not what I understood. > > > > My point is this - if you are walking up /home/hanna/foo/mysql/blah, > > in 2.4.17 you would have to do 6 d_lookup()s and thus acquire the > > lock 6 times. If you hold the lock during cached lookups as long > > Yes that is the way I understand it too. But with many CPU's Looking at the lockstat output, of the 25 million acquisitions, 21 million comes from d_lookup() roughly the same distribution for 2.4.17. For some reason, cached_lookup() is getting called without the dcache_lock held most of the time with fast_walk. I would have expected dcache_lock to be held for a larger number of cached_lookup()s which would avoid having to acquire the dcache_lock again by using __d_lookup(). Thanks -- Dipankar Sarma <dip...@in...> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. |
From: Hanna L. <ha...@us...> - 2002-02-22 23:05:13
|
> 2.4.17. For some reason, cached_lookup() is getting called > without the dcache_lock held most of the time with fast_walk. That is probably because I need to hook up more sys_calls to use path_lookup (new name for path_init_walk). Everything except vfs is still using the old path walk path. I am going to look into extending it to other filesystems. Hanna |