From: Paul S. <pa...@up...> - 2008-08-09 15:12:43
|
Hi Miklos and other filesystem guru's. I upgraded the kernel on my pc from 2.6.20.11 to 2.6.26 and fuse-2.8.0 from the cvs (co yesterday). I see very slow performance compared to 2.6.20.11. I like to use bonnie++ to test certain aspects of my filesystem. I ran "time bonnie++ -uroot -s0 -n100" under 2.6.20.11 with the following results: Version 1.03 ------Sequential Create------ --------Random Create-------- debian -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 100 24592 29 +++++ +++ 37742 23 22396 26 +++++ +++ 36432 23 debian,,,,,,,,,,,,,,100,24592,29,+++++,+++,37742,23,22396,26,+++++,+++,36432,23 User:0.312 Kernel:3.876 Total:0:14.93 CPU: 27.9% The same command under 2.6.26 results in: Version 1.03 ------Sequential Create------ --------Random Create-------- debian -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 100 8054 37 146627 41 13037 31 7790 34 +++++ +++ 13432 32 debian,,,,,,,,,,,,,,100,8054,37,146627,41,13037,31,7790,34,+++++,+++,13432,32 User:0.368 Kernel:14.608 Total:0:42.47 CPU: 35.2% The newer kernel is slower and uses a lot more CPU. (14.93 sec vs 42.47 sec. total time) (27.9% vs 35.2% CPU usage). I did an "oprofile -l" on both but could not isolate the problem. 2.6.20.11: CPU: Core 2, speed 2397.7 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % app name symbol name 19619 4.7623 vmlinux-2.6.20.11 mwait_idle_with_hints 11306 2.7444 vmlinux-2.6.20.11 fuse_dev_read 10661 2.5878 vmlinux-2.6.20.11 system_call 8945 2.1713 libc-2.3.6.so strlen 8081 1.9616 vmlinux-2.6.20.11 task_rq_lock 8049 1.9538 libc-2.3.6.so strncpy 7901 1.9179 vmlinux-2.6.20.11 kunmap_atomic 7019 1.7038 libJudy.so.1.0.3 j__udyLGet 6931 1.6824 vmlinux-2.6.20.11 try_to_wake_up 6909 1.6771 libpthread-2.3.6.so pthread_mutex_lock 6582 1.5977 libc-2.3.6.so _int_malloc 6108 1.4826 libc-2.3.6.so memset 5642 1.3695 libJudy.so.1.0.3 j__udyInsWalk 5345 1.2974 libpthread-2.3.6.so pthread_mutex_unlock 4944 1.2001 libJudy.so.1.0.3 JudyLNext 4834 1.1734 vmlinux-2.6.20.11 find_busiest_group 4665 1.1324 libJudy.so.1.0.3 j__udyDelWalk 4661 1.1314 vmlinux-2.6.20.11 prepare_to_wait 4613 1.1197 libJudy.so.1.0.3 JudySLGet -----snip----- 2.6.26: samples % app name symbol name 209608 20.9357 vmlinux-2.6.26 mach_get_cmos_time 56031 5.5964 vmlinux-2.6.26 lock_acquired 42243 4.2192 vmlinux-2.6.26 __lock_acquire 30939 3.0902 vmlinux-2.6.26 clear_lock_stats 27708 2.7675 vmlinux-2.6.26 lock_stats 22123 2.2096 vmlinux-2.6.26 __rwlock_init 20888 2.0863 vmlinux-2.6.26 lockdep_reset 20378 2.0354 vmlinux-2.6.26 in_gate_area 18412 1.8390 vmlinux-2.6.26 lockdep_free_key_range 17508 1.7487 vmlinux-2.6.26 lockdep_init_map 17351 1.7330 vmlinux-2.6.26 irq_entries_start 12552 1.2537 vmlinux-2.6.26 count_matching_names 10445 1.0432 vmlinux-2.6.26 fuse_copy_args 9208 0.9197 libpthread-2.3.6.so pthread_mutex_lock 9022 0.9011 libc-2.3.6.so _int_malloc 9020 0.9009 libJudy.so.1.0.3 j__udyLGet 8721 0.8711 libc-2.3.6.so strlen 8564 0.8554 vmlinux-2.6.26 unmap_vmas 8558 0.8548 libc-2.3.6.so strncpy 7609 0.7600 vmlinux-2.6.26 __spin_lock_init 7275 0.7266 libpthread-2.3.6.so pthread_mutex_unlock 6647 0.6639 libc-2.3.6.so memset One can see from this that there is a lot more kernel overhead in 2.6.26. "mach_get_cmos_time" is by far the function that is called the most. It would probably help the performance a lot if I can eliminate it. I therefore tried to include/exclude various RTC modules during the kernel build. None made a difference. Did I mess up the kernel config ? If so please tell me what to include and exclude. Or maybe this is the price for "mmap and big_writes" ? If anyone has any insight into this, please let me know. Regards Paul |