From: Karl R. <km...@us...> - 2008-05-01 23:01:29
|
Hi I have been trying to do some testing of a large number of guests (72) on a big multi-node IBM box (8 sockets, 32 cores, 128GB) and I am having various issues with the guests. I can get the guests to boot, but then I start to have problems. Some guests appear to stall doing I/O and some become unresponsive and spin their single vcpu at 100%. Each guest is configured with 1 vcpu and 1000MB of memory. The single virtual disk is backed by a LVM volume. Both the guest and host are running custom kernels. I have tried kvm-67, kvm-64, and kvm-62 (not functional at all). I have cloned both the kvm and kvm-userspace repositories and am building the tagged changesets from each. Here are a few of the various things I have tried: virtio and emulated devices for the nic and disk; mixed virtio and emulated devices; kvm-clock and clock=jiffies. Any help in pinpointing the problem would be appreciated. Thanks. -- Karl Rister IBM Linux Performance Team km...@us... (512) 838-1553 (t/l 678) |
From: Marcelo T. <mto...@re...> - 2008-05-02 00:13:49
|
On Thu, May 01, 2008 at 06:00:44PM -0500, Karl Rister wrote: > Hi > > I have been trying to do some testing of a large number of guests (72) on a > big multi-node IBM box (8 sockets, 32 cores, 128GB) and I am having various > issues with the guests. I can get the guests to boot, but then I start to > have problems. Some guests appear to stall doing I/O and some become > unresponsive and spin their single vcpu at 100%. Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please grab kvm_stat --once output when that happens. Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the host booted with "profile=kvm". Make sure all guests are running the same kernel image. The profiling should be easier to understand if you have 1 guest spinning and remaining ones idle. |
From: Karl R. <km...@us...> - 2008-05-02 19:21:20
|
On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote: > On Thu, May 01, 2008 at 06:00:44PM -0500, Karl Rister wrote: > > Hi > > > > I have been trying to do some testing of a large number of guests (72) on > > a big multi-node IBM box (8 sockets, 32 cores, 128GB) and I am having > > various issues with the guests. I can get the guests to boot, but then I > > start to have problems. Some guests appear to stall doing I/O and some > > become unresponsive and spin their single vcpu at 100%. > > Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please > grab kvm_stat --once output when that happens. I have tried -no-kvm-irqchip and it didn't help any. I will try -no-kvm-pit and get the kvm_stat info for both. > > Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the > host booted with "profile=kvm". Make sure all guests are running the same > kernel image. Will do. > > The profiling should be easier to understand if you have 1 guest spinning > and remaining ones idle. -- Karl Rister IBM Linux Performance Team km...@us... (512) 838-1553 (t/l 678) |
From: Karl R. <km...@us...> - 2008-05-06 15:10:42
|
On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote: > Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please > grab kvm_stat --once output when that happens. Per some suggestions I have moved up to kvm-68 which is better, but still having problems. Replicating the problem with only one guest spinning has proven quite difficult, but attempting to boot a large smp guest can reliably recreate the problem. Using -no-kvm-pit did not help the large guest and -no-kvm-irqchip made it seize up even earlier with only 1 cpu spinning instead of all of them. > > Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the > host booted with "profile=kvm". Make sure all guests are running the same > kernel image. I got this from a spinning 16-way guest with only 8 of the host CPUs online and without either -no-kvm-irqchip or -no-kvm-pit: [root@newcastle ~]# readprofile -r ; readprofile -m karl/System.map-2.6.25-03591-g873c05f 101 native_read_tsc 3.4828 1 read_persistent_clock 0.0192 25 kvm_clock_read 0.2660 95 getnstimeofday 0.7252 13 update_wall_time 0.0138 1 second_overflow 0.0020 readprofile: profile address out of range. Wrong map file? The kvm_stat output during this is: [root@newcastle ~]# kvm_stat --once efer_reload 23354 0 exits 3587109 2250 fpu_reload 1934298 0 halt_exits 4583 0 halt_wakeup 42 0 host_state_reload 2165502 167 hypercalls 1482 0 insn_emulation 900199 0 insn_emulation_fail 0 0 invlpg 0 0 io_exits 1983116 0 irq_exits 427728 2250 irq_window 0 0 largepages 0 0 mmio_exits 163522 0 mmu_cache_miss 176 0 mmu_flooded 99 0 mmu_pde_zapped 191 0 mmu_pte_updated 10 0 mmu_pte_write 59030 0 mmu_recycled 0 0 mmu_shadow_zapped 99 0 pf_fixed 14890 0 pf_guest 0 0 remote_tlb_flush 29 0 request_irq 0 0 signal_exits 1 0 tlb_flush 481952 0 The output with -no-kvm-pit looked almost identical and with -no-kvm-pit there was no samples registered for either tool. -- Karl Rister IBM Linux Performance Team km...@us... (512) 838-1553 (t/l 678) |
From: Marcelo T. <mto...@re...> - 2008-05-06 16:31:58
|
Hi Karl, On Mon, May 05, 2008 at 08:40:22PM -0500, Karl Rister wrote: > On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote: > > Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please > > grab kvm_stat --once output when that happens. > > Per some suggestions I have moved up to kvm-68 which is better, but still > having problems. Replicating the problem with only one guest spinning has > proven quite difficult, but attempting to boot a large smp guest can reliably > recreate the problem. Using -no-kvm-pit did not help the large guest > and -no-kvm-irqchip made it seize up even earlier with only 1 cpu spinning > instead of all of them. > > > > > Also run "readprofile -r ; readprofile -m System-map-of-guest.map" with the > > host booted with "profile=kvm". Make sure all guests are running the same > > kernel image. > > I got this from a spinning 16-way guest with only 8 of the host CPUs online > and without either -no-kvm-irqchip or -no-kvm-pit: > > [root@newcastle ~]# readprofile -r ; readprofile -m > karl/System.map-2.6.25-03591-g873c05f > 101 native_read_tsc 3.4828 > 1 read_persistent_clock 0.0192 > 25 kvm_clock_read 0.2660 > 95 getnstimeofday 0.7252 > 13 update_wall_time 0.0138 > 1 second_overflow 0.0020 > readprofile: profile address out of range. Wrong map file? KVM clock has known problems with SMP guests, please disable it for now. Also disable LOCKDEP on the guest if it has more VCPU's than CPU's available in the host. |
From: Avi K. <av...@qu...> - 2008-05-07 07:56:34
|
Karl Rister wrote: > On Thursday 01 May 2008 7:16:53 pm Marcelo Tosatti wrote: > >> Does -no-kvm-irqchip or -no-kvm-pit makes a difference? If not, please >> grab kvm_stat --once output when that happens. >> > > Per some suggestions I have moved up to kvm-68 which is better, but still > having problems. Replicating the problem with only one guest spinning has > proven quite difficult, but attempting to boot a large smp guest can reliably > recreate the problem. Using -no-kvm-pit did not help the large guest > and -no-kvm-irqchip made it seize up even earlier with only 1 cpu spinning > instead of all of them. > > Can you try the many-uniprocessor-guests scenario, with each guest pinned to a cpu? taskset $(( 1 << (RANDOM % 32) )) qemu ... -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-05-04 08:41:18
|
Karl Rister wrote: > Hi > > I have been trying to do some testing of a large number of guests (72) on a > big multi-node IBM box (8 sockets, 32 cores, 128GB) and I am having various > issues with the guests. I can get the guests to boot, but then I start to > have problems. Some guests appear to stall doing I/O and some become > unresponsive and spin their single vcpu at 100%. > One of the problems with these large boxes is that their TSCs are not synced across sockets; you may be hitting related issues. Can you try configuring the guests not to use the tsc? Also, if you are running on an old host kernel, you won't have smp_call_function_single() and there will be many broadcast IPIs. Please use a recent host kernel (kvm.git is best, though a bit bleeding edge). -- error compiling committee.c: too many arguments to function |