From: Yunfeng Z. <yun...@in...> - 2008-04-24 10:30:27
|
Hi All, This is today's KVM test result against kvm.git 873c05fa7e6fea27090b1bf0f67a073eadb04782 and kvm-userspace.git d102d750f397b543fe620a3c77a7e5e42c483865. In today's nightly testing, we meet host hang while booting multiple guests several times. This issue could be easily reproduced. Two Old Issues: ================================================ 1. Booting four guests likely fails https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1919354&group_id=180599 2. Cannot boot guests with hugetlbfs https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1941302&group_id=180599 Test environment ================================================ Platform Woodcrest CPU 4 Memory size 8G' Details ================================================ IA32-pae: 1. boot guest with 256M memory PASS 2. boot two windows xp guest PASS 3. boot 4 same guest in parallel PASS 4. boot linux and windows guest in parallel PASS 5. boot guest with 1500M memory PASS 6. boot windows 2003 with ACPI enabled PASS 7. boot Windows xp with ACPI enabled PASS 8. boot Windows 2000 without ACPI PASS 9. kernel build on SMP linux guest PASS 10. LTP on linux guest PASS 11. boot base kernel linux PASS 12. save/restore 32-bit HVM guests PASS 13. live migration 32-bit HVM guests PASS 14. boot SMP Windows xp with ACPI enabled PASS 15. boot SMP Windows 2003 with ACPI enabled PASS 16. boot SMP Windows 2000 with ACPI enabled PASS ================================================ IA32e: 1. boot four 32-bit guest in parallel PASS 2. boot four 64-bit guest in parallel PASS 3. boot 4G 64-bit guest PASS 4. boot 4G pae guest PASS 5. boot 32-bit linux and 32 bit windows guest in parallel PASS 6. boot 32-bit guest with 1500M memory PASS 7. boot 64-bit guest with 1500M memory PASS 8. boot 32-bit guest with 256M memory PASS 9. boot 64-bit guest with 256M memory PASS 10. boot two 32-bit windows xp in parallel PASS 11. boot four 32-bit different guest in para PASS 12. save/restore 64-bit linux guests PASS 13. save/restore 32-bit linux guests PASS 14. boot 32-bit SMP windows 2003 with ACPI enabled PASS 15. boot 32-bit SMP Windows 2000 with ACPI enabled PASS 16. boot 32-bit SMP Windows xp with ACPI enabled PASS 17. boot 32-bit Windows 2000 without ACPI PASS 18. boot 64-bit Windows xp with ACPI enabled PASS 19. boot 32-bit Windows xp without ACPI PASS 20. boot 64-bit UP vista PASS 21. boot 64-bit SMP vista PASS 22. kernel build in 32-bit linux guest OS PASS 23. kernel build in 64-bit linux guest OS PASS 24. LTP on 32-bit linux guest OS PASS 25. LTP on 64-bit linux guest OS PASS 26. boot 64-bit guests with ACPI enabled PASS 27. boot 32-bit x-server PASS 28. boot 64-bit SMP windows XP with ACPI enabled PASS 29. boot 64-bit SMP windows 2003 with ACPI enabled PASS 30. live migration 64bit linux guests PASS 31. live migration 32bit linux guests PASS 32. reboot 32bit windows xp guest PASS 33. reboot 32bit windows xp guest PASS Report Summary on IA32-pae Summary Test Report of Last Session ===================================================================== Total Pass Fail NoResult Crash ===================================================================== control_panel 7 7 0 0 0 Restart 2 2 0 0 0 gtest 15 15 0 0 0 ===================================================================== control_panel 7 7 0 0 0 :KVM_LM_PAE_gPAE 1 1 0 0 0 :KVM_four_sguest_PAE_gPA 1 1 0 0 0 :KVM_256M_guest_PAE_gPAE 1 1 0 0 0 :KVM_linux_win_PAE_gPAE 1 1 0 0 0 :KVM_1500M_guest_PAE_gPA 1 1 0 0 0 :KVM_SR_PAE_gPAE 1 1 0 0 0 :KVM_two_winxp_PAE_gPAE 1 1 0 0 0 Restart 2 2 0 0 0 :GuestPAE_PAE_gPAE 1 1 0 0 0 :BootTo32pae_PAE_gPAE 1 1 0 0 0 gtest 15 15 0 0 0 :ltp_nightly_PAE_gPAE 1 1 0 0 0 :boot_up_acpi_PAE_gPAE 1 1 0 0 0 :reboot_xp_PAE_gPAE 1 1 0 0 0 :boot_up_vista_PAE_gPAE 1 1 0 0 0 :boot_up_acpi_xp_PAE_gPA 1 1 0 0 0 :boot_up_acpi_win2k3_PAE 1 1 0 0 0 :boot_base_kernel_PAE_gP 1 1 0 0 0 :boot_smp_acpi_win2k3_PA 1 1 0 0 0 :boot_smp_acpi_win2k_PAE 1 1 0 0 0 :boot_up_acpi_win2k_PAE_ 1 1 0 0 0 :boot_smp_acpi_xp_PAE_gP 1 1 0 0 0 :boot_up_noacpi_win2k_PA 1 1 0 0 0 :boot_smp_vista_PAE_gPAE 1 1 0 0 0 :bootx_PAE_gPAE 1 1 0 0 0 :kb_nightly_PAE_gPAE 1 1 0 0 0 ===================================================================== Total 24 24 0 0 0 Report Summary on IA32e Summary Test Report of Last Session ===================================================================== Total Pass Fail NoResult Crash ===================================================================== control_panel 15 14 1 0 0 Restart 3 3 0 0 0 gtest 25 25 0 0 0 ===================================================================== control_panel 15 14 1 0 0 :KVM_LM_64_g64 1 1 0 0 0 :KVM_four_sguest_64_gPAE 1 1 0 0 0 :KVM_4G_guest_64_g64 1 1 0 0 0 :KVM_four_sguest_64_g64 1 1 0 0 0 :KVM_linux_win_64_gPAE 1 1 0 0 0 :KVM_1500M_guest_64_gPAE 1 1 0 0 0 :KVM_SR_64_g64 1 0 1 0 0 :KVM_LM_64_gPAE 1 1 0 0 0 :KVM_256M_guest_64_g64 1 1 0 0 0 :KVM_1500M_guest_64_g64 1 1 0 0 0 :KVM_4G_guest_64_gPAE 1 1 0 0 0 :KVM_SR_64_gPAE 1 1 0 0 0 :KVM_256M_guest_64_gPAE 1 1 0 0 0 :KVM_two_winxp_64_gPAE 1 1 0 0 0 :KVM_four_dguest_64_gPAE 1 1 0 0 0 Restart 3 3 0 0 0 :GuestPAE_64_gPAE 1 1 0 0 0 :BootTo64_64_gPAE 1 1 0 0 0 :Guest64_64_gPAE 1 1 0 0 0 gtest 25 25 0 0 0 :boot_up_acpi_64_gPAE 1 1 0 0 0 :boot_up_noacpi_xp_64_gP 1 1 0 0 0 :boot_smp_acpi_xp_64_g64 1 1 0 0 0 :boot_base_kernel_64_gPA 1 1 0 0 0 :boot_smp_acpi_win2k3_64 1 1 0 0 0 :boot_smp_acpi_win2k_64_ 1 1 0 0 0 :boot_base_kernel_64_g64 1 1 0 0 0 :bootx_64_gPAE 1 1 0 0 0 :kb_nightly_64_gPAE 1 1 0 0 0 :ltp_nightly_64_g64 1 1 0 0 0 :boot_up_acpi_64_g64 1 1 0 0 0 :boot_up_noacpi_win2k_64 1 1 0 0 0 :boot_smp_acpi_xp_64_gPA 1 1 0 0 0 :boot_smp_vista_64_gPAE 1 1 0 0 0 :boot_up_acpi_win2k3_64_ 1 1 0 0 0 :reboot_xp_64_gPAE 1 1 0 0 0 :bootx_64_g64 1 1 0 0 0 :boot_up_vista_64_g64 1 1 0 0 0 :boot_smp_vista_64_g64 1 1 0 0 0 :boot_up_acpi_xp_64_g64 1 1 0 0 0 :boot_up_vista_64_gPAE 1 1 0 0 0 :ltp_nightly_64_gPAE 1 1 0 0 0 :boot_smp_acpi_win2k3_64 1 1 0 0 0 :boot_up_noacpi_win2k3_6 1 1 0 0 0 :kb_nightly_64_g64 1 1 0 0 0 ===================================================================== Total 43 42 1 0 0 Best Regards, Yunfeng |
From: Avi K. <av...@qu...> - 2008-04-24 11:37:06
|
Yunfeng Zhao wrote: > Hi All, > > This is today's KVM test result against kvm.git > 873c05fa7e6fea27090b1bf0f67a073eadb04782 and kvm-userspace.git > d102d750f397b543fe620a3c77a7e5e42c483865. > > I suspect 873c05fa7e6fea27090b1bf0f67a073eadb04782 itself, it's the only thing that has any chance of badness. Marcelo, any idea? Perhaps due to load, interrupts accumulate and can't be injected fast enough? These tests are run on a 2.6.22 host, which has a hacked smp_call_function_single() in external-module-compat.h, which may exaberate the problem. -- error compiling committee.c: too many arguments to function |
From: Yang, S. <she...@in...> - 2008-04-24 12:55:05
|
On Thursday 24 April 2008 19:37:03 Avi Kivity wrote: > Yunfeng Zhao wrote: > > Hi All, > > > > This is today's KVM test result against kvm.git > > 873c05fa7e6fea27090b1bf0f67a073eadb04782 and kvm-userspace.git > > d102d750f397b543fe620a3c77a7e5e42c483865. > > I suspect 873c05fa7e6fea27090b1bf0f67a073eadb04782 itself, it's the only > thing that has any chance of badness. > > Marcelo, any idea? Perhaps due to load, interrupts accumulate and can't > be injected fast enough? > > These tests are run on a 2.6.22 host, which has a hacked > smp_call_function_single() in external-module-compat.h, which may > exaberate the problem. Yeah, I suspect the commit too(I tried tip without that, and found mostly alright). In fact, I didn't use kvm_vcpu_kick() just because that I found this function may causing hang on my host... But I didn't do more investigate so I can't tell what's wrong, then I just chose way to keep it working... I am sorry for not clarify... -- Thanks Yang, Sheng |
From: Avi K. <av...@qu...> - 2008-04-24 13:00:23
|
Yang, Sheng wrote: > On Thursday 24 April 2008 19:37:03 Avi Kivity wrote: > >> Yunfeng Zhao wrote: >> >>> Hi All, >>> >>> This is today's KVM test result against kvm.git >>> 873c05fa7e6fea27090b1bf0f67a073eadb04782 and kvm-userspace.git >>> d102d750f397b543fe620a3c77a7e5e42c483865. >>> >> I suspect 873c05fa7e6fea27090b1bf0f67a073eadb04782 itself, it's the only >> thing that has any chance of badness. >> >> Marcelo, any idea? Perhaps due to load, interrupts accumulate and can't >> be injected fast enough? >> >> These tests are run on a 2.6.22 host, which has a hacked >> smp_call_function_single() in external-module-compat.h, which may >> exaberate the problem. >> > > Yeah, I suspect the commit too(I tried tip without that, and found mostly > alright). In fact, I didn't use kvm_vcpu_kick() just because that I found > this function may causing hang on my host... But I didn't do more investigate > so I can't tell what's wrong, then I just chose way to keep it working... I > am sorry for not clarify... > I think smp_call_function_single() is miscompiled when using the compatibility code. I took it out-of-line to be sure (it is now in kernel/external-module-compat.c). No evidence, but... -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-04-24 13:44:37
|
Chris Lalancette wrote: > Avi Kivity wrote: > >> Ok. __pit_timer_fn() is called from an interrupt, which then calls >> smp_call_function_single(), which calls spin_lock(). If we've already >> taken the lock, we hang. >> >> > > Ah. Just adding a "me too"; I didn't get a chance to debug it yesterday, but I > was seeing similar problems. If I disabled in-kernel pit with -no-kvm-pit, all > was well. > How to fix it, though? the only idea that comes to mind is to affine the hrtimer with vcpu0 (like the local apic timers) which would mean we only need to unwait the waitqueue, and never need to send the IPI. Would slightly improve performance as well. -- error compiling committee.c: too many arguments to function |
From: Marcelo T. <mto...@re...> - 2008-04-24 16:28:55
|
On Thu, Apr 24, 2008 at 04:44:27PM +0300, Avi Kivity wrote: > Chris Lalancette wrote: > >Avi Kivity wrote: > > > >>Ok. __pit_timer_fn() is called from an interrupt, which then calls > >>smp_call_function_single(), which calls spin_lock(). If we've already > >>taken the lock, we hang. > >> > >> > > > >Ah. Just adding a "me too"; I didn't get a chance to debug it yesterday, > >but I > >was seeing similar problems. If I disabled in-kernel pit with > >-no-kvm-pit, all > >was well. > > > > How to fix it, though? the only idea that comes to mind is to affine > the hrtimer with vcpu0 (like the local apic timers) which would mean we > only need to unwait the waitqueue, and never need to send the IPI. > Would slightly improve performance as well. Yes, agree. For now I think just revert --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -200,10 +200,8 @@ int __pit_timer_fn(struct kvm_kpit_state *ps) atomic_inc(&pt->pending); smp_mb__after_atomic_inc(); - if (vcpu0 && waitqueue_active(&vcpu0->wq)) { - vcpu0->arch.mp_state = KVM_MP_STATE_RUNNABLE; - wake_up_interruptible(&vcpu0->wq); - } + if (vcpu0) + kvm_vcpu_kick(vcpu0); And add a big fat FIXME. |
From: Avi K. <av...@qu...> - 2008-04-24 16:52:38
|
Marcelo Tosatti wrote: > On Thu, Apr 24, 2008 at 04:44:27PM +0300, Avi Kivity wrote: > >> Chris Lalancette wrote: >> >>> Avi Kivity wrote: >>> >>> >>>> Ok. __pit_timer_fn() is called from an interrupt, which then calls >>>> smp_call_function_single(), which calls spin_lock(). If we've already >>>> taken the lock, we hang. >>>> >>>> >>>> >>> Ah. Just adding a "me too"; I didn't get a chance to debug it yesterday, >>> but I >>> was seeing similar problems. If I disabled in-kernel pit with >>> -no-kvm-pit, all >>> was well. >>> >>> >> How to fix it, though? the only idea that comes to mind is to affine >> the hrtimer with vcpu0 (like the local apic timers) which would mean we >> only need to unwait the waitqueue, and never need to send the IPI. >> Would slightly improve performance as well. >> > > Yes, agree. > > For now I think just revert > I committed this, so this should be fixed for now. I'm not sure hrtimer migration would work 100% reliably (suppose it fired just after a vcpu migration) so I think a queue_work is better. -- error compiling committee.c: too many arguments to function |
From: Avi K. <av...@qu...> - 2008-04-24 13:21:25
|
Avi Kivity wrote: > Yang, Sheng wrote: >> On Thursday 24 April 2008 19:37:03 Avi Kivity wrote: >> >>> Yunfeng Zhao wrote: >>> >>>> Hi All, >>>> >>>> This is today's KVM test result against kvm.git >>>> 873c05fa7e6fea27090b1bf0f67a073eadb04782 and kvm-userspace.git >>>> d102d750f397b543fe620a3c77a7e5e42c483865. >>>> >>> I suspect 873c05fa7e6fea27090b1bf0f67a073eadb04782 itself, it's the >>> only >>> thing that has any chance of badness. >>> >>> Marcelo, any idea? Perhaps due to load, interrupts accumulate and >>> can't >>> be injected fast enough? >>> >>> These tests are run on a 2.6.22 host, which has a hacked >>> smp_call_function_single() in external-module-compat.h, which may >>> exaberate the problem. >>> >> >> Yeah, I suspect the commit too(I tried tip without that, and found >> mostly alright). In fact, I didn't use kvm_vcpu_kick() just because >> that I found this function may causing hang on my host... But I >> didn't do more investigate so I can't tell what's wrong, then I just >> chose way to keep it working... I am sorry for not clarify... >> > > I think smp_call_function_single() is miscompiled when using the > compatibility code. I took it out-of-line to be sure (it is now in > kernel/external-module-compat.c). > > No evidence, but... > Ok. __pit_timer_fn() is called from an interrupt, which then calls smp_call_function_single(), which calls spin_lock(). If we've already taken the lock, we hang. -- error compiling committee.c: too many arguments to function |
From: Chris L. <cla...@re...> - 2008-04-24 13:33:24
|
Avi Kivity wrote: > > Ok. __pit_timer_fn() is called from an interrupt, which then calls > smp_call_function_single(), which calls spin_lock(). If we've already > taken the lock, we hang. > Ah. Just adding a "me too"; I didn't get a chance to debug it yesterday, but I was seeing similar problems. If I disabled in-kernel pit with -no-kvm-pit, all was well. Chris Lalancette |