You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(33) |
Nov
(325) |
Dec
(320) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(484) |
Feb
(438) |
Mar
(407) |
Apr
(713) |
May
(831) |
Jun
(806) |
Jul
(1023) |
Aug
(1184) |
Sep
(1118) |
Oct
(1461) |
Nov
(1224) |
Dec
(1042) |
2008 |
Jan
(1449) |
Feb
(1110) |
Mar
(1428) |
Apr
(1643) |
May
(682) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Avi K. <av...@qu...> - 2008-04-21 10:30:09
|
Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/Kconfig | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig index 2489b34..1761b74 100644 --- a/arch/s390/kvm/Kconfig +++ b/arch/s390/kvm/Kconfig @@ -36,6 +36,9 @@ config KVM If unsure, say N. +config KVM_TRACE + bool + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/virtio/Kconfig -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:09
|
From: Marcelo Tosatti <mto...@re...> Timers that fire between guest hlt and vcpu_block's add_wait_queue() are ignored, possibly resulting in hangs. Also make sure that atomic_inc and waitqueue_active tests happen in the specified order, otherwise the following race is open: CPU0 CPU1 if (waitqueue_active(wq)) add_wait_queue() if (!atomic_read(pit_timer->pending)) schedule() atomic_inc(pit_timer->pending) Signed-off-by: Marcelo Tosatti <mto...@re...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/kvm-ia64.c | 5 +++++ arch/s390/kvm/interrupt.c | 5 +++++ arch/x86/kvm/i8254.c | 11 +++++++++++ arch/x86/kvm/irq.c | 15 +++++++++++++++ arch/x86/kvm/irq.h | 3 +++ arch/x86/kvm/lapic.c | 10 ++++++++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 1 + 8 files changed, 51 insertions(+), 0 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 9c56b64..ca1cfb1 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1778,6 +1778,11 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu) return 0; } +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + return 0; +} + gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn) { return gfn; diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index f62588c..fcd1ed8 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -325,6 +325,11 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu) return rc; } +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + return 0; +} + int kvm_s390_handle_wait(struct kvm_vcpu *vcpu) { u64 now, sltime; diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c index 06a241a..abb4b16 100644 --- a/arch/x86/kvm/i8254.c +++ b/arch/x86/kvm/i8254.c @@ -199,6 +199,7 @@ int __pit_timer_fn(struct kvm_kpit_state *ps) struct kvm_kpit_timer *pt = &ps->pit_timer; atomic_inc(&pt->pending); + smp_mb__after_atomic_inc(); if (vcpu0 && waitqueue_active(&vcpu0->wq)) { vcpu0->arch.mp_state = VCPU_MP_STATE_RUNNABLE; wake_up_interruptible(&vcpu0->wq); @@ -210,6 +211,16 @@ int __pit_timer_fn(struct kvm_kpit_state *ps) return (pt->period == 0 ? 0 : 1); } +int pit_has_pending_timer(struct kvm_vcpu *vcpu) +{ + struct kvm_pit *pit = vcpu->kvm->arch.vpit; + + if (pit && vcpu->vcpu_id == 0) + return atomic_read(&pit->pit_state.pit_timer.pending); + + return 0; +} + static enum hrtimer_restart pit_timer_fn(struct hrtimer *data) { struct kvm_kpit_state *ps; diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index dbfe21c..ce1f583 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -26,6 +26,21 @@ #include "i8254.h" /* + * check if there are pending timer events + * to be processed. + */ +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) +{ + int ret; + + ret = pit_has_pending_timer(vcpu); + ret |= apic_has_pending_timer(vcpu); + + return ret; +} +EXPORT_SYMBOL(kvm_cpu_has_pending_timer); + +/* * check if there is pending interrupt without * intack. */ diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index fa5ed5d..1802134 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -85,4 +85,7 @@ void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); +int pit_has_pending_timer(struct kvm_vcpu *vcpu); +int apic_has_pending_timer(struct kvm_vcpu *vcpu); + #endif diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 31280df..debf582 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -952,6 +952,16 @@ static int __apic_timer_fn(struct kvm_lapic *apic) return result; } +int apic_has_pending_timer(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *lapic = vcpu->arch.apic; + + if (lapic) + return atomic_read(&lapic->timer.pending); + + return 0; +} + static int __inject_apic_timer_irq(struct kvm_lapic *apic) { int vector; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bd0c2d2..0bc4003 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -269,6 +269,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm); int kvm_cpu_get_interrupt(struct kvm_vcpu *v); int kvm_cpu_has_interrupt(struct kvm_vcpu *v); +int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu); void kvm_vcpu_kick(struct kvm_vcpu *vcpu); static inline void kvm_guest_enter(void) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d5911d9..47cbc6e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -765,6 +765,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) * We will block until either an interrupt or a signal wakes us up */ while (!kvm_cpu_has_interrupt(vcpu) + && !kvm_cpu_has_pending_timer(vcpu) && !signal_pending(current) && !kvm_arch_vcpu_runnable(vcpu)) { set_current_state(TASK_INTERRUPTIBLE); -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:09
|
From: Marcelo Tosatti <mto...@re...> So userspace can save/restore the mpstate during migration. [avi: export the #define constants describing the value] [christian: add s390 stubs] [avi: ditto for ia64] Signed-off-by: Marcelo Tosatti <mto...@re...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> KVM: ia64: provide get/set_mp_state stubs to fix compile error Since commit ded6fb24fb694bcc5f308a02ec504d45fbc8aaa6 Author: Marcelo Tosatti <mto...@re...> Date: Fri Apr 11 13:24:45 2008 -0300 KVM: add ioctls to save/store mpstate kvm does not compile on ia64. This patch provides ioctl stubs for ia64 to make kvm.git compile again. Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/kvm-ia64.c | 12 ++++++++++++ arch/s390/kvm/kvm-s390.c | 12 ++++++++++++ arch/x86/kvm/x86.c | 19 +++++++++++++++++++ include/asm-x86/kvm_host.h | 5 ----- include/linux/kvm.h | 15 +++++++++++++++ include/linux/kvm_host.h | 4 ++++ virt/kvm/kvm_main.c | 24 ++++++++++++++++++++++++ 7 files changed, 86 insertions(+), 5 deletions(-) diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index f7589db..6df0732 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1792,3 +1792,15 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) { return vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE; } + +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; +} diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d966137..98d1e73 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -414,6 +414,18 @@ int kvm_arch_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu, return -EINVAL; /* not implemented yet */ } +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; /* not implemented yet */ +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + return -EINVAL; /* not implemented yet */ +} + static void __vcpu_run(struct kvm_vcpu *vcpu) { memcpy(&vcpu->arch.sie_block->gg14, &vcpu->arch.guest_gprs[14], 16); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b364d19..5c3c9d3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -817,6 +817,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_CLOCKSOURCE: case KVM_CAP_PIT: case KVM_CAP_NOP_IO_DELAY: + case KVM_CAP_MP_STATE: r = 1; break; case KVM_CAP_VAPIC: @@ -3083,6 +3084,24 @@ int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, return 0; } +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + vcpu_load(vcpu); + mp_state->mp_state = vcpu->arch.mp_state; + vcpu_put(vcpu); + return 0; +} + +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state) +{ + vcpu_load(vcpu); + vcpu->arch.mp_state = mp_state->mp_state; + vcpu_put(vcpu); + return 0; +} + static void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) { diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index f35a6ad..9d963cd 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -227,11 +227,6 @@ struct kvm_vcpu_arch { u64 shadow_efer; u64 apic_base; struct kvm_lapic *apic; /* kernel irqchip context */ -#define KVM_MP_STATE_RUNNABLE 0 -#define KVM_MP_STATE_UNINITIALIZED 1 -#define KVM_MP_STATE_INIT_RECEIVED 2 -#define KVM_MP_STATE_SIPI_RECEIVED 3 -#define KVM_MP_STATE_HALTED 4 int mp_state; int sipi_vector; u64 ia32_misc_enable_msr; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index d302d63..f8e211d 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -228,6 +228,18 @@ struct kvm_vapic_addr { __u64 vapic_addr; }; +/* for KVM_SET_MPSTATE */ + +#define KVM_MP_STATE_RUNNABLE 0 +#define KVM_MP_STATE_UNINITIALIZED 1 +#define KVM_MP_STATE_INIT_RECEIVED 2 +#define KVM_MP_STATE_HALTED 3 +#define KVM_MP_STATE_SIPI_RECEIVED 4 + +struct kvm_mp_state { + __u32 mp_state; +}; + struct kvm_s390_psw { __u64 mask; __u64 addr; @@ -326,6 +338,7 @@ struct kvm_trace_rec { #define KVM_CAP_PIT 11 #define KVM_CAP_NOP_IO_DELAY 12 #define KVM_CAP_PV_MMU 13 +#define KVM_CAP_MP_STATE 14 /* * ioctls for VM fds @@ -387,5 +400,7 @@ struct kvm_trace_rec { #define KVM_S390_SET_INITIAL_PSW _IOW(KVMIO, 0x96, struct kvm_s390_psw) /* initial reset for s390 */ #define KVM_S390_INITIAL_RESET _IO(KVMIO, 0x97) +#define KVM_GET_MP_STATE _IOR(KVMIO, 0x98, struct kvm_mp_state) +#define KVM_SET_MP_STATE _IOW(KVMIO, 0x99, struct kvm_mp_state) #endif diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 0bc4003..81d4c33 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -237,6 +237,10 @@ int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs); +int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state); +int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, + struct kvm_mp_state *mp_state); int kvm_arch_vcpu_ioctl_debug_guest(struct kvm_vcpu *vcpu, struct kvm_debug_guest *dbg); int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 47cbc6e..0998455 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -979,6 +979,30 @@ out_free2: r = 0; break; } + case KVM_GET_MP_STATE: { + struct kvm_mp_state mp_state; + + r = kvm_arch_vcpu_ioctl_get_mpstate(vcpu, &mp_state); + if (r) + goto out; + r = -EFAULT; + if (copy_to_user(argp, &mp_state, sizeof mp_state)) + goto out; + r = 0; + break; + } + case KVM_SET_MP_STATE: { + struct kvm_mp_state mp_state; + + r = -EFAULT; + if (copy_from_user(&mp_state, argp, sizeof mp_state)) + goto out; + r = kvm_arch_vcpu_ioctl_set_mpstate(vcpu, &mp_state); + if (r) + goto out; + r = 0; + break; + } case KVM_TRANSLATE: { struct kvm_translation tr; -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:09
|
From: Marcelo Tosatti <mto...@re...> kvm_pv_mmu_op should not take mmap_sem. All gfn_to_page() callers down in the MMU processing will take it if necessary, so as it is it can deadlock. Apparently a leftover from the days before slots_lock. Signed-off-by: Marcelo Tosatti <mto...@re...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/mmu.c | 3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 078a7f1..2ad6f54 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2173,8 +2173,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes, int r; struct kvm_pv_mmu_op_buffer buffer; - down_read(¤t->mm->mmap_sem); - buffer.ptr = buffer.buf; buffer.len = min_t(unsigned long, bytes, sizeof buffer.buf); buffer.processed = 0; @@ -2194,7 +2192,6 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes, r = 1; out: *ret = buffer.processed; - up_read(¤t->mm->mmap_sem); return r; } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:06
|
From: Marcelo Tosatti <mto...@re...> There is a window open between testing of pending IRQ's and assignment of guest_mode in __vcpu_run. Injection of IRQ's can race with __vcpu_run as follows: CPU0 CPU1 kvm_x86_ops->run() vcpu->guest_mode = 0 SET_IRQ_LINE ioctl .. kvm_x86_ops->inject_pending_irq kvm_cpu_has_interrupt() apic_test_and_set_irr() kvm_vcpu_kick if (vcpu->guest_mode) send_ipi() vcpu->guest_mode = 1 So move guest_mode=1 assignment before ->inject_pending_irq, and make sure that it won't reorder after it. Signed-off-by: Marcelo Tosatti <mto...@re...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/x86.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5c3c9d3..0ce5563 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2802,6 +2802,13 @@ again: goto out; } + vcpu->guest_mode = 1; + /* + * Make sure that guest_mode assignment won't happen after + * testing the pending IRQ vector bitmap. + */ + smp_wmb(); + if (vcpu->arch.exception.pending) __queue_exception(vcpu); else if (irqchip_in_kernel(vcpu->kvm)) @@ -2813,7 +2820,6 @@ again: up_read(&vcpu->kvm->slots_lock); - vcpu->guest_mode = 1; kvm_guest_enter(); if (vcpu->requests) @@ -3970,11 +3976,17 @@ static void vcpu_kick_intr(void *info) void kvm_vcpu_kick(struct kvm_vcpu *vcpu) { int ipi_pcpu = vcpu->cpu; + int cpu = get_cpu(); if (waitqueue_active(&vcpu->wq)) { wake_up_interruptible(&vcpu->wq); ++vcpu->stat.halt_wakeup; } - if (vcpu->guest_mode) + /* + * We may be called synchronously with irqs disabled in guest mode, + * So need not to call smp_call_function_single() in that case. + */ + if (vcpu->guest_mode && vcpu->cpu != cpu) smp_call_function_single(ipi_pcpu, vcpu_kick_intr, vcpu, 0, 0); + put_cpu(); } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:05
|
Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/Kconfig | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/ia64/kvm/Kconfig b/arch/ia64/kvm/Kconfig index d2e54b9..7914e48 100644 --- a/arch/ia64/kvm/Kconfig +++ b/arch/ia64/kvm/Kconfig @@ -43,4 +43,7 @@ config KVM_INTEL Provides support for KVM on Itanium 2 processors equipped with the VT extensions. +config KVM_TRACE + bool + endif # VIRTUALIZATION -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:30:05
|
From: Joerg Roedel <joe...@am...> To properly forward a MCE occured while the guest is running to the host, we have to intercept this exception and call the host handler by hand. This is implemented by this patch. Signed-off-by: Joerg Roedel <joe...@am...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/svm.c | 17 ++++++++++++++++- include/asm-x86/kvm_host.h | 1 + 2 files changed, 17 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 8af463b..da3ddef 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -507,7 +507,8 @@ static void init_vmcb(struct vcpu_svm *svm) INTERCEPT_DR7_MASK; control->intercept_exceptions = (1 << PF_VECTOR) | - (1 << UD_VECTOR); + (1 << UD_VECTOR) | + (1 << MC_VECTOR); control->intercept = (1ULL << INTERCEPT_INTR) | @@ -1044,6 +1045,19 @@ static int nm_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) return 1; } +static int mc_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) +{ + /* + * On an #MC intercept the MCE handler is not called automatically in + * the host. So do it by hand here. + */ + asm volatile ( + "int $0x12\n"); + /* not sure if we ever come back to this point */ + + return 1; +} + static int shutdown_interception(struct vcpu_svm *svm, struct kvm_run *kvm_run) { /* @@ -1367,6 +1381,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm, [SVM_EXIT_EXCP_BASE + UD_VECTOR] = ud_interception, [SVM_EXIT_EXCP_BASE + PF_VECTOR] = pf_interception, [SVM_EXIT_EXCP_BASE + NM_VECTOR] = nm_interception, + [SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception, [SVM_EXIT_INTR] = nop_on_interception, [SVM_EXIT_NMI] = nop_on_interception, [SVM_EXIT_SMI] = nop_on_interception, diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index de3eccf..2861178 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -62,6 +62,7 @@ #define SS_VECTOR 12 #define GP_VECTOR 13 #define PF_VECTOR 14 +#define MC_VECTOR 18 #define SELECTOR_TI_MASK (1 << 2) #define SELECTOR_RPL_MASK 0x03 -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:29:59
|
From: Anthony Liguori <ali...@us...> This patch introduces a gfn_to_pfn() function and corresponding functions like kvm_release_pfn_dirty(). Using these new functions, we can modify the x86 MMU to no longer assume that it can always get a struct page for any given gfn. We don't want to eliminate gfn_to_page() entirely because a number of places assume they can do gfn_to_page() and then kmap() the results. When we support IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will succeed. This does not implement support for avoiding reference counting for reserved RAM or for IO memory. However, it should make those things pretty straight forward. Since we're only introducing new common symbols, I don't think it will break the non-x86 architectures but I haven't tested those. I've tested Intel, AMD, NPT, and hugetlbfs with Windows and Linux guests. [avi: fix overflow when shifting left pfns by adding casts] Signed-off-by: Anthony Liguori <ali...@us...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/mmu.c | 89 +++++++++++++++++++++---------------------- arch/x86/kvm/paging_tmpl.h | 26 ++++++------ include/asm-x86/kvm_host.h | 4 +- include/linux/kvm_host.h | 12 ++++++ include/linux/kvm_types.h | 2 + virt/kvm/kvm_main.c | 68 ++++++++++++++++++++++++++++++--- 6 files changed, 133 insertions(+), 68 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c89bf23..078a7f1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -240,11 +240,9 @@ static int is_rmap_pte(u64 pte) return is_shadow_present_pte(pte); } -static struct page *spte_to_page(u64 pte) +static pfn_t spte_to_pfn(u64 pte) { - hfn_t hfn = (pte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT; - - return pfn_to_page(hfn); + return (pte & PT64_BASE_ADDR_MASK) >> PAGE_SHIFT; } static gfn_t pse36_gfn_delta(u32 gpte) @@ -541,20 +539,20 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) struct kvm_rmap_desc *desc; struct kvm_rmap_desc *prev_desc; struct kvm_mmu_page *sp; - struct page *page; + pfn_t pfn; unsigned long *rmapp; int i; if (!is_rmap_pte(*spte)) return; sp = page_header(__pa(spte)); - page = spte_to_page(*spte); + pfn = spte_to_pfn(*spte); if (*spte & PT_ACCESSED_MASK) - mark_page_accessed(page); + kvm_set_pfn_accessed(pfn); if (is_writeble_pte(*spte)) - kvm_release_page_dirty(page); + kvm_release_pfn_dirty(pfn); else - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); rmapp = gfn_to_rmap(kvm, sp->gfns[spte - sp->spt], is_large_pte(*spte)); if (!*rmapp) { printk(KERN_ERR "rmap_remove: %p %llx 0->BUG\n", spte, *spte); @@ -635,11 +633,11 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn) spte = rmap_next(kvm, rmapp, spte); } if (write_protected) { - struct page *page; + pfn_t pfn; spte = rmap_next(kvm, rmapp, NULL); - page = spte_to_page(*spte); - SetPageDirty(page); + pfn = spte_to_pfn(*spte); + kvm_set_pfn_dirty(pfn); } /* check for huge page mappings */ @@ -1036,7 +1034,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, unsigned pt_access, unsigned pte_access, int user_fault, int write_fault, int dirty, int *ptwrite, int largepage, gfn_t gfn, - struct page *page, bool speculative) + pfn_t pfn, bool speculative) { u64 spte; int was_rmapped = 0; @@ -1058,10 +1056,9 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, child = page_header(pte & PT64_BASE_ADDR_MASK); mmu_page_remove_parent_pte(child, shadow_pte); - } else if (page != spte_to_page(*shadow_pte)) { + } else if (pfn != spte_to_pfn(*shadow_pte)) { pgprintk("hfn old %lx new %lx\n", - page_to_pfn(spte_to_page(*shadow_pte)), - page_to_pfn(page)); + spte_to_pfn(*shadow_pte), pfn); rmap_remove(vcpu->kvm, shadow_pte); } else { if (largepage) @@ -1090,7 +1087,7 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, if (largepage) spte |= PT_PAGE_SIZE_MASK; - spte |= page_to_phys(page); + spte |= (u64)pfn << PAGE_SHIFT; if ((pte_access & ACC_WRITE_MASK) || (write_fault && !is_write_protection(vcpu) && !user_fault)) { @@ -1135,12 +1132,12 @@ unshadowed: if (!was_rmapped) { rmap_add(vcpu, shadow_pte, gfn, largepage); if (!is_rmap_pte(*shadow_pte)) - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); } else { if (was_writeble) - kvm_release_page_dirty(page); + kvm_release_pfn_dirty(pfn); else - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); } if (!ptwrite || !*ptwrite) vcpu->arch.last_pte_updated = shadow_pte; @@ -1151,7 +1148,7 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu) } static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, - int largepage, gfn_t gfn, struct page *page, + int largepage, gfn_t gfn, pfn_t pfn, int level) { hpa_t table_addr = vcpu->arch.mmu.root_hpa; @@ -1166,13 +1163,13 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, if (level == 1) { mmu_set_spte(vcpu, &table[index], ACC_ALL, ACC_ALL, - 0, write, 1, &pt_write, 0, gfn, page, false); + 0, write, 1, &pt_write, 0, gfn, pfn, false); return pt_write; } if (largepage && level == 2) { mmu_set_spte(vcpu, &table[index], ACC_ALL, ACC_ALL, - 0, write, 1, &pt_write, 1, gfn, page, false); + 0, write, 1, &pt_write, 1, gfn, pfn, false); return pt_write; } @@ -1187,7 +1184,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, 1, ACC_ALL, &table[index]); if (!new_table) { pgprintk("nonpaging_map: ENOMEM\n"); - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); return -ENOMEM; } @@ -1202,8 +1199,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) { int r; int largepage = 0; - - struct page *page; + pfn_t pfn; down_read(¤t->mm->mmap_sem); if (is_largepage_backed(vcpu, gfn & ~(KVM_PAGES_PER_HPAGE-1))) { @@ -1211,18 +1207,18 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) largepage = 1; } - page = gfn_to_page(vcpu->kvm, gfn); + pfn = gfn_to_pfn(vcpu->kvm, gfn); up_read(¤t->mm->mmap_sem); /* mmio */ - if (is_error_page(page)) { - kvm_release_page_clean(page); + if (is_error_pfn(pfn)) { + kvm_release_pfn_clean(pfn); return 1; } spin_lock(&vcpu->kvm->mmu_lock); kvm_mmu_free_some_pages(vcpu); - r = __direct_map(vcpu, v, write, largepage, gfn, page, + r = __direct_map(vcpu, v, write, largepage, gfn, pfn, PT32E_ROOT_LEVEL); spin_unlock(&vcpu->kvm->mmu_lock); @@ -1355,7 +1351,7 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva, static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code) { - struct page *page; + pfn_t pfn; int r; int largepage = 0; gfn_t gfn = gpa >> PAGE_SHIFT; @@ -1372,16 +1368,16 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, gfn &= ~(KVM_PAGES_PER_HPAGE-1); largepage = 1; } - page = gfn_to_page(vcpu->kvm, gfn); + pfn = gfn_to_pfn(vcpu->kvm, gfn); up_read(¤t->mm->mmap_sem); - if (is_error_page(page)) { - kvm_release_page_clean(page); + if (is_error_pfn(pfn)) { + kvm_release_pfn_clean(pfn); return 1; } spin_lock(&vcpu->kvm->mmu_lock); kvm_mmu_free_some_pages(vcpu); r = __direct_map(vcpu, gpa, error_code & PFERR_WRITE_MASK, - largepage, gfn, page, TDP_ROOT_LEVEL); + largepage, gfn, pfn, TDP_ROOT_LEVEL); spin_unlock(&vcpu->kvm->mmu_lock); return r; @@ -1525,6 +1521,8 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu) static int init_kvm_mmu(struct kvm_vcpu *vcpu) { + vcpu->arch.update_pte.pfn = bad_pfn; + if (tdp_enabled) return init_kvm_tdp_mmu(vcpu); else @@ -1644,7 +1642,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, gfn_t gfn; int r; u64 gpte = 0; - struct page *page; + pfn_t pfn; vcpu->arch.update_pte.largepage = 0; @@ -1680,15 +1678,15 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, gfn &= ~(KVM_PAGES_PER_HPAGE-1); vcpu->arch.update_pte.largepage = 1; } - page = gfn_to_page(vcpu->kvm, gfn); + pfn = gfn_to_pfn(vcpu->kvm, gfn); up_read(¤t->mm->mmap_sem); - if (is_error_page(page)) { - kvm_release_page_clean(page); + if (is_error_pfn(pfn)) { + kvm_release_pfn_clean(pfn); return; } vcpu->arch.update_pte.gfn = gfn; - vcpu->arch.update_pte.page = page; + vcpu->arch.update_pte.pfn = pfn; } void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, @@ -1793,9 +1791,9 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, } kvm_mmu_audit(vcpu, "post pte write"); spin_unlock(&vcpu->kvm->mmu_lock); - if (vcpu->arch.update_pte.page) { - kvm_release_page_clean(vcpu->arch.update_pte.page); - vcpu->arch.update_pte.page = NULL; + if (!is_error_pfn(vcpu->arch.update_pte.pfn)) { + kvm_release_pfn_clean(vcpu->arch.update_pte.pfn); + vcpu->arch.update_pte.pfn = bad_pfn; } } @@ -2236,8 +2234,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte, audit_mappings_page(vcpu, ent, va, level - 1); } else { gpa_t gpa = vcpu->arch.mmu.gva_to_gpa(vcpu, va); - struct page *page = gpa_to_page(vcpu, gpa); - hpa_t hpa = page_to_phys(page); + hpa_t hpa = (hpa_t)gpa_to_pfn(vcpu, gpa) << PAGE_SHIFT; if (is_shadow_present_pte(ent) && (ent & PT64_BASE_ADDR_MASK) != hpa) @@ -2250,7 +2247,7 @@ static void audit_mappings_page(struct kvm_vcpu *vcpu, u64 page_pte, && !is_error_hpa(hpa)) printk(KERN_ERR "audit: (%s) notrap shadow," " valid guest gva %lx\n", audit_msg, va); - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); } } diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 57d872a..156fe10 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -247,7 +247,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, { pt_element_t gpte; unsigned pte_access; - struct page *npage; + pfn_t pfn; int largepage = vcpu->arch.update_pte.largepage; gpte = *(const pt_element_t *)pte; @@ -260,13 +260,13 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, pte_access = page->role.access & FNAME(gpte_access)(vcpu, gpte); if (gpte_to_gfn(gpte) != vcpu->arch.update_pte.gfn) return; - npage = vcpu->arch.update_pte.page; - if (!npage) + pfn = vcpu->arch.update_pte.pfn; + if (is_error_pfn(pfn)) return; - get_page(npage); + kvm_get_pfn(pfn); mmu_set_spte(vcpu, spte, page->role.access, pte_access, 0, 0, gpte & PT_DIRTY_MASK, NULL, largepage, gpte_to_gfn(gpte), - npage, true); + pfn, true); } /* @@ -275,7 +275,7 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, struct guest_walker *walker, int user_fault, int write_fault, int largepage, - int *ptwrite, struct page *page) + int *ptwrite, pfn_t pfn) { hpa_t shadow_addr; int level; @@ -336,7 +336,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, walker->pte_gpa[level - 2], &curr_pte, sizeof(curr_pte)); if (r || curr_pte != walker->ptes[level - 2]) { - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); return NULL; } } @@ -349,7 +349,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, mmu_set_spte(vcpu, shadow_ent, access, walker->pte_access & access, user_fault, write_fault, walker->ptes[walker->level-1] & PT_DIRTY_MASK, - ptwrite, largepage, walker->gfn, page, false); + ptwrite, largepage, walker->gfn, pfn, false); return shadow_ent; } @@ -378,7 +378,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, u64 *shadow_pte; int write_pt = 0; int r; - struct page *page; + pfn_t pfn; int largepage = 0; pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code); @@ -413,20 +413,20 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, largepage = 1; } } - page = gfn_to_page(vcpu->kvm, walker.gfn); + pfn = gfn_to_pfn(vcpu->kvm, walker.gfn); up_read(¤t->mm->mmap_sem); /* mmio */ - if (is_error_page(page)) { + if (is_error_pfn(pfn)) { pgprintk("gfn %x is mmio\n", walker.gfn); - kvm_release_page_clean(page); + kvm_release_pfn_clean(pfn); return 1; } spin_lock(&vcpu->kvm->mmu_lock); kvm_mmu_free_some_pages(vcpu); shadow_pte = FNAME(fetch)(vcpu, addr, &walker, user_fault, write_fault, - largepage, &write_pt, page); + largepage, &write_pt, pfn); pgprintk("%s: shadow pte %p %llx ptwrite %d\n", __func__, shadow_pte, *shadow_pte, write_pt); diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index b923049..de3eccf 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -248,8 +248,8 @@ struct kvm_vcpu_arch { u64 *last_pte_updated; struct { - gfn_t gfn; /* presumed gfn during guest pte update */ - struct page *page; /* page corresponding to that gfn */ + gfn_t gfn; /* presumed gfn during guest pte update */ + pfn_t pfn; /* pfn corresponding to that gfn */ int largepage; } update_pte; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a2ceb51..578c363 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -150,8 +150,10 @@ static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; } struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva); extern struct page *bad_page; +extern pfn_t bad_pfn; int is_error_page(struct page *page); +int is_error_pfn(pfn_t pfn); int kvm_is_error_hva(unsigned long addr); int kvm_set_memory_region(struct kvm *kvm, struct kvm_userspace_memory_region *mem, @@ -168,6 +170,16 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn); unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn); void kvm_release_page_clean(struct page *page); void kvm_release_page_dirty(struct page *page); +void kvm_set_page_dirty(struct page *page); +void kvm_set_page_accessed(struct page *page); + +pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn); +void kvm_release_pfn_dirty(pfn_t); +void kvm_release_pfn_clean(pfn_t pfn); +void kvm_set_pfn_dirty(pfn_t pfn); +void kvm_set_pfn_accessed(pfn_t pfn); +void kvm_get_pfn(pfn_t pfn); + int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, int len); int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, void *data, diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 1c4e46d..9b6f395 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -38,6 +38,8 @@ typedef unsigned long hva_t; typedef u64 hpa_t; typedef unsigned long hfn_t; +typedef hfn_t pfn_t; + struct kvm_pio_request { unsigned long count; int cur_count; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 93ed78b..6a52c08 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -40,6 +40,7 @@ #include <linux/kvm_para.h> #include <linux/pagemap.h> #include <linux/mman.h> +#include <linux/swap.h> #include <asm/processor.h> #include <asm/io.h> @@ -458,6 +459,12 @@ int is_error_page(struct page *page) } EXPORT_SYMBOL_GPL(is_error_page); +int is_error_pfn(pfn_t pfn) +{ + return pfn == bad_pfn; +} +EXPORT_SYMBOL_GPL(is_error_pfn); + static inline unsigned long bad_hva(void) { return PAGE_OFFSET; @@ -519,7 +526,7 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn) /* * Requires current->mm->mmap_sem to be held */ -struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) +pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) { struct page *page[1]; unsigned long addr; @@ -530,7 +537,7 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) addr = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(addr)) { get_page(bad_page); - return bad_page; + return page_to_pfn(bad_page); } npages = get_user_pages(current, current->mm, addr, 1, 1, 1, page, @@ -538,27 +545,71 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) if (npages != 1) { get_page(bad_page); - return bad_page; + return page_to_pfn(bad_page); } - return page[0]; + return page_to_pfn(page[0]); +} + +EXPORT_SYMBOL_GPL(gfn_to_pfn); + +struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) +{ + return pfn_to_page(gfn_to_pfn(kvm, gfn)); } EXPORT_SYMBOL_GPL(gfn_to_page); void kvm_release_page_clean(struct page *page) { - put_page(page); + kvm_release_pfn_clean(page_to_pfn(page)); } EXPORT_SYMBOL_GPL(kvm_release_page_clean); +void kvm_release_pfn_clean(pfn_t pfn) +{ + put_page(pfn_to_page(pfn)); +} +EXPORT_SYMBOL_GPL(kvm_release_pfn_clean); + void kvm_release_page_dirty(struct page *page) { + kvm_release_pfn_dirty(page_to_pfn(page)); +} +EXPORT_SYMBOL_GPL(kvm_release_page_dirty); + +void kvm_release_pfn_dirty(pfn_t pfn) +{ + kvm_set_pfn_dirty(pfn); + kvm_release_pfn_clean(pfn); +} +EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty); + +void kvm_set_page_dirty(struct page *page) +{ + kvm_set_pfn_dirty(page_to_pfn(page)); +} +EXPORT_SYMBOL_GPL(kvm_set_page_dirty); + +void kvm_set_pfn_dirty(pfn_t pfn) +{ + struct page *page = pfn_to_page(pfn); if (!PageReserved(page)) SetPageDirty(page); - put_page(page); } -EXPORT_SYMBOL_GPL(kvm_release_page_dirty); +EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty); + +void kvm_set_pfn_accessed(pfn_t pfn) +{ + mark_page_accessed(pfn_to_page(pfn)); +} +EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed); + +void kvm_get_pfn(pfn_t pfn) +{ + get_page(pfn_to_page(pfn)); +} +EXPORT_SYMBOL_GPL(kvm_get_pfn); static int next_segment(unsigned long len, int offset) { @@ -1351,6 +1402,7 @@ static struct sys_device kvm_sysdev = { }; struct page *bad_page; +pfn_t bad_pfn; static inline struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn) @@ -1392,6 +1444,8 @@ int kvm_init(void *opaque, unsigned int vcpu_size, goto out; } + bad_pfn = page_to_pfn(bad_page); + r = kvm_arch_hardware_setup(); if (r < 0) goto out_free_0; -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:29:59
|
Signed-off-by: Avi Kivity <av...@qu...> --- Documentation/ioctl-number.txt | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/Documentation/ioctl-number.txt b/Documentation/ioctl-number.txt index c18363b..240ce7a 100644 --- a/Documentation/ioctl-number.txt +++ b/Documentation/ioctl-number.txt @@ -183,6 +183,8 @@ Code Seq# Include File Comments 0xAC 00-1F linux/raw.h 0xAD 00 Netfilter device in development: <mailto:ru...@ru...> +0xAE all linux/kvm.h Kernel-based Virtual Machine + <mailto:kvm...@li...> 0xB0 all RATIO devices in development: <mailto:vg...@ra...> 0xB1 00-1F PPPoX <mailto:mos...@st...> -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:29:59
|
From: Joerg Roedel <joe...@am...> This patch aligns the host version of the CR4.MCE bit with the CR4 active in the guest. This is necessary to get MCE exceptions when the guest is running. Signed-off-by: Joerg Roedel <joe...@am...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/svm.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d7439ce..8af463b 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -878,9 +878,12 @@ set: static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { + unsigned long host_cr4_mce = read_cr4() & X86_CR4_MCE; + vcpu->arch.cr4 = cr4; if (!npt_enabled) cr4 |= X86_CR4_PAE; + cr4 |= host_cr4_mce; to_svm(vcpu)->vmcb->save.cr4 = cr4; } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-21 10:29:59
|
Fourth and final batch of the pending kvm updates. This one contains the ppc port in addition to x86 updates. Documentation/ioctl-number.txt | 2 + Documentation/powerpc/kvm_440.txt | 41 ++ MAINTAINERS | 7 + arch/ia64/kvm/Kconfig | 3 + arch/ia64/kvm/kvm-ia64.c | 43 ++- arch/powerpc/Kconfig | 1 + arch/powerpc/Kconfig.debug | 3 + arch/powerpc/Makefile | 1 + arch/powerpc/kernel/asm-offsets.c | 26 ++ arch/powerpc/kvm/44x_tlb.c | 224 ++++++++++ arch/powerpc/kvm/44x_tlb.h | 91 +++++ arch/powerpc/kvm/Kconfig | 43 ++ arch/powerpc/kvm/Makefile | 15 + arch/powerpc/kvm/booke_guest.c | 615 ++++++++++++++++++++++++++++ arch/powerpc/kvm/booke_host.c | 83 ++++ arch/powerpc/kvm/booke_interrupts.S | 436 ++++++++++++++++++++ arch/powerpc/kvm/emulate.c | 760 +++++++++++++++++++++++++++++++++++ arch/powerpc/kvm/powerpc.c | 436 ++++++++++++++++++++ arch/s390/kvm/Kconfig | 3 + arch/s390/kvm/interrupt.c | 5 + arch/s390/kvm/kvm-s390.c | 12 + arch/x86/kvm/Kconfig | 11 + arch/x86/kvm/Makefile | 3 + arch/x86/kvm/i8254.c | 13 +- arch/x86/kvm/irq.c | 15 + arch/x86/kvm/irq.h | 3 + arch/x86/kvm/lapic.c | 27 +- arch/x86/kvm/mmu.c | 92 ++--- arch/x86/kvm/paging_tmpl.h | 26 +- arch/x86/kvm/svm.c | 110 ++++-- arch/x86/kvm/vmx.c | 35 ++- arch/x86/kvm/x86.c | 79 ++++- arch/x86/kvm/x86_emulate.c | 33 +- include/asm-ia64/kvm_host.h | 8 +- include/asm-powerpc/kvm.h | 53 +++- include/asm-powerpc/kvm_asm.h | 55 +++ include/asm-powerpc/kvm_host.h | 152 +++++++ include/asm-powerpc/kvm_para.h | 38 ++ include/asm-powerpc/kvm_ppc.h | 88 ++++ include/asm-powerpc/mmu-44x.h | 2 + include/asm-x86/kvm.h | 20 + include/asm-x86/kvm_host.h | 29 +- include/linux/kvm.h | 71 ++++- include/linux/kvm_host.h | 31 ++ include/linux/kvm_types.h | 2 + virt/kvm/kvm_main.c | 107 +++++- virt/kvm/kvm_trace.c | 276 +++++++++++++ 47 files changed, 4065 insertions(+), 164 deletions(-) create mode 100644 Documentation/powerpc/kvm_440.txt create mode 100644 arch/powerpc/kvm/44x_tlb.c create mode 100644 arch/powerpc/kvm/44x_tlb.h create mode 100644 arch/powerpc/kvm/Kconfig create mode 100644 arch/powerpc/kvm/Makefile create mode 100644 arch/powerpc/kvm/booke_guest.c create mode 100644 arch/powerpc/kvm/booke_host.c create mode 100644 arch/powerpc/kvm/booke_interrupts.S create mode 100644 arch/powerpc/kvm/emulate.c create mode 100644 arch/powerpc/kvm/powerpc.c create mode 100644 include/asm-powerpc/kvm_asm.h create mode 100644 include/asm-powerpc/kvm_host.h create mode 100644 include/asm-powerpc/kvm_para.h create mode 100644 include/asm-powerpc/kvm_ppc.h create mode 100644 virt/kvm/kvm_trace.c |
From: Avi K. <av...@qu...> - 2008-04-21 10:29:58
|
From: Joerg Roedel <joe...@am...> The svm_set_cr4 function is indented with spaces. This patch replaces them with tabs. Signed-off-by: Joerg Roedel <joe...@am...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/svm.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ad27346..d7439ce 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -878,10 +878,10 @@ set: static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { - vcpu->arch.cr4 = cr4; - if (!npt_enabled) - cr4 |= X86_CR4_PAE; - to_svm(vcpu)->vmcb->save.cr4 = cr4; + vcpu->arch.cr4 = cr4; + if (!npt_enabled) + cr4 |= X86_CR4_PAE; + to_svm(vcpu)->vmcb->save.cr4 = cr4; } static void svm_set_segment(struct kvm_vcpu *vcpu, -- 1.5.5 |
From: Jeremy F. <je...@go...> - 2008-04-21 09:57:21
|
Gerd Hoffmann wrote: > * Host: make kvm pv clock really compatible with xen pv clock. > * Guest/xen: factor out some xen clock code into a separate > source file (pvclock.[ch]), so kvm can reuse it. > * Guest/kvm: make kvm clock compatible with xen clock by using > the common code bits. > I guess saving on code duplication is good... > +cycle_t pvclock_clocksource_read(struct kvm_vcpu_time_info *src) > +{ > + struct pvclock_shadow_time *shadow = &get_cpu_var(shadow_time); > + cycle_t ret; > + > + pvclock_get_time_values(shadow, src); > + ret = shadow->system_timestamp + pvclock_get_nsec_offset(shadow); > You need to put this in a loop in case the system clock parameters change between the pvclock_get_time_values() and pvclock_get_nsec_offset(). How does kvm deal with suspend/resume with respect to time? Is the "system" timestamp guaranteed to remain monotonic? For Xen, I think we'll need to maintain an offset between the initial system timestamp and whatever it is after resuming. J |
From: Avi K. <av...@qu...> - 2008-04-21 09:20:21
|
David S. Ahern wrote: > I added the traces and captured data over another apparent lockup of the guest. > This seems to be representative of the sequence (pid/vcpu removed). > > (+4776) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c016127c ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db4 ] > (+3632) VMENTRY > (+4552) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c016104a ] > (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000 fffb61c8 ] > (+ 54928) VMENTRY > Can you oprofile the host to see where the 54K cycles are spent? > (+4568) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c01610e7 ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db4 ] > (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000 41c5d363 ] > (+8432) VMENTRY > (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c01610ee ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db0 ] > (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000 00000000 ] > (+ 13832) VMENTRY > > > (+5768) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c016127c ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db4 ] > (+3712) VMENTRY > (+4576) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c016104a ] > (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000 fffb61d0 ] > (+ 0) PTE_WRITE [ gpa = 0x00000000 3d5981d0 gpte = 0x00000000 3d55d047 ] > This indeed has the accessed bit clear. > (+ 65216) VMENTRY > (+4232) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c01610e7 ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db4 ] > (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db4 gpte = 0x00000000 3d598363 ] > This has the accessed bit set and the user bit clear, and the pte pointing at the previous pte_write gpa. Looks like a kmap_atomic(). > (+8640) VMENTRY > (+3936) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 c01610ee ] > (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 c0009db0 ] > (+ 0) PTE_WRITE [ gpa = 0x00000000 00009db0 gpte = 0x00000000 00000000 ] > (+ 14160) VMENTRY > > I can forward a more complete time snippet if you'd like. vcpu0 + corresponding > vcpu1 files have 85000 total lines and compressed the files total ~500k. > > I did not see the FLOODED trace come out during this sample though I did bump > the count from 3 to 4 as you suggested. > > > Bumping the count was supposed to remove the flooding... > Correlating rip addresses to the 2.4 kernel: > > c0160d00-c0161290 = page_referenced > > It looks like the event is kscand running through the pages. I suspected this > some time ago, and tried tweaking the kscand_work_percent sysctl variable. It > appeared to lower the peak of the spikes, but maybe I imagined it. I believe > lowering that value makes kscand wake up more often but do less work (page > scanning) each time it is awakened. > > What does 'top' in the guest show (perhaps sorted by total cpu time rather than instantaneous usage)? What host kernel are you running? How many host cpus? -- error compiling committee.c: too many arguments to function |
From: Tomas R. <li...@ko...> - 2008-04-21 09:10:03
|
Hello everybody After I update to KVM-66 (from 65), I have problem to boot guests with lilo installed. Boot sequence always stop with "LIL" output. With kvm-65 everythink works great. I have also windows XP guest, which boot without problem. With -no-kvm guests boot ok. Processor: AMD Opteron 2210 KVM: kvm-64 Host: gentoo-sources-2.6.25-r1 Arch: x86_64 Guests: gentoo, 2.6.25, x86_64 dmesg: Apr 21 09:53:37 kvm BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 Apr 21 09:53:37 kvm IP: [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm PGD 11e989067 PUD 11ec0a067 PMD 0 Apr 21 09:53:37 kvm Oops: 0002 [2] SMP Apr 21 09:53:37 kvm CPU 2 Apr 21 09:53:37 kvm Modules linked in: w83627hf hwmon_vid xt_tcpudp xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables tun kvm_amd kvm shpchp pci_hotplug k8temp i2c_nforce2 i2c_core Apr 21 09:53:37 kvm Pid: 7130, comm: kvm Tainted: G D 2.6.25-gentoo-r1 #1 Apr 21 09:53:37 kvm RIP: 0010:[<ffffffff88029c85>] [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm RSP: 0018:ffff81011e3e5738 EFLAGS: 00010246 Apr 21 09:53:37 kvm RAX: 0000000000000010 RBX: 0000000000000000 RCX: ffff81011e3e7378 Apr 21 09:53:37 kvm RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff81011e3e6000 Apr 21 09:53:37 kvm RBP: ffff81011e3e7378 R08: 0000000000000000 R09: 0000000000000000 Apr 21 09:53:37 kvm R10: ffffffff88041988 R11: ffff81011e3e7378 R12: ffff81011e3e7330 Apr 21 09:53:37 kvm R13: 0000000000000000 R14: ffffffff88033a20 R15: 0000000000000ce3 Apr 21 09:53:37 kvm FS: 0000000041e68950(0063) GS:ffff81011ff1b000(0000) knlGS:0000000000000000 Apr 21 09:53:37 kvm CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Apr 21 09:53:37 kvm CR2: 0000000000000000 CR3: 000000011e1af000 CR4: 00000000000006e0 Apr 21 09:53:37 kvm DR0: ffffffff805b34f8 DR1: 0000000000000000 DR2: 0000000000000000 Apr 21 09:53:37 kvm DR3: 0000000000000000 DR6: 00000000ffff0ff1 DR7: 0000000000000701 Apr 21 09:53:37 kvm Process kvm (pid: 7130, threadinfo ffff81011e3e4000, task ffff81011e560000) Apr 21 09:53:37 kvm Stack: ffffffff8021a311 000000000000000f 00000000fffffff7 ffffffff8021a49b Apr 21 09:53:37 kvm 00000000ffffffff ffff81011ed41d00 ffffc20001926000 0000000000000000 Apr 21 09:53:37 kvm ffffffff8021a311 ffffffff802347a0 ffff81011ed41d00 ffffffff880419e0 Apr 21 09:53:37 kvm Call Trace: Apr 21 09:53:37 kvm [<ffffffff8021a311>] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [<ffffffff8021a49b>] ? smp_call_function_mask+0x47/0x55 Apr 21 09:53:37 kvm [<ffffffff8021a311>] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [<ffffffff802347a0>] ? on_each_cpu+0x19/0x25 Apr 21 09:53:37 kvm [<ffffffff880419e0>] Apr 21 09:53:37 kvm [<ffffffff88020501>] ? :kvm:kvm_get_cs_db_l_bits+0x9/0x2f Apr 21 09:53:37 kvm [<ffffffff8801f101>] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [<ffffffff8801f101>] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [<ffffffff88041fbc>] Apr 21 09:53:37 kvm [<ffffffff88020148>] ? :kvm:kvm_arch_vcpu_ioctl_run+0x44a/0x5b8 Apr 21 09:53:37 kvm [<ffffffff8801bf23>] ? :kvm:kvm_resched+0x1b4/0x9b7 Apr 21 09:53:37 kvm [<ffffffff8802ad63>] ? :kvm:kvm_pic_set_irq+0x21/0x6b Apr 21 09:53:37 kvm [<ffffffff8801e81b>] ? :kvm:kvm_arch_vm_ioctl+0x38e/0x5e6 Apr 21 09:53:37 kvm [<ffffffff8026217b>] ? zone_statistics+0x41/0x94 Apr 21 09:53:37 kvm [<ffffffff8025bc16>] ? get_page_from_freelist+0x457/0x5af Apr 21 09:53:37 kvm [<ffffffff8025bdc0>] ? __alloc_pages+0x52/0x2ee Apr 21 09:53:37 kvm [<ffffffff80225e50>] ? source_load+0x25/0x41 Apr 21 09:53:37 kvm [<ffffffff802286f1>] ? find_busiest_group+0x268/0x742 Apr 21 09:53:37 kvm [<ffffffff80225552>] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [<ffffffff805b3aae>] ? thread_return+0x64/0xa5 Apr 21 09:53:37 kvm [<ffffffff80249099>] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [<ffffffff80249816>] ? unqueue_me+0x6b/0x73 Apr 21 09:53:37 kvm [<ffffffff80249bc1>] ? futex_wait+0x290/0x327 Apr 21 09:53:37 kvm [<ffffffff80227c36>] ? try_to_wake_up+0xfa/0x10c Apr 21 09:53:37 kvm [<ffffffff80229752>] ? __wake_up_common+0x49/0x74 Apr 21 09:53:37 kvm [<ffffffff80268c29>] ? find_extend_vma+0x16/0x61 Apr 21 09:53:37 kvm [<ffffffff80249099>] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [<ffffffff803c1439>] ? __up_read+0x10/0x8a Apr 21 09:53:37 kvm [<ffffffff8024955e>] ? futex_wake+0xfa/0x10c Apr 21 09:53:37 kvm [<ffffffff80242e5a>] ? ktime_get_ts+0x56/0x5d Apr 21 09:53:37 kvm [<ffffffff8801c3cb>] ? :kvm:kvm_resched+0x65c/0x9b7 Apr 21 09:53:37 kvm [<ffffffff80225552>] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [<ffffffff8028a311>] ? vfs_ioctl+0x29/0x6f Apr 21 09:53:37 kvm [<ffffffff8028a5a4>] ? do_vfs_ioctl+0x24d/0x25c Apr 21 09:53:37 kvm [<ffffffff8028a5ef>] ? sys_ioctl+0x3c/0x61 Apr 21 09:53:37 kvm [<ffffffff8020b09b>] ? system_call_after_swapgs+0x7b/0x80 Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Code: 02 74 20 77 06 ff c8 74 0e eb 78 83 f8 04 74 20 83 f8 08 74 27 eb 6c 48 8b 51 40 48 8b 41 30 88 02 eb 60 48 8b 51 40 48 8b 41 30 <66> 89 02 eb 53 48 8b 41 40 8b 49 30 48 89 08 eb 47 48 8b 51 40 Apr 21 09:53:37 kvm RIP [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm RSP <ffff81011e3e5738> Apr 21 09:53:37 kvm CR2: 0000000000000000 Apr 21 09:53:37 kvm ---[ end trace 8b01d2fbd0fdd57f ]--- -- Tomas Rusnak |
From: Бухгалтерия <pl...@br...> - 2008-04-21 08:43:36
|
Бухгалтеру о договорной работе организации - правовые основы и налоговый аспект 7 мая 2008, г. Мoсква Прoграмма семинара Программа семинара 1. Как правильно оформить договор, обязательные и дополнительные условия договоров. Когда можно считать соблюденной простую письменную форму договора. Когда договор требует государственной регистрации или нотариального заверения. Рамочные договоры. Оферта (одностороннее предложение заключить сделку). Подписание договора. 2. Гарантийные условия в договорах - залог, задаток, неустойка (налоговые преимущества гарантий по сравнению с авансами по договору). Регулирование в договорах и ╚по умолчанию╩ вопросов возмещения ущерба от неисполнения договора. Упущенная выгода. Ничтожность противозаконных условий сделок и ее налоговые последствия. 3. Разрешение споров по договорам. Претензионная работа. Признание долга сомнительным, безнадежным, списание долга. Сроки исковой давности. 4. Цена в договоре - способы обозначения (должна ли быть указана определенная цена), единицы изменения (рубли, иностранная валюта, условные единицы), обоснование рыночной цены. Скидки - разовые и накопительные - порядок предоставления и учета. 5. Порядок расчетов по договору, наличные и безналичные платежи с учетом изменений порядка расчета наличными согласно Указанию ЦБ от 20.06.2007 N 1843-У "О предельном размере расчетов наличными деньгами и расходовании наличных денег, поступивших в кассу юридического лица или кассу индивидуального предпринимателя". Даты признания доходов и расходов по договорам. Расчетные документы в рублях, валюте, условных единицах. Сопроводительные и расчетные документы в электронном виде. Акты: обязательно ли составлять акт, ждать ли окончания договора или составлять акт поэтапно, форма акта, позиция Минфина относительно порядка заполнения актов и детализации сведений в них. 6. Договоры между юридическими лицами - оформление, учет и налогообложение. Зависимость налогового бремени от вида и содержания договора. Договор и налог на прибыль. Договор и НДС. - договор купли-продажи (предмет, обязательные условия, оформление и последствия возврата товара), - договор мены (обмен и мена - в чем отличия, рыночная цена сделки), - договор аренды (регистрация договоров, аренда автотранспортного средства, аренда офиса), - договор страхования (личное и имущественное страхование, страхование ответственности, налоговые льготы), - договоры займа (признание расходов, беспроцентные займы), - договоры безвозмездной передачи и безвозмездного пользования (ограничения в сфере применения и признания расходов), - посреднические договоры (особенности договоров комиссии, агентирования, поручения), - договор простого товарищества (участники, налоговые преимущества, доля участника и распределение расходов и доходов), - договор возмездного оказания услуг (существенные условия договора, разновидности договоров услуг, преимущества перед договором подряда). 6. Договоры организации с физическими лицами: коллективные, трудовые, гражданско-правовые, договоры с индивидуальными предпринимателями - учет и налогообложение, выплаты по таким договорам. Возможность управления налоговой нагрузкой на предприятие с помощью таких договоров. Пpoдoлжительнoсть oбучения: с 10 дo 17 часoв (с пеpеpывoм на oбед и кoфе-паузу). Местo oбучения: г. Мoсква, 5 мин. пешкoм oт м. Академическая. Стoимoсть oбучения: 4900 pуб. (с НДС). (В стoимoсть вxoдит: pаздатoчный матеpиал, кoфе-пауза, oбед в pестopане). Пpи oтсутствии вoзмoжнoсти пoсетить семинаp, мы пpедлагаем пpиoбpести егo видеoвеpсию на DVD/CD дискаx или видеoкассетаx (пpилагается автopский pаздатoчный матеpиал). Цена видеoкуpса - 3500 pублей, с учетoм НДС. Для pегистpации на семинаp неoбxoдимo oтпpавить нам пo факсу: pеквизиты opганизации, тему и дату семинаpа, пoлнoе ФИo участникoв, кoнтактный телефoн и факс. Для заказа видеoкуpса неoбxoдимo oтпpавить нам пo факсу: pеквизиты opганизации, тему видеoкуpса, указать нoситель (ДВД или СД диски), телефoн, факс, кoнтактнoе лицo и тoчный адpес дoставки. Пoлучить дoпoлнительную инфopмацию и заpегистpиpoваться мoжнo: пo т/ф: ( Ч 9 5 ) 5 ЧЗ = 8 8 = Ч 6 |
From: Gerd H. <kr...@re...> - 2008-04-21 08:34:07
|
Gerd Hoffmann wrote: > Marcelo Tosatti wrote: >> Haven't seen Gerd's guest patches ? > > I'm still busy cooking them up. I've mentioned them in a mail, but they > didn't ran over the list (yet). Stay tuned ;) It compiles, ship it! This time as all-in one patch (both guest and host side). Almost untested and not (yet) splitted into pieces. Changes: * Host: make kvm pv clock really compatible with xen pv clock. * Guest/xen: factor out some xen clock code into a separate source file (pvclock.[ch]), so kvm can reuse it. * Guest/kvm: make kvm clock compatible with xen clock by using the common code bits. Tests, reviews and comments are welcome. cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ |
From: Yunfeng Z. <yun...@in...> - 2008-04-21 08:29:57
|
Hi All, This is today's KVM test result against kvm.git 6cf59734fc9bc89954d0157524eea156c2f9a5ab and kvm-userspace.git 43201923a67647913b67da255ca60f0269a3e34a. One Issue Fixed ================================================ 1.Can't boot smp guests on ia32e host https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1944629&group_id=180599 Three Old Issues: ================================================ 1. Booting four guests likely fails https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1919354&group_id=180599 2. booting smp windows guests has 30% chance of hang https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1910923&group_id=180599 3. Cannot boot guests with hugetlbfs https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1941302&group_id=180599 Test environment ================================================ Platform Woodcrest CPU 4 Memory size 8G' Details ================================================ IA32-pae: 1. boot guest with 256M memory PASS 2. boot two windows xp guest PASS 3. boot 4 same guest in parallel PASS 4. boot linux and windows guest in parallel PASS 5. boot guest with 1500M memory PASS 6. boot windows 2003 with ACPI enabled PASS 7. boot Windows xp with ACPI enabled PASS 8. boot Windows 2000 without ACPI PASS 9. kernel build on SMP linux guest PASS 10. LTP on linux guest PASS 11. boot base kernel linux PASS 12. save/restore 32-bit HVM guests PASS 13. live migration 32-bit HVM guests PASS 14. boot SMP Windows xp with ACPI enabled PASS 15. boot SMP Windows 2003 with ACPI enabled PASS 16. boot SMP Windows 2000 with ACPI enabled PASS ================================================ IA32e: 1. boot four 32-bit guest in parallel PASS 2. boot four 64-bit guest in parallel PASS 3. boot 4G 64-bit guest PASS 4. boot 4G pae guest PASS 5. boot 32-bit linux and 32 bit windows guest in parallel PASS 6. boot 32-bit guest with 1500M memory PASS 7. boot 64-bit guest with 1500M memory PASS 8. boot 32-bit guest with 256M memory PASS 9. boot 64-bit guest with 256M memory PASS 10. boot two 32-bit windows xp in parallel PASS 11. boot four 32-bit different guest in para PASS 12. save/restore 64-bit linux guests PASS 13. save/restore 32-bit linux guests PASS 14. boot 32-bit SMP windows 2003 with ACPI enabled PASS 15. boot 32-bit SMP Windows 2000 with ACPI enabled PASS 16. boot 32-bit SMP Windows xp with ACPI enabled PASS 17. boot 32-bit Windows 2000 without ACPI PASS 18. boot 64-bit Windows xp with ACPI enabled PASS 19. boot 32-bit Windows xp without ACPI PASS 20. boot 64-bit UP vista PASS 21. boot 64-bit SMP vista PASS 22. kernel build in 32-bit linux guest OS PASS 23. kernel build in 64-bit linux guest OS PASS 24. LTP on 32-bit linux guest OS PASS 25. LTP on 64-bit linux guest OS PASS 26. boot 64-bit guests with ACPI enabled PASS 27. boot 32-bit x-server PASS 28. boot 64-bit SMP windows XP with ACPI enabled PASS 29. boot 64-bit SMP windows 2003 with ACPI enabled PASS 30. live migration 64bit linux guests PASS 31. live migration 32bit linux guests PASS 32. reboot 32bit windows xp guest PASS 33. reboot 32bit windows xp guest PASS Report Summary on IA32-pae Summary Test Report of Last Session ===================================================================== Total Pass Fail NoResult Crash ===================================================================== control_panel 8 5 3 0 0 Restart 2 2 0 0 0 gtest 15 14 1 0 0 ===================================================================== control_panel 8 5 3 0 0 :KVM_LM_PAE_gPAE 1 0 1 0 0 :KVM_four_sguest_PAE_gPA 1 1 0 0 0 :KVM_256M_guest_PAE_gPAE 1 1 0 0 0 :KVM_linux_win_PAE_gPAE 1 1 0 0 0 :KVM_1500M_guest_PAE_gPA 1 1 0 0 0 :KVM_SR_PAE_gPAE 1 0 1 0 0 :KVM_two_winxp_PAE_gPAE 1 1 0 0 0 :KVM_4G_guest_PAE_gPAE 1 0 1 0 0 Restart 2 2 0 0 0 :GuestPAE_PAE_gPAE 1 1 0 0 0 :BootTo32pae_PAE_gPAE 1 1 0 0 0 gtest 15 14 1 0 0 :ltp_nightly_PAE_gPAE 1 1 0 0 0 :boot_up_acpi_PAE_gPAE 1 1 0 0 0 :reboot_xp_PAE_gPAE 1 1 0 0 0 :boot_up_vista_PAE_gPAE 1 0 1 0 0 :boot_up_acpi_xp_PAE_gPA 1 1 0 0 0 :boot_up_acpi_win2k3_PAE 1 1 0 0 0 :boot_base_kernel_PAE_gP 1 1 0 0 0 :boot_smp_acpi_win2k3_PA 1 1 0 0 0 :boot_smp_acpi_win2k_PAE 1 1 0 0 0 :boot_up_acpi_win2k_PAE_ 1 1 0 0 0 :boot_smp_acpi_xp_PAE_gP 1 1 0 0 0 :boot_up_noacpi_win2k_PA 1 1 0 0 0 :boot_smp_vista_PAE_gPAE 1 1 0 0 0 :bootx_PAE_gPAE 1 1 0 0 0 :kb_nightly_PAE_gPAE 1 1 0 0 0 ===================================================================== Total 25 21 4 0 0 Report Summary on IA32e Summary Test Report of Last Session ===================================================================== Total Pass Fail NoResult Crash ===================================================================== control_panel 15 15 0 0 0 Restart 3 3 0 0 0 gtest 25 25 0 0 0 ===================================================================== control_panel 15 15 0 0 0 :KVM_LM_64_g64 1 1 0 0 0 :KVM_four_sguest_64_gPAE 1 1 0 0 0 :KVM_4G_guest_64_g64 1 1 0 0 0 :KVM_four_sguest_64_g64 1 1 0 0 0 :KVM_linux_win_64_gPAE 1 1 0 0 0 :KVM_1500M_guest_64_gPAE 1 1 0 0 0 :KVM_SR_64_g64 1 1 0 0 0 :KVM_LM_64_gPAE 1 1 0 0 0 :KVM_256M_guest_64_g64 1 1 0 0 0 :KVM_1500M_guest_64_g64 1 1 0 0 0 :KVM_4G_guest_64_gPAE 1 1 0 0 0 :KVM_SR_64_gPAE 1 1 0 0 0 :KVM_256M_guest_64_gPAE 1 1 0 0 0 :KVM_two_winxp_64_gPAE 1 1 0 0 0 :KVM_four_dguest_64_gPAE 1 1 0 0 0 Restart 3 3 0 0 0 :GuestPAE_64_gPAE 1 1 0 0 0 :BootTo64_64_gPAE 1 1 0 0 0 :Guest64_64_gPAE 1 1 0 0 0 gtest 25 25 0 0 0 :boot_up_acpi_64_gPAE 1 1 0 0 0 :boot_up_noacpi_xp_64_gP 1 1 0 0 0 :boot_smp_acpi_xp_64_g64 1 1 0 0 0 :boot_base_kernel_64_gPA 1 1 0 0 0 :boot_smp_acpi_win2k3_64 1 1 0 0 0 :boot_smp_acpi_win2k_64_ 1 1 0 0 0 :boot_base_kernel_64_g64 1 1 0 0 0 :bootx_64_gPAE 1 1 0 0 0 :kb_nightly_64_gPAE 1 1 0 0 0 :ltp_nightly_64_g64 1 1 0 0 0 :boot_up_acpi_64_g64 1 1 0 0 0 :boot_up_noacpi_win2k_64 1 1 0 0 0 :boot_smp_acpi_xp_64_gPA 1 1 0 0 0 :boot_smp_vista_64_gPAE 1 1 0 0 0 :boot_up_acpi_win2k3_64_ 1 1 0 0 0 :reboot_xp_64_gPAE 1 1 0 0 0 :bootx_64_g64 1 1 0 0 0 :boot_up_vista_64_g64 1 1 0 0 0 :boot_smp_vista_64_g64 1 1 0 0 0 :boot_up_acpi_xp_64_g64 1 1 0 0 0 :boot_up_vista_64_gPAE 1 1 0 0 0 :ltp_nightly_64_gPAE 1 1 0 0 0 :boot_smp_acpi_win2k3_64 1 1 0 0 0 :boot_up_noacpi_win2k3_6 1 1 0 0 0 :kb_nightly_64_g64 1 1 0 0 0 ===================================================================== Total 43 43 0 0 0 Best Regards, Yunfeng |
From: Gerd H. <kr...@re...> - 2008-04-21 07:31:51
|
Jeremy Fitzhardinge wrote: > Gerd Hoffmann wrote: >> I'm looking at the guest side of the issue right now, trying to identify >> common code, and while doing so noticed that xen does the >> version-check-loop in both get_time_values_from_xen(void) and >> xen_clocksource_read(void), and I can't see any obvious reason for that. >> The loop in xen_clocksource_read(void) is not needed IMHO. Can I >> drop it? > > No. The get_nsec_offset() needs to be atomic with respect to the > get_time_values() parameters. Hmm, I somehow fail to see a case where it could be non-atomic ... get_time_values() copies a consistent snapshot, thus xen_clocksource_read() doesn't race against xen updating the fields. The snapshot is in a per-cpu variable, thus it doesn't race against other guest vcpus running get_time_values() at the same time. > There could be a loopless > __get_time_values() for use in this case, but given that it almost never > loops, I don't think its worthwhile. "in this case" ??? I'm confused. There is only a single user of get_nsec_offset(), which is xen_clocksource_read() ... cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ |
From: Gerd H. <kr...@re...> - 2008-04-21 07:15:31
|
Marcelo Tosatti wrote: >> >From what me and marcelo discussed, I think there's a possibility that >> it has marginally something to do with precision of clock calculation. >> Gerd's patches address that issues. Can somebody test this with those >> patches (both guest and host), while I'm off ? > > Haven't seen Gerd's guest patches ? I'm still busy cooking them up. I've mentioned them in a mail, but they didn't ran over the list (yet). Stay tuned ;) cheers, Gerd -- http://kraxel.fedorapeople.org/xenner/ |
From: Soren H. <so...@ub...> - 2008-04-21 07:08:15
|
Esteemed kvm developers! I've been trying to debug this bug https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/219165 It originally revealed itself by failing to run grub (which is a 32 bit binary) when installing Ubuntu from our live cd. It turned out to be a more general problem of 32 bit binaries failing to run. The server install worked like a charm. I eventually discovered that loading the vmmouse driver triggered it and narrowed it down to the call to kvm_load_registers in vmport_ioport_read. We're releasing on Thursday, and I needed a quick fix, so I reverted the calls to kvm_{save,load}_registers in vmport_ioport_read to the old code that simply saved the eax, ebx, ecx, edx, esi, and edi registers, but I'm supposing kvm_{load,save}_registers really should work here. I dug a bit further into the code and tried disabling various pieces of the kvm_load_registers until it finally worked again. The problem seems to only arise when the lstar msr is loaded. I've looked at the code, but seeing as three days ago I didn't know there was such a thing as an lstar msr, I'm finding myself getting stuck. :) Any pointers in the right direction would be lovely. -- Soren Hansen | Virtualisation specialist | Ubuntu Server Team Canonical Ltd. | http://www.ubuntu.com/ |
From: Avi K. <av...@qu...> - 2008-04-21 06:44:36
|
Javier Guerra Giraldez wrote: > On Sunday 20 April 2008, Avi Kivity wrote: > >> Also, I'd presume that those that need 10K IOPS and above will not place >> their high throughput images on a filesystem; rather on a separate SAN LUN. >> > > i think that too; but still that LUN would be accessed by the VM's via one of > these IO emulation layers, right? > > Yes. Hopefully Linux aio. > or maybe you're advocating using the SAN initiator in the VM instead of the > host? > That works too, especially for iSCSI, but that's not what I'm advocating. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Учет в б. < <ron...@ep...> - 2008-04-21 06:43:11
|
Дoкументooбoрoт в бухгалтерскoм учете 22 апpеля 2008, г. Мoсква Прoграмма семинара 1. Дoкументы в бухгалтерскoм учете и их классификация, график дoкументooбoрoта. oрганизация дoкументooбoрoта на предприятии, сoставление oбязательных лoкальных нoрмативных актoв: учетнoй пoлитики, пoлoжения пo кассе и пoдoтчетным суммам, пoлoжения пo oплате труда и премирoвании, пoлoжения пo пoдoтчетным суммам, графика oтпускoв, правил внутреннегo распoрядка, пoлoжения пo аттестации, oб инвентаризации и т.д. 2. Правo пoдписи дoкументoв. Приказ o праве пoдписи как oбязательный дoкумент. Передача права пoдписи на oснoвании дoвереннoсти. Факсимильная и электрoнная цифрoвая пoдпись на дoкументе - взгляд кoнтрoлирующих oрганoв, пoлoжения закoнoдательства, закoн "oб инфoрмации┘" легализует электрoнную пoдпись. oсoбеннoсти пoдписи счетoв-фактур. Пoследствия пoдписания дoкументoв ненадлежащим лицoм. Дoлжен ли налoгoплательщик прoверять правoмернoсть пoдписи на дoкументах, пoлученных oт кoнтрагентoв. 3. Легализация дoкументoв, oставленных на инoстранных языках, дoкументoв, принятых в делoвoм oбoрoте в зарубежных странах. 4. Дoкументы стрoгoй oтчетнoсти. Нoвoе в 2008 г. Бланки стрoгoй oтчетнoсти - учет, пoрядoк хранения, утверждение и изгoтoвление бланка. Замена бланками стрoгoй oтчетнoсти кассoвых чекoв - кoгда этo вoзмoжнo. Пoлученные дoкументы на бланках стрoгoй oтчетнoсти: счета гoстиниц, талoны и карты на тoпливo и др. Денежные дoкументы в кассе oрганизации. 5. Правила oфoрмления дoкументoв. Унифицирoванные фoрмы первичных дoкументoв - oбязательнo ли их применять. Самoстoятельнoе утверждение фoрм первичных дoкументoв - oбязательные реквизиты, пoрядoк утверждения и применения. Типичные нарушения при oфoрмлении первичных дoкументoв. Дoгoвoр и акт, как первичные дoкументы. 6. Хранение дoкументации. Срoки и oрганизация хранения бухгалтерских и управленческих дoкументoв, дoкументoв кадрoвoгo учета. oтветственнoсть бухгалтера и рукoвoдителя за хранение дoкументации. Истребoвание дoкументoв oрганизации при налoгoвых прoверках (нoвый пoрядoк в 2007 г.), прoверках, oрганизуемых внебюджетными фoндами, банками, инспекцией пo труду - oграничения пo срoкам давнoсти и набoру фoрм. Выемка дoкументoв или их кoпий. Дoступ дoлжнoстных лиц кoнтрoлирующих oрганoв к местам хранения дoкументoв. oсoбеннoсти дoкументальнoй камеральнoй и выезднoй прoверoк (нoвый пoрядoк в 2007 г.). 7. Внесение исправлений в учетные дoкументы. Случаи, кoгда правка запрещена. Кoрректирoвание данных учета и oтчетнoсти. Изменение налoгoвых деклараций за прoшедшие налoгoвые и oтчетные периoды. 8. Вoсстанoвление утраченных первичных дoкументoв 9. Дoступ к инфoрмации, сoдержащейся в первичнoй и oтчетнoй дoкументации. Публичная инфoрмация и инфoрмация, сoставляющая кoммерческую тайну. 10. Придание юридическoй силы дoкументам, сoзданным на кoмпьютере. Хранение дoкументoв в электрoннoм виде. Кoгда требуется распечатка. Пpoдoлжительнoсть oбучения: с 10 дo 17 часoв (с пеpеpывoм на oбед и кoфе-паузу). Местo oбучения: г. Мoсква, 5 мин. пешкoм oт м. Академическая. Стoимoсть oбучения: 4900 pуб. (с НДС). (В стoимoсть вxoдит: pаздатoчный матеpиал, кoфе-пауза, oбед в pестopане). Пpи oтсутствии вoзмoжнoсти пoсетить семинаp, мы пpедлагаем пpиoбpести егo видеoвеpсию на DVD/CD дискаx или видеoкассетаx (пpилагается автopский pаздатoчный матеpиал). Цена видеoкуpса - 3500 pублей, с учетoм НДС. Для pегистpации на семинаp неoбxoдимo oтпpавить нам пo факсу или электpoннoй пoчте: pеквизиты opганизации, тему и дату семинаpа, пoлнoе ФИo участникoв, кoнтактный телефoн и факс. Для заказа видеoкуpса неoбxoдимo oтпpавить нам пo факсу или электpoннoй пoчте: pеквизиты opганизации, тему видеoкуpса, указать нoситель (ДВД или СД диски), телефoн, факс, кoнтактнoе лицo и тoчный адpес дoставки. Пoлучить дoпoлнительную инфopмацию и заpегистpиpoваться мoжнo: пo т/ф: (495) 543-88-46 пo электpoннoй пoчте: so...@se... |
From: Avi K. <av...@qu...> - 2008-04-21 06:43:08
|
Jamie Lokier wrote: > Avi Kivity wrote: > >>> Does that mean "for the majority of deployments, the slow version is >>> sufficient. The few that care about performance can use Linux AIO?" >>> >> In essence, yes. s/slow/slower/ and s/performance/ultimate block device >> performance/. >> >> Many deployments don't care at all about block device performance; they >> care mostly about networking performance. >> > > That's interesting. I'd have expected block device performance to be > important for most things, for the same reason that disk performance > is (well, reasonably) important for non-virtual machines. > > Seek time is important. Bandwidth is somewhat important. But for one- and two- spindle workloads (the majority), the cpu utilization induced by getting requests to the disk is not important, and that's what we're optimizing here. Disks work at around 300 Hz. Processors at around 3 GHz. That's seven orders of magnitude difference. Even if you spent 100 usec calculating what's the next best seek, even if it saves you only 10% of seeks it's a win. And of course modern processors spend a few microseconds at most getting a request out. You really need 50+ disks or a large write-back cache to make microoptimizations around the submission path felt. > But as you say next: > > >>> I'm under the impression that the entire and only point of Linux AIO >>> is that it's faster than POSIX AIO on Linux. >>> >> It is. I estimate posix aio adds a few microseconds above linux aio per >> I/O request, when using O_DIRECT. Assuming 10 microseconds, you will >> need 10,000 I/O requests per second per vcpu to have a 10% performance >> difference. That's definitely rare. >> > > Oh, I didn't realise the difference was so small. > > At such a tiny difference, I'm wondering why Linux-AIO exists at all, > as it complicates the kernel rather a lot. I can see the theoretical > appeal, but if performance is so marginal, I'm surprised it's in > there. > > Linux aio exists, but that's all that can be said for it. It works mostly for raw disks, doesn't integrate with networking, and doesn't advance at the same pace as the rest of the kernel. I believe only databases use it (and a userspace filesystem I wrote some time ago). > I'm also surprised the Glibc implementation of AIO using ordinary > threads is so close to it. Why are you surprised? Actually the glibc implementation could be improved from what I've heard. My estimates are for a thread pool implementation, but there is not reason why glibc couldn't achieve exactly the same performance. > And then, I'm wondering why use AIO it > all: it suggests QEMU would run about as fast doing synchronous I/O in > a few dedicated I/O threads. > > Posix aio is the unix API for this, why not use it? >> Also, I'd presume that those that need 10K IOPS and above will not place >> their high throughput images on a filesystem; rather on a separate SAN LUN. >> > > Does the separate LUN make any difference? I thought O_DIRECT on a > filesystem was meant to be pretty close to block device performance. > On a good extent-based filesystem like XFS you will get good performance (though more cpu overhead due to needing to go through additional mapping layers. Old clunkers like ext3 will require additional seeks or a ton of cache (1 GB per 1 TB). > I base this on messages here and there which say swapping to a file is > about as fast as swapping to a block device, nowadays. > Swapping to a file preloads the block mapping into memory, so the filesystem is not involved at all in the I/O path. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Avi K. <av...@qu...> - 2008-04-21 06:13:55
|
Marcelo Tosatti wrote: > On Sun, Apr 20, 2008 at 02:16:52PM +0300, Avi Kivity wrote: > >>> The iperf numbers are pretty good. Performance of UP guests increase >>> slightly but SMP >>> is quite significant. >>> >> I expect you're seeing contention induced by memcpy()s and inefficient >> emulation. With the dma api, I expect the benefit will drop. >> > > You still have to memcpy() with the dma api. Even with vringfd the > kernel->user copy has to be performed under the global mutex protection, > difference being that several packets can be copied per-syscall instead > of only one. > > Block does the copy outside the mutex protection, so net can be adapted to do the same. It does mean we will need to block all I/O temporarily during memory hotplug. >> For pure cpu emulation, there is a ton of work to be done: protecting >> the translator as well as making the translated code smp safe. >> > > I now believe there is a lot of work (which was not clear before). > Not particularly interested in getting real emulation to be > multithreaded. > > Anyways, the lack of multithreading in qemu emulation should not be a > blocker for these patches to get in, since these are infrastructural > changes. > > Getting this into qemu upstream is essential as this is far more intrusive than anything else we've done. But again, I believe there are many other fruit hanging from lower branches. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |