You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(33) |
Nov
(325) |
Dec
(320) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(484) |
Feb
(438) |
Mar
(407) |
Apr
(713) |
May
(831) |
Jun
(806) |
Jul
(1023) |
Aug
(1184) |
Sep
(1118) |
Oct
(1461) |
Nov
(1224) |
Dec
(1042) |
2008 |
Jan
(1449) |
Feb
(1110) |
Mar
(1428) |
Apr
(1643) |
May
(682) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Avi K. <av...@qu...> - 2008-04-30 13:39:55
|
Hollis Blanchard wrote: > Acked-by: Hollis Blanchard <ho...@us...> > > Avi, please apply for 2.6.26. > > Sure thing. Thanks. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Jan K. <jan...@si...> - 2008-04-30 13:39:52
|
This looks bogus, but it is so far without practical impact (phys_start is always 0 when we do the calculation). Signed-off-by: Jan Kiszka <jan...@si...> --- libkvm/libkvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: b/libkvm/libkvm.c =================================================================== --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -550,7 +550,7 @@ int kvm_register_userspace_phys_mem(kvm_ int r; if (!kvm->physical_memory) - kvm->physical_memory = userspace_addr - phys_start; + kvm->physical_memory = userspace_addr + phys_start; memory.slot = get_free_slot(kvm); r = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &memory); |
From: Avi K. <av...@qu...> - 2008-04-30 13:29:51
|
Jan Kiszka wrote: >>> Clear the pending original exception when raising a triple fault. This >>> allows to re-use the vcpu instance, e.g. after a reset which is >>> typically issued as reaction on the triple fault. >>> >>> Signed-off-by: Jan Kiszka <jan...@si...> >>> >>> --- >>> arch/x86/kvm/x86.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> Index: b/arch/x86/kvm/x86.c >>> =================================================================== >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -149,8 +149,10 @@ static void handle_multiple_faults(struc >>> if (vcpu->arch.exception.nr != DF_VECTOR) { >>> vcpu->arch.exception.nr = DF_VECTOR; >>> vcpu->arch.exception.error_code = 0; >>> - } else >>> + } else { >>> set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >>> + vcpu->arch.exception.pending = false; >>> + } >>> } >>> >>> void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) >>> >>> >>> >> There's a bigger problem here. The exception queue is hidden state that >> qemu and load and save. >> > > Could you elaborate a bit on what the problematic scenario precisely is > (that pending triple faults would not be saved/restored while pending > exceptions are?), and if I/we can do anything to resolve it? > Two scenarios: savevm (no pending exception) guest runs... loadvm (with a pending exception in the current state) spurious exception injected savevm (pending exception, lost) new qemu instance (or live migration) loadvm (exception not delivered) The second scenario is not too bad, I guess: for fault-like exceptions, the first instruction would fault again and the exception would be regenerated. The first scenario is bad, but I guess very unlikely. One fix would be to expose the exception queue to userspace. I don't like it since this is not x86 architectural state but a kvm artifact. Maybe we should clear the exception queue on kvm_set_sregs() (that should fix the reset case as well). -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Jan K. <jan...@si...> - 2008-04-30 13:12:45
|
Joerg Roedel wrote: > The current KVM x86 exception code handles double and triple faults only for > page fault exceptions. This patch extends this detection for every exception > that gets queued for the guest. > > Signed-off-by: Joerg Roedel <joe...@am...> > Cc: Jan Kiszka <jan...@si...> > --- > arch/x86/kvm/x86.c | 31 +++++++++++++++++-------------- > 1 files changed, 17 insertions(+), 14 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 578a0c1..c05aa32 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -144,9 +144,21 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data) > } > EXPORT_SYMBOL_GPL(kvm_set_apic_base); > > +static void handle_multiple_faults(struct kvm_vcpu *vcpu) > +{ > + if (vcpu->arch.exception.nr != DF_VECTOR) { > + vcpu->arch.exception.nr = DF_VECTOR; > + vcpu->arch.exception.error_code = 0; > + } else > + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > +} > + > void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) > { > - WARN_ON(vcpu->arch.exception.pending); > + if (vcpu->arch.exception.pending) { > + handle_multiple_faults(vcpu); > + return; > + } > vcpu->arch.exception.pending = true; > vcpu->arch.exception.has_error_code = false; > vcpu->arch.exception.nr = nr; > @@ -157,25 +169,16 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr, > u32 error_code) > { > ++vcpu->stat.pf_guest; > - if (vcpu->arch.exception.pending) { > - if (vcpu->arch.exception.nr == PF_VECTOR) { > - printk(KERN_DEBUG "kvm: inject_page_fault:" > - " double fault 0x%lx\n", addr); > - vcpu->arch.exception.nr = DF_VECTOR; > - vcpu->arch.exception.error_code = 0; > - } else if (vcpu->arch.exception.nr == DF_VECTOR) { > - /* triple fault -> shutdown */ > - set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > - } > - return; > - } > vcpu->arch.cr2 = addr; > kvm_queue_exception_e(vcpu, PF_VECTOR, error_code); > } > > void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code) > { > - WARN_ON(vcpu->arch.exception.pending); > + if (vcpu->arch.exception.pending) { > + handle_multiple_faults(vcpu); > + return; > + } > vcpu->arch.exception.pending = true; > vcpu->arch.exception.has_error_code = true; > vcpu->arch.exception.nr = nr; And here is an add-on patch to fix reset-on-triple-fault: Clear the pending original exception when raising a triple fault. This allows to re-use the vcpu instance, e.g. after a reset which is typically issued as reaction on the triple fault. Signed-off-by: Jan Kiszka <jan...@si...> --- arch/x86/kvm/x86.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: b/arch/x86/kvm/x86.c =================================================================== --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -149,8 +149,10 @@ static void handle_multiple_faults(struc if (vcpu->arch.exception.nr != DF_VECTOR) { vcpu->arch.exception.nr = DF_VECTOR; vcpu->arch.exception.error_code = 0; - } else + } else { set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); + vcpu->arch.exception.pending = false; + } } void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) |
From: Jan K. <jan...@si...> - 2008-04-30 13:08:51
|
Minor cleanup I came across while reverting printf instrumentations. Signed-off-by: Jan Kiszka <jan...@si...> --- libkvm/libkvm-x86.c | 4 ++-- libkvm/libkvm.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) Index: b/libkvm/libkvm-x86.c =================================================================== --- a/libkvm/libkvm-x86.c +++ b/libkvm/libkvm-x86.c @@ -117,7 +117,7 @@ static int kvm_init_tss(kvm_context_t kv */ r = kvm_set_tss_addr(kvm, 0xfffbd000); if (r < 0) { - printf("kvm_init_tss: unable to set tss addr\n"); + fprintf(stderr, "kvm_init_tss: unable to set tss addr\n"); return r; } @@ -157,7 +157,7 @@ int kvm_create_pit(kvm_context_t kvm) if (r >= 0) kvm->pit_in_kernel = 1; else { - printf("Create kernel PIC irqchip failed\n"); + fprintf(stderr, "Create kernel PIC irqchip failed\n"); return r; } } Index: b/libkvm/libkvm.c =================================================================== --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -368,7 +368,7 @@ void kvm_create_irqchip(kvm_context_t kv if (r >= 0) kvm->irqchip_in_kernel = 1; else - printf("Create kernel PIC irqchip failed\n"); + fprintf(stderr, "Create kernel PIC irqchip failed\n"); } } #endif @@ -877,7 +877,7 @@ again: if (r == -1 && errno != EINTR && errno != EAGAIN) { r = -errno; post_kvm_run(kvm, vcpu); - printf("kvm_run: %s\n", strerror(-r)); + fprintf(stderr, "kvm_run: %s\n", strerror(-r)); return r; } |
From: Fabian D. <fab...@gm...> - 2008-04-30 13:06:59
|
Avi Kivity wrote: > Fabian Deutsch wrote: > > Hey. > > > > I've been trying Microsoft Windows 2003 a couple of times. The wiki > > tells me that "everything" should work okay. It does, when using -smp 1, > > but gets ugly when using -smp 2 or so. > > > > SO might it be useful, to add the column "smp" to the "Guest Support > > Status" Page in the wiki? > > > > SMP Windows work best if you have FlexPriority on your hardware. What > host cpu are you using? In general I am not able to install Microsoft Windows guests when using -smp > 1 on the following hardware (and kvm modules+userspace head): Intel(R) Xeon(R) CPU X3210 @ 2.13GHz |
From: Jan K. <jan...@si...> - 2008-04-30 13:03:40
|
Avi Kivity wrote: > Jan Kiszka wrote: >> Joerg Roedel wrote: >> >>> The current KVM x86 exception code handles double and triple faults >>> only for >>> page fault exceptions. This patch extends this detection for every >>> exception >>> that gets queued for the guest. >>> >>> Signed-off-by: Joerg Roedel <joe...@am...> >>> Cc: Jan Kiszka <jan...@si...> >>> --- >>> arch/x86/kvm/x86.c | 31 +++++++++++++++++-------------- >>> 1 files changed, 17 insertions(+), 14 deletions(-) >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 578a0c1..c05aa32 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -144,9 +144,21 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, >>> u64 data) >>> } >>> EXPORT_SYMBOL_GPL(kvm_set_apic_base); >>> >>> +static void handle_multiple_faults(struct kvm_vcpu *vcpu) >>> +{ >>> + if (vcpu->arch.exception.nr != DF_VECTOR) { >>> + vcpu->arch.exception.nr = DF_VECTOR; >>> + vcpu->arch.exception.error_code = 0; >>> + } else >>> + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >>> +} >>> + >>> void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) >>> { >>> - WARN_ON(vcpu->arch.exception.pending); >>> + if (vcpu->arch.exception.pending) { >>> + handle_multiple_faults(vcpu); >>> + return; >>> + } >>> vcpu->arch.exception.pending = true; >>> vcpu->arch.exception.has_error_code = false; >>> vcpu->arch.exception.nr = nr; >>> @@ -157,25 +169,16 @@ void kvm_inject_page_fault(struct kvm_vcpu >>> *vcpu, unsigned long addr, >>> u32 error_code) >>> { >>> ++vcpu->stat.pf_guest; >>> - if (vcpu->arch.exception.pending) { >>> - if (vcpu->arch.exception.nr == PF_VECTOR) { >>> - printk(KERN_DEBUG "kvm: inject_page_fault:" >>> - " double fault 0x%lx\n", addr); >>> - vcpu->arch.exception.nr = DF_VECTOR; >>> - vcpu->arch.exception.error_code = 0; >>> - } else if (vcpu->arch.exception.nr == DF_VECTOR) { >>> - /* triple fault -> shutdown */ >>> - set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >>> - } >>> - return; >>> - } >>> vcpu->arch.cr2 = addr; >>> kvm_queue_exception_e(vcpu, PF_VECTOR, error_code); >>> } >>> >>> void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 >>> error_code) >>> { >>> - WARN_ON(vcpu->arch.exception.pending); >>> + if (vcpu->arch.exception.pending) { >>> + handle_multiple_faults(vcpu); >>> + return; >>> + } >>> vcpu->arch.exception.pending = true; >>> vcpu->arch.exception.has_error_code = true; >>> vcpu->arch.exception.nr = nr; >>> >> >> And here is an add-on patch to fix reset-on-triple-fault: >> >> >> Clear the pending original exception when raising a triple fault. This >> allows to re-use the vcpu instance, e.g. after a reset which is >> typically issued as reaction on the triple fault. >> >> Signed-off-by: Jan Kiszka <jan...@si...> >> >> --- >> arch/x86/kvm/x86.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> Index: b/arch/x86/kvm/x86.c >> =================================================================== >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -149,8 +149,10 @@ static void handle_multiple_faults(struc >> if (vcpu->arch.exception.nr != DF_VECTOR) { >> vcpu->arch.exception.nr = DF_VECTOR; >> vcpu->arch.exception.error_code = 0; >> - } else >> + } else { >> set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >> + vcpu->arch.exception.pending = false; >> + } >> } >> >> void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) >> >> > > There's a bigger problem here. The exception queue is hidden state that > qemu and load and save. Could you elaborate a bit on what the problematic scenario precisely is (that pending triple faults would not be saved/restored while pending exceptions are?), and if I/we can do anything to resolve it? Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux |
From: Avi K. <av...@qu...> - 2008-04-30 13:02:11
|
In about a week, the various kvm lists will move to vger.kenel.org. This will improve responsiveness, and reduce spam and advertising. Please subscribe to the lists you are interested in as soon as possible. You can subscribe by sending an email to maj...@vg..., with the following lines in the body: subscribe kvm subscribe kvm-commits subscribe kvm-ia64 subscribe kvm-ppc Of course, omit lines for the lists you are not interested in. Majordomo will then send further instructions. Thanks to the vger admins for hosting the kvm lists. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Avi K. <av...@qu...> - 2008-04-30 13:02:11
|
David S. Ahern wrote: > Another tidbit for you guys as I make my way through various permutations: > I installed the RHEL3 hugemem kernel and the guest behavior is *much* better. > System time still has some regular hiccups that are higher than xen and esx > (e.g., 1 minute samples out of 5 show system time between 10 and 15%), but > overall guest behavior is good with the hugemem kernel. > > Wait, the amount of info here is overwhelming. Let's stick with the current kernel (32-bit, HIGHMEM4G, right?) Did you get any traces with bypass_guest_pf=0? That may show more info. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Jiang, Y. <yun...@in...> - 2008-04-30 12:58:05
|
I noticed there is a windows PV driver based on virtIO in http://sourceforge.net/project/showfiles.php?group_id=180599 But when I enable the driver in guest, the guest will hang. I'm using changeset around April, 18. Since the driver is created in March, I assume the changeset in Apri should be ok. Are there any special action needed to enable the PV driver in windows? Have anyone tried it recently? -- Yunhong Jiang |
From: Joerg R. <joe...@am...> - 2008-04-30 12:57:53
|
On Wed, Apr 30, 2008 at 10:45:12AM +0200, Jan Kiszka wrote: > Joerg Roedel wrote: > > The current KVM x86 exception code handles double and triple faults only for > > page fault exceptions. This patch extends this detection for every exception > > that gets queued for the guest. > > > > Signed-off-by: Joerg Roedel <joe...@am...> > > Cc: Jan Kiszka <jan...@si...> > > --- > > arch/x86/kvm/x86.c | 31 +++++++++++++++++-------------- > > 1 files changed, 17 insertions(+), 14 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 578a0c1..c05aa32 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -144,9 +144,21 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data) > > } > > EXPORT_SYMBOL_GPL(kvm_set_apic_base); > > > > +static void handle_multiple_faults(struct kvm_vcpu *vcpu) > > +{ > > + if (vcpu->arch.exception.nr != DF_VECTOR) { > > + vcpu->arch.exception.nr = DF_VECTOR; > > + vcpu->arch.exception.error_code = 0; > > + } else > > + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > > +} > > + > > void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) > > { > > - WARN_ON(vcpu->arch.exception.pending); > > + if (vcpu->arch.exception.pending) { > > + handle_multiple_faults(vcpu); > > + return; > > + } > > vcpu->arch.exception.pending = true; > > vcpu->arch.exception.has_error_code = false; > > vcpu->arch.exception.nr = nr; > > @@ -157,25 +169,16 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr, > > u32 error_code) > > { > > ++vcpu->stat.pf_guest; > > - if (vcpu->arch.exception.pending) { > > - if (vcpu->arch.exception.nr == PF_VECTOR) { > > - printk(KERN_DEBUG "kvm: inject_page_fault:" > > - " double fault 0x%lx\n", addr); > > - vcpu->arch.exception.nr = DF_VECTOR; > > - vcpu->arch.exception.error_code = 0; > > - } else if (vcpu->arch.exception.nr == DF_VECTOR) { > > - /* triple fault -> shutdown */ > > - set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > > - } > > - return; > > - } > > vcpu->arch.cr2 = addr; > > kvm_queue_exception_e(vcpu, PF_VECTOR, error_code); > > } > > > > void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code) > > { > > - WARN_ON(vcpu->arch.exception.pending); > > + if (vcpu->arch.exception.pending) { > > + handle_multiple_faults(vcpu); > > + return; > > + } > > vcpu->arch.exception.pending = true; > > vcpu->arch.exception.has_error_code = true; > > vcpu->arch.exception.nr = nr; > > And here is an add-on patch to fix reset-on-triple-fault: > > > Clear the pending original exception when raising a triple fault. This > allows to re-use the vcpu instance, e.g. after a reset which is > typically issued as reaction on the triple fault. > > Signed-off-by: Jan Kiszka <jan...@si...> > > --- > arch/x86/kvm/x86.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > Index: b/arch/x86/kvm/x86.c > =================================================================== > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -149,8 +149,10 @@ static void handle_multiple_faults(struc > if (vcpu->arch.exception.nr != DF_VECTOR) { > vcpu->arch.exception.nr = DF_VECTOR; > vcpu->arch.exception.error_code = 0; > - } else > + } else { > set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > + vcpu->arch.exception.pending = false; > + } > } > > void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) Ah, indeed. Thanks. -- | AMD Saxony Limited Liability Company & Co. KG Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany System | Register Court Dresden: HRA 4896 Research | General Partner authorized to represent: Center | AMD Saxony LLC (Wilmington, Delaware, US) | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy |
From: Avi K. <av...@qu...> - 2008-04-30 12:57:10
|
Jan Kiszka wrote: > Joerg Roedel wrote: > >> The current KVM x86 exception code handles double and triple faults only for >> page fault exceptions. This patch extends this detection for every exception >> that gets queued for the guest. >> >> Signed-off-by: Joerg Roedel <joe...@am...> >> Cc: Jan Kiszka <jan...@si...> >> --- >> arch/x86/kvm/x86.c | 31 +++++++++++++++++-------------- >> 1 files changed, 17 insertions(+), 14 deletions(-) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 578a0c1..c05aa32 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -144,9 +144,21 @@ void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data) >> } >> EXPORT_SYMBOL_GPL(kvm_set_apic_base); >> >> +static void handle_multiple_faults(struct kvm_vcpu *vcpu) >> +{ >> + if (vcpu->arch.exception.nr != DF_VECTOR) { >> + vcpu->arch.exception.nr = DF_VECTOR; >> + vcpu->arch.exception.error_code = 0; >> + } else >> + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >> +} >> + >> void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) >> { >> - WARN_ON(vcpu->arch.exception.pending); >> + if (vcpu->arch.exception.pending) { >> + handle_multiple_faults(vcpu); >> + return; >> + } >> vcpu->arch.exception.pending = true; >> vcpu->arch.exception.has_error_code = false; >> vcpu->arch.exception.nr = nr; >> @@ -157,25 +169,16 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr, >> u32 error_code) >> { >> ++vcpu->stat.pf_guest; >> - if (vcpu->arch.exception.pending) { >> - if (vcpu->arch.exception.nr == PF_VECTOR) { >> - printk(KERN_DEBUG "kvm: inject_page_fault:" >> - " double fault 0x%lx\n", addr); >> - vcpu->arch.exception.nr = DF_VECTOR; >> - vcpu->arch.exception.error_code = 0; >> - } else if (vcpu->arch.exception.nr == DF_VECTOR) { >> - /* triple fault -> shutdown */ >> - set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); >> - } >> - return; >> - } >> vcpu->arch.cr2 = addr; >> kvm_queue_exception_e(vcpu, PF_VECTOR, error_code); >> } >> >> void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code) >> { >> - WARN_ON(vcpu->arch.exception.pending); >> + if (vcpu->arch.exception.pending) { >> + handle_multiple_faults(vcpu); >> + return; >> + } >> vcpu->arch.exception.pending = true; >> vcpu->arch.exception.has_error_code = true; >> vcpu->arch.exception.nr = nr; >> > > And here is an add-on patch to fix reset-on-triple-fault: > > > Clear the pending original exception when raising a triple fault. This > allows to re-use the vcpu instance, e.g. after a reset which is > typically issued as reaction on the triple fault. > > Signed-off-by: Jan Kiszka <jan...@si...> > > --- > arch/x86/kvm/x86.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > Index: b/arch/x86/kvm/x86.c > =================================================================== > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -149,8 +149,10 @@ static void handle_multiple_faults(struc > if (vcpu->arch.exception.nr != DF_VECTOR) { > vcpu->arch.exception.nr = DF_VECTOR; > vcpu->arch.exception.error_code = 0; > - } else > + } else { > set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); > + vcpu->arch.exception.pending = false; > + } > } > > void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr) > > There's a bigger problem here. The exception queue is hidden state that qemu and load and save. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Avi K. <av...@qu...> - 2008-04-30 12:57:07
|
David Miller wrote: > I've created (and tested) all of these lists. > > Thanks. I about a week I'll make the sourceforge lists read-only. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Avi K. <av...@qu...> - 2008-04-30 12:57:06
|
Muli Ben-Yehuda wrote: >> @@ -544,19 +545,35 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) >> npages = get_user_pages(current, current->mm, addr, 1, 1, 1, page, >> NULL); >> >> - if (npages != 1) { >> - get_page(bad_page); >> - return page_to_pfn(bad_page); >> - } >> + if (unlikely(npages != 1)) { >> + struct vm_area_struct *vma; >> >> - return page_to_pfn(page[0]); >> + vma = find_vma(current->mm, addr); >> + if (vma == NULL || addr >= vma->vm_start || >> + !(vma->vm_flags & VM_PFNMAP)) { >> > > Isn't the check for addr backwards here? For the VMA we would like to > to find, vma->vm_start <= addr < vma->vm_end. > > The code is not trying to find a vma for the address, but a vma for the address which also has VM_PFNMAP set. The cases for vma not found, or vma found, but not VM_PFNMAP, are folded together. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Ingo M. <mi...@el...> - 2008-04-30 12:56:58
|
* Christian Borntraeger <bor...@de...> wrote: > While it is not a typical case, is there a better way of specifying > multiple authors to avoid future confusion? i think the established rule is that there's one Author field per commit. Multiple authors should either submit a tree with multiple commits (which shows the exact lineage of work) - or, for nontrivial joint work where the development tree would be way too messy, expose proper credits in copyrights/credit info in the source code. It's seldom that work is split exactly in half - better spell out who did what both in the source code and in the commit log - without trying to formalize the From/Author line. [which line will always be imprecise for multiple authors.] Ingo |
From: SourceForge.net <no...@so...> - 2008-04-30 12:53:28
|
Bugs item #1953353, was opened at 2008-04-28 13:50 Message generated for change (Comment added) made by ravpl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1953353&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Rafal Wijata (ravpl) Assigned to: Nobody/Anonymous (nobody) Summary: could not load PC BIOS '/path/to/bios.bin' on "-m 4096" Initial Comment: The maximum amount of memory I can give to kvm is ~3560M I run custom compiled kvm-66 on F8 box with Linux mailhub 2.6.24.4-64.fc8 #1 SMP Sat Mar 29 09:15:49 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux The modules are loaded from F8 kernel, rather than those shipped with kvm-66 ---------------------------------------------------------------------- >Comment By: Rafal Wijata (ravpl) Date: 2008-04-30 14:53 Message: Logged In: YES user_id=996150 Originator: YES Indeed, I grabbed the kvm-67, recompiled, and loaded modules that comes with kvm(kvm-intel). After that I could give to the guest even 6G of ram. And BTW, after I loaded those modules I was able to assign more than 4 CPUs to the guest as well(I remember there's such bug here). Thanx for prompt reply. ---------------------------------------------------------------------- Comment By: Marcelo Tosatti (mtosatti) Date: 2008-04-30 03:34 Message: Logged In: YES user_id=2022487 Originator: NO Can you reproduce the problem with the modules shipped with kvm-66? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1953353&group_id=180599 |
From: Avi K. <av...@qu...> - 2008-04-30 12:52:24
|
Christian Borntraeger wrote: > Am Sonntag, 27. April 2008 schrieb Avi Kivity: > >> Carsten Otte (4): >> s390: KVM preparation: provide hook to enable pgstes in user pagetable >> KVM: s390: interrupt subsystem, cpu timer, waitpsw >> KVM: s390: API documentation >> s390: KVM guest: detect when running on kvm >> >> Christian Borntraeger (10): >> KVM: kvm.h: __user requires compiler.h >> s390: KVM preparation: host memory management changes for s390 kvm >> s390: KVM preparation: address of the 64bit extint parm in lowcore >> KVM: s390: sie intercept handling >> KVM: s390: intercepts for privileged instructions >> KVM: s390: interprocessor communication via sigp >> KVM: s390: intercepts for diagnose instructions >> KVM: s390: add kvm to kconfig on s390 >> KVM: s390: update maintainers >> s390: KVM guest: virtio device support, and kvm hypercalls >> > > Thats interesting, some of these patches should actually be credited to > Carsten - and in fact on kvm.git master they are credited to Carsten. > > I think the problem is, that these patches contained multiple From lines. On > kvm.git the first line (Carsten) was used. When you transferred these patches > to the kvm.git-2.6.26-branch, git used the next From-line as the original one > was already removed. > > While it is not a typical case, is there a better way of specifying multiple > authors to avoid future confusion? It's probably due to my heavy use of git cherry-pick, rebase, and rebase -i. I couldn't reproduce this with a test that mimics that workflow, so either it has been fixed already, or it's a little more subtle. I don't think you should change anything to avoid this. I'll keep an eye open for this, and if it happens again I'll fix it locally and send a proper bug report to the git mailing list. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Avi K. <av...@qu...> - 2008-04-30 12:52:13
|
Andrea Arcangeli wrote: > On Wed, Apr 30, 2008 at 11:59:47AM +0300, Avi Kivity wrote: > >> The code is not trying to find a vma for the address, but a vma for the >> address which also has VM_PFNMAP set. The cases for vma not found, or vma >> found, but not VM_PFNMAP, are folded together. >> > > Muli's saying the comparison is reversed, change >= to <. > Err, yes. -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: Andrea A. <an...@qu...> - 2008-04-30 12:52:02
|
On Wed, Apr 30, 2008 at 11:59:47AM +0300, Avi Kivity wrote: > The code is not trying to find a vma for the address, but a vma for the > address which also has VM_PFNMAP set. The cases for vma not found, or vma > found, but not VM_PFNMAP, are folded together. Muli's saying the comparison is reversed, change >= to <. |
From: Anthony L. <an...@co...> - 2008-04-30 12:50:38
|
Muli Ben-Yehuda wrote: > On Tue, Apr 29, 2008 at 02:09:20PM -0500, Anthony Liguori wrote: > >> This patch allows VMA's that contain no backing page to be used for guest >> memory. This is a drop-in replacement for Ben-Ami's first page in his direct >> mmio series. Here, we continue to allow mmio pages to be represented in the >> rmap. >> >> Since v1, I've taken into account Andrea's suggestions at using VM_PFNMAP >> instead of VM_IO and changed the BUG_ON to a return of bad_page. >> >> Signed-off-by: Anthony Liguori <ali...@us...> >> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index 1d7991a..64e5efe 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -532,6 +532,7 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) >> struct page *page[1]; >> unsigned long addr; >> int npages; >> + pfn_t pfn; >> >> might_sleep(); >> >> @@ -544,19 +545,35 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) >> npages = get_user_pages(current, current->mm, addr, 1, 1, 1, page, >> NULL); >> >> - if (npages != 1) { >> - get_page(bad_page); >> - return page_to_pfn(bad_page); >> - } >> + if (unlikely(npages != 1)) { >> + struct vm_area_struct *vma; >> >> - return page_to_pfn(page[0]); >> + vma = find_vma(current->mm, addr); >> + if (vma == NULL || addr >= vma->vm_start || >> + !(vma->vm_flags & VM_PFNMAP)) { >> > > Isn't the check for addr backwards here? For the VMA we would like to > to find, vma->vm_start <= addr < vma->vm_end. > Yes it is. Thanks for spotting that. Regards, Anthony Liguori > Cheers, > Muli > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > kvm-devel mailing list > kvm...@li... > https://lists.sourceforge.net/lists/listinfo/kvm-devel > |
From: Andrea A. <an...@qu...> - 2008-04-30 07:00:40
|
On Tue, Apr 29, 2008 at 06:12:51PM -0500, Anthony Liguori wrote: > IIUC PPC correctly, all IO pages have corresponding struct pages. This > means that get_user_pages() would succeed and you can reference count them? > In this case, we would never take the VM_PFNMAP path. get_user_pages only works on vmas where only pfn with struct page can be mapped, but if a struct page exists it doesn't mean get_user_pages will succeed. All mmio regions should be marked VM_IO as reading on them affects hardware somehow and that prevents get_user_pages to work on them regardless if a struct page exists. > That's independent of this patchset. For non-aware guests, we'll have to > pin all of physical memory up front and then create an IOMMU table from the > pinned physical memory. For aware guests with a PV DMA window API, we'll > be able to build that mapping on the fly (enforcing mlock allocation > limits). BTW, as far as linux guest is concerned, if the PV DMA API mlock ulimit triggers the guest will crash. Nothing checks when pci_map_single returns null (the fix would be to throttle the I/O until some other dma is completed and to split the dma in multiple operations if it's a SG entry and if it repeteadly fails to fallback to PIO or return an IO error if PIO isn't available). It can fail if there's lots of weird pci hardware doing rdma at the same time (for example see iommu_arena_alloc retval in arch/alpha/kernel/pci_iommu.c). In short we'll either need ulimit -l unlimited or we'll have to define practical limits so depending on the guest driver code and number of devices using passthrough. I'll make the reserved-ram patch incremental with those patches, then it should pick the right pfn coming from /dev/mem without my page_count == 0 check, and then I've only to fixup the page pinning (so likely it'll also be incremental with the kvm mmu notifier patch so I can hope to get something final and remove page pinning for good not only on mmio regions that don't have a struct page). I've currently troubles with the blk-settings.c change done in 2.6.25 to boot in the host, I thought I fixed that already...(I did when loading the host kernel in kvm, but on real hardware it fails still for another reason). And Andrew sent me a large email about mmu notifiers, so before I return on the reserved-ram I've to answer him and upload an updated mmu-notifier patch with certain cleanups he requested, so go ahead ignoring the reserved-ram and mmu notifiers, I'll pick whatever is available in or outside kvm.git when I'm ready. Thanks! |
From: Muli Ben-Y. <mu...@il...> - 2008-04-30 06:27:15
|
On Tue, Apr 29, 2008 at 01:37:29PM +0300, Amit Shah wrote: > dma_alloc_coherent() doesn't call dma_ops->alloc_coherent in case no > IOMMU translations are necessary. I always thought this was a huge wart in the x86-64 DMA ops. Would there be strong resistance to fixing it so that alloc_coherent matches the way the other ops are used? This will eliminate the need for this patch and will make other DMA ops implementations saner. Cheers, Muli |
From: Muli Ben-Y. <mu...@il...> - 2008-04-30 06:09:16
|
On Tue, Apr 29, 2008 at 02:09:20PM -0500, Anthony Liguori wrote: > This patch allows VMA's that contain no backing page to be used for guest > memory. This is a drop-in replacement for Ben-Ami's first page in his direct > mmio series. Here, we continue to allow mmio pages to be represented in the > rmap. > > Since v1, I've taken into account Andrea's suggestions at using VM_PFNMAP > instead of VM_IO and changed the BUG_ON to a return of bad_page. > > Signed-off-by: Anthony Liguori <ali...@us...> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 1d7991a..64e5efe 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -532,6 +532,7 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) > struct page *page[1]; > unsigned long addr; > int npages; > + pfn_t pfn; > > might_sleep(); > > @@ -544,19 +545,35 @@ pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn) > npages = get_user_pages(current, current->mm, addr, 1, 1, 1, page, > NULL); > > - if (npages != 1) { > - get_page(bad_page); > - return page_to_pfn(bad_page); > - } > + if (unlikely(npages != 1)) { > + struct vm_area_struct *vma; > > - return page_to_pfn(page[0]); > + vma = find_vma(current->mm, addr); > + if (vma == NULL || addr >= vma->vm_start || > + !(vma->vm_flags & VM_PFNMAP)) { Isn't the check for addr backwards here? For the VMA we would like to to find, vma->vm_start <= addr < vma->vm_end. Cheers, Muli |
From: Muli Ben-Y. <mu...@il...> - 2008-04-30 06:02:50
|
On Wed, Apr 30, 2008 at 01:48:38AM +0300, Avi Kivity wrote: > Amit Shah wrote: >> >>>> + if (is_error_page(host_page)) { >>>> + printk(KERN_INFO "%s: gfn %p not valid\n", >>>> + __func__, (void *)page_gfn); >>>> + r = -1; >>>> >>> r = -1 is not really informative. Better use some meaningful error. >>> >> >> The error's going to the guest. The guest, as we know, has already >> done a successful DMA allocation. Something went wrong in the >> hypercall, and we don't know why (bad page). Any kind of error here >> isn't going to be intelligible to the guest anyway. It's mostly a >> host thing if we ever hit this. >> > > If the guest is not able to handle it, why bother returning an > error? Better to kill it. > > But in any case, -1 is not a good error number. The guest should be able to deal with transient DMA mapping errors, either by retrying, or quiescing the device. This is in line with how HW IOMMUs work - they may run out of mappings for example and the driver should be able to cope with it. Killing the guest is a last resort. Cheers, Muli |
From: David S. A. <da...@ci...> - 2008-04-30 04:18:21
|
Another tidbit for you guys as I make my way through various permutations: I installed the RHEL3 hugemem kernel and the guest behavior is *much* better. System time still has some regular hiccups that are higher than xen and esx (e.g., 1 minute samples out of 5 show system time between 10 and 15%), but overall guest behavior is good with the hugemem kernel. One side effect I've noticed is that I cannot restart the RHEL3 guest running the hugemem kernel in successive attempts. The guest has 2 vcpus and qemu shows one thread at 100% cpu. If I recall correctly kvm_stat shows a large amount of tlb_flushes (like millions in a 5-second sample). The scenario is: 1. start guest running hugemem kernel, 2. shutdown, 3. restart guest. During 3. it hangs, but at random points. Removing kvm/kvm-intel has no effect - guest still hangs on the restart. Rebooting the host clears the problem. Alternatively, during the hang on a restart I can kill the guest, and then on restart choose the normal, 32-bit smp kernel and the guest boots just fine. At this point I can shutdown the guest and restart with the hugemem kernel and it boots just fine. david David S. Ahern wrote: > Hi Marcelo: > > mmu_recycled is always 0 for this guest -- even after almost 4 hours of uptime. > > Here is a kvm_stat sample where guest time was very high and qemu had 2 > processors at 100% on the host. I removed counters where both columns have 0 > value for brevity. > > exits 45937979 758051 > fpu_reload 1416831 87 > halt_exits 112911 0 > halt_wakeup 31771 0 > host_state_reload 2068602 263 > insn_emulation 21601480 365493 > io_exits 1827374 2705 > irq_exits 8934818 285196 > mmio_exits 421674 147 > mmu_cache_miss 4817689 93680 > mmu_flooded 4815273 93680 > mmu_pde_zapped 51344 0 > mmu_prefetch 4817625 93680 > mmu_pte_updated 14803298 270104 > mmu_pte_write 19859863 363785 > mmu_shadow_zapped 4832106 93679 > pf_fixed 32184355 468398 > pf_guest 264138 0 > remote_tlb_flush 10697762 280522 > tlb_flush 10301338 176424 > > (NOTE: This is for a *5* second sample interval instead of 1 to allow me to > capture the data). > > Here's a sample when the guest is "well-behaved" (system time <10%, though ): > exits 51502194 97453 > fpu_reload 1421736 227 > halt_exits 138361 1927 > halt_wakeup 33047 117 > host_state_reload 2110190 3740 > insn_emulation 24367441 47260 > io_exits 1874075 2576 > irq_exits 10224702 13333 > mmio_exits 435154 1726 > mmu_cache_miss 5414097 11258 > mmu_flooded 5411548 11243 > mmu_pde_zapped 52851 44 > mmu_prefetch 5414031 11258 > mmu_pte_updated 16854686 29901 > mmu_pte_write 22526765 42285 > mmu_shadow_zapped 5430025 11313 > pf_fixed 36144578 67666 > pf_guest 282794 430 > remote_tlb_flush 12126268 14619 > tlb_flush 11753162 21460 > > > There is definitely a strong correlation between the mmu counters and high > system times in the guest. I am still trying to find out what in the guest is > stimulating it when running on RHEL3; I do not see this same behavior for an > equivalent setup running on RHEL4. > > By the way I added an mmu_prefetch stat in prefetch_page() to count the number > of times the for() loop is hit with PTTYPE == 64; ie., number of times > paging64_prefetch_page() is invoked. (I wanted an explicit counter for this > loop, though the info seems to duplicate other entries.) That counter is listed > above. As I mentioned in a prior post when kscand kicks in the change in > mmu_prefetch counter is at 20,000+/sec, with each trip through that function > taking 45k+ cycles. > > kscand is an instigator shortly after boot, however, kscand is *not* the culprit > once the system has been up for 30-45 minutes. I have started instrumenting the > RHEL3U8 kernel and for the load I am running kscand does not walk the active > lists very often once the system is up. > > So, to dig deeper on what in the guest is stimulating the mmu I collected > kvmtrace data for about a 2 minute time interval which caught about a 30-second > period where guest system time was steady in the 25-30% range. Summarizing the > number of times a RIP appears in an VMEXIT shows the following high runners: > > count RIP RHEL3-symbol > 82549 0xc0140e42 follow_page [kernel] c0140d90 offset b2 > 42532 0xc0144760 handle_mm_fault [kernel] c01446d0 offset 90 > 36826 0xc013da4a futex_wait [kernel] c013d870 offset 1da > 29987 0xc0145cd0 zap_pte_range [kernel] c0145c10 offset c0 > 27451 0xc0144018 do_no_page [kernel] c0143e20 offset 1f8 > > (halt entry removed the list since that is the ideal scenario for an exit). > > So the RIP correlates to follow_page() for a large percentage of the VMEXITs. > > I wrote an awk script to summarize (histogram style) the TSC cycles between > VMEXIT and VMENTRY for an address. For the first rip, 0xc0140e42, 82,271 times > (ie., almost 100% of the time) the trace shows a delta between 50k and 100k > cycles between the VMEXIT and the subsequent VMENTRY. Similarly for the second > one, 0xc0144760, 42403 times (again almost 100% of the occurrences) the trace > shows a delta between 50k and 100k cycles between VMEXIT and VMENTRY. These > seems to correlate with the prefetch_page function in kvm, though I am not 100% > positive on that. > > I am now investigating the kernel paths leading to those functions. Any insights > would definitely be appreciated. > > david > > > Marcelo Tosatti wrote: >> On Fri, Apr 25, 2008 at 11:33:18AM -0600, David S. Ahern wrote: >>> Most of the cycles (~80% of that 54k+) are spent in paging64_prefetch_page(): >>> >>> for (i = 0; i < PT64_ENT_PER_PAGE; ++i) { >>> gpa_t pte_gpa = gfn_to_gpa(sp->gfn); >>> pte_gpa += (i+offset) * sizeof(pt_element_t); >>> >>> r = kvm_read_guest_atomic(vcpu->kvm, pte_gpa, &pt, >>> sizeof(pt_element_t)); >>> if (r || is_present_pte(pt)) >>> sp->spt[i] = shadow_trap_nonpresent_pte; >>> else >>> sp->spt[i] = shadow_notrap_nonpresent_pte; >>> } >>> >>> This loop is run 512 times and takes a total of ~45k cycles, or ~88 cycles per >>> loop. >>> >>> This function gets run >20,000/sec during some of the kscand loops. >> Hi David, >> >> Do you see the mmu_recycled counter increase? >> > |