You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(33) |
Nov
(325) |
Dec
(320) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
(484) |
Feb
(438) |
Mar
(407) |
Apr
(713) |
May
(831) |
Jun
(806) |
Jul
(1023) |
Aug
(1184) |
Sep
(1118) |
Oct
(1461) |
Nov
(1224) |
Dec
(1042) |
2008 |
Jan
(1449) |
Feb
(1110) |
Mar
(1428) |
Apr
(1643) |
May
(682) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jerone Y. <jy...@us...> - 2008-04-25 20:01:07
|
2 files changed, 23 insertions(+), 3 deletions(-) arch/powerpc/kvm/emulate.c | 14 ++++++++++++++ arch/powerpc/kvm/powerpc.c | 12 +++++++++--- This patch handles a guest that is in a wait state & wake up guest that end up being recheduled and go to sleep. This ensures that the guest is not allways eating up 100% cpu when it is idle. Signed-off-by: Jerone Young <jy...@us...> diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -235,6 +235,13 @@ int kvmppc_emulate_instruction(struct kv case 50: /* rfi */ kvmppc_emul_rfi(vcpu); advance = 0; + + /* Handle guest vcpu that is in wait state. + * This will implicitly wake up when it is ready. + */ + if (vcpu->arch.msr & MSR_WE) { + kvm_vcpu_block(vcpu); + } break; default: @@ -265,6 +272,13 @@ int kvmppc_emulate_instruction(struct kv case 146: /* mtmsr */ rs = get_rs(inst); kvmppc_set_msr(vcpu, vcpu->arch.gpr[rs]); + + /* Handle guest vcpu that is in wait state + * This will implicitly wake up when it is ready. + */ + if (vcpu->arch.msr & MSR_WE) { + kvm_vcpu_block(vcpu); + } break; case 163: /* wrteei */ diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -36,13 +36,12 @@ gfn_t unalias_gfn(struct kvm *kvm, gfn_t int kvm_cpu_has_interrupt(struct kvm_vcpu *v) { - /* XXX implement me */ - return 0; + return !!(v->arch.pending_exceptions); } int kvm_arch_vcpu_runnable(struct kvm_vcpu *v) { - return 1; + return !(v->arch.msr & MSR_WE); } @@ -213,6 +212,9 @@ static void kvmppc_decrementer_func(unsi { struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; + if (waitqueue_active(&vcpu->wq)) + wake_up_interruptible(&vcpu->wq); + kvmppc_queue_exception(vcpu, BOOKE_INTERRUPT_DECREMENTER); } @@ -339,6 +341,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v int r; sigset_t sigsaved; + vcpu_load(vcpu); + if (vcpu->sigset_active) sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved); @@ -362,6 +366,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v if (vcpu->sigset_active) sigprocmask(SIG_SETMASK, &sigsaved, NULL); + + vcpu_put(vcpu); return r; } |
From: Robin H. <ho...@sg...> - 2008-04-25 19:25:43
|
On Fri, Apr 25, 2008 at 06:56:40PM +0200, Andrea Arcangeli wrote: > Fortunately I figured out we don't really need mm_lock in unregister > because it's ok to unregister in the middle of the range_begin/end > critical section (that's definitely not ok for register that's why > register needs mm_lock). And it's perfectly ok to fail in register(). I think you still need mm_lock (unless I miss something). What happens when one callout is scanning mmu_notifier_invalidate_range_start() and you unlink. That list next pointer with LIST_POISON1 which is a really bad address for the processor to track. Maybe I misunderstood your description. Thanks, Robin |
From: Jerone Y. <jy...@us...> - 2008-04-25 19:22:54
|
* This update consolidates patches, adds more explicit comments, and add wait check when rfi instruction is emulated. This set of patches fixes 100% CPU usage when a guest is idle on PowerPC. Idle CPU usage is now at ~15-16% CPU time. An improvment. Signed-off-by: Jeorne Young <jy...@us...> 4 files changed, 91 insertions(+), 4 deletions(-) arch/powerpc/kvm/emulate.c | 14 +++++++ arch/powerpc/kvm/powerpc.c | 12 ++++-- arch/powerpc/platforms/44x/Makefile | 2 - arch/powerpc/platforms/44x/idle.c | 67 +++++++++++++++++++++++++++++++++++ |
From: Chris L. <cla...@re...> - 2008-04-25 18:43:36
|
Avi Kivity wrote: > > Hmm, looking back at the dump: > >> 1811: 8d 86 00 00 ff 3f lea 0x3fff0000(%rsi),%eax >> 1817: 83 f8 03 cmp $0x3,%eax >> 181a: 0f 87 e2 01 00 00 ja 1a02 <svm_set_msr+0x27f> > > So while gcc is using %rsi, it loads the result back into %eax, which > has the effect of dropping back into 32-bits. So looks like gcc was > right here. Sorry for spreading confusion and apologies to gcc. > Avi, Arg. I was completely, utterly wrong about the problem here (although there is definitely still a problem). I'm sorry for making a confusing mess out of this. Here is what is actually happening: During startup, the RHEL-4 x86_64 kernel (2.6.9-67.EL, if you care) setups up the NMI watchdog. It does the following: for(i = 0; i < 4; ++i) { /* Simulator may not support it */ if (checking_wrmsrl(MSR_K7_EVNTSEL0+i, 0UL)) return; wrmsrl(MSR_K7_PERFCTR0+i, 0UL); } checking_wrmsrl() just does a "test write" to the msr; because of the code that is currently in there, this succeeds. However, when it tries to do the MSR_K7_PERFCTR0 wrmsr, *that* is where it fails, since we don't currently handle that MSR, and KVM injects a GPF into the guest (which kills it). My previous patch just happened to fix this because it was making checking_wrmsrl() fail on the EVNTSEL0, so we just returned out of this loop rather than trying to write to the PERFCTR0. Unfortunately, we can't just "fake emulate" MSR_K7_PERFCTR[0-3] like we are doing for MSR_K7_EVNTSEL[0-3]; if they are there, linux expects to be able to put values into them. I think the correct solution here is to emulate MSR_K7_PERFCTR[0-3] and MSR_K7_EVNTSEL[0-3] for real. I'm working on a patch to do this now. Chris Lalancette |
From: David S. A. <da...@ci...> - 2008-04-25 17:34:34
|
David S. Ahern wrote: > Avi Kivity wrote: > >> David S. Ahern wrote: >> >>> I added the traces and captured data over another apparent lockup of >>> the guest. >>> This seems to be representative of the sequence (pid/vcpu removed). >>> >>> (+4776) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >>> c016127c ] >>> (+ 0) PAGE_FAULT [ errorcode = 0x00000003, virt = 0x00000000 >>> c0009db4 ] >>> (+3632) VMENTRY >>> (+4552) VMEXIT [ exitcode = 0x00000000, rip = 0x00000000 >>> c016104a ] >>> (+ 0) PAGE_FAULT [ errorcode = 0x0000000b, virt = 0x00000000 >>> fffb61c8 ] >>> (+ 54928) VMENTRY >>> >> Can you oprofile the host to see where the 54K cycles are spent? >> Most of the cycles (~80% of that 54k+) are spent in paging64_prefetch_page(): for (i = 0; i < PT64_ENT_PER_PAGE; ++i) { gpa_t pte_gpa = gfn_to_gpa(sp->gfn); pte_gpa += (i+offset) * sizeof(pt_element_t); r = kvm_read_guest_atomic(vcpu->kvm, pte_gpa, &pt, sizeof(pt_element_t)); if (r || is_present_pte(pt)) sp->spt[i] = shadow_trap_nonpresent_pte; else sp->spt[i] = shadow_notrap_nonpresent_pte; } This loop is run 512 times and takes a total of ~45k cycles, or ~88 cycles per loop. This function gets run >20,000/sec during some of the kscand loops. david |
From: Andrea A. <an...@qu...> - 2008-04-25 17:04:33
|
On Fri, Apr 25, 2008 at 06:56:39PM +0200, Andrea Arcangeli wrote: > > > + data->i_mmap_locks = vmalloc(nr_i_mmap_locks * > > > + sizeof(spinlock_t)); > > > > This is why non-typesafe allocators suck. You want 'sizeof(spinlock_t *)' > > here. > > > > > + data->anon_vma_locks = vmalloc(nr_anon_vma_locks * > > > + sizeof(spinlock_t)); > > > > and here. > > Great catch! (it was temporarily wasting some ram which isn't nice at all) As I went into the editor I just found the above already fixed in #v14-pre3. And I can't move the structure into the file anymore without kmallocing it. Exposing that structure avoids the ERR_PTR/PTR_ERR on the retvals and one kmalloc so I think it makes the code simpler in the end to keep it as it is now. I'd rather avoid further changes to the 1/N patch, as long as they don't make any difference at runtime and as long as they involve more than cut-and-pasting a structure from .h to .c file. |
From: Andrea A. <an...@qu...> - 2008-04-25 16:56:47
|
I somehow lost missed this email in my inbox, found it now because it was strangely still unread... Sorry for the late reply! On Tue, Apr 22, 2008 at 03:06:24PM +1000, Rusty Russell wrote: > On Wednesday 09 April 2008 01:44:04 Andrea Arcangeli wrote: > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1050,6 +1050,15 @@ > > unsigned long addr, unsigned long len, > > unsigned long flags, struct page **pages); > > > > +struct mm_lock_data { > > + spinlock_t **i_mmap_locks; > > + spinlock_t **anon_vma_locks; > > + unsigned long nr_i_mmap_locks; > > + unsigned long nr_anon_vma_locks; > > +}; > > +extern struct mm_lock_data *mm_lock(struct mm_struct * mm); > > +extern void mm_unlock(struct mm_struct *mm, struct mm_lock_data *data); > > As far as I can tell you don't actually need to expose this struct at all? Yes, it should be possible to only expose 'struct mm_lock_data;'. > > + data->i_mmap_locks = vmalloc(nr_i_mmap_locks * > > + sizeof(spinlock_t)); > > This is why non-typesafe allocators suck. You want 'sizeof(spinlock_t *)' > here. > > > + data->anon_vma_locks = vmalloc(nr_anon_vma_locks * > > + sizeof(spinlock_t)); > > and here. Great catch! (it was temporarily wasting some ram which isn't nice at all) > > + err = -EINTR; > > + i_mmap_lock_last = NULL; > > + nr_i_mmap_locks = 0; > > + for (;;) { > > + spinlock_t *i_mmap_lock = (spinlock_t *) -1UL; > > + for (vma = mm->mmap; vma; vma = vma->vm_next) { > ... > > + data->i_mmap_locks[nr_i_mmap_locks++] = i_mmap_lock; > > + } > > + data->nr_i_mmap_locks = nr_i_mmap_locks; > > How about you track your running counter in data->nr_i_mmap_locks, leave > nr_i_mmap_locks alone, and BUG_ON(data->nr_i_mmap_locks != nr_i_mmap_locks)? > > Even nicer would be to wrap this in a "get_sorted_mmap_locks()" function. I'll try to clean this up further and I'll make a further update for review. > Unfortunately, I just don't think we can fail locking like this. In your next > patch unregistering a notifier can fail because of it: that not usable. Fortunately I figured out we don't really need mm_lock in unregister because it's ok to unregister in the middle of the range_begin/end critical section (that's definitely not ok for register that's why register needs mm_lock). And it's perfectly ok to fail in register(). Also it wasn't ok to unpin the module count in ->release as ->release needs to 'ret' to get back to the mmu notifier code. And without any unregister at all, the module can't be unloaded at all which is quite unacceptable... The logic is to prevent mmu_notifier_register to race with mmu_notifier_release because it takes the mm_users pin (implicit or explicit, and then mmput just after mmu_notifier_register returns). Then _register serializes against all the mmu notifier methods (except ->release) with srcu (->release can't run thanks to the mm_users pin). The mmu_notifier_mm->lock then serializes the modification on the list (register vs unregister) and it ensures one and only one between _unregister and _releases calls ->release before _unregister returns. All other methods runs freely with srcu. Having the guarante that ->release is called just before all pages are freed or inside _unregister, allows the module to zap and freeze its secondary mmu inside ->release with the race condition of exit() against mmu_notifier_unregister internally by the mmu notifier code and without dependency on exit_files/exit_mm ordering depending if the fd of the driver is open the filetables or in the vma only. The mmu_notifier_mm can be reset to 0 only after the last mmdrop. About the mm_count refcounting for _release and _unregiste: no mmu notifier and not even mmu_notifier_unregister and _release can cope with mmu_notfier_mm list and srcu structures going away out of order. exit_mmap is safe as it holds an mm_count implicitly because mmdrop is run after exit_mmap returns. mmu_notifier_unregister is safe too as _register takes the mm_count pin. We can't prevent mmu_notifer_mm to go away with mm_users as that will screwup the vma filedescriptor closure that only happens inside exit_mmap (mm_users pinned prevents exit_mmap to run, and it can only be taken temporarily until _register returns). |
From: Jerone Y. <jy...@us...> - 2008-04-25 16:37:48
|
On Fri, 2008-04-25 at 09:00 -0500, Hollis Blanchard wrote: > On Friday 25 April 2008 00:56:01 Jerone Young wrote: > > This set of patches fixes 100% CPU usage when a guest is idle on PowerPC. > This time it uses common kvm functions to sleep the guest. > > Looking much better now, with just a few minor issues to correct. With these > patches applied, about how much CPU *does* an idling guest consume? With the current patch *as is* idle guest are eating about 16% CPU. Better then 100%, but more then the other patch. I'll see if by removing the vcpu_loads & vcpu_puts if that goes down. > > By the way, you don't explicitly *unset* MSR[WE]. I think this works > implicitly because of the way we deliver interrupts; could you add a comment > explaining that? Yes it is unset implicity. I can add a comment on this. > |
From: Hollis B. <ho...@us...> - 2008-04-25 14:01:09
|
On Friday 25 April 2008 00:56:01 Jerone Young wrote: > This set of patches fixes 100% CPU usage when a guest is idle on PowerPC. This time it uses common kvm functions to sleep the guest. Looking much better now, with just a few minor issues to correct. With these patches applied, about how much CPU *does* an idling guest consume? By the way, you don't explicitly *unset* MSR[WE]. I think this works implicitly because of the way we deliver interrupts; could you add a comment explaining that? -- Hollis Blanchard IBM Linux Technology Center |
From: Hollis B. <ho...@us...> - 2008-04-25 13:57:22
|
On Friday 25 April 2008 00:56:04 Jerone Young wrote: > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -212,6 +212,9 @@ static void kvmppc_decrementer_func(unsi > { > struct kvm_vcpu *vcpu = (struct kvm_vcpu *)data; > > + if (waitqueue_active(&vcpu->wq)) > + wake_up_interruptible(&vcpu->wq); > + > kvmppc_queue_exception(vcpu, BOOKE_INTERRUPT_DECREMENTER); > } Hooray! > int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq) > { > + vcpu_load(vcpu); > kvmppc_queue_exception(vcpu, BOOKE_INTERRUPT_EXTERNAL); > + vcpu_put(vcpu); > + > return 0; > } load/put here is definitely unnecessary. That makes me question how necessary it is in other parts of this patch too. I think the (hardware) TLB is the only state we really need to worry about, because there is no other state that our guest can load into the hardware that is not handled by a regular context switch. If that's true, we should only need vcpu_load/put() on paths where we muck with the TLB behind the host's back, and that is only in the run path. -- Hollis Blanchard IBM Linux Technology Center |
From: Hollis B. <ho...@us...> - 2008-04-25 13:57:17
|
On Friday 25 April 2008 00:56:03 Jerone Young wrote: > This patch handles a guest that is in a wait state. This ensures that the guest is not allways eating up 100% cpu when it is idle. > > Signed-off-by: Jerone Young <jy...@us...> > > diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c > --- a/arch/powerpc/kvm/emulate.c > +++ b/arch/powerpc/kvm/emulate.c > @@ -265,6 +265,11 @@ int kvmppc_emulate_instruction(struct kv > case 146: /* mtmsr */ > rs = get_rs(inst); > kvmppc_set_msr(vcpu, vcpu->arch.gpr[rs]); > + > + /* handle guest vcpu that is in wait state */ > + if (vcpu->arch.msr & MSR_WE) { > + kvm_vcpu_block(vcpu); > + } > break; > > case 163: /* wrteei */ So if I apply this patch and not #3, the guest will put itself to sleep and never wake up? You need to combine patches 2 and 3. Also, for completeness, you should add the same test to the rfi emulation, which could (theoretically) also set MSR[WE]. -- Hollis Blanchard IBM Linux Technology Center |
From: Chris L. <cla...@re...> - 2008-04-25 13:08:08
|
Avi Kivity wrote: > > Hmm, looking back at the dump: > >> 1811: 8d 86 00 00 ff 3f lea 0x3fff0000(%rsi),%eax >> 1817: 83 f8 03 cmp $0x3,%eax >> 181a: 0f 87 e2 01 00 00 ja 1a02 <svm_set_msr+0x27f> > > So while gcc is using %rsi, it loads the result back into %eax, which > has the effect of dropping back into 32-bits. So looks like gcc was > right here. Sorry for spreading confusion and apologies to gcc. > OK. Well, then I can't explain why we are unconditionally calling kvm_set_msr_common(), regardless of whether data == 0 or not. Avi, you said it works for you; what version of gcc are you using, and can you send me your objdump -Sr? I'd like to compare the assembly output with what 4.3.0 is spitting out. Chris Lalancette |
From: Tomas R. <li...@ko...> - 2008-04-25 12:14:51
|
Hello everybody I have problem to boot guests with lilo installed, after I upgrade to KVM-66 (from 65). Boot sequence always stop with "LIL" output. With kvm-65 everythink works great. I have also windows XP guest, which boot without problem. With -no-kvm, all my guests with lilo start correctly. Processor: AMD Opteron 2210 KVM: kvm-64 Host: gentoo-sources-2.6.25-r1 Arch: x86_64 Guests: gentoo, 2.6.25, x86_64 dmesg: Apr 21 09:53:37 kvm BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 Apr 21 09:53:37 kvm IP: [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm PGD 11e989067 PUD 11ec0a067 PMD 0 Apr 21 09:53:37 kvm Oops: 0002 [2] SMP Apr 21 09:53:37 kvm CPU 2 Apr 21 09:53:37 kvm Modules linked in: w83627hf hwmon_vid xt_tcpudp xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables tun kvm_amd kvm shpchp pci_hotplug k8temp i2c_nforce2 i2c_core Apr 21 09:53:37 kvm Pid: 7130, comm: kvm Tainted: G D 2.6.25-gentoo-r1 #1 Apr 21 09:53:37 kvm RIP: 0010:[<ffffffff88029c85>] [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm RSP: 0018:ffff81011e3e5738 EFLAGS: 00010246 Apr 21 09:53:37 kvm RAX: 0000000000000010 RBX: 0000000000000000 RCX: ffff81011e3e7378 Apr 21 09:53:37 kvm RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff81011e3e6000 Apr 21 09:53:37 kvm RBP: ffff81011e3e7378 R08: 0000000000000000 R09: 0000000000000000 Apr 21 09:53:37 kvm R10: ffffffff88041988 R11: ffff81011e3e7378 R12: ffff81011e3e7330 Apr 21 09:53:37 kvm R13: 0000000000000000 R14: ffffffff88033a20 R15: 0000000000000ce3 Apr 21 09:53:37 kvm FS: 0000000041e68950(0063) GS:ffff81011ff1b000(0000) knlGS:0000000000000000 Apr 21 09:53:37 kvm CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Apr 21 09:53:37 kvm CR2: 0000000000000000 CR3: 000000011e1af000 CR4: 00000000000006e0 Apr 21 09:53:37 kvm DR0: ffffffff805b34f8 DR1: 0000000000000000 DR2: 0000000000000000 Apr 21 09:53:37 kvm DR3: 0000000000000000 DR6: 00000000ffff0ff1 DR7: 0000000000000701 Apr 21 09:53:37 kvm Process kvm (pid: 7130, threadinfo ffff81011e3e4000, task ffff81011e560000) Apr 21 09:53:37 kvm Stack: ffffffff8021a311 000000000000000f 00000000fffffff7 ffffffff8021a49b Apr 21 09:53:37 kvm 00000000ffffffff ffff81011ed41d00 ffffc20001926000 0000000000000000 Apr 21 09:53:37 kvm ffffffff8021a311 ffffffff802347a0 ffff81011ed41d00 ffffffff880419e0 Apr 21 09:53:37 kvm Call Trace: Apr 21 09:53:37 kvm [<ffffffff8021a311>] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [<ffffffff8021a49b>] ? smp_call_function_mask+0x47/0x55 Apr 21 09:53:37 kvm [<ffffffff8021a311>] ? do_flush_tlb_all+0x0/0x2f Apr 21 09:53:37 kvm [<ffffffff802347a0>] ? on_each_cpu+0x19/0x25 Apr 21 09:53:37 kvm [<ffffffff880419e0>] Apr 21 09:53:37 kvm [<ffffffff88020501>] ? :kvm:kvm_get_cs_db_l_bits+0x9/0x2f Apr 21 09:53:37 kvm [<ffffffff8801f101>] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [<ffffffff8801f101>] ? :kvm:emulate_instruction+0x1ef/0x3a5 Apr 21 09:53:37 kvm [<ffffffff88041fbc>] Apr 21 09:53:37 kvm [<ffffffff88020148>] ? :kvm:kvm_arch_vcpu_ioctl_run+0x44a/0x5b8 Apr 21 09:53:37 kvm [<ffffffff8801bf23>] ? :kvm:kvm_resched+0x1b4/0x9b7 Apr 21 09:53:37 kvm [<ffffffff8802ad63>] ? :kvm:kvm_pic_set_irq+0x21/0x6b Apr 21 09:53:37 kvm [<ffffffff8801e81b>] ? :kvm:kvm_arch_vm_ioctl+0x38e/0x5e6 Apr 21 09:53:37 kvm [<ffffffff8026217b>] ? zone_statistics+0x41/0x94 Apr 21 09:53:37 kvm [<ffffffff8025bc16>] ? get_page_from_freelist+0x457/0x5af Apr 21 09:53:37 kvm [<ffffffff8025bdc0>] ? __alloc_pages+0x52/0x2ee Apr 21 09:53:37 kvm [<ffffffff80225e50>] ? source_load+0x25/0x41 Apr 21 09:53:37 kvm [<ffffffff802286f1>] ? find_busiest_group+0x268/0x742 Apr 21 09:53:37 kvm [<ffffffff80225552>] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [<ffffffff805b3aae>] ? thread_return+0x64/0xa5 Apr 21 09:53:37 kvm [<ffffffff80249099>] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [<ffffffff80249816>] ? unqueue_me+0x6b/0x73 Apr 21 09:53:37 kvm [<ffffffff80249bc1>] ? futex_wait+0x290/0x327 Apr 21 09:53:37 kvm [<ffffffff80227c36>] ? try_to_wake_up+0xfa/0x10c Apr 21 09:53:37 kvm [<ffffffff80229752>] ? __wake_up_common+0x49/0x74 Apr 21 09:53:37 kvm [<ffffffff80268c29>] ? find_extend_vma+0x16/0x61 Apr 21 09:53:37 kvm [<ffffffff80249099>] ? get_futex_key+0x76/0x14d Apr 21 09:53:37 kvm [<ffffffff803c1439>] ? __up_read+0x10/0x8a Apr 21 09:53:37 kvm [<ffffffff8024955e>] ? futex_wake+0xfa/0x10c Apr 21 09:53:37 kvm [<ffffffff80242e5a>] ? ktime_get_ts+0x56/0x5d Apr 21 09:53:37 kvm [<ffffffff8801c3cb>] ? :kvm:kvm_resched+0x65c/0x9b7 Apr 21 09:53:37 kvm [<ffffffff80225552>] ? hrtick_set+0x99/0x107 Apr 21 09:53:37 kvm [<ffffffff8028a311>] ? vfs_ioctl+0x29/0x6f Apr 21 09:53:37 kvm [<ffffffff8028a5a4>] ? do_vfs_ioctl+0x24d/0x25c Apr 21 09:53:37 kvm [<ffffffff8028a5ef>] ? sys_ioctl+0x3c/0x61 Apr 21 09:53:37 kvm [<ffffffff8020b09b>] ? system_call_after_swapgs+0x7b/0x80 Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Apr 21 09:53:37 kvm Code: 02 74 20 77 06 ff c8 74 0e eb 78 83 f8 04 74 20 83 f8 08 74 27 eb 6c 48 8b 51 40 48 8b 41 30 88 02 eb 60 48 8b 51 40 48 8b 41 30 <66> 89 02 eb 53 48 8b 41 40 8b 49 30 48 89 08 eb 47 48 8b 51 40 Apr 21 09:53:37 kvm RIP [<ffffffff88029c85>] :kvm:x86_emulate_insn+0x3a47/0x468f Apr 21 09:53:37 kvm RSP <ffff81011e3e5738> Apr 21 09:53:37 kvm CR2: 0000000000000000 Apr 21 09:53:37 kvm ---[ end trace 8b01d2fbd0fdd57f ]--- Thank you for your help -- Tomas Rusnak |
From: Alexander G. <ag...@su...> - 2008-04-25 09:38:28
|
On Apr 25, 2008, at 3:01 AM, Marcelo Tosatti wrote: > > Add three PCI bridges to support 128 slots. Vendor and device_id have > been stolen from my test box. > > I/O port addresses behind each bridge are statically allocated > starting > from 0x2000 with 0x1000 length. Once the bridge runs out of I/O space > the guest (Linux at least) happily allocates outside of the region. > That > needs verification. > > I/O memory addresses are divided between 0xf0000000 -> APIC base. > > The PCI irq mapping function is also changed, there was the assumption > that devices behind the bridge use the IRQ allocated to the bridge > device itself, which is weird. Apparently this is how the SPARC ABP > PCI > host works (only user of the bridge code at the moment). Is there any reason we're not using the _PIC function and give the OS a clue on which APIC pin the device is? Right now everything boils down to LNKA - LNKD which it does not have to. It might even be a good idea to connect each PCI device to a specific APIC pin, so we don't need to share too much, which might become a problem with a lot of PCI devices. As far as I know there is no limitation on how many pins an APIC may have. > > > There was a copy&paste buglet in acpi-dsdt.dsl, slots 8 and 9 were > sharing the same address, and that error was later copy&pasted to > slots > 24 and 25. > > Please review and give it a try (attached is the patch to increase the > QEMU static tables). > > Index: kvm-userspace.pci2/hw/pci.c > =================================================================== > --- a/qemu/hw/pci.c > +++ b/qemu/hw/pci.c > @@ -532,6 +532,7 @@ uint32_t pci_data_read(void *opaque, uint32_t > addr, int len) > static void pci_set_irq(void *opaque, int irq_num, int level) > { > PCIDevice *pci_dev = (PCIDevice *)opaque; > + PCIDevice *host_dev; > PCIBus *bus; > int change; > > @@ -539,13 +540,16 @@ static void pci_set_irq(void *opaque, int > irq_num, int level) > if (!change) > return; > > + > pci_dev->irq_state[irq_num] = level; > + host_dev = pci_dev; > for (;;) { > - bus = pci_dev->bus; > - irq_num = bus->map_irq(pci_dev, irq_num); > - if (bus->set_irq) > + bus = host_dev->bus; > + if (bus->set_irq) { > + irq_num = bus->map_irq(pci_dev, irq_num); > break; > - pci_dev = bus->parent_dev; > + } > + host_dev = bus->parent_dev; > } > bus->irq_count[irq_num] += change; > bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] ! > = 0); > > Index: kvm-userspace.pci2/bios/acpi-dsdt.dsl > =================================================================== > --- kvm-userspace.pci2.orig/bios/acpi-dsdt.dsl > +++ kvm-userspace.pci2/bios/acpi-dsdt.dsl > @@ -208,218 +208,29 @@ DefinitionBlock ( > Name (_HID, EisaId ("PNP0A03")) > Name (_ADR, 0x00) > Name (_UID, 1) > - Name(_PRT, Package() { > - /* PCI IRQ routing table, example from ACPI 2.0a > specification, > - section 6.2.8.1 */ > - /* Note: we provide the same info as the PCI routing > - table of the Bochs BIOS */ > - > - // PCI Slot 0 > - Package() {0x0000ffff, 0, LNKD, 0}, > - Package() {0x0000ffff, 1, LNKA, 0}, > - Package() {0x0000ffff, 2, LNKB, 0}, > - Package() {0x0000ffff, 3, LNKC, 0}, > - > - // PCI Slot 1 > - Package() {0x0001ffff, 0, LNKA, 0}, > - Package() {0x0001ffff, 1, LNKB, 0}, > - Package() {0x0001ffff, 2, LNKC, 0}, > - Package() {0x0001ffff, 3, LNKD, 0}, > - > - // PCI Slot 2 > - Package() {0x0002ffff, 0, LNKB, 0}, > - Package() {0x0002ffff, 1, LNKC, 0}, > - Package() {0x0002ffff, 2, LNKD, 0}, > - Package() {0x0002ffff, 3, LNKA, 0}, > - > - // PCI Slot 3 > - Package() {0x0003ffff, 0, LNKC, 0}, > - Package() {0x0003ffff, 1, LNKD, 0}, > - Package() {0x0003ffff, 2, LNKA, 0}, > - Package() {0x0003ffff, 3, LNKB, 0}, > - > - // PCI Slot 4 > - Package() {0x0004ffff, 0, LNKD, 0}, > - Package() {0x0004ffff, 1, LNKA, 0}, > - Package() {0x0004ffff, 2, LNKB, 0}, > - Package() {0x0004ffff, 3, LNKC, 0}, > - > - // PCI Slot 5 > - Package() {0x0005ffff, 0, LNKA, 0}, > - Package() {0x0005ffff, 1, LNKB, 0}, > - Package() {0x0005ffff, 2, LNKC, 0}, > - Package() {0x0005ffff, 3, LNKD, 0}, > - > - // PCI Slot 6 > - Package() {0x0006ffff, 0, LNKB, 0}, > - Package() {0x0006ffff, 1, LNKC, 0}, > - Package() {0x0006ffff, 2, LNKD, 0}, > - Package() {0x0006ffff, 3, LNKA, 0}, > - > - // PCI Slot 7 > - Package() {0x0007ffff, 0, LNKC, 0}, > - Package() {0x0007ffff, 1, LNKD, 0}, > - Package() {0x0007ffff, 2, LNKA, 0}, > - Package() {0x0007ffff, 3, LNKB, 0}, > - > - // PCI Slot 8 > - Package() {0x0008ffff, 0, LNKD, 0}, > - Package() {0x0008ffff, 1, LNKA, 0}, > - Package() {0x0008ffff, 2, LNKB, 0}, > - Package() {0x0008ffff, 3, LNKC, 0}, > - > - // PCI Slot 9 > - Package() {0x0008ffff, 0, LNKA, 0}, > - Package() {0x0008ffff, 1, LNKB, 0}, > - Package() {0x0008ffff, 2, LNKC, 0}, > - Package() {0x0008ffff, 3, LNKD, 0}, > - > - // PCI Slot 10 > - Package() {0x000affff, 0, LNKB, 0}, > - Package() {0x000affff, 1, LNKC, 0}, > - Package() {0x000affff, 2, LNKD, 0}, > - Package() {0x000affff, 3, LNKA, 0}, > - > - // PCI Slot 11 > - Package() {0x000bffff, 0, LNKC, 0}, > - Package() {0x000bffff, 1, LNKD, 0}, > - Package() {0x000bffff, 2, LNKA, 0}, > - Package() {0x000bffff, 3, LNKB, 0}, > - > - // PCI Slot 12 > - Package() {0x000cffff, 0, LNKD, 0}, > - Package() {0x000cffff, 1, LNKA, 0}, > - Package() {0x000cffff, 2, LNKB, 0}, > - Package() {0x000cffff, 3, LNKC, 0}, > - > - // PCI Slot 13 > - Package() {0x000dffff, 0, LNKA, 0}, > - Package() {0x000dffff, 1, LNKB, 0}, > - Package() {0x000dffff, 2, LNKC, 0}, > - Package() {0x000dffff, 3, LNKD, 0}, > - > - // PCI Slot 14 > - Package() {0x000effff, 0, LNKB, 0}, > - Package() {0x000effff, 1, LNKC, 0}, > - Package() {0x000effff, 2, LNKD, 0}, > - Package() {0x000effff, 3, LNKA, 0}, > - > - // PCI Slot 15 > - Package() {0x000fffff, 0, LNKC, 0}, > - Package() {0x000fffff, 1, LNKD, 0}, > - Package() {0x000fffff, 2, LNKA, 0}, > - Package() {0x000fffff, 3, LNKB, 0}, > - > - // PCI Slot 16 > - Package() {0x0010ffff, 0, LNKD, 0}, > - Package() {0x0010ffff, 1, LNKA, 0}, > - Package() {0x0010ffff, 2, LNKB, 0}, > - Package() {0x0010ffff, 3, LNKC, 0}, > - > - // PCI Slot 17 > - Package() {0x0011ffff, 0, LNKA, 0}, > - Package() {0x0011ffff, 1, LNKB, 0}, > - Package() {0x0011ffff, 2, LNKC, 0}, > - Package() {0x0011ffff, 3, LNKD, 0}, > - > - // PCI Slot 18 > - Package() {0x0012ffff, 0, LNKB, 0}, > - Package() {0x0012ffff, 1, LNKC, 0}, > - Package() {0x0012ffff, 2, LNKD, 0}, > - Package() {0x0012ffff, 3, LNKA, 0}, > - > - // PCI Slot 19 > - Package() {0x0013ffff, 0, LNKC, 0}, > - Package() {0x0013ffff, 1, LNKD, 0}, > - Package() {0x0013ffff, 2, LNKA, 0}, > - Package() {0x0013ffff, 3, LNKB, 0}, > - > - // PCI Slot 20 > - Package() {0x0014ffff, 0, LNKD, 0}, > - Package() {0x0014ffff, 1, LNKA, 0}, > - Package() {0x0014ffff, 2, LNKB, 0}, > - Package() {0x0014ffff, 3, LNKC, 0}, > - > - // PCI Slot 21 > - Package() {0x0015ffff, 0, LNKA, 0}, > - Package() {0x0015ffff, 1, LNKB, 0}, > - Package() {0x0015ffff, 2, LNKC, 0}, > - Package() {0x0015ffff, 3, LNKD, 0}, > - > - // PCI Slot 22 > - Package() {0x0016ffff, 0, LNKB, 0}, > - Package() {0x0016ffff, 1, LNKC, 0}, > - Package() {0x0016ffff, 2, LNKD, 0}, > - Package() {0x0016ffff, 3, LNKA, 0}, > - > - // PCI Slot 23 > - Package() {0x0017ffff, 0, LNKC, 0}, > - Package() {0x0017ffff, 1, LNKD, 0}, > - Package() {0x0017ffff, 2, LNKA, 0}, > - Package() {0x0017ffff, 3, LNKB, 0}, > - > - // PCI Slot 24 > - Package() {0x0018ffff, 0, LNKD, 0}, > - Package() {0x0018ffff, 1, LNKA, 0}, > - Package() {0x0018ffff, 2, LNKB, 0}, > - Package() {0x0018ffff, 3, LNKC, 0}, > - > - // PCI Slot 25 > - Package() {0x0018ffff, 0, LNKA, 0}, > - Package() {0x0018ffff, 1, LNKB, 0}, > - Package() {0x0018ffff, 2, LNKC, 0}, > - Package() {0x0018ffff, 3, LNKD, 0}, > - > - // PCI Slot 26 > - Package() {0x001affff, 0, LNKB, 0}, > - Package() {0x001affff, 1, LNKC, 0}, > - Package() {0x001affff, 2, LNKD, 0}, > - Package() {0x001affff, 3, LNKA, 0}, > - > - // PCI Slot 27 > - Package() {0x001bffff, 0, LNKC, 0}, > - Package() {0x001bffff, 1, LNKD, 0}, > - Package() {0x001bffff, 2, LNKA, 0}, > - Package() {0x001bffff, 3, LNKB, 0}, > - > - // PCI Slot 28 > - Package() {0x001cffff, 0, LNKD, 0}, > - Package() {0x001cffff, 1, LNKA, 0}, > - Package() {0x001cffff, 2, LNKB, 0}, > - Package() {0x001cffff, 3, LNKC, 0}, > - > - // PCI Slot 29 > - Package() {0x001dffff, 0, LNKA, 0}, > - Package() {0x001dffff, 1, LNKB, 0}, > - Package() {0x001dffff, 2, LNKC, 0}, > - Package() {0x001dffff, 3, LNKD, 0}, > - > - // PCI Slot 30 > - Package() {0x001effff, 0, LNKB, 0}, > - Package() {0x001effff, 1, LNKC, 0}, > - Package() {0x001effff, 2, LNKD, 0}, > - Package() {0x001effff, 3, LNKA, 0}, > - > - // PCI Slot 31 > - Package() {0x001fffff, 0, LNKC, 0}, > - Package() {0x001fffff, 1, LNKD, 0}, > - Package() {0x001fffff, 2, LNKA, 0}, > - Package() {0x001fffff, 3, LNKB, 0}, > - }) > + > + Include ("acpi-irq-routing.dsl") > > OperationRegion(PCST, SystemIO, 0xae00, 0x08) > Field (PCST, DWordAcc, NoLock, WriteAsZeros) > - { > + { > PCIU, 32, > PCID, 32, > - } > - > + } > OperationRegion(SEJ, SystemIO, 0xae08, 0x04) > Field (SEJ, DWordAcc, NoLock, WriteAsZeros) > { > B0EJ, 32, > } > > + Device (S0) { // Slot 0 > + Name (_ADR, 0x00000000) > + Method (_EJ0,1) { > + Store(0x1, B0EJ) > + Return (0x0) > + } > + } > + > Device (S1) { // Slot 1 > Name (_ADR, 0x00010000) > Method (_EJ0,1) { > @@ -436,28 +247,70 @@ DefinitionBlock ( > } > } > > - Device (S3) { // Slot 3 > + Device (S3) { // Slot 3, PCI-to-PCI bridge > Name (_ADR, 0x00030000) > - Method (_EJ0,1) { > - Store (0x8, B0EJ) > - Return (0x0) > + Include ("acpi-irq-routing.dsl") > + > + OperationRegion(PCST, SystemIO, 0xae0c, 0x08) > + Field (PCST, DWordAcc, NoLock, WriteAsZeros) > + { > + PCIU, 32, > + PCID, 32, > } > + > + OperationRegion(SEJ, SystemIO, 0xae14, 0x04) > + Field (SEJ, DWordAcc, NoLock, WriteAsZeros) > + { > + B1EJ, 32, > + } > + > + Name (SUN1, 30) > + Alias (\_SB.PCI0.S3.B1EJ, BEJ) > + Include ("acpi-pci-slots.dsl") > } > > - Device (S4) { // Slot 4 > + Device (S4) { // Slot 4, PCI-to-PCI bridge > Name (_ADR, 0x00040000) > - Method (_EJ0,1) { > - Store(0x10, B0EJ) > - Return (0x0) > + Include ("acpi-irq-routing.dsl") > + > + OperationRegion(PCST, SystemIO, 0xae18, 0x08) > + Field (PCST, DWordAcc, NoLock, WriteAsZeros) > + { > + PCIU, 32, > + PCID, 32, > + } > + > + OperationRegion(SEJ, SystemIO, 0xae20, 0x04) > + Field (SEJ, DWordAcc, NoLock, WriteAsZeros) > + { > + B2EJ, 32, > } > + > + Name (SUN1, 62) > + Alias (\_SB.PCI0.S4.B2EJ, BEJ) > + Include ("acpi-pci-slots.dsl") > } > > - Device (S5) { // Slot 5 > + Device (S5) { // Slot 5, PCI-to-PCI bridge > Name (_ADR, 0x00050000) > - Method (_EJ0,1) { > - Store(0x20, B0EJ) > - Return (0x0) > + Include ("acpi-irq-routing.dsl") > + > + OperationRegion(PCST, SystemIO, 0xae24, 0x08) > + Field (PCST, DWordAcc, NoLock, WriteAsZeros) > + { > + PCIU, 32, > + PCID, 32, > } > + > + OperationRegion(SEJ, SystemIO, 0xae2c, 0x04) > + Field (SEJ, DWordAcc, NoLock, WriteAsZeros) > + { > + B3EJ, 32, > + } > + > + Name (SUN1, 94) > + Alias (\_SB.PCI0.S5.B3EJ, BEJ) > + Include ("acpi-pci-slots.dsl") > } > > Device (S6) { // Slot 6 > @@ -1248,266 +1101,156 @@ DefinitionBlock ( > Return(0x01) > } > Method(_L01) { > - /* Up status */ > - If (And(\_SB.PCI0.PCIU, 0x2)) { > - Notify(\_SB.PCI0.S1, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x4)) { > - Notify(\_SB.PCI0.S2, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x8)) { > - Notify(\_SB.PCI0.S3, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x10)) { > - Notify(\_SB.PCI0.S4, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x20)) { > - Notify(\_SB.PCI0.S5, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x40)) { > - Notify(\_SB.PCI0.S6, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x80)) { > - Notify(\_SB.PCI0.S7, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x0100)) { > - Notify(\_SB.PCI0.S8, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x0200)) { > - Notify(\_SB.PCI0.S9, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x0400)) { > - Notify(\_SB.PCI0.S10, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x0800)) { > - Notify(\_SB.PCI0.S11, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x1000)) { > - Notify(\_SB.PCI0.S12, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x2000)) { > - Notify(\_SB.PCI0.S13, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x4000)) { > - Notify(\_SB.PCI0.S14, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x8000)) { > - Notify(\_SB.PCI0.S15, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x10000)) { > - Notify(\_SB.PCI0.S16, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x20000)) { > - Notify(\_SB.PCI0.S17, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x40000)) { > - Notify(\_SB.PCI0.S18, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x80000)) { > - Notify(\_SB.PCI0.S19, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x100000)) { > - Notify(\_SB.PCI0.S20, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x200000)) { > - Notify(\_SB.PCI0.S21, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x400000)) { > - Notify(\_SB.PCI0.S22, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x800000)) { > - Notify(\_SB.PCI0.S23, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x1000000)) { > - Notify(\_SB.PCI0.S24, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x2000000)) { > - Notify(\_SB.PCI0.S25, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x4000000)) { > - Notify(\_SB.PCI0.S26, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x8000000)) { > - Notify(\_SB.PCI0.S27, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x10000000)) { > - Notify(\_SB.PCI0.S28, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x20000000)) { > - Notify(\_SB.PCI0.S29, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x40000000)) { > - Notify(\_SB.PCI0.S30, 0x1) > - } > - > - If (And(\_SB.PCI0.PCIU, 0x80000000)) { > - Notify(\_SB.PCI0.S31, 0x1) > - } > - > - /* Down status */ > - If (And(\_SB.PCI0.PCID, 0x2)) { > - Notify(\_SB.PCI0.S1, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x4)) { > - Notify(\_SB.PCI0.S2, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x8)) { > - Notify(\_SB.PCI0.S3, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x10)) { > - Notify(\_SB.PCI0.S4, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x20)) { > - Notify(\_SB.PCI0.S5, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x40)) { > - Notify(\_SB.PCI0.S6, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x80)) { > - Notify(\_SB.PCI0.S7, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x0100)) { > - Notify(\_SB.PCI0.S8, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x0200)) { > - Notify(\_SB.PCI0.S9, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x0400)) { > - Notify(\_SB.PCI0.S10, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x0800)) { > - Notify(\_SB.PCI0.S11, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x1000)) { > - Notify(\_SB.PCI0.S12, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x2000)) { > - Notify(\_SB.PCI0.S13, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x4000)) { > - Notify(\_SB.PCI0.S14, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x8000)) { > - Notify(\_SB.PCI0.S15, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x10000)) { > - Notify(\_SB.PCI0.S16, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x20000)) { > - Notify(\_SB.PCI0.S17, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x40000)) { > - Notify(\_SB.PCI0.S18, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x80000)) { > - Notify(\_SB.PCI0.S19, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x100000)) { > - Notify(\_SB.PCI0.S20, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x200000)) { > - Notify(\_SB.PCI0.S21, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x400000)) { > - Notify(\_SB.PCI0.S22, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x800000)) { > - Notify(\_SB.PCI0.S23, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x1000000)) { > - Notify(\_SB.PCI0.S24, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x2000000)) { > - Notify(\_SB.PCI0.S25, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x4000000)) { > - Notify(\_SB.PCI0.S26, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x8000000)) { > - Notify(\_SB.PCI0.S27, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x10000000)) { > - Notify(\_SB.PCI0.S28, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x20000000)) { > - Notify(\_SB.PCI0.S29, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x40000000)) { > - Notify(\_SB.PCI0.S30, 0x3) > - } > - > - If (And(\_SB.PCI0.PCID, 0x80000000)) { > - Notify(\_SB.PCI0.S31, 0x3) > - } > - > - Return(0x01) > + Alias (\_SB.PCI0.PCIU, UP) > + Alias (\_SB.PCI0.PCID, DOWN) > + Alias (\_SB.PCI0.S0, S0) > + Alias (\_SB.PCI0.S1, S1) > + Alias (\_SB.PCI0.S2, S2) > + Alias (\_SB.PCI0.S3, S3) > + Alias (\_SB.PCI0.S4, S4) > + Alias (\_SB.PCI0.S5, S5) > + Alias (\_SB.PCI0.S6, S6) > + Alias (\_SB.PCI0.S7, S7) > + Alias (\_SB.PCI0.S8, S8) > + Alias (\_SB.PCI0.S9, S9) > + Alias (\_SB.PCI0.S10, S10) > + Alias (\_SB.PCI0.S11, S11) > + Alias (\_SB.PCI0.S12, S12) > + Alias (\_SB.PCI0.S13, S13) > + Alias (\_SB.PCI0.S14, S14) > + Alias (\_SB.PCI0.S15, S15) > + Alias (\_SB.PCI0.S16, S16) > + Alias (\_SB.PCI0.S17, S17) > + Alias (\_SB.PCI0.S18, S18) > + Alias (\_SB.PCI0.S19, S19) > + Alias (\_SB.PCI0.S20, S20) > + Alias (\_SB.PCI0.S21, S21) > + Alias (\_SB.PCI0.S22, S22) > + Alias (\_SB.PCI0.S23, S23) > + Alias (\_SB.PCI0.S24, S24) > + Alias (\_SB.PCI0.S25, S25) > + Alias (\_SB.PCI0.S26, S26) > + Alias (\_SB.PCI0.S27, S27) > + Alias (\_SB.PCI0.S28, S28) > + Alias (\_SB.PCI0.S29, S29) > + Alias (\_SB.PCI0.S30, S30) > + Alias (\_SB.PCI0.S31, S31) > + Include ("acpi-hotplug-gpe.dsl") > + Return (0x01) > } > Method(_L02) { > - Return(0x01) > + Alias (\_SB.PCI0.S3.PCIU, UP) > + Alias (\_SB.PCI0.S3.PCID, DOWN) > + Alias (\_SB.PCI0.S3.S0, S0) > + Alias (\_SB.PCI0.S3.S1, S1) > + Alias (\_SB.PCI0.S3.S2, S2) > + Alias (\_SB.PCI0.S3.S3, S3) > + Alias (\_SB.PCI0.S3.S4, S4) > + Alias (\_SB.PCI0.S3.S5, S5) > + Alias (\_SB.PCI0.S3.S6, S6) > + Alias (\_SB.PCI0.S3.S7, S7) > + Alias (\_SB.PCI0.S3.S8, S8) > + Alias (\_SB.PCI0.S3.S9, S9) > + Alias (\_SB.PCI0.S3.S10, S10) > + Alias (\_SB.PCI0.S3.S11, S11) > + Alias (\_SB.PCI0.S3.S12, S12) > + Alias (\_SB.PCI0.S3.S13, S13) > + Alias (\_SB.PCI0.S3.S14, S14) > + Alias (\_SB.PCI0.S3.S15, S15) > + Alias (\_SB.PCI0.S3.S16, S16) > + Alias (\_SB.PCI0.S3.S17, S17) > + Alias (\_SB.PCI0.S3.S18, S18) > + Alias (\_SB.PCI0.S3.S19, S19) > + Alias (\_SB.PCI0.S3.S20, S20) > + Alias (\_SB.PCI0.S3.S21, S21) > + Alias (\_SB.PCI0.S3.S22, S22) > + Alias (\_SB.PCI0.S3.S23, S23) > + Alias (\_SB.PCI0.S3.S24, S24) > + Alias (\_SB.PCI0.S3.S25, S25) > + Alias (\_SB.PCI0.S3.S26, S26) > + Alias (\_SB.PCI0.S3.S27, S27) > + Alias (\_SB.PCI0.S3.S28, S28) > + Alias (\_SB.PCI0.S3.S29, S29) > + Alias (\_SB.PCI0.S3.S30, S30) > + Alias (\_SB.PCI0.S3.S31, S31) > + Include ("acpi-hotplug-gpe.dsl") > + Return (0x01) > } > Method(_L03) { > - Return(0x01) > + Alias (\_SB.PCI0.S4.PCIU, UP) > + Alias (\_SB.PCI0.S4.PCID, DOWN) > + Alias (\_SB.PCI0.S4.S0, S0) > + Alias (\_SB.PCI0.S4.S1, S1) > + Alias (\_SB.PCI0.S4.S2, S2) > + Alias (\_SB.PCI0.S4.S3, S3) > + Alias (\_SB.PCI0.S4.S4, S4) > + Alias (\_SB.PCI0.S4.S5, S5) > + Alias (\_SB.PCI0.S4.S6, S6) > + Alias (\_SB.PCI0.S4.S7, S7) > + Alias (\_SB.PCI0.S4.S8, S8) > + Alias (\_SB.PCI0.S4.S9, S9) > + Alias (\_SB.PCI0.S4.S10, S10) > + Alias (\_SB.PCI0.S4.S11, S11) > + Alias (\_SB.PCI0.S4.S12, S12) > + Alias (\_SB.PCI0.S4.S13, S13) > + Alias (\_SB.PCI0.S4.S14, S14) > + Alias (\_SB.PCI0.S4.S15, S15) > + Alias (\_SB.PCI0.S4.S16, S16) > + Alias (\_SB.PCI0.S4.S17, S17) > + Alias (\_SB.PCI0.S4.S18, S18) > + Alias (\_SB.PCI0.S4.S19, S19) > + Alias (\_SB.PCI0.S4.S20, S20) > + Alias (\_SB.PCI0.S4.S21, S21) > + Alias (\_SB.PCI0.S4.S22, S22) > + Alias (\_SB.PCI0.S4.S23, S23) > + Alias (\_SB.PCI0.S4.S24, S24) > + Alias (\_SB.PCI0.S4.S25, S25) > + Alias (\_SB.PCI0.S4.S26, S26) > + Alias (\_SB.PCI0.S4.S27, S27) > + Alias (\_SB.PCI0.S4.S28, S28) > + Alias (\_SB.PCI0.S4.S29, S29) > + Alias (\_SB.PCI0.S4.S30, S30) > + Alias (\_SB.PCI0.S4.S31, S31) > + Include ("acpi-hotplug-gpe.dsl") > + Return (0x01) > } > Method(_L04) { > - Return(0x01) > + Alias (\_SB.PCI0.S5.PCIU, UP) > + Alias (\_SB.PCI0.S5.PCID, DOWN) > + Alias (\_SB.PCI0.S5.S0, S0) > + Alias (\_SB.PCI0.S5.S1, S1) > + Alias (\_SB.PCI0.S5.S2, S2) > + Alias (\_SB.PCI0.S5.S3, S3) > + Alias (\_SB.PCI0.S5.S4, S4) > + Alias (\_SB.PCI0.S5.S5, S5) > + Alias (\_SB.PCI0.S5.S6, S6) > + Alias (\_SB.PCI0.S5.S7, S7) > + Alias (\_SB.PCI0.S5.S8, S8) > + Alias (\_SB.PCI0.S5.S9, S9) > + Alias (\_SB.PCI0.S5.S10, S10) > + Alias (\_SB.PCI0.S5.S11, S11) > + Alias (\_SB.PCI0.S5.S12, S12) > + Alias (\_SB.PCI0.S5.S13, S13) > + Alias (\_SB.PCI0.S5.S14, S14) > + Alias (\_SB.PCI0.S5.S15, S15) > + Alias (\_SB.PCI0.S5.S16, S16) > + Alias (\_SB.PCI0.S5.S17, S17) > + Alias (\_SB.PCI0.S5.S18, S18) > + Alias (\_SB.PCI0.S5.S19, S19) > + Alias (\_SB.PCI0.S5.S20, S20) > + Alias (\_SB.PCI0.S5.S21, S21) > + Alias (\_SB.PCI0.S5.S22, S22) > + Alias (\_SB.PCI0.S5.S23, S23) > + Alias (\_SB.PCI0.S5.S24, S24) > + Alias (\_SB.PCI0.S5.S25, S25) > + Alias (\_SB.PCI0.S5.S26, S26) > + Alias (\_SB.PCI0.S5.S27, S27) > + Alias (\_SB.PCI0.S5.S28, S28) > + Alias (\_SB.PCI0.S5.S29, S29) > + Alias (\_SB.PCI0.S5.S30, S30) > + Alias (\_SB.PCI0.S5.S31, S31) > + Include ("acpi-hotplug-gpe.dsl") > + Return (0x01) > } > Method(_L05) { > Return(0x01) > Index: kvm-userspace.pci2/bios/acpi-hotplug-gpe.dsl > =================================================================== > --- /dev/null > +++ kvm-userspace.pci2/bios/acpi-hotplug-gpe.dsl > @@ -0,0 +1,257 @@ > + /* Up status */ > + If (And(UP, 0x1)) { > + Notify(S0, 0x1) > + } > + > + If (And(UP, 0x2)) { > + Notify(S1, 0x1) > + } > + > + If (And(UP, 0x4)) { > + Notify(S2, 0x1) > + } > + > + If (And(UP, 0x8)) { > + Notify(S3, 0x1) > + } > + > + If (And(UP, 0x10)) { > + Notify(S4, 0x1) > + } > + > + If (And(UP, 0x20)) { > + Notify(S5, 0x1) > + } > + > + If (And(UP, 0x40)) { > + Notify(S6, 0x1) > + } > + > + If (And(UP, 0x80)) { > + Notify(S7, 0x1) > + } > + > + If (And(UP, 0x0100)) { > + Notify(S8, 0x1) > + } > + > + If (And(UP, 0x0200)) { > + Notify(S9, 0x1) > + } > + > + If (And(UP, 0x0400)) { > + Notify(S10, 0x1) > + } > + > + If (And(UP, 0x0800)) { > + Notify(S11, 0x1) > + } > + > + If (And(UP, 0x1000)) { > + Notify(S12, 0x1) > + } > + > + If (And(UP, 0x2000)) { > + Notify(S13, 0x1) > + } > + > + If (And(UP, 0x4000)) { > + Notify(S14, 0x1) > + } > + > + If (And(UP, 0x8000)) { > + Notify(S15, 0x1) > + } > + > + If (And(UP, 0x10000)) { > + Notify(S16, 0x1) > + } > + > + If (And(UP, 0x20000)) { > + Notify(S17, 0x1) > + } > + > + If (And(UP, 0x40000)) { > + Notify(S18, 0x1) > + } > + > + If (And(UP, 0x80000)) { > + Notify(S19, 0x1) > + } > + > + If (And(UP, 0x100000)) { > + Notify(S20, 0x1) > + } > + > + If (And(UP, 0x200000)) { > + Notify(S21, 0x1) > + } > + > + If (And(UP, 0x400000)) { > + Notify(S22, 0x1) > + } > + > + If (And(UP, 0x800000)) { > + Notify(S23, 0x1) > + } > + > + If (And(UP, 0x1000000)) { > + Notify(S24, 0x1) > + } > + > + If (And(UP, 0x2000000)) { > + Notify(S25, 0x1) > + } > + > + If (And(UP, 0x4000000)) { > + Notify(S26, 0x1) > + } > + > + If (And(UP, 0x8000000)) { > + Notify(S27, 0x1) > + } > + > + If (And(UP, 0x10000000)) { > + Notify(S28, 0x1) > + } > + > + If (And(UP, 0x20000000)) { > + Notify(S29, 0x1) > + } > + > + If (And(UP, 0x40000000)) { > + Notify(S30, 0x1) > + } > + > + If (And(UP, 0x80000000)) { > + Notify(S31, 0x1) > + } > + > + /* Down status */ > + If (And(DOWN, 0x1)) { > + Notify(S0, 0x3) > + } > + > + If (And(DOWN, 0x2)) { > + Notify(S1, 0x3) > + } > + > + If (And(DOWN, 0x4)) { > + Notify(S2, 0x3) > + } > + > + If (And(DOWN, 0x8)) { > + Notify(S3, 0x3) > + } > + > + If (And(DOWN, 0x10)) { > + Notify(S4, 0x3) > + } > + > + If (And(DOWN, 0x20)) { > + Notify(S5, 0x3) > + } > + > + If (And(DOWN, 0x40)) { > + Notify(S6, 0x3) > + } > + > + If (And(DOWN, 0x80)) { > + Notify(S7, 0x3) > + } > + > + If (And(DOWN, 0x0100)) { > + Notify(S8, 0x3) > + } > + > + If (And(DOWN, 0x0200)) { > + Notify(S9, 0x3) > + } > + > + If (And(DOWN, 0x0400)) { > + Notify(S10, 0x3) > + } > + > + If (And(DOWN, 0x0800)) { > + Notify(S11, 0x3) > + } > + > + If (And(DOWN, 0x1000)) { > + Notify(S12, 0x3) > + } > + > + If (And(DOWN, 0x2000)) { > + Notify(S13, 0x3) > + } > + > + If (And(DOWN, 0x4000)) { > + Notify(S14, 0x3) > + } > + > + If (And(DOWN, 0x8000)) { > + Notify(S15, 0x3) > + } > + > + If (And(DOWN, 0x10000)) { > + Notify(S16, 0x3) > + } > + > + If (And(DOWN, 0x20000)) { > + Notify(S17, 0x3) > + } > + > + If (And(DOWN, 0x40000)) { > + Notify(S18, 0x3) > + } > + > + If (And(DOWN, 0x80000)) { > + Notify(S19, 0x3) > + } > + > + If (And(DOWN, 0x100000)) { > + Notify(S20, 0x3) > + } > + > + If (And(DOWN, 0x200000)) { > + Notify(S21, 0x3) > + } > + > + If (And(DOWN, 0x400000)) { > + Notify(S22, 0x3) > + } > + > + If (And(DOWN, 0x800000)) { > + Notify(S23, 0x3) > + } > + > + If (And(DOWN, 0x1000000)) { > + Notify(S24, 0x3) > + } > + > + If (And(DOWN, 0x2000000)) { > + Notify(S25, 0x3) > + } > + > + If (And(DOWN, 0x4000000)) { > + Notify(S26, 0x3) > + } > + > + If (And(DOWN, 0x8000000)) { > + Notify(S27, 0x3) > + } > + > + If (And(DOWN, 0x10000000)) { > + Notify(S28, 0x3) > + } > + > + If (And(DOWN, 0x20000000)) { > + Notify(S29, 0x3) > + } > + > + If (And(DOWN, 0x40000000)) { > + Notify(S30, 0x3) > + } > + > + If (And(DOWN, 0x80000000)) { > + Notify(S31, 0x3) > + } > Index: kvm-userspace.pci2/bios/acpi-irq-routing.dsl > =================================================================== > --- /dev/null > +++ kvm-userspace.pci2/bios/acpi-irq-routing.dsl > @@ -0,0 +1,203 @@ > + External(LNKA, DeviceObj) > + External(LNKB, DeviceObj) > + External(LNKC, DeviceObj) > + External(LNKD, DeviceObj) > + > + Name(_PRT, Package() { > + /* PCI IRQ routing table, example from ACPI 2.0a > specification, > + section 6.2.8.1 */ > + /* Note: we provide the same info as the PCI routing > + table of the Bochs BIOS */ > + > + // PCI Slot 0 > + Package() {0x0000ffff, 0, LNKD, 0}, > + Package() {0x0000ffff, 1, LNKA, 0}, > + Package() {0x0000ffff, 2, LNKB, 0}, > + Package() {0x0000ffff, 3, LNKC, 0}, > + > + // PCI Slot 1 > + Package() {0x0001ffff, 0, LNKA, 0}, > + Package() {0x0001ffff, 1, LNKB, 0}, > + Package() {0x0001ffff, 2, LNKC, 0}, > + Package() {0x0001ffff, 3, LNKD, 0}, > + > + // PCI Slot 2 > + Package() {0x0002ffff, 0, LNKB, 0}, > + Package() {0x0002ffff, 1, LNKC, 0}, > + Package() {0x0002ffff, 2, LNKD, 0}, > + Package() {0x0002ffff, 3, LNKA, 0}, > + > + // PCI Slot 3 > + Package() {0x0003ffff, 0, LNKC, 0}, > + Package() {0x0003ffff, 1, LNKD, 0}, > + Package() {0x0003ffff, 2, LNKA, 0}, > + Package() {0x0003ffff, 3, LNKB, 0}, > + > + // PCI Slot 4 > + Package() {0x0004ffff, 0, LNKD, 0}, > + Package() {0x0004ffff, 1, LNKA, 0}, > + Package() {0x0004ffff, 2, LNKB, 0}, > + Package() {0x0004ffff, 3, LNKC, 0}, > + > + // PCI Slot 5 > + Package() {0x0005ffff, 0, LNKA, 0}, > + Package() {0x0005ffff, 1, LNKB, 0}, > + Package() {0x0005ffff, 2, LNKC, 0}, > + Package() {0x0005ffff, 3, LNKD, 0}, > + > + // PCI Slot 6 > + Package() {0x0006ffff, 0, LNKB, 0}, > + Package() {0x0006ffff, 1, LNKC, 0}, > + Package() {0x0006ffff, 2, LNKD, 0}, > + Package() {0x0006ffff, 3, LNKA, 0}, > + > + // PCI Slot 7 > + Package() {0x0007ffff, 0, LNKC, 0}, > + Package() {0x0007ffff, 1, LNKD, 0}, > + Package() {0x0007ffff, 2, LNKA, 0}, > + Package() {0x0007ffff, 3, LNKB, 0}, > + > + // PCI Slot 8 > + Package() {0x0008ffff, 0, LNKD, 0}, > + Package() {0x0008ffff, 1, LNKA, 0}, > + Package() {0x0008ffff, 2, LNKB, 0}, > + Package() {0x0008ffff, 3, LNKC, 0}, > + > + // PCI Slot 9 > + Package() {0x0009ffff, 0, LNKA, 0}, > + Package() {0x0009ffff, 1, LNKB, 0}, > + Package() {0x0009ffff, 2, LNKC, 0}, > + Package() {0x0009ffff, 3, LNKD, 0}, > + > + // PCI Slot 10 > + Package() {0x000affff, 0, LNKB, 0}, > + Package() {0x000affff, 1, LNKC, 0}, > + Package() {0x000affff, 2, LNKD, 0}, > + Package() {0x000affff, 3, LNKA, 0}, > + > + // PCI Slot 11 > + Package() {0x000bffff, 0, LNKC, 0}, > + Package() {0x000bffff, 1, LNKD, 0}, > + Package() {0x000bffff, 2, LNKA, 0}, > + Package() {0x000bffff, 3, LNKB, 0}, > + > + // PCI Slot 12 > + Package() {0x000cffff, 0, LNKD, 0}, > + Package() {0x000cffff, 1, LNKA, 0}, > + Package() {0x000cffff, 2, LNKB, 0}, > + Package() {0x000cffff, 3, LNKC, 0}, > + > + // PCI Slot 13 > + Package() {0x000dffff, 0, LNKA, 0}, > + Package() {0x000dffff, 1, LNKB, 0}, > + Package() {0x000dffff, 2, LNKC, 0}, > + Package() {0x000dffff, 3, LNKD, 0}, > + > + // PCI Slot 14 > + Package() {0x000effff, 0, LNKB, 0}, > + Package() {0x000effff, 1, LNKC, 0}, > + Package() {0x000effff, 2, LNKD, 0}, > + Package() {0x000effff, 3, LNKA, 0}, > + > + // PCI Slot 15 > + Package() {0x000fffff, 0, LNKC, 0}, > + Package() {0x000fffff, 1, LNKD, 0}, > + Package() {0x000fffff, 2, LNKA, 0}, > + Package() {0x000fffff, 3, LNKB, 0}, > + > + // PCI Slot 16 > + Package() {0x0010ffff, 0, LNKD, 0}, > + Package() {0x0010ffff, 1, LNKA, 0}, > + Package() {0x0010ffff, 2, LNKB, 0}, > + Package() {0x0010ffff, 3, LNKC, 0}, > + > + // PCI Slot 17 > + Package() {0x0011ffff, 0, LNKA, 0}, > + Package() {0x0011ffff, 1, LNKB, 0}, > + Package() {0x0011ffff, 2, LNKC, 0}, > + Package() {0x0011ffff, 3, LNKD, 0}, > + > + // PCI Slot 18 > + Package() {0x0012ffff, 0, LNKB, 0}, > + Package() {0x0012ffff, 1, LNKC, 0}, > + Package() {0x0012ffff, 2, LNKD, 0}, > + Package() {0x0012ffff, 3, LNKA, 0}, > + > + // PCI Slot 19 > + Package() {0x0013ffff, 0, LNKC, 0}, > + Package() {0x0013ffff, 1, LNKD, 0}, > + Package() {0x0013ffff, 2, LNKA, 0}, > + Package() {0x0013ffff, 3, LNKB, 0}, > + > + // PCI Slot 20 > + Package() {0x0014ffff, 0, LNKD, 0}, > + Package() {0x0014ffff, 1, LNKA, 0}, > + Package() {0x0014ffff, 2, LNKB, 0}, > + Package() {0x0014ffff, 3, LNKC, 0}, > + > + // PCI Slot 21 > + Package() {0x0015ffff, 0, LNKA, 0}, > + Package() {0x0015ffff, 1, LNKB, 0}, > + Package() {0x0015ffff, 2, LNKC, 0}, > + Package() {0x0015ffff, 3, LNKD, 0}, > + > + // PCI Slot 22 > + Package() {0x0016ffff, 0, LNKB, 0}, > + Package() {0x0016ffff, 1, LNKC, 0}, > + Package() {0x0016ffff, 2, LNKD, 0}, > + Package() {0x0016ffff, 3, LNKA, 0}, > + > + // PCI Slot 23 > + Package() {0x0017ffff, 0, LNKC, 0}, > + Package() {0x0017ffff, 1, LNKD, 0}, > + Package() {0x0017ffff, 2, LNKA, 0}, > + Package() {0x0017ffff, 3, LNKB, 0}, > + > + // PCI Slot 24 > + Package() {0x0018ffff, 0, LNKD, 0}, > + Package() {0x0018ffff, 1, LNKA, 0}, > + Package() {0x0018ffff, 2, LNKB, 0}, > + Package() {0x0018ffff, 3, LNKC, 0}, > + > + // PCI Slot 25 > + Package() {0x0019ffff, 0, LNKA, 0}, > + Package() {0x0019ffff, 1, LNKB, 0}, > + Package() {0x0019ffff, 2, LNKC, 0}, > + Package() {0x0019ffff, 3, LNKD, 0}, > + > + // PCI Slot 26 > + Package() {0x001affff, 0, LNKB, 0}, > + Package() {0x001affff, 1, LNKC, 0}, > + Package() {0x001affff, 2, LNKD, 0}, > + Package() {0x001affff, 3, LNKA, 0}, > + > + // PCI Slot 27 > + Package() {0x001bffff, 0, LNKC, 0}, > + Package() {0x001bffff, 1, LNKD, 0}, > + Package() {0x001bffff, 2, LNKA, 0}, > + Package() {0x001bffff, 3, LNKB, 0}, > + > + // PCI Slot 28 > + Package() {0x001cffff, 0, LNKD, 0}, > + Package() {0x001cffff, 1, LNKA, 0}, > + Package() {0x001cffff, 2, LNKB, 0}, > + Package() {0x001cffff, 3, LNKC, 0}, > + > + // PCI Slot 29 > + Package() {0x001dffff, 0, LNKA, 0}, > + Package() {0x001dffff, 1, LNKB, 0}, > + Package() {0x001dffff, 2, LNKC, 0}, > + Package() {0x001dffff, 3, LNKD, 0}, > + > + // PCI Slot 30 > + Package() {0x001effff, 0, LNKB, 0}, > + Package() {0x001effff, 1, LNKC, 0}, > + Package() {0x001effff, 2, LNKD, 0}, > + Package() {0x001effff, 3, LNKA, 0}, > + > + // PCI Slot 31 > + Package() {0x001fffff, 0, LNKC, 0}, > + Package() {0x001fffff, 1, LNKD, 0}, > + Package() {0x001fffff, 2, LNKA, 0}, > + Package() {0x001fffff, 3, LNKB, 0}, > + }) > Index: kvm-userspace.pci2/bios/acpi-pci-slots.dsl > =================================================================== > --- /dev/null > +++ kvm-userspace.pci2/bios/acpi-pci-slots.dsl > @@ -0,0 +1,385 @@ > + Device (S0) { // Slot 0 > + Name (_ADR, 0x00000000) > + Method (_EJ0,1) { > + Store(0x1, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 0, Local0) > + Return (Local0) > + } > + } > + > + Device (S1) { // Slot 1 > + Name (_ADR, 0x00010000) > + Method (_EJ0,1) { > + Store(0x2, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 1, Local0) > + Return (Local0) > + } > + } > + > + Device (S2) { // Slot 2 > + Name (_ADR, 0x00020000) > + Method (_EJ0,1) { > + Store(0x4, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 2, Local0) > + Return (Local0) > + } > + } > + > + Device (S3) { // Slot 3 > + Name (_ADR, 0x00030000) > + Method (_EJ0,1) { > + Store(0x4, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 3, Local0) > + Return (Local0) > + } > + } > + > + Device (S4) { // Slot 4 > + Name (_ADR, 0x00040000) > + Method (_EJ0,1) { > + Store(0x4, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 4, Local0) > + Return (Local0) > + } > + } > + > + Device (S5) { // Slot 5 > + Name (_ADR, 0x00050000) > + Method (_EJ0,1) { > + Store(0x4, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 5, Local0) > + Return (Local0) > + } > + } > + > + Device (S6) { // Slot 6 > + Name (_ADR, 0x00060000) > + Method (_EJ0,1) { > + Store(0x40, BEJ) > + Return (0x0) > + } > + > + Method(_SUN) { > + Add (SUN1, 6, Local0) > + Return (Local0) > + } > + } > + > + Device (S7) { // Slot 7 > + Name (_ADR, 0x00070000) > + Method (_EJ0,1) { > + Store(0x80, BEJ) > + Return (0x0) > + } > + > + Method(_SUN) { > + Add (SUN1, 7, Local0) > + Return (Local0) > + } > + } > + > + Device (S8) { // Slot 8 > + Name (_ADR, 0x00080000) > + Method (_EJ0,1) { > + Store(0x100, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 8, Local0) > + Return (Local0) > + } > + } > + > + Device (S9) { // Slot 9 > + Name (_ADR, 0x00090000) > + Method (_EJ0,1) { > + Store(0x200, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 9, Local0) > + Return (Local0) > + } > + } > + > + Device (S10) { // Slot 10 > + Name (_ADR, 0x000A0000) > + Method (_EJ0,1) { > + Store(0x400, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 10, Local0) > + Return (Local0) > + } > + } > + > + Device (S11) { // Slot 11 > + Name (_ADR, 0x000B0000) > + Method (_EJ0,1) { > + Store(0x800, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 11, Local0) > + Return (Local0) > + } > + } > + > + Device (S12) { // Slot 12 > + Name (_ADR, 0x000C0000) > + Method (_EJ0,1) { > + Store(0x1000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 12, Local0) > + Return (Local0) > + } > + } > + > + Device (S13) { // Slot 13 > + Name (_ADR, 0x000D0000) > + Method (_EJ0,1) { > + Store(0x2000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 13, Local0) > + Return (Local0) > + } > + } > + > + Device (S14) { // Slot 14 > + Name (_ADR, 0x000E0000) > + Method (_EJ0,1) { > + Store(0x4000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 14, Local0) > + Return (Local0) > + } > + } > + > + Device (S15) { // Slot 15 > + Name (_ADR, 0x000F0000) > + Method (_EJ0,1) { > + Store(0x8000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 15, Local0) > + Return (Local0) > + } > + } > + > + Device (S16) { // Slot 16 > + Name (_ADR, 0x00100000) > + Method (_EJ0,1) { > + Store(0x10000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 16, Local0) > + Return (Local0) > + } > + } > + > + Device (S17) { // Slot 17 > + Name (_ADR, 0x00110000) > + Method (_EJ0,1) { > + Store(0x20000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 17, Local0) > + Return (Local0) > + } > + } > + > + Device (S18) { // Slot 18 > + Name (_ADR, 0x00120000) > + Method (_EJ0,1) { > + Store(0x40000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 18, Local0) > + Return (Local0) > + } > + } > + > + Device (S19) { // Slot 19 > + Name (_ADR, 0x00130000) > + Method (_EJ0,1) { > + Store(0x80000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 19, Local0) > + Return (Local0) > + } > + } > + > + Device (S20) { // Slot 20 > + Name (_ADR, 0x00140000) > + Method (_EJ0,1) { > + Store(0x100000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 20, Local0) > + Return (Local0) > + } > + } > + > + Device (S21) { // Slot 21 > + Name (_ADR, 0x00150000) > + Method (_EJ0,1) { > + Store(0x200000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 21, Local0) > + Return (Local0) > + } > + } > + > + Device (S22) { // Slot 22 > + Name (_ADR, 0x00160000) > + Method (_EJ0,1) { > + Store(0x400000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 22, Local0) > + Return (Local0) > + } > + } > + > + Device (S23) { // Slot 23 > + Name (_ADR, 0x00170000) > + Method (_EJ0,1) { > + Store(0x800000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 23, Local0) > + Return (Local0) > + } > + } > + > + Device (S24) { // Slot 24 > + Name (_ADR, 0x00180000) > + Method (_EJ0,1) { > + Store(0x1000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 24, Local0) > + Return (Local0) > + } > + } > + > + Device (S25) { // Slot 25 > + Name (_ADR, 0x00190000) > + Method (_EJ0,1) { > + Store(0x2000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 25, Local0) > + Return (Local0) > + } > + } > + > + Device (S26) { // Slot 26 > + Name (_ADR, 0x001A0000) > + Method (_EJ0,1) { > + Store(0x4000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 26, Local0) > + Return (Local0) > + } > + } > + > + Device (S27) { // Slot 27 > + Name (_ADR, 0x001B0000) > + Method (_EJ0,1) { > + Store(0x8000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 27, Local0) > + Return (Local0) > + } > + } > + > + Device (S28) { // Slot 28 > + Name (_ADR, 0x001C0000) > + Method (_EJ0,1) { > + Store(0x10000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 28, Local0) > + Return (Local0) > + } > + } > + > + Device (S29) { // Slot 29 > + Name (_ADR, 0x001D0000) > + Method (_EJ0,1) { > + Store(0x20000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 29, Local0) > + Return (Local0) > + } > + } > + > + Device (S30) { // Slot 30 > + Name (_ADR, 0x001E0000) > + Method (_EJ0,1) { > + Store(0x40000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 30, Local0) > + Return (Local0) > + } > + } > + > + Device (S31) { // Slot 31 > + Name (_ADR, 0x001F0000) > + Method (_EJ0,1) { > + Store(0x80000000, BEJ) > + Return (0x0) > + } > + Method(_SUN) { > + Add (SUN1, 31, Local0) > + Return (Local0) > + } > + } > Index: kvm-userspace.pci2/bios/rombios32.c > =================================================================== > --- kvm-userspace.pci2.orig/bios/rombios32.c > +++ kvm-userspace.pci2/bios/rombios32.c > @@ -652,6 +652,30 @@ static void bios_lock_shadow_ram(void) > pci_config_writeb(d, 0x59, v); > } > > +static int nr_bridges = 1; > +static int current_bridge = 0; > + > +static void pci_bios_count_p2p(PCIDevice *d) > +{ > + uint16_t vendor_id, device_id; > + > + vendor_id = pci_config_readw(d, PCI_VENDOR_ID); > + device_id = pci_config_readw(d, PCI_DEVICE_ID); > + if (vendor_id == 0x8086 && device_id == 0x244e) > + nr_bridges++; > +} > + > +int fls(int i) > +{ > + int bit; > + > + for (bit=31; bit >= 0; bit--) > + if (i & (1 << bit)) > + return bit+1; > + > + return 0; > +} > + > static void pci_bios_init_bridges(PCIDevice *d) > { > uint16_t vendor_id, device_id; > @@ -681,6 +705,27 @@ static void pci_bios_init_bridges(PCIDev > } else if (vendor_id == 0x8086 && device_id == 0x1237) { > /* i440 PCI bridge */ > bios_shadow_init(d); > + } else if (vendor_id == 0x8086 && device_id == 0x244e) { > + int len, base; > + > + len = (0xfebfffff - 0xf0000000) / nr_bridges; > + if (len & (len-1)) > + len = 1 << fls(len); > + > + /* memory IO */ > + base = (0xf0000000+len) + (current_bridge*len); > + base >>= 16; > + pci_config_writew(d, 0x20, base); > + pci_config_writew(d, 0x22, base); > + > + /* port IO */ > + len = 0x1000; > + base = 0x2000 + (current_bridge*len); > + base >>= 8; > + pci_config_writeb(d, 0x1c, base); > + pci_config_writeb(d, 0x1d, base); > + > + current_bridge++; > } > } > > @@ -775,6 +820,8 @@ static void pci_bios_init_device(PCIDevi > pci_set_io_region_addr(d, 0, 0x80800000); > } > break; > + case 0x0604: > + break; > default: > default_map: > /* default memory mappings */ > @@ -859,6 +906,8 @@ void pci_bios_init(void) > if (pci_bios_bigmem_addr < 0x90000000) > pci_bios_bigmem_addr = 0x90000000; > > + pci_for_each_device(pci_bios_count_p2p); > + > pci_for_each_device(pci_bios_init_bridges); > > pci_for_each_device(pci_bios_init_device); > Index: kvm-userspace.pci2/qemu/hw/acpi.c > =================================================================== > --- kvm-userspace.pci2.orig/qemu/hw/acpi.c > +++ kvm-userspace.pci2/qemu/hw/acpi.c > @@ -557,10 +557,11 @@ struct gpe_regs { > struct pci_status { > uint32_t up; > uint32_t down; > + unsigned long base; > }; > > static struct gpe_regs gpe; > -static struct pci_status pci0_status; > +static struct pci_status pci_bus_status[4]; > > static uint32_t gpe_readb(void *opaque, uint32_t addr) > { > @@ -630,16 +631,19 @@ static void gpe_writeb(void *opaque, uin > > static uint32_t pcihotplug_read(void *opaque, uint32_t addr) > { > - uint32_t val = 0; > struct pci_status *g = opaque; > - switch (addr) { > - case PCI_BASE: > + uint32_t val, offset; > + > + offset = addr - g->base; > + switch (offset) { > + case 0: > val = g->up; > break; > - case PCI_BASE + 4: > + case 4: > val = g->down; > break; > default: > + val = 0; > break; > } > > @@ -652,11 +656,13 @@ static uint32_t pcihotplug_read(voi... [truncated message content] |
From: Jes S. <je...@sg...> - 2008-04-25 09:18:28
|
>>>>> "Avi" == Avi Kivity <av...@qu...> writes: Avi> kvm-devel doesn't do manual moderation. If vger has the Avi> infrastructure, I don't see why you can't continue doing this on Avi> kvm-ppc-devel. Please don't do this for the kvm-ia64 list either. Avi> btw, we can probably shorten the names to kvm@ and kvm-$arch@ Avi> while we're at it. Either way works IMHO. Cheers, Jes |
From: Yang, S. <she...@in...> - 2008-04-25 08:22:36
|
From 592b7855a88266fa19505f0d51fe12ec0eadfa62 Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 22:14:06 +0800 Subject: [PATCH 8/8] KVM: VMX: Enable EPT feature for KVM Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/mmu.c | 11 ++- arch/x86/kvm/vmx.c | 227 ++++++++++++++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx.h | 9 ++ include/asm-x86/kvm_host.h | 1 + 4 files changed, 238 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 3dbedf1..7a8640a 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1177,8 +1177,15 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, return -ENOMEM; } - table[index] = __pa(new_table->spt) | PT_PRESENT_MASK - | PT_WRITABLE_MASK | shadow_user_mask; + if (shadow_user_mask) + table[index] = __pa(new_table->spt) + | PT_PRESENT_MASK | PT_WRITABLE_MASK + | shadow_user_mask; + else + table[index] = __pa(new_table->spt) + | PT_PRESENT_MASK | PT_WRITABLE_MASK + | shadow_x_mask; + table[index] = __pa(new_table->spt) | 0x7; } table_addr = table[index] & PT64_BASE_ADDR_MASK; } diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index de5f615..8870c6f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -42,7 +42,7 @@ module_param(enable_vpid, bool, 0); static int flexpriority_enabled = 1; module_param(flexpriority_enabled, bool, 0); -static int enable_ept; +static int enable_ept = 1; module_param(enable_ept, bool, 0); struct vmcs { @@ -284,6 +284,18 @@ static inline void __invvpid(int ext, u16 vpid, gva_t gva) : : "a"(&operand), "c"(ext) : "cc", "memory"); } +static inline void __invept(int ext, u64 eptp, gpa_t gpa) +{ + struct { + u64 eptp, gpa; + } operand = {eptp, gpa}; + + asm volatile (ASM_VMX_INVEPT + /* CF==1 or ZF==1 --> rc = -1 */ + "; ja 1f ; ud2 ; 1:\n" + : : "a" (&operand), "c" (ext) : "cc", "memory"); +} + static struct kvm_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr) { int i; @@ -335,6 +347,33 @@ static inline void vpid_sync_vcpu_all(struct vcpu_vmx *vmx) __invvpid(VMX_VPID_EXTENT_SINGLE_CONTEXT, vmx->vpid, 0); } +static inline void ept_sync_global(void) +{ + if (cpu_has_vmx_invept_global()) + __invept(VMX_EPT_EXTENT_GLOBAL, 0, 0); +} + +static inline void ept_sync_context(u64 eptp) +{ + if (vm_need_ept()) { + if (cpu_has_vmx_invept_context()) + __invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0); + else + ept_sync_global(); + } +} + +static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa) +{ + if (vm_need_ept()) { + if (cpu_has_vmx_invept_individual_addr()) + __invept(VMX_EPT_EXTENT_INDIVIDUAL_ADDR, + eptp, gpa); + else + ept_sync_context(eptp); + } +} + static unsigned long vmcs_readl(unsigned long field) { unsigned long value; @@ -422,6 +461,8 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu) eb |= 1u << 1; if (vcpu->arch.rmode.active) eb = ~0; + if (vm_need_ept()) + eb &= ~(1u << PF_VECTOR); /* bypass_guest_pf = 0 */ vmcs_write32(EXCEPTION_BITMAP, eb); } @@ -1352,8 +1393,64 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu) vcpu->arch.cr4 |= vmcs_readl(GUEST_CR4) & ~KVM_GUEST_CR4_MASK; } +static void ept_load_pdptrs(struct kvm_vcpu *vcpu) +{ + if (is_paging(vcpu) && is_pae(vcpu) && !is_long_mode(vcpu)) { + if (!load_pdptrs(vcpu, vcpu->arch.cr3)) { + printk(KERN_ERR "EPT: Fail to load pdptrs!\n"); + return; + } + vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]); + vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]); + vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]); + vmcs_write64(GUEST_PDPTR3, vcpu->arch.pdptrs[3]); + } +} + +static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); + +static void ept_update_paging_mode_cr0(unsigned long *hw_cr0, + unsigned long cr0, + struct kvm_vcpu *vcpu) +{ + if (!(cr0 & X86_CR0_PG)) { + /* From paging/starting to nonpaging */ + vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, + vmcs_config.cpu_based_exec_ctrl | + (CPU_BASED_CR3_LOAD_EXITING | + CPU_BASED_CR3_STORE_EXITING)); + vcpu->arch.cr0 = cr0; + vmx_set_cr4(vcpu, vcpu->arch.cr4); + *hw_cr0 |= X86_CR0_PE | X86_CR0_PG; + *hw_cr0 &= ~X86_CR0_WP; + } else if (!is_paging(vcpu)) { + /* From nonpaging to paging */ + vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, + vmcs_config.cpu_based_exec_ctrl & + ~(CPU_BASED_CR3_LOAD_EXITING | + CPU_BASED_CR3_STORE_EXITING)); + vcpu->arch.cr0 = cr0; + vmx_set_cr4(vcpu, vcpu->arch.cr4); + if (!(vcpu->arch.cr0 & X86_CR0_WP)) + *hw_cr0 &= ~X86_CR0_WP; + } +} + +static void ept_update_paging_mode_cr4(unsigned long *hw_cr4, + struct kvm_vcpu *vcpu) +{ + if (!is_paging(vcpu)) { + *hw_cr4 &= ~X86_CR4_PAE; + *hw_cr4 |= X86_CR4_PSE; + } else if (!(vcpu->arch.cr4 & X86_CR4_PAE)) + *hw_cr4 &= ~X86_CR4_PAE; +} + static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { + unsigned long hw_cr0 = (cr0 & ~KVM_GUEST_CR0_MASK) | + KVM_VM_CR0_ALWAYS_ON; + vmx_fpu_deactivate(vcpu); if (vcpu->arch.rmode.active && (cr0 & X86_CR0_PE)) @@ -1371,29 +1468,61 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) } #endif + if (vm_need_ept()) + ept_update_paging_mode_cr0(&hw_cr0, cr0, vcpu); + vmcs_writel(CR0_READ_SHADOW, cr0); - vmcs_writel(GUEST_CR0, - (cr0 & ~KVM_GUEST_CR0_MASK) | KVM_VM_CR0_ALWAYS_ON); + vmcs_writel(GUEST_CR0, hw_cr0); vcpu->arch.cr0 = cr0; if (!(cr0 & X86_CR0_TS) || !(cr0 & X86_CR0_PE)) vmx_fpu_activate(vcpu); } +static u64 construct_eptp(unsigned long root_hpa) +{ + u64 eptp; + + /* TODO write the value reading from MSR */ + eptp = VMX_EPT_DEFAULT_MT | + VMX_EPT_DEFAULT_GAW << VMX_EPT_GAW_EPTP_SHIFT; + eptp |= (root_hpa & PAGE_MASK); + + return eptp; +} + static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) { + unsigned long guest_cr3; + u64 eptp; + + guest_cr3 = cr3; + if (vm_need_ept()) { + eptp = construct_eptp(cr3); + vmcs_write64(EPT_POINTER, eptp); + ept_sync_context(eptp); + ept_load_pdptrs(vcpu); + guest_cr3 = is_paging(vcpu) ? vcpu->arch.cr3 : + VMX_EPT_IDENTITY_PAGETABLE_ADDR; + } + vmx_flush_tlb(vcpu); - vmcs_writel(GUEST_CR3, cr3); + vmcs_writel(GUEST_CR3, guest_cr3); if (vcpu->arch.cr0 & X86_CR0_PE) vmx_fpu_deactivate(vcpu); } static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { - vmcs_writel(CR4_READ_SHADOW, cr4); - vmcs_writel(GUEST_CR4, cr4 | (vcpu->arch.rmode.active ? - KVM_RMODE_VM_CR4_ALWAYS_ON : KVM_PMODE_VM_CR4_ALWAYS_ON)); + unsigned long hw_cr4 = cr4 | (vcpu->arch.rmode.active ? + KVM_RMODE_VM_CR4_ALWAYS_ON : KVM_PMODE_VM_CR4_ALWAYS_ON); + vcpu->arch.cr4 = cr4; + if (vm_need_ept()) + ept_update_paging_mode_cr4(&hw_cr4, vcpu); + + vmcs_writel(CR4_READ_SHADOW, cr4); + vmcs_writel(GUEST_CR4, hw_cr4); } static void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer) @@ -2116,6 +2245,9 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) if (intr_info & INTR_INFO_DELIVER_CODE_MASK) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); if (is_page_fault(intr_info)) { + /* EPT won't cause page fault directly */ + if (vm_need_ept()) + BUG(); cr2 = vmcs_readl(EXIT_QUALIFICATION); KVMTRACE_3D(PAGE_FAULT, vcpu, error_code, (u32)cr2, (u32)((u64)cr2 >> 32), handler); @@ -2445,6 +2577,64 @@ static int handle_task_switch(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return kvm_task_switch(vcpu, tss_selector, reason); } +static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + u64 exit_qualification; + enum emulation_result er; + gpa_t gpa; + unsigned long hva; + int gla_validity; + int r; + + exit_qualification = vmcs_read64(EXIT_QUALIFICATION); + + if (exit_qualification & (1 << 6)) { + printk(KERN_ERR "EPT: GPA exceeds GAW!\n"); + return -ENOTSUPP; + } + + gla_validity = (exit_qualification >> 7) & 0x3; + if (gla_validity != 0x3 && gla_validity != 0x1 && gla_validity != 0) { + printk(KERN_ERR "EPT: Handling EPT violation failed!\n"); + printk(KERN_ERR "EPT: GPA: 0x%lx, GVA: 0x%lx\n", + (long unsigned int)vmcs_read64(GUEST_PHYSICAL_ADDRESS), + (long unsigned int)vmcs_read64(GUEST_LINEAR_ADDRESS)); + printk(KERN_ERR "EPT: Exit qualification is 0x%lx\n", + (long unsigned int)exit_qualification); + kvm_run->exit_reason = KVM_EXIT_UNKNOWN; + kvm_run->hw.hardware_exit_reason = 0; + return -ENOTSUPP; + } + + gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); + hva = gfn_to_hva(vcpu->kvm, gpa >> PAGE_SHIFT); + if (!kvm_is_error_hva(hva)) { + r = kvm_mmu_page_fault(vcpu, gpa & PAGE_MASK, 0); + if (r < 0) { + printk(KERN_ERR "EPT: Not enough memory!\n"); + return -ENOMEM; + } + return 1; + } else { + /* must be MMIO */ + er = emulate_instruction(vcpu, kvm_run, 0, 0, 0); + + if (er == EMULATE_FAIL) { + printk(KERN_ERR + "EPT: Fail to handle EPT violation vmexit!er is %d\n", + er); + printk(KERN_ERR "EPT: GPA: 0x%lx, GVA: 0x%lx\n", + (long unsigned int)vmcs_read64(GUEST_PHYSICAL_ADDRESS), + (long unsigned int)vmcs_read64(GUEST_LINEAR_ADDRESS)); + printk(KERN_ERR "EPT: Exit qualification is 0x%lx\n", + (long unsigned int)exit_qualification); + return -ENOTSUPP; + } else if (er == EMULATE_DO_MMIO) + return 0; + } + return 1; +} + /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -2468,6 +2658,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu, [EXIT_REASON_APIC_ACCESS] = handle_apic_access, [EXIT_REASON_WBINVD] = handle_wbinvd, [EXIT_REASON_TASK_SWITCH] = handle_task_switch, + [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation, }; static const int kvm_vmx_max_exit_handlers = @@ -2494,7 +2685,8 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } if ((vectoring_info & VECTORING_INFO_VALID_MASK) && - exit_reason != EXIT_REASON_EXCEPTION_NMI) + (exit_reason != EXIT_REASON_EXCEPTION_NMI && + exit_reason != EXIT_REASON_EPT_VIOLATION)) printk(KERN_WARNING "%s: unexpected, valid vectoring info and " "exit reason is 0x%x\n", __func__, exit_reason); if (exit_reason < kvm_vmx_max_exit_handlers @@ -2741,6 +2933,11 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) #endif ); + /* Access CR3 don't cause VMExit in paging mode, so we need + * to sync with guest real CR3. */ + if (vm_need_ept() && is_paging(vcpu)) + vcpu->arch.cr3 = vmcs_readl(GUEST_CR3); + vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD); if (vmx->rmode.irq.pending) fixup_rmode_irq(vmx); @@ -2796,6 +2993,15 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) return ERR_PTR(-ENOMEM); allocate_vpid(vmx); + if (id == 0 && vm_need_ept()) { + kvm_mmu_set_base_ptes(VMX_EPT_READABLE_MASK | + VMX_EPT_WRITABLE_MASK | + VMX_EPT_DEFAULT_MT << VMX_EPT_MT_EPTE_SHIFT); + kvm_mmu_set_mask_ptes(0ull, VMX_EPT_FAKE_ACCESSED_MASK, + VMX_EPT_FAKE_DIRTY_MASK, 0ull, + VMX_EPT_EXECUTABLE_MASK); + kvm_enable_tdp(); + } err = kvm_vcpu_init(&vmx->vcpu, kvm, id); if (err) @@ -2975,9 +3181,14 @@ static int __init vmx_init(void) vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_IA32_SYSENTER_ESP); vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_IA32_SYSENTER_EIP); + if (cpu_has_vmx_ept()) + bypass_guest_pf = 0; + if (bypass_guest_pf) kvm_mmu_set_nonpresent_ptes(~0xffeull, 0ull); + ept_sync_global(); + return 0; out2: diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index f97eccc..79d94c6 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -353,6 +353,15 @@ enum vmcs_field { #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) #define VMX_EPT_DEFAULT_GAW 3 +#define VMX_EPT_MAX_GAW 0x4 +#define VMX_EPT_MT_EPTE_SHIFT 3 +#define VMX_EPT_GAW_EPTP_SHIFT 3 +#define VMX_EPT_DEFAULT_MT 0x6ull +#define VMX_EPT_READABLE_MASK 0x1ull +#define VMX_EPT_WRITABLE_MASK 0x2ull +#define VMX_EPT_EXECUTABLE_MASK 0x4ull +#define VMX_EPT_FAKE_ACCESSED_MASK (1ull << 62) +#define VMX_EPT_FAKE_DIRTY_MASK (1ull << 63) #define VMX_EPT_IDENTITY_PAGETABLE_ADDR 0xfffbc000ul diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index f03ef75..54a8f77 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -650,6 +650,7 @@ static inline void kvm_inject_gp(struct kvm_vcpu *vcpu, u32 error_code) #define ASM_VMX_VMWRITE_RSP_RDX ".byte 0x0f, 0x79, 0xd4" #define ASM_VMX_VMXOFF ".byte 0x0f, 0x01, 0xc4" #define ASM_VMX_VMXON_RAX ".byte 0xf3, 0x0f, 0xc7, 0x30" +#define ASM_VMX_INVEPT ".byte 0x66, 0x0f, 0x38, 0x80, 0x08" #define ASM_VMX_INVVPID ".byte 0x66, 0x0f, 0x38, 0x81, 0x08" #define MSR_IA32_TIME_STAMP_COUNTER 0x010 -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:22:11
|
From 30448ffed0d5dad04d0538a53661764128cf05f5 Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 21:44:52 +0800 Subject: [PATCH 7/8] KVM: VMX: Perpare a identity page table for EPT in real mode Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/vmx.c | 79 ++++++++++++++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx.h | 3 ++ include/asm-x86/kvm_host.h | 3 ++ 3 files changed, 82 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 98e4f2b..de5f615 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -87,7 +87,7 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_vmx, vcpu); } -static int init_rmode_tss(struct kvm *kvm); +static int init_rmode(struct kvm *kvm); static DEFINE_PER_CPU(struct vmcs *, vmxarea); static DEFINE_PER_CPU(struct vmcs *, current_vmcs); @@ -1304,7 +1304,7 @@ static void enter_rmode(struct kvm_vcpu *vcpu) fix_rmode_seg(VCPU_SREG_FS, &vcpu->arch.rmode.fs); kvm_mmu_reset_context(vcpu); - init_rmode_tss(vcpu->kvm); + init_rmode(vcpu->kvm); } #ifdef CONFIG_X86_64 @@ -1578,6 +1578,41 @@ out: return ret; } +static int init_rmode_identity_map(struct kvm *kvm) +{ + int i, r, ret; + pfn_t identity_map_pfn; + u32 tmp; + + if (!vm_need_ept()) + return 1; + if (unlikely(!kvm->arch.ept_identity_pagetable)) { + printk(KERN_ERR "EPT: identity-mapping pagetable " + "haven't been allocated!\n"); + return 0; + } + if (likely(kvm->arch.ept_identity_pagetable_done)) + return 1; + ret = 0; + identity_map_pfn = VMX_EPT_IDENTITY_PAGETABLE_ADDR >> PAGE_SHIFT; + r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE); + if (r < 0) + goto out; + /* Set up identity-mapping pagetable for EPT in real mode */ + for (i = 0; i < PT32_ENT_PER_PAGE; i++) { + tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | + _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); + r = kvm_write_guest_page(kvm, identity_map_pfn, + &tmp, i * sizeof(tmp), sizeof(tmp)); + if (r < 0) + goto out; + } + kvm->arch.ept_identity_pagetable_done = true; + ret = 1; +out: + return ret; +} + static void seg_setup(int seg) { struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg]; @@ -1612,6 +1647,31 @@ out: return r; } +static int alloc_identity_pagetable(struct kvm *kvm) +{ + struct kvm_userspace_memory_region kvm_userspace_mem; + int r = 0; + + down_write(&kvm->slots_lock); + if (kvm->arch.ept_identity_pagetable) + goto out; + kvm_userspace_mem.slot = IDENTITY_PAGETABLE_PRIVATE_MEMSLOT; + kvm_userspace_mem.flags = 0; + kvm_userspace_mem.guest_phys_addr = VMX_EPT_IDENTITY_PAGETABLE_ADDR; + kvm_userspace_mem.memory_size = PAGE_SIZE; + r = __kvm_set_memory_region(kvm, &kvm_userspace_mem, 0); + if (r) + goto out; + + down_read(¤t->mm->mmap_sem); + kvm->arch.ept_identity_pagetable = gfn_to_page(kvm, + VMX_EPT_IDENTITY_PAGETABLE_ADDR >> PAGE_SHIFT); + up_read(¤t->mm->mmap_sem); +out: + up_write(&kvm->slots_lock); + return r; +} + static void allocate_vpid(struct vcpu_vmx *vmx) { int vpid; @@ -1775,6 +1835,15 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) return 0; } +static int init_rmode(struct kvm *kvm) +{ + if (!init_rmode_tss(kvm)) + return 0; + if (!init_rmode_identity_map(kvm)) + return 0; + return 1; +} + static int vmx_vcpu_reset(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -1782,7 +1851,7 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu) int ret; down_read(&vcpu->kvm->slots_lock); - if (!init_rmode_tss(vmx->vcpu.kvm)) { + if (!init_rmode(vmx->vcpu.kvm)) { ret = -ENOMEM; goto out; } @@ -2759,6 +2828,10 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) if (alloc_apic_access_page(kvm) != 0) goto free_vmcs; + if (vm_need_ept()) + if (alloc_identity_pagetable(kvm) != 0) + goto free_vmcs; + return &vmx->vcpu; free_vmcs: diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 093b085..f97eccc 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -340,6 +340,7 @@ enum vmcs_field { #define MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED 0x4 #define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT 9 +#define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT 10 #define VMX_NR_VPIDS (1 << 16) #define VMX_VPID_EXTENT_SINGLE_CONTEXT 1 @@ -353,4 +354,6 @@ enum vmcs_field { #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) #define VMX_EPT_DEFAULT_GAW 3 +#define VMX_EPT_IDENTITY_PAGETABLE_ADDR 0xfffbc000ul + #endif diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 715f7b9..f03ef75 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -314,6 +314,9 @@ struct kvm_arch{ struct page *apic_access_page; gpa_t wall_clock; + + struct page *ept_identity_pagetable; + bool ept_identity_pagetable_done; }; struct kvm_vm_stat { -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:21:41
|
From 6edba459ef83a1f2c64c3b782ce67a14e74f9330 Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 21:44:50 +0800 Subject: [PATCH 6/8] KVM: Export necessary function for EPT Signed-off-by: Sheng Yang <she...@in...> --- virt/kvm/kvm_main.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5e7802e..1d7991a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -522,6 +522,7 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn) return bad_hva(); return (slot->userspace_addr + (gfn - slot->base_gfn) * PAGE_SIZE); } +EXPORT_SYMBOL_GPL(gfn_to_hva); /* * Requires current->mm->mmap_sem to be held -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:21:23
|
From 169c62e33ea1dbadc8d2fbc3d4880e63caa4d6ab Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 21:44:42 +0800 Subject: [PATCH 5/8] KVM: MMU: Remove #ifdef CONFIG_X86_64 to support 4 level EPT Currently EPT level is 4 for both PAE and 32e. The patch remove the #ifdef for alloc root_hpa and free root_hpa to support EPT. Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/mmu.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c28a36b..3dbedf1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1233,7 +1233,6 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu) if (!VALID_PAGE(vcpu->arch.mmu.root_hpa)) return; spin_lock(&vcpu->kvm->mmu_lock); -#ifdef CONFIG_X86_64 if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) { hpa_t root = vcpu->arch.mmu.root_hpa; @@ -1245,7 +1244,6 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu) spin_unlock(&vcpu->kvm->mmu_lock); return; } -#endif for (i = 0; i < 4; ++i) { hpa_t root = vcpu->arch.mmu.pae_root[i]; @@ -1271,7 +1269,6 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu) root_gfn = vcpu->arch.cr3 >> PAGE_SHIFT; -#ifdef CONFIG_X86_64 if (vcpu->arch.mmu.shadow_root_level == PT64_ROOT_LEVEL) { hpa_t root = vcpu->arch.mmu.root_hpa; @@ -1286,7 +1283,6 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu) vcpu->arch.mmu.root_hpa = root; return; } -#endif metaphysical = !is_paging(vcpu); if (tdp_enabled) metaphysical = 1; -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:20:53
|
From 239f38236196c2321989c64d7c61ff28490b3f00 Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 21:13:50 +0800 Subject: [PATCH 4/8] KVM: MMU: Add EPT support Enable kvm_set_spte() to generate EPT entries. Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/mmu.c | 47 ++++++++++++++++++++++++++++++++----------- arch/x86/kvm/x86.c | 3 ++ include/asm-x86/kvm_host.h | 3 ++ 3 files changed, 41 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bcfaf7e..c28a36b 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -152,6 +152,12 @@ static struct kmem_cache *mmu_page_header_cache; static u64 __read_mostly shadow_trap_nonpresent_pte; static u64 __read_mostly shadow_notrap_nonpresent_pte; +static u64 __read_mostly shadow_base_present_pte; +static u64 __read_mostly shadow_nx_mask; +static u64 __read_mostly shadow_x_mask; /* mutual exclusive with nx_mask */ +static u64 __read_mostly shadow_user_mask; +static u64 __read_mostly shadow_accessed_mask; +static u64 __read_mostly shadow_dirty_mask; void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte) { @@ -160,6 +166,23 @@ void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte) } EXPORT_SYMBOL_GPL(kvm_mmu_set_nonpresent_ptes); +void kvm_mmu_set_base_ptes(u64 base_pte) +{ + shadow_base_present_pte = base_pte; +} +EXPORT_SYMBOL_GPL(kvm_mmu_set_base_ptes); + +void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, + u64 dirty_mask, u64 nx_mask, u64 x_mask) +{ + shadow_user_mask = user_mask; + shadow_accessed_mask = accessed_mask; + shadow_dirty_mask = dirty_mask; + shadow_nx_mask = nx_mask; + shadow_x_mask = x_mask; +} +EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes); + static int is_write_protection(struct kvm_vcpu *vcpu) { return vcpu->arch.cr0 & X86_CR0_WP; @@ -198,7 +221,7 @@ static int is_writeble_pte(unsigned long pte) static int is_dirty_pte(unsigned long pte) { - return pte & PT_DIRTY_MASK; + return pte & shadow_dirty_mask; } static int is_rmap_pte(u64 pte) @@ -513,7 +536,7 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) return; sp = page_header(__pa(spte)); pfn = spte_to_pfn(*spte); - if (*spte & PT_ACCESSED_MASK) + if (*spte & shadow_accessed_mask) kvm_set_pfn_accessed(pfn); if (is_writeble_pte(*spte)) kvm_release_pfn_dirty(pfn); @@ -1039,17 +1062,17 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, * whether the guest actually used the pte (in order to detect * demand paging). */ - spte = PT_PRESENT_MASK | PT_DIRTY_MASK; + spte = shadow_base_present_pte | shadow_dirty_mask; if (!speculative) pte_access |= PT_ACCESSED_MASK; if (!dirty) pte_access &= ~ACC_WRITE_MASK; - if (!(pte_access & ACC_EXEC_MASK)) - spte |= PT64_NX_MASK; - - spte |= PT_PRESENT_MASK; + if (pte_access & ACC_EXEC_MASK) + spte |= shadow_x_mask; + else + spte |= shadow_nx_mask; if (pte_access & ACC_USER_MASK) - spte |= PT_USER_MASK; + spte |= shadow_user_mask; if (largepage) spte |= PT_PAGE_SIZE_MASK; @@ -1155,7 +1178,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, } table[index] = __pa(new_table->spt) | PT_PRESENT_MASK - | PT_WRITABLE_MASK | PT_USER_MASK; + | PT_WRITABLE_MASK | shadow_user_mask; } table_addr = table[index] & PT64_BASE_ADDR_MASK; } @@ -1343,7 +1366,7 @@ static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, spin_lock(&vcpu->kvm->mmu_lock); kvm_mmu_free_some_pages(vcpu); r = __direct_map(vcpu, gpa, error_code & PFERR_WRITE_MASK, - largepage, gfn, pfn, TDP_ROOT_LEVEL); + largepage, gfn, pfn, kvm_x86_ops->get_tdp_level()); spin_unlock(&vcpu->kvm->mmu_lock); return r; @@ -1450,7 +1473,7 @@ static int init_kvm_tdp_mmu(struct kvm_vcpu *vcpu) context->page_fault = tdp_page_fault; context->free = nonpaging_free; context->prefetch_page = nonpaging_prefetch_page; - context->shadow_root_level = TDP_ROOT_LEVEL; + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); context->root_hpa = INVALID_PAGE; if (!is_paging(vcpu)) { @@ -1599,7 +1622,7 @@ static bool last_updated_pte_accessed(struct kvm_vcpu *vcpu) { u64 *spte = vcpu->arch.last_pte_updated; - return !!(spte && (*spte & PT_ACCESSED_MASK)); + return !!(spte && (*spte & shadow_accessed_mask)); } static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0ce5563..0735efb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2417,6 +2417,9 @@ int kvm_arch_init(void *opaque) kvm_x86_ops = ops; kvm_mmu_set_nonpresent_ptes(0ull, 0ull); + kvm_mmu_set_base_ptes(PT_PRESENT_MASK); + kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK, + PT_DIRTY_MASK, PT64_NX_MASK, 0); return 0; out: diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 65b27c9..715f7b9 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -433,6 +433,9 @@ void kvm_mmu_destroy(struct kvm_vcpu *vcpu); int kvm_mmu_create(struct kvm_vcpu *vcpu); int kvm_mmu_setup(struct kvm_vcpu *vcpu); void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte); +void kvm_mmu_set_base_ptes(u64 base_pte); +void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, + u64 dirty_mask, u64 nx_mask, u64 x_mask); int kvm_mmu_reset_context(struct kvm_vcpu *vcpu); void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot); -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:19:46
|
From 143c1240c5e5391f4e108aedfeb4491e6347d04e Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 10:20:22 +0800 Subject: [PATCH 3/8] KVM: Add kvm_x86_ops get_tdp_level() The function get_tdp_level() provided the number of tdp level for EPT and NPT rather than the NPT specific macro. Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/mmu.h | 6 ------ arch/x86/kvm/svm.c | 10 ++++++++++ arch/x86/kvm/vmx.c | 6 ++++++ arch/x86/kvm/vmx.h | 1 + include/asm-x86/kvm_host.h | 2 +- 5 files changed, 18 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index a4fcb78..1730757 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -3,12 +3,6 @@ #include <linux/kvm_host.h> -#ifdef CONFIG_X86_64 -#define TDP_ROOT_LEVEL PT64_ROOT_LEVEL -#else -#define TDP_ROOT_LEVEL PT32E_ROOT_LEVEL -#endif - #define PT64_PT_BITS 9 #define PT64_ENT_PER_PAGE (1 << PT64_PT_BITS) #define PT32_PT_BITS 10 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 89e0be2..ab22615 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1863,6 +1863,15 @@ static bool svm_cpu_has_accelerated_tpr(void) return false; } +static int get_npt_level(void) +{ +#ifdef CONFIG_X86_64 + return PT64_ROOT_LEVEL; +#else + return PT32E_ROOT_LEVEL; +#endif +} + static struct kvm_x86_ops svm_x86_ops = { .cpu_has_kvm_support = has_svm, .disabled_by_bios = is_disabled, @@ -1920,6 +1929,7 @@ static struct kvm_x86_ops svm_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = svm_set_tss_addr, + .get_tdp_level = get_npt_level, }; static int __init svm_init(void) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index d93250d..98e4f2b 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2788,6 +2788,11 @@ static void __init vmx_check_processor_compat(void *rtn) } } +static int get_ept_level(void) +{ + return VMX_EPT_DEFAULT_GAW + 1; +} + static struct kvm_x86_ops vmx_x86_ops = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -2844,6 +2849,7 @@ static struct kvm_x86_ops vmx_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = vmx_set_tss_addr, + .get_tdp_level = get_ept_level, }; static int __init vmx_init(void) diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 5f7fdc9..093b085 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -351,5 +351,6 @@ enum vmcs_field { #define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull << 24) #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) +#define VMX_EPT_DEFAULT_GAW 3 #endif diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 9d963cd..65b27c9 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -422,8 +422,8 @@ struct kvm_x86_ops { struct kvm_run *run); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); + int (*get_tdp_level)(void); }; - extern struct kvm_x86_ops *kvm_x86_ops; int kvm_mmu_module_init(void); -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:19:06
|
From 75e9d921390a29ecf9c85dd1370103c921beacd7 Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 10:17:08 +0800 Subject: [PATCH 2/8] KVM: MMU: Move some defination Move some defination to mmu.h in order to building common table entries. Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/mmu.c | 34 ---------------------------------- arch/x86/kvm/mmu.h | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 33 insertions(+), 34 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 2ad6f54..bcfaf7e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -79,36 +79,6 @@ static int dbg = 1; } #endif -#define PT64_PT_BITS 9 -#define PT64_ENT_PER_PAGE (1 << PT64_PT_BITS) -#define PT32_PT_BITS 10 -#define PT32_ENT_PER_PAGE (1 << PT32_PT_BITS) - -#define PT_WRITABLE_SHIFT 1 - -#define PT_PRESENT_MASK (1ULL << 0) -#define PT_WRITABLE_MASK (1ULL << PT_WRITABLE_SHIFT) -#define PT_USER_MASK (1ULL << 2) -#define PT_PWT_MASK (1ULL << 3) -#define PT_PCD_MASK (1ULL << 4) -#define PT_ACCESSED_MASK (1ULL << 5) -#define PT_DIRTY_MASK (1ULL << 6) -#define PT_PAGE_SIZE_MASK (1ULL << 7) -#define PT_PAT_MASK (1ULL << 7) -#define PT_GLOBAL_MASK (1ULL << 8) -#define PT64_NX_SHIFT 63 -#define PT64_NX_MASK (1ULL << PT64_NX_SHIFT) - -#define PT_PAT_SHIFT 7 -#define PT_DIR_PAT_SHIFT 12 -#define PT_DIR_PAT_MASK (1ULL << PT_DIR_PAT_SHIFT) - -#define PT32_DIR_PSE36_SIZE 4 -#define PT32_DIR_PSE36_SHIFT 13 -#define PT32_DIR_PSE36_MASK \ - (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT) - - #define PT_FIRST_AVAIL_BITS_SHIFT 9 #define PT64_SECOND_AVAIL_BITS_SHIFT 52 @@ -154,10 +124,6 @@ static int dbg = 1; #define PFERR_USER_MASK (1U << 2) #define PFERR_FETCH_MASK (1U << 4) -#define PT64_ROOT_LEVEL 4 -#define PT32_ROOT_LEVEL 2 -#define PT32E_ROOT_LEVEL 3 - #define PT_DIRECTORY_LEVEL 2 #define PT_PAGE_TABLE_LEVEL 1 diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index e64e9f5..a4fcb78 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -9,6 +9,39 @@ #define TDP_ROOT_LEVEL PT32E_ROOT_LEVEL #endif +#define PT64_PT_BITS 9 +#define PT64_ENT_PER_PAGE (1 << PT64_PT_BITS) +#define PT32_PT_BITS 10 +#define PT32_ENT_PER_PAGE (1 << PT32_PT_BITS) + +#define PT_WRITABLE_SHIFT 1 + +#define PT_PRESENT_MASK (1ULL << 0) +#define PT_WRITABLE_MASK (1ULL << PT_WRITABLE_SHIFT) +#define PT_USER_MASK (1ULL << 2) +#define PT_PWT_MASK (1ULL << 3) +#define PT_PCD_MASK (1ULL << 4) +#define PT_ACCESSED_MASK (1ULL << 5) +#define PT_DIRTY_MASK (1ULL << 6) +#define PT_PAGE_SIZE_MASK (1ULL << 7) +#define PT_PAT_MASK (1ULL << 7) +#define PT_GLOBAL_MASK (1ULL << 8) +#define PT64_NX_SHIFT 63 +#define PT64_NX_MASK (1ULL << PT64_NX_SHIFT) + +#define PT_PAT_SHIFT 7 +#define PT_DIR_PAT_SHIFT 12 +#define PT_DIR_PAT_MASK (1ULL << PT_DIR_PAT_SHIFT) + +#define PT32_DIR_PSE36_SIZE 4 +#define PT32_DIR_PSE36_SHIFT 13 +#define PT32_DIR_PSE36_MASK \ + (((1ULL << PT32_DIR_PSE36_SIZE) - 1) << PT32_DIR_PSE36_SHIFT) + +#define PT64_ROOT_LEVEL 4 +#define PT32_ROOT_LEVEL 2 +#define PT32E_ROOT_LEVEL 3 + static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { if (unlikely(vcpu->kvm->arch.n_free_mmu_pages < KVM_MIN_FREE_MMU_PAGES)) -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:18:39
|
From cd0cf53ce955328f949893316b4717f051085f5a Mon Sep 17 00:00:00 2001 From: Sheng Yang <she...@in...> Date: Fri, 25 Apr 2008 10:13:16 +0800 Subject: [PATCH 1/8] KVM: VMX: EPT Feature Detection Signed-off-by: Sheng Yang <she...@in...> --- arch/x86/kvm/vmx.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++---- arch/x86/kvm/vmx.h | 25 ++++++++++++++++++++ 2 files changed, 83 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 8e5d664..d93250d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -42,6 +42,9 @@ module_param(enable_vpid, bool, 0); static int flexpriority_enabled = 1; module_param(flexpriority_enabled, bool, 0); +static int enable_ept; +module_param(enable_ept, bool, 0); + struct vmcs { u32 revision_id; u32 abort; @@ -107,6 +110,11 @@ static struct vmcs_config { u32 vmentry_ctrl; } vmcs_config; +struct vmx_capability { + u32 ept; + u32 vpid; +} vmx_capability; + #define VMX_SEGMENT_FIELD(seg) \ [VCPU_SREG_##seg] = { \ .selector = GUEST_##seg##_SELECTOR, \ @@ -214,6 +222,32 @@ static inline bool cpu_has_vmx_virtualize_apic_accesses(void) SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES); } +static inline int cpu_has_vmx_invept_individual_addr(void) +{ + return (!!(vmx_capability.ept & VMX_EPT_EXTENT_INDIVIDUAL_BIT)); +} + +static inline int cpu_has_vmx_invept_context(void) +{ + return (!!(vmx_capability.ept & VMX_EPT_EXTENT_CONTEXT_BIT)); +} + +static inline int cpu_has_vmx_invept_global(void) +{ + return (!!(vmx_capability.ept & VMX_EPT_EXTENT_GLOBAL_BIT)); +} + +static inline int cpu_has_vmx_ept(void) +{ + return (vmcs_config.cpu_based_2nd_exec_ctrl & + SECONDARY_EXEC_ENABLE_EPT); +} + +static inline int vm_need_ept(void) +{ + return (cpu_has_vmx_ept() && enable_ept); +} + static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm) { return ((cpu_has_vmx_virtualize_apic_accesses()) && @@ -985,7 +1019,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt, static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) { u32 vmx_msr_low, vmx_msr_high; - u32 min, opt; + u32 min, opt, min2, opt2; u32 _pin_based_exec_control = 0; u32 _cpu_based_exec_control = 0; u32 _cpu_based_2nd_exec_control = 0; @@ -1003,6 +1037,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) CPU_BASED_CR8_LOAD_EXITING | CPU_BASED_CR8_STORE_EXITING | #endif + CPU_BASED_CR3_LOAD_EXITING | + CPU_BASED_CR3_STORE_EXITING | CPU_BASED_USE_IO_BITMAPS | CPU_BASED_MOV_DR_EXITING | CPU_BASED_USE_TSC_OFFSETING; @@ -1018,11 +1054,13 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) ~CPU_BASED_CR8_STORE_EXITING; #endif if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) { - min = 0; - opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | + min2 = 0; + opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | SECONDARY_EXEC_WBINVD_EXITING | - SECONDARY_EXEC_ENABLE_VPID; - if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS2, + SECONDARY_EXEC_ENABLE_VPID | + SECONDARY_EXEC_ENABLE_EPT; + if (adjust_vmx_controls(min2, opt2, + MSR_IA32_VMX_PROCBASED_CTLS2, &_cpu_based_2nd_exec_control) < 0) return -EIO; } @@ -1031,6 +1069,16 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) _cpu_based_exec_control &= ~CPU_BASED_TPR_SHADOW; #endif + if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) { + /* CR3 accesses don't need to cause VM Exits when EPT enabled */ + min &= ~(CPU_BASED_CR3_LOAD_EXITING | + CPU_BASED_CR3_STORE_EXITING); + if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS, + &_cpu_based_exec_control) < 0) + return -EIO; + rdmsr(MSR_IA32_VMX_EPT_VPID_CAP, + vmx_capability.ept, vmx_capability.vpid); + } min = 0; #ifdef CONFIG_X86_64 @@ -1638,6 +1686,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) CPU_BASED_CR8_LOAD_EXITING; #endif } + if (!vm_need_ept()) + exec_control |= CPU_BASED_CR3_STORE_EXITING | + CPU_BASED_CR3_LOAD_EXITING; vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, exec_control); if (cpu_has_secondary_exec_ctrls()) { @@ -1647,6 +1698,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES; if (vmx->vpid == 0) exec_control &= ~SECONDARY_EXEC_ENABLE_VPID; + if (!vm_need_ept()) + exec_control &= ~SECONDARY_EXEC_ENABLE_EPT; vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_control); } diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 5dff460..5f7fdc9 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -35,6 +35,8 @@ #define CPU_BASED_MWAIT_EXITING 0x00000400 #define CPU_BASED_RDPMC_EXITING 0x00000800 #define CPU_BASED_RDTSC_EXITING 0x00001000 +#define CPU_BASED_CR3_LOAD_EXITING 0x00008000 +#define CPU_BASED_CR3_STORE_EXITING 0x00010000 #define CPU_BASED_CR8_LOAD_EXITING 0x00080000 #define CPU_BASED_CR8_STORE_EXITING 0x00100000 #define CPU_BASED_TPR_SHADOW 0x00200000 @@ -49,6 +51,7 @@ * Definitions of Secondary Processor-Based VM-Execution Controls. */ #define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001 +#define SECONDARY_EXEC_ENABLE_EPT 0x00000002 #define SECONDARY_EXEC_ENABLE_VPID 0x00000020 #define SECONDARY_EXEC_WBINVD_EXITING 0x00000040 @@ -100,10 +103,22 @@ enum vmcs_field { VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013, APIC_ACCESS_ADDR = 0x00002014, APIC_ACCESS_ADDR_HIGH = 0x00002015, + EPT_POINTER = 0x0000201a, + EPT_POINTER_HIGH = 0x0000201b, + GUEST_PHYSICAL_ADDRESS = 0x00002400, + GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401, VMCS_LINK_POINTER = 0x00002800, VMCS_LINK_POINTER_HIGH = 0x00002801, GUEST_IA32_DEBUGCTL = 0x00002802, GUEST_IA32_DEBUGCTL_HIGH = 0x00002803, + GUEST_PDPTR0 = 0x0000280a, + GUEST_PDPTR0_HIGH = 0x0000280b, + GUEST_PDPTR1 = 0x0000280c, + GUEST_PDPTR1_HIGH = 0x0000280d, + GUEST_PDPTR2 = 0x0000280e, + GUEST_PDPTR2_HIGH = 0x0000280f, + GUEST_PDPTR3 = 0x00002810, + GUEST_PDPTR3_HIGH = 0x00002811, PIN_BASED_VM_EXEC_CONTROL = 0x00004000, CPU_BASED_VM_EXEC_CONTROL = 0x00004002, EXCEPTION_BITMAP = 0x00004004, @@ -226,6 +241,8 @@ enum vmcs_field { #define EXIT_REASON_MWAIT_INSTRUCTION 36 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define EXIT_REASON_APIC_ACCESS 44 +#define EXIT_REASON_EPT_VIOLATION 48 +#define EXIT_REASON_EPT_MISCONFIG 49 #define EXIT_REASON_WBINVD 54 /* @@ -316,6 +333,7 @@ enum vmcs_field { #define MSR_IA32_VMX_CR4_FIXED1 0x489 #define MSR_IA32_VMX_VMCS_ENUM 0x48a #define MSR_IA32_VMX_PROCBASED_CTLS2 0x48b +#define MSR_IA32_VMX_EPT_VPID_CAP 0x48c #define MSR_IA32_FEATURE_CONTROL 0x3a #define MSR_IA32_FEATURE_CONTROL_LOCKED 0x1 @@ -327,4 +345,11 @@ enum vmcs_field { #define VMX_VPID_EXTENT_SINGLE_CONTEXT 1 #define VMX_VPID_EXTENT_ALL_CONTEXT 2 +#define VMX_EPT_EXTENT_INDIVIDUAL_ADDR 0 +#define VMX_EPT_EXTENT_CONTEXT 1 +#define VMX_EPT_EXTENT_GLOBAL 2 +#define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull << 24) +#define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25) +#define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) + #endif -- 1.5.4.5 |
From: Yang, S. <she...@in...> - 2008-04-25 08:18:06
|
Hi, Avi Here is the latest EPT patchset. Change from v3: 1. Build the identity mapping pagetable from kernel rather than userspace. The address was changed to 0xfffbc000ul 2. Fix the S/R and LM problem. 3. EPT enabled on 32pae host now. The 2m page should also works, though test blocked by current hugetlb bug. -- Thanks Yang, Sheng |
From: Avi K. <av...@qu...> - 2008-04-25 07:31:49
|
Chris Lalancette wrote: > Avi Kivity wrote: > >> Now it uses %rsi instead of %esi, and any junk in the upper bits will >> cause the ja to be taken. >> >> We need to get a reduced testcase to the gcc folks, this is a serious >> bug. Any changes in the code to work around this would be fragile. >> >> > > Avi, > I've now filed a bug in the upstream gcc database: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36040 > > And I came up with a reduced test case, available here: > > http://people.redhat.com/clalance/rsi-test-case.tar.bz2 > > If I compile the code in the above and look at the disassembly, it shows the > problem; however, I can't reproduce the bug by actually running the code. I > suspect the %rsi register is always 0 when we start in this userland code, so I > never run into the bogus ja, but I just thought I'd mention it. > > Hmm, looking back at the dump: > 1811: 8d 86 00 00 ff 3f lea 0x3fff0000(%rsi),%eax > 1817: 83 f8 03 cmp $0x3,%eax > 181a: 0f 87 e2 01 00 00 ja 1a02 <svm_set_msr+0x27f> So while gcc is using %rsi, it loads the result back into %eax, which has the effect of dropping back into 32-bits. So looks like gcc was right here. Sorry for spreading confusion and apologies to gcc. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |