From: Avi K. <av...@qu...> - 2008-04-17 09:11:16
|
Third batch of the pending kvm updates. This one contains the s390 and ia64 ports. ppc will follow in part IV (and last). Note that a few arch preparation patches are expected to be merged through the arch trees; I will drop them before submitting. Documentation/ia64/kvm.txt | 82 ++ Documentation/s390/kvm.txt | 125 +++ MAINTAINERS | 10 + arch/ia64/Kconfig | 3 + arch/ia64/Makefile | 1 + arch/ia64/kernel/mca.c | 49 + arch/ia64/kernel/mca_asm.S | 5 + arch/ia64/kernel/smp.c | 82 ++ arch/ia64/kvm/Kconfig | 46 + arch/ia64/kvm/Makefile | 61 ++ arch/ia64/kvm/asm-offsets.c | 251 +++++ arch/ia64/kvm/kvm-ia64.c | 1789 +++++++++++++++++++++++++++++++++ arch/ia64/kvm/kvm_fw.c | 500 ++++++++++ arch/ia64/kvm/kvm_minstate.h | 273 +++++ arch/ia64/kvm/lapic.h | 25 + arch/ia64/kvm/misc.h | 93 ++ arch/ia64/kvm/mmio.c | 341 +++++++ arch/ia64/kvm/optvfault.S | 918 +++++++++++++++++ arch/ia64/kvm/process.c | 970 ++++++++++++++++++ arch/ia64/kvm/trampoline.S | 1038 +++++++++++++++++++ arch/ia64/kvm/vcpu.c | 2163 ++++++++++++++++++++++++++++++++++++++++ arch/ia64/kvm/vcpu.h | 740 ++++++++++++++ arch/ia64/kvm/vmm.c | 66 ++ arch/ia64/kvm/vmm_ivt.S | 1424 ++++++++++++++++++++++++++ arch/ia64/kvm/vti.h | 290 ++++++ arch/ia64/kvm/vtlb.c | 636 ++++++++++++ arch/ia64/mm/tlb.c | 198 ++++ arch/s390/Kconfig | 14 + arch/s390/Makefile | 2 +- arch/s390/kernel/early.c | 4 + arch/s390/kernel/setup.c | 14 +- arch/s390/kernel/vtime.c | 1 + arch/s390/kvm/Kconfig | 43 + arch/s390/kvm/Makefile | 14 + arch/s390/kvm/diag.c | 67 ++ arch/s390/kvm/gaccess.h | 274 +++++ arch/s390/kvm/intercept.c | 216 ++++ arch/s390/kvm/interrupt.c | 587 +++++++++++ arch/s390/kvm/kvm-s390.c | 673 +++++++++++++ arch/s390/kvm/kvm-s390.h | 64 ++ arch/s390/kvm/priv.c | 323 ++++++ arch/s390/kvm/sie64a.S | 47 + arch/s390/kvm/sigp.c | 288 ++++++ arch/s390/lib/uaccess_pt.c | 3 + arch/s390/mm/pgtable.c | 65 ++- arch/x86/kvm/mmu.c | 76 ++- arch/x86/kvm/paging_tmpl.h | 4 - arch/x86/kvm/vmx.c | 73 ++- arch/x86/kvm/x86.c | 55 +- drivers/s390/Makefile | 2 +- drivers/s390/kvm/Makefile | 9 + drivers/s390/kvm/kvm_virtio.c | 338 +++++++ drivers/s390/sysinfo.c | 100 +-- include/asm-ia64/gcc_intrin.h | 12 + include/asm-ia64/kregs.h | 3 + include/asm-ia64/kvm.h | 205 ++++- include/asm-ia64/kvm_host.h | 524 ++++++++++ include/asm-ia64/kvm_para.h | 29 + include/asm-ia64/processor.h | 63 ++ include/asm-ia64/smp.h | 3 + include/asm-ia64/tlb.h | 26 + include/asm-s390/Kbuild | 1 + include/asm-s390/kvm.h | 41 +- include/asm-s390/kvm_host.h | 234 +++++ include/asm-s390/kvm_para.h | 150 +++ include/asm-s390/kvm_virtio.h | 53 + include/asm-s390/lowcore.h | 15 +- include/asm-s390/mmu.h | 1 + include/asm-s390/mmu_context.h | 8 +- include/asm-s390/pgtable.h | 93 ++- include/asm-s390/setup.h | 1 + include/asm-s390/sysinfo.h | 111 ++ include/asm-x86/kvm_host.h | 8 +- include/linux/kvm.h | 49 + include/linux/kvm_host.h | 4 + include/linux/sched.h | 2 + kernel/fork.c | 2 +- mm/rmap.c | 7 +- virt/kvm/kvm_main.c | 17 +- 79 files changed, 17006 insertions(+), 191 deletions(-) create mode 100644 Documentation/ia64/kvm.txt create mode 100644 Documentation/s390/kvm.txt create mode 100644 arch/ia64/kvm/Kconfig create mode 100644 arch/ia64/kvm/Makefile create mode 100644 arch/ia64/kvm/asm-offsets.c create mode 100644 arch/ia64/kvm/kvm-ia64.c create mode 100644 arch/ia64/kvm/kvm_fw.c create mode 100644 arch/ia64/kvm/kvm_minstate.h create mode 100644 arch/ia64/kvm/lapic.h create mode 100644 arch/ia64/kvm/misc.h create mode 100644 arch/ia64/kvm/mmio.c create mode 100644 arch/ia64/kvm/optvfault.S create mode 100644 arch/ia64/kvm/process.c create mode 100644 arch/ia64/kvm/trampoline.S create mode 100644 arch/ia64/kvm/vcpu.c create mode 100644 arch/ia64/kvm/vcpu.h create mode 100644 arch/ia64/kvm/vmm.c create mode 100644 arch/ia64/kvm/vmm_ivt.S create mode 100644 arch/ia64/kvm/vti.h create mode 100644 arch/ia64/kvm/vtlb.c create mode 100644 arch/s390/kvm/Kconfig create mode 100644 arch/s390/kvm/Makefile create mode 100644 arch/s390/kvm/diag.c create mode 100644 arch/s390/kvm/gaccess.h create mode 100644 arch/s390/kvm/intercept.c create mode 100644 arch/s390/kvm/interrupt.c create mode 100644 arch/s390/kvm/kvm-s390.c create mode 100644 arch/s390/kvm/kvm-s390.h create mode 100644 arch/s390/kvm/priv.c create mode 100644 arch/s390/kvm/sie64a.S create mode 100644 arch/s390/kvm/sigp.c create mode 100644 drivers/s390/kvm/Makefile create mode 100644 drivers/s390/kvm/kvm_virtio.c create mode 100644 include/asm-ia64/kvm_host.h create mode 100644 include/asm-ia64/kvm_para.h create mode 100644 include/asm-s390/kvm_host.h create mode 100644 include/asm-s390/kvm_para.h create mode 100644 include/asm-s390/kvm_virtio.h create mode 100644 include/asm-s390/sysinfo.h |
From: Avi K. <av...@qu...> - 2008-04-17 09:10:52
|
From: Carsten Otte <co...@de...> Temporary commit, should appear in mainline shortly. Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/lib/uaccess_pt.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/s390/lib/uaccess_pt.c b/arch/s390/lib/uaccess_pt.c index 5efdfe9..d68b2ee 100644 --- a/arch/s390/lib/uaccess_pt.c +++ b/arch/s390/lib/uaccess_pt.c @@ -365,6 +365,9 @@ int futex_atomic_op_pt(int op, int __user *uaddr, int oparg, int *old) { int oldval = 0, newval, ret; + if (!current->mm) + return -EFAULT; + spin_lock(¤t->mm->page_table_lock); uaddr = (int __user *) __dat_user_addr((unsigned long) uaddr); if (!uaddr) { -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:10:54
|
From: Christian Borntraeger <bor...@de...> The address 0x11b8 is used by z/VM for pfault and diag 250 I/O to provide a 64 bit extint parameter. virtio uses the same address, so its time to update the lowcore structure. Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- include/asm-s390/lowcore.h | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/include/asm-s390/lowcore.h b/include/asm-s390/lowcore.h index 801a6fd..e0857c5 100644 --- a/include/asm-s390/lowcore.h +++ b/include/asm-s390/lowcore.h @@ -380,27 +380,32 @@ struct _lowcore /* whether the kernel died with panic() or not */ __u32 panic_magic; /* 0xe00 */ - __u8 pad13[0x1200-0xe04]; /* 0xe04 */ + __u8 pad13[0x11b8-0xe04]; /* 0xe04 */ + + /* 64 bit extparam used for pfault, diag 250 etc */ + __u64 ext_params2; /* 0x11B8 */ + + __u8 pad14[0x1200-0x11C0]; /* 0x11C0 */ /* System info area */ __u64 floating_pt_save_area[16]; /* 0x1200 */ __u64 gpregs_save_area[16]; /* 0x1280 */ __u32 st_status_fixed_logout[4]; /* 0x1300 */ - __u8 pad14[0x1318-0x1310]; /* 0x1310 */ + __u8 pad15[0x1318-0x1310]; /* 0x1310 */ __u32 prefixreg_save_area; /* 0x1318 */ __u32 fpt_creg_save_area; /* 0x131c */ - __u8 pad15[0x1324-0x1320]; /* 0x1320 */ + __u8 pad16[0x1324-0x1320]; /* 0x1320 */ __u32 tod_progreg_save_area; /* 0x1324 */ __u32 cpu_timer_save_area[2]; /* 0x1328 */ __u32 clock_comp_save_area[2]; /* 0x1330 */ - __u8 pad16[0x1340-0x1338]; /* 0x1338 */ + __u8 pad17[0x1340-0x1338]; /* 0x1338 */ __u32 access_regs_save_area[16]; /* 0x1340 */ __u64 cregs_save_area[16]; /* 0x1380 */ /* align to the top of the prefix area */ - __u8 pad17[0x2000-0x1400]; /* 0x1400 */ + __u8 pad18[0x2000-0x1400]; /* 0x1400 */ #endif /* !__s390x__ */ } __attribute__((packed)); /* End structure*/ -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:10:54
|
From: Heiko Carstens <hei...@de...> From: Christian Borntraeger <bor...@de...> This patch changes the s390 memory management defintions to use the pgste field for dirty and reference bit tracking of host and guest code. Usually on s390, dirty and referenced are tracked in storage keys, which belong to the physical page. This changes with virtualization: The guest and host dirty/reference bits are defined to be the logical OR of the values for the mapping and the physical page. This patch implements the necessary changes in pgtable.h for s390. There is a common code change in mm/rmap.c, the call to page_test_and_clear_young must be moved. This is a no-op for all architecture but s390. page_referenced checks the referenced bits for the physiscal page and for all mappings: o The physical page is checked with page_test_and_clear_young. o The mappings are checked with ptep_test_and_clear_young and friends. Without pgstes (the current implementation on Linux s390) the physical page check is implemented but the mapping callbacks are no-ops because dirty and referenced are not tracked in the s390 page tables. The pgstes introduces guest and host dirty and reference bits for s390 in the host mapping. These mapping must be checked before page_test_and_clear_young resets the reference bit. Signed-off-by: Heiko Carstens <hei...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Acked-by: Martin Schwidefsky <sch...@de...> Acked-by: Andrew Morton <ak...@li...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- include/asm-s390/pgtable.h | 92 ++++++++++++++++++++++++++++++++++++++++++- mm/rmap.c | 7 ++- 2 files changed, 93 insertions(+), 6 deletions(-) diff --git a/include/asm-s390/pgtable.h b/include/asm-s390/pgtable.h index 8e9a629..7fe5c4b 100644 --- a/include/asm-s390/pgtable.h +++ b/include/asm-s390/pgtable.h @@ -30,6 +30,7 @@ */ #ifndef __ASSEMBLY__ #include <linux/mm_types.h> +#include <asm/bitops.h> #include <asm/bug.h> #include <asm/processor.h> @@ -258,6 +259,13 @@ extern char empty_zero_page[PAGE_SIZE]; * swap pte is 1011 and 0001, 0011, 0101, 0111 are invalid. */ +/* Page status table bits for virtualization */ +#define RCP_PCL_BIT 55 +#define RCP_HR_BIT 54 +#define RCP_HC_BIT 53 +#define RCP_GR_BIT 50 +#define RCP_GC_BIT 49 + #ifndef __s390x__ /* Bits in the segment table address-space-control-element */ @@ -513,6 +521,48 @@ static inline int pte_file(pte_t pte) #define __HAVE_ARCH_PTE_SAME #define pte_same(a,b) (pte_val(a) == pte_val(b)) +static inline void rcp_lock(pte_t *ptep) +{ +#ifdef CONFIG_PGSTE + unsigned long *pgste = (unsigned long *) (ptep + PTRS_PER_PTE); + preempt_disable(); + while (test_and_set_bit(RCP_PCL_BIT, pgste)) + ; +#endif +} + +static inline void rcp_unlock(pte_t *ptep) +{ +#ifdef CONFIG_PGSTE + unsigned long *pgste = (unsigned long *) (ptep + PTRS_PER_PTE); + clear_bit(RCP_PCL_BIT, pgste); + preempt_enable(); +#endif +} + +/* forward declaration for SetPageUptodate in page-flags.h*/ +static inline void page_clear_dirty(struct page *page); +#include <linux/page-flags.h> + +static inline void ptep_rcp_copy(pte_t *ptep) +{ +#ifdef CONFIG_PGSTE + struct page *page = virt_to_page(pte_val(*ptep)); + unsigned int skey; + unsigned long *pgste = (unsigned long *) (ptep + PTRS_PER_PTE); + + skey = page_get_storage_key(page_to_phys(page)); + if (skey & _PAGE_CHANGED) + set_bit(RCP_GC_BIT, pgste); + if (skey & _PAGE_REFERENCED) + set_bit(RCP_GR_BIT, pgste); + if (test_and_clear_bit(RCP_HC_BIT, pgste)) + SetPageDirty(page); + if (test_and_clear_bit(RCP_HR_BIT, pgste)) + SetPageReferenced(page); +#endif +} + /* * query functions pte_write/pte_dirty/pte_young only work if * pte_present() is true. Undefined behaviour if not.. @@ -599,6 +649,8 @@ static inline void pmd_clear(pmd_t *pmd) static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { + if (mm->context.pgstes) + ptep_rcp_copy(ptep); pte_val(*ptep) = _PAGE_TYPE_EMPTY; if (mm->context.noexec) pte_val(ptep[PTRS_PER_PTE]) = _PAGE_TYPE_EMPTY; @@ -667,6 +719,24 @@ static inline pte_t pte_mkyoung(pte_t pte) static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { +#ifdef CONFIG_PGSTE + unsigned long physpage; + int young; + unsigned long *pgste; + + if (!vma->vm_mm->context.pgstes) + return 0; + physpage = pte_val(*ptep) & PAGE_MASK; + pgste = (unsigned long *) (ptep + PTRS_PER_PTE); + + young = ((page_get_storage_key(physpage) & _PAGE_REFERENCED) != 0); + rcp_lock(ptep); + if (young) + set_bit(RCP_GR_BIT, pgste); + young |= test_and_clear_bit(RCP_HR_BIT, pgste); + rcp_unlock(ptep); + return young; +#endif return 0; } @@ -674,7 +744,13 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, static inline int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - /* No need to flush TLB; bits are in storage key */ + /* No need to flush TLB + * On s390 reference bits are in storage key and never in TLB + * With virtualization we handle the reference bit, without we + * we can simply return */ +#ifdef CONFIG_PGSTE + return ptep_test_and_clear_young(vma, address, ptep); +#endif return 0; } @@ -693,15 +769,25 @@ static inline void __ptep_ipte(unsigned long address, pte_t *ptep) : "=m" (*ptep) : "m" (*ptep), "a" (pto), "a" (address)); } - pte_val(*ptep) = _PAGE_TYPE_EMPTY; } static inline void ptep_invalidate(struct mm_struct *mm, unsigned long address, pte_t *ptep) { + if (mm->context.pgstes) { + rcp_lock(ptep); + __ptep_ipte(address, ptep); + ptep_rcp_copy(ptep); + pte_val(*ptep) = _PAGE_TYPE_EMPTY; + rcp_unlock(ptep); + return; + } __ptep_ipte(address, ptep); - if (mm->context.noexec) + pte_val(*ptep) = _PAGE_TYPE_EMPTY; + if (mm->context.noexec) { __ptep_ipte(address, ptep + PTRS_PER_PTE); + pte_val(*(ptep + PTRS_PER_PTE)) = _PAGE_TYPE_EMPTY; + } } /* diff --git a/mm/rmap.c b/mm/rmap.c index 997f069..e9bb6b1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -413,9 +413,6 @@ int page_referenced(struct page *page, int is_locked, { int referenced = 0; - if (page_test_and_clear_young(page)) - referenced++; - if (TestClearPageReferenced(page)) referenced++; @@ -433,6 +430,10 @@ int page_referenced(struct page *page, int is_locked, unlock_page(page); } } + + if (page_test_and_clear_young(page)) + referenced++; + return referenced; } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:10:54
|
From: Martin Schwidefsky <sch...@de...> From: Carsten Otte <co...@de...> The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now call task_lock() to prevent race against ptrace modification of mm_users. Signed-off-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Carsten Otte <co...@de...> Acked-by: Andrew Morton <ak...@li...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/Kconfig | 4 ++ arch/s390/kernel/setup.c | 4 ++ arch/s390/mm/pgtable.c | 65 ++++++++++++++++++++++++++++++++++++++-- include/asm-s390/mmu.h | 1 + include/asm-s390/mmu_context.h | 8 ++++- include/asm-s390/pgtable.h | 1 + include/linux/sched.h | 2 + kernel/fork.c | 2 +- 8 files changed, 82 insertions(+), 5 deletions(-) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 1831833..3a6bfef 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -55,6 +55,10 @@ config GENERIC_LOCKBREAK default y depends on SMP && PREEMPT +config PGSTE + bool + default y if KVM + mainmenu "Linux Kernel Configuration" config S390 diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 290e504..526c135 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -315,7 +315,11 @@ static int __init early_parse_ipldelay(char *p) early_param("ipldelay", early_parse_ipldelay); #ifdef CONFIG_S390_SWITCH_AMODE +#ifdef CONFIG_PGSTE +unsigned int switch_amode = 1; +#else unsigned int switch_amode = 0; +#endif EXPORT_SYMBOL_GPL(switch_amode); static void set_amode_and_uaccess(unsigned long user_amode, diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index fd07201..5c1aea9 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -30,11 +30,27 @@ #define TABLES_PER_PAGE 4 #define FRAG_MASK 15UL #define SECOND_HALVES 10UL + +void clear_table_pgstes(unsigned long *table) +{ + clear_table(table, _PAGE_TYPE_EMPTY, PAGE_SIZE/4); + memset(table + 256, 0, PAGE_SIZE/4); + clear_table(table + 512, _PAGE_TYPE_EMPTY, PAGE_SIZE/4); + memset(table + 768, 0, PAGE_SIZE/4); +} + #else #define ALLOC_ORDER 2 #define TABLES_PER_PAGE 2 #define FRAG_MASK 3UL #define SECOND_HALVES 2UL + +void clear_table_pgstes(unsigned long *table) +{ + clear_table(table, _PAGE_TYPE_EMPTY, PAGE_SIZE/2); + memset(table + 256, 0, PAGE_SIZE/2); +} + #endif unsigned long *crst_table_alloc(struct mm_struct *mm, int noexec) @@ -153,7 +169,7 @@ unsigned long *page_table_alloc(struct mm_struct *mm) unsigned long *table; unsigned long bits; - bits = mm->context.noexec ? 3UL : 1UL; + bits = (mm->context.noexec || mm->context.pgstes) ? 3UL : 1UL; spin_lock(&mm->page_table_lock); page = NULL; if (!list_empty(&mm->context.pgtable_list)) { @@ -170,7 +186,10 @@ unsigned long *page_table_alloc(struct mm_struct *mm) pgtable_page_ctor(page); page->flags &= ~FRAG_MASK; table = (unsigned long *) page_to_phys(page); - clear_table(table, _PAGE_TYPE_EMPTY, PAGE_SIZE); + if (mm->context.pgstes) + clear_table_pgstes(table); + else + clear_table(table, _PAGE_TYPE_EMPTY, PAGE_SIZE); spin_lock(&mm->page_table_lock); list_add(&page->lru, &mm->context.pgtable_list); } @@ -191,7 +210,7 @@ void page_table_free(struct mm_struct *mm, unsigned long *table) struct page *page; unsigned long bits; - bits = mm->context.noexec ? 3UL : 1UL; + bits = (mm->context.noexec || mm->context.pgstes) ? 3UL : 1UL; bits <<= (__pa(table) & (PAGE_SIZE - 1)) / 256 / sizeof(unsigned long); page = pfn_to_page(__pa(table) >> PAGE_SHIFT); spin_lock(&mm->page_table_lock); @@ -228,3 +247,43 @@ void disable_noexec(struct mm_struct *mm, struct task_struct *tsk) mm->context.noexec = 0; update_mm(mm, tsk); } + +/* + * switch on pgstes for its userspace process (for kvm) + */ +int s390_enable_sie(void) +{ + struct task_struct *tsk = current; + struct mm_struct *mm; + int rc; + + task_lock(tsk); + + rc = 0; + if (tsk->mm->context.pgstes) + goto unlock; + + rc = -EINVAL; + if (!tsk->mm || atomic_read(&tsk->mm->mm_users) > 1 || + tsk->mm != tsk->active_mm || tsk->mm->ioctx_list) + goto unlock; + + tsk->mm->context.pgstes = 1; /* dirty little tricks .. */ + mm = dup_mm(tsk); + tsk->mm->context.pgstes = 0; + + rc = -ENOMEM; + if (!mm) + goto unlock; + mmput(tsk->mm); + tsk->mm = tsk->active_mm = mm; + preempt_disable(); + update_mm(mm, tsk); + cpu_set(smp_processor_id(), mm->cpu_vm_mask); + preempt_enable(); + rc = 0; +unlock: + task_unlock(tsk); + return rc; +} +EXPORT_SYMBOL_GPL(s390_enable_sie); diff --git a/include/asm-s390/mmu.h b/include/asm-s390/mmu.h index 1698e29..5dd5e7b 100644 --- a/include/asm-s390/mmu.h +++ b/include/asm-s390/mmu.h @@ -7,6 +7,7 @@ typedef struct { unsigned long asce_bits; unsigned long asce_limit; int noexec; + int pgstes; } mm_context_t; #endif diff --git a/include/asm-s390/mmu_context.h b/include/asm-s390/mmu_context.h index b5a34c6..4c2fbf4 100644 --- a/include/asm-s390/mmu_context.h +++ b/include/asm-s390/mmu_context.h @@ -20,7 +20,13 @@ static inline int init_new_context(struct task_struct *tsk, #ifdef CONFIG_64BIT mm->context.asce_bits |= _ASCE_TYPE_REGION3; #endif - mm->context.noexec = s390_noexec; + if (current->mm->context.pgstes) { + mm->context.noexec = 0; + mm->context.pgstes = 1; + } else { + mm->context.noexec = s390_noexec; + mm->context.pgstes = 0; + } mm->context.asce_limit = STACK_TOP_MAX; crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm)); return 0; diff --git a/include/asm-s390/pgtable.h b/include/asm-s390/pgtable.h index 65154dc..8e9a629 100644 --- a/include/asm-s390/pgtable.h +++ b/include/asm-s390/pgtable.h @@ -966,6 +966,7 @@ static inline pte_t mk_swap_pte(unsigned long type, unsigned long offset) extern int add_shared_memory(unsigned long start, unsigned long size); extern int remove_shared_memory(unsigned long start, unsigned long size); +extern int s390_enable_sie(void); /* * No page table caches to initialise diff --git a/include/linux/sched.h b/include/linux/sched.h index 6a1e7af..cb67356 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1765,6 +1765,8 @@ extern void mmput(struct mm_struct *); extern struct mm_struct *get_task_mm(struct task_struct *task); /* Remove the current tasks stale references to the old mm_struct */ extern void mm_release(struct task_struct *, struct mm_struct *); +/* Allocate a new mm structure and copy contents from tsk->mm */ +extern struct mm_struct *dup_mm(struct task_struct *tsk); extern int copy_thread(int, unsigned long, unsigned long, unsigned long, struct task_struct *, struct pt_regs *); extern void flush_thread(void); diff --git a/kernel/fork.c b/kernel/fork.c index 9c042f9..4ca580a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -498,7 +498,7 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm) * Allocate a new mm structure and copy contents from the * mm structure of the passed in task structure. */ -static struct mm_struct *dup_mm(struct task_struct *tsk) +struct mm_struct *dup_mm(struct task_struct *tsk) { struct mm_struct *mm, *oldmm = current->mm; int err; -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:03
|
From: Carsten Otte <co...@de...> This patch adds Documentation/s390/kvm.txt, which describes specifics of kvm's user interface that are unique to s390 architecture. Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- Documentation/s390/kvm.txt | 125 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 125 insertions(+), 0 deletions(-) create mode 100644 Documentation/s390/kvm.txt diff --git a/Documentation/s390/kvm.txt b/Documentation/s390/kvm.txt new file mode 100644 index 0000000..6f5ceb0 --- /dev/null +++ b/Documentation/s390/kvm.txt @@ -0,0 +1,125 @@ +*** BIG FAT WARNING *** +The kvm module is currently in EXPERIMENTAL state for s390. This means that +the interface to the module is not yet considered to remain stable. Thus, be +prepared that we keep breaking your userspace application and guest +compatibility over and over again until we feel happy with the result. Make sure +your guest kernel, your host kernel, and your userspace launcher are in a +consistent state. + +This Documentation describes the unique ioctl calls to /dev/kvm, the resulting +kvm-vm file descriptors, and the kvm-vcpu file descriptors that differ from x86. + +1. ioctl calls to /dev/kvm +KVM does support the following ioctls on s390 that are common with other +architectures and do behave the same: +KVM_GET_API_VERSION +KVM_CREATE_VM (*) see note +KVM_CHECK_EXTENSION +KVM_GET_VCPU_MMAP_SIZE + +Notes: +* KVM_CREATE_VM may fail on s390, if the calling process has multiple +threads and has not called KVM_S390_ENABLE_SIE before. + +In addition, on s390 the following architecture specific ioctls are supported: +ioctl: KVM_S390_ENABLE_SIE +args: none +see also: include/linux/kvm.h +This call causes the kernel to switch on PGSTE in the user page table. This +operation is needed in order to run a virtual machine, and it requires the +calling process to be single-threaded. Note that the first call to KVM_CREATE_VM +will implicitly try to switch on PGSTE if the user process has not called +KVM_S390_ENABLE_SIE before. User processes that want to launch multiple threads +before creating a virtual machine have to call KVM_S390_ENABLE_SIE, or will +observe an error calling KVM_CREATE_VM. Switching on PGSTE is a one-time +operation, is not reversible, and will persist over the entire lifetime of +the calling process. It does not have any user-visible effect other than a small +performance penalty. + +2. ioctl calls to the kvm-vm file descriptor +KVM does support the following ioctls on s390 that are common with other +architectures and do behave the same: +KVM_CREATE_VCPU +KVM_SET_USER_MEMORY_REGION (*) see note +KVM_GET_DIRTY_LOG (**) see note + +Notes: +* kvm does only allow exactly one memory slot on s390, which has to start + at guest absolute address zero and at a user address that is aligned on any + page boundary. This hardware "limitation" allows us to have a few unique + optimizations. The memory slot doesn't have to be filled + with memory actually, it may contain sparse holes. That said, with different + user memory layout this does still allow a large flexibility when + doing the guest memory setup. +** KVM_GET_DIRTY_LOG doesn't work properly yet. The user will receive an empty +log. This ioctl call is only needed for guest migration, and we intend to +implement this one in the future. + +In addition, on s390 the following architecture specific ioctls for the kvm-vm +file descriptor are supported: +ioctl: KVM_S390_INTERRUPT +args: struct kvm_s390_interrupt * +see also: include/linux/kvm.h +This ioctl is used to submit a floating interrupt for a virtual machine. +Floating interrupts may be delivered to any virtual cpu in the configuration. +Only some interrupt types defined in include/linux/kvm.h make sense when +submitted as floating interrupts. The following interrupts are not considered +to be useful as floating interrupts, and a call to inject them will result in +-EINVAL error code: program interrupts and interprocessor signals. Valid +floating interrupts are: +KVM_S390_INT_VIRTIO +KVM_S390_INT_SERVICE + +3. ioctl calls to the kvm-vcpu file descriptor +KVM does support the following ioctls on s390 that are common with other +architectures and do behave the same: +KVM_RUN +KVM_GET_REGS +KVM_SET_REGS +KVM_GET_SREGS +KVM_SET_SREGS +KVM_GET_FPU +KVM_SET_FPU + +In addition, on s390 the following architecture specific ioctls for the +kvm-vcpu file descriptor are supported: +ioctl: KVM_S390_INTERRUPT +args: struct kvm_s390_interrupt * +see also: include/linux/kvm.h +This ioctl is used to submit an interrupt for a specific virtual cpu. +Only some interrupt types defined in include/linux/kvm.h make sense when +submitted for a specific cpu. The following interrupts are not considered +to be useful, and a call to inject them will result in -EINVAL error code: +service processor calls and virtio interrupts. Valid interrupt types are: +KVM_S390_PROGRAM_INT +KVM_S390_SIGP_STOP +KVM_S390_RESTART +KVM_S390_SIGP_SET_PREFIX +KVM_S390_INT_EMERGENCY + +ioctl: KVM_S390_STORE_STATUS +args: unsigned long +see also: include/linux/kvm.h +This ioctl stores the state of the cpu at the guest real address given as +argument, unless one of the following values defined in include/linux/kvm.h +is given as arguement: +KVM_S390_STORE_STATUS_NOADDR - the CPU stores its status to the save area in +absolute lowcore as defined by the principles of operation +KVM_S390_STORE_STATUS_PREFIXED - the CPU stores its status to the save area in +its prefix page just like the dump tool that comes with zipl. This is useful +to create a system dump for use with lkcdutils or crash. + +ioctl: KVM_S390_SET_INITIAL_PSW +args: struct kvm_s390_psw * +see also: include/linux/kvm.h +This ioctl can be used to set the processor status word (psw) of a stopped cpu +prior to running it with KVM_RUN. Note that this call is not required to modify +the psw during sie intercepts that fall back to userspace because struct kvm_run +does contain the psw, and this value is evaluated during reentry of KVM_RUN +after the intercept exit was recognized. + +ioctl: KVM_S390_INITIAL_RESET +args: none +see also: include/linux/kvm.h +This ioctl can be used to perform an initial cpu reset as defined by the +principles of operation. The target cpu has to be in stopped state. -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:04
|
From: Carsten Otte <co...@de...> From: Christian Borntraeger <bor...@de...> This patch adds the virtualization submenu and the kvm option to the kernel config. It also defines HAVE_KVM for 64bit kernels. Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/Kconfig | 3 +++ arch/s390/kvm/Kconfig | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+), 0 deletions(-) create mode 100644 arch/s390/kvm/Kconfig diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 3a6bfef..c43b250 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -66,6 +66,7 @@ config S390 select HAVE_OPROFILE select HAVE_KPROBES select HAVE_KRETPROBES + select HAVE_KVM if 64BIT source "init/Kconfig" @@ -553,3 +554,5 @@ source "security/Kconfig" source "crypto/Kconfig" source "lib/Kconfig" + +source "arch/s390/kvm/Kconfig" diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig new file mode 100644 index 0000000..2489b34 --- /dev/null +++ b/arch/s390/kvm/Kconfig @@ -0,0 +1,43 @@ +# +# KVM configuration +# +config HAVE_KVM + bool + +menuconfig VIRTUALIZATION + bool "Virtualization" + default y + ---help--- + Say Y here to get to see options for using your Linux host to run other + operating systems inside virtual machines (guests). + This option alone does not add any kernel code. + + If you say N, all options in this submenu will be skipped and disabled. + +if VIRTUALIZATION + +config KVM + tristate "Kernel-based Virtual Machine (KVM) support" + depends on HAVE_KVM && EXPERIMENTAL + select PREEMPT_NOTIFIERS + select ANON_INODES + select S390_SWITCH_AMODE + select PREEMPT + ---help--- + Support hosting paravirtualized guest machines using the SIE + virtualization capability on the mainframe. This should work + on any 64bit machine. + + This module provides access to the hardware capabilities through + a character device node named /dev/kvm. + + To compile this as a module, choose M here: the module + will be called kvm. + + If unsure, say N. + +# OK, it's a little counter-intuitive to do this, but it puts it neatly under +# the virtualization menu. +source drivers/virtio/Kconfig + +endif # VIRTUALIZATION -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:03
|
From: Carsten Otte <co...@de...> From: Christian Borntraeger <bor...@de...> This patch introduces in-kernel handling of some intercepts for privileged instructions: handle_set_prefix() sets the prefix register of the local cpu handle_store_prefix() stores the content of the prefix register to memory handle_store_cpu_address() stores the cpu number of the current cpu to memory handle_skey() just decrements the instruction address and retries handle_stsch() delivers condition code 3 "operation not supported" handle_chsc() same here handle_stfl() stores the facility list which contains the capabilities of the cpu handle_stidp() stores cpu type/model/revision and such handle_stsi() stores information about the system topology Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Heiko Carstens <hei...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/Makefile | 2 +- arch/s390/kvm/intercept.c | 1 + arch/s390/kvm/kvm-s390.c | 11 ++ arch/s390/kvm/kvm-s390.h | 3 + arch/s390/kvm/priv.c | 323 +++++++++++++++++++++++++++++++++++++++++++ include/asm-s390/kvm_host.h | 13 ++ 6 files changed, 352 insertions(+), 1 deletions(-) create mode 100644 arch/s390/kvm/priv.c diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile index 7275a1a..82dde1e 100644 --- a/arch/s390/kvm/Makefile +++ b/arch/s390/kvm/Makefile @@ -10,5 +10,5 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o) EXTRA_CFLAGS += -Ivirt/kvm -Iarch/s390/kvm -kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o +kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o priv.o obj-$(CONFIG_KVM) += kvm.o diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c index 7f7347b..7a20d63 100644 --- a/arch/s390/kvm/intercept.c +++ b/arch/s390/kvm/intercept.c @@ -95,6 +95,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu) } static intercept_handler_t instruction_handlers[256] = { + [0xb2] = kvm_s390_handle_priv, [0xb7] = handle_lctl, [0xeb] = handle_lctg, }; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5e3473c..5a17176 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -48,6 +48,15 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "deliver_restart_signal", VCPU_STAT(deliver_restart_signal) }, { "deliver_program_interruption", VCPU_STAT(deliver_program_int) }, { "exit_wait_state", VCPU_STAT(exit_wait_state) }, + { "instruction_stidp", VCPU_STAT(instruction_stidp) }, + { "instruction_spx", VCPU_STAT(instruction_spx) }, + { "instruction_stpx", VCPU_STAT(instruction_stpx) }, + { "instruction_stap", VCPU_STAT(instruction_stap) }, + { "instruction_storage_key", VCPU_STAT(instruction_storage_key) }, + { "instruction_stsch", VCPU_STAT(instruction_stsch) }, + { "instruction_chsc", VCPU_STAT(instruction_chsc) }, + { "instruction_stsi", VCPU_STAT(instruction_stsi) }, + { "instruction_stfl", VCPU_STAT(instruction_stfl) }, { NULL } }; @@ -246,6 +255,8 @@ int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) vcpu->arch.sie_block->eca = 0xC1002001U; setup_timer(&vcpu->arch.ckc_timer, kvm_s390_idle_wakeup, (unsigned long) vcpu); + get_cpu_id(&vcpu->arch.cpu_id); + vcpu->arch.cpu_id.version = 0xfe; return 0; } diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 8df745b..50f96b3 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -48,4 +48,7 @@ int kvm_s390_inject_vm(struct kvm *kvm, int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu, struct kvm_s390_interrupt *s390int); int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code); + +/* implemented in priv.c */ +int kvm_s390_handle_priv(struct kvm_vcpu *vcpu); #endif diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c new file mode 100644 index 0000000..c97e904 --- /dev/null +++ b/arch/s390/kvm/priv.c @@ -0,0 +1,323 @@ +/* + * priv.c - handling privileged instructions + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Carsten Otte <co...@de...> + * Christian Borntraeger <bor...@de...> + */ + +#include <linux/kvm.h> +#include <linux/errno.h> +#include <asm/current.h> +#include <asm/debug.h> +#include <asm/ebcdic.h> +#include <asm/sysinfo.h> +#include "gaccess.h" +#include "kvm-s390.h" + +static int handle_set_prefix(struct kvm_vcpu *vcpu) +{ + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u64 operand2; + u32 address = 0; + u8 tmp; + + vcpu->stat.instruction_spx++; + + operand2 = disp2; + if (base2) + operand2 += vcpu->arch.guest_gprs[base2]; + + /* must be word boundary */ + if (operand2 & 3) { + kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + goto out; + } + + /* get the value */ + if (get_guest_u32(vcpu, operand2, &address)) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out; + } + + address = address & 0x7fffe000u; + + /* make sure that the new value is valid memory */ + if (copy_from_guest_absolute(vcpu, &tmp, address, 1) || + (copy_from_guest_absolute(vcpu, &tmp, address + PAGE_SIZE, 1))) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out; + } + + vcpu->arch.sie_block->prefix = address; + vcpu->arch.sie_block->ihcpu = 0xffff; + + VCPU_EVENT(vcpu, 5, "setting prefix to %x", address); +out: + return 0; +} + +static int handle_store_prefix(struct kvm_vcpu *vcpu) +{ + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u64 operand2; + u32 address; + + vcpu->stat.instruction_stpx++; + operand2 = disp2; + if (base2) + operand2 += vcpu->arch.guest_gprs[base2]; + + /* must be word boundary */ + if (operand2 & 3) { + kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + goto out; + } + + address = vcpu->arch.sie_block->prefix; + address = address & 0x7fffe000u; + + /* get the value */ + if (put_guest_u32(vcpu, operand2, address)) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out; + } + + VCPU_EVENT(vcpu, 5, "storing prefix to %x", address); +out: + return 0; +} + +static int handle_store_cpu_address(struct kvm_vcpu *vcpu) +{ + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u64 useraddr; + int rc; + + vcpu->stat.instruction_stap++; + useraddr = disp2; + if (base2) + useraddr += vcpu->arch.guest_gprs[base2]; + + if (useraddr & 1) { + kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + goto out; + } + + rc = put_guest_u16(vcpu, useraddr, vcpu->vcpu_id); + if (rc == -EFAULT) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out; + } + + VCPU_EVENT(vcpu, 5, "storing cpu address to %lx", useraddr); +out: + return 0; +} + +static int handle_skey(struct kvm_vcpu *vcpu) +{ + vcpu->stat.instruction_storage_key++; + vcpu->arch.sie_block->gpsw.addr -= 4; + VCPU_EVENT(vcpu, 4, "%s", "retrying storage key operation"); + return 0; +} + +static int handle_stsch(struct kvm_vcpu *vcpu) +{ + vcpu->stat.instruction_stsch++; + VCPU_EVENT(vcpu, 4, "%s", "store subchannel - CC3"); + /* condition code 3 */ + vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44); + vcpu->arch.sie_block->gpsw.mask |= (3 & 3ul) << 44; + return 0; +} + +static int handle_chsc(struct kvm_vcpu *vcpu) +{ + vcpu->stat.instruction_chsc++; + VCPU_EVENT(vcpu, 4, "%s", "channel subsystem call - CC3"); + /* condition code 3 */ + vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44); + vcpu->arch.sie_block->gpsw.mask |= (3 & 3ul) << 44; + return 0; +} + +static unsigned int stfl(void) +{ + asm volatile( + " .insn s,0xb2b10000,0(0)\n" /* stfl */ + "0:\n" + EX_TABLE(0b, 0b)); + return S390_lowcore.stfl_fac_list; +} + +static int handle_stfl(struct kvm_vcpu *vcpu) +{ + unsigned int facility_list = stfl(); + int rc; + + vcpu->stat.instruction_stfl++; + facility_list &= ~(1UL<<24); /* no stfle */ + + rc = copy_to_guest(vcpu, offsetof(struct _lowcore, stfl_fac_list), + &facility_list, sizeof(facility_list)); + if (rc == -EFAULT) + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + else + VCPU_EVENT(vcpu, 5, "store facility list value %x", + facility_list); + return 0; +} + +static int handle_stidp(struct kvm_vcpu *vcpu) +{ + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u64 operand2; + int rc; + + vcpu->stat.instruction_stidp++; + operand2 = disp2; + if (base2) + operand2 += vcpu->arch.guest_gprs[base2]; + + if (operand2 & 7) { + kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + goto out; + } + + rc = put_guest_u64(vcpu, operand2, vcpu->arch.stidp_data); + if (rc == -EFAULT) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out; + } + + VCPU_EVENT(vcpu, 5, "%s", "store cpu id"); +out: + return 0; +} + +static void handle_stsi_3_2_2(struct kvm_vcpu *vcpu, struct sysinfo_3_2_2 *mem) +{ + struct float_interrupt *fi = &vcpu->kvm->arch.float_int; + int cpus = 0; + int n; + + spin_lock_bh(&fi->lock); + for (n = 0; n < KVM_MAX_VCPUS; n++) + if (fi->local_int[n]) + cpus++; + spin_unlock_bh(&fi->lock); + + /* deal with other level 3 hypervisors */ + if (stsi(mem, 3, 2, 2) == -ENOSYS) + mem->count = 0; + if (mem->count < 8) + mem->count++; + for (n = mem->count - 1; n > 0 ; n--) + memcpy(&mem->vm[n], &mem->vm[n - 1], sizeof(mem->vm[0])); + + mem->vm[0].cpus_total = cpus; + mem->vm[0].cpus_configured = cpus; + mem->vm[0].cpus_standby = 0; + mem->vm[0].cpus_reserved = 0; + mem->vm[0].caf = 1000; + memcpy(mem->vm[0].name, "KVMguest", 8); + ASCEBC(mem->vm[0].name, 8); + memcpy(mem->vm[0].cpi, "KVM/Linux ", 16); + ASCEBC(mem->vm[0].cpi, 16); +} + +static int handle_stsi(struct kvm_vcpu *vcpu) +{ + int fc = (vcpu->arch.guest_gprs[0] & 0xf0000000) >> 28; + int sel1 = vcpu->arch.guest_gprs[0] & 0xff; + int sel2 = vcpu->arch.guest_gprs[1] & 0xffff; + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u64 operand2; + unsigned long mem; + + vcpu->stat.instruction_stsi++; + VCPU_EVENT(vcpu, 4, "stsi: fc: %x sel1: %x sel2: %x", fc, sel1, sel2); + + operand2 = disp2; + if (base2) + operand2 += vcpu->arch.guest_gprs[base2]; + + if (operand2 & 0xfff && fc > 0) + return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); + + switch (fc) { + case 0: + vcpu->arch.guest_gprs[0] = 3 << 28; + vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44); + return 0; + case 1: /* same handling for 1 and 2 */ + case 2: + mem = get_zeroed_page(GFP_KERNEL); + if (!mem) + goto out_fail; + if (stsi((void *) mem, fc, sel1, sel2) == -ENOSYS) + goto out_mem; + break; + case 3: + if (sel1 != 2 || sel2 != 2) + goto out_fail; + mem = get_zeroed_page(GFP_KERNEL); + if (!mem) + goto out_fail; + handle_stsi_3_2_2(vcpu, (void *) mem); + break; + default: + goto out_fail; + } + + if (copy_to_guest_absolute(vcpu, operand2, (void *) mem, PAGE_SIZE)) { + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + goto out_mem; + } + free_page(mem); + vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44); + vcpu->arch.guest_gprs[0] = 0; + return 0; +out_mem: + free_page(mem); +out_fail: + /* condition code 3 */ + vcpu->arch.sie_block->gpsw.mask |= 3ul << 44; + return 0; +} + +static intercept_handler_t priv_handlers[256] = { + [0x02] = handle_stidp, + [0x10] = handle_set_prefix, + [0x11] = handle_store_prefix, + [0x12] = handle_store_cpu_address, + [0x29] = handle_skey, + [0x2a] = handle_skey, + [0x2b] = handle_skey, + [0x34] = handle_stsch, + [0x5f] = handle_chsc, + [0x7d] = handle_stsi, + [0xb1] = handle_stfl, +}; + +int kvm_s390_handle_priv(struct kvm_vcpu *vcpu) +{ + intercept_handler_t handler; + + handler = priv_handlers[vcpu->arch.sie_block->ipa & 0x00ff]; + if (handler) + return handler(vcpu); + return -ENOTSUPP; +} diff --git a/include/asm-s390/kvm_host.h b/include/asm-s390/kvm_host.h index 4fe1930..2eaf6fe 100644 --- a/include/asm-s390/kvm_host.h +++ b/include/asm-s390/kvm_host.h @@ -119,6 +119,15 @@ struct kvm_vcpu_stat { u32 deliver_restart_signal; u32 deliver_program_int; u32 exit_wait_state; + u32 instruction_stidp; + u32 instruction_spx; + u32 instruction_stpx; + u32 instruction_stap; + u32 instruction_storage_key; + u32 instruction_stsch; + u32 instruction_chsc; + u32 instruction_stsi; + u32 instruction_stfl; }; struct io_info { @@ -188,6 +197,10 @@ struct kvm_vcpu_arch { unsigned int guest_acrs[NUM_ACRS]; struct local_interrupt local_int; struct timer_list ckc_timer; + union { + cpuid_t cpu_id; + u64 stidp_data; + }; }; struct kvm_vm_stat { -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:04
|
From: Christian Borntraeger <bor...@de...> drivers/s390/sysinfo.c uses the store system information intruction to query the system about information of the machine, the LPAR and additional hypervisors. KVM has to implement the host part for this instruction. To avoid code duplication, this patch splits the common definitions from sysinfo.c into a separate header file include/asm-s390/sysinfo.h for KVM use. Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- drivers/s390/sysinfo.c | 100 +--------------------------------------- include/asm-s390/sysinfo.h | 111 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 112 insertions(+), 99 deletions(-) create mode 100644 include/asm-s390/sysinfo.h diff --git a/drivers/s390/sysinfo.c b/drivers/s390/sysinfo.c index 291ff62..43fffa3 100644 --- a/drivers/s390/sysinfo.c +++ b/drivers/s390/sysinfo.c @@ -11,111 +11,13 @@ #include <linux/init.h> #include <linux/delay.h> #include <asm/ebcdic.h> +#include <asm/sysinfo.h> /* Sigh, math-emu. Don't ask. */ #include <asm/sfp-util.h> #include <math-emu/soft-fp.h> #include <math-emu/single.h> -struct sysinfo_1_1_1 { - char reserved_0[32]; - char manufacturer[16]; - char type[4]; - char reserved_1[12]; - char model_capacity[16]; - char sequence[16]; - char plant[4]; - char model[16]; -}; - -struct sysinfo_1_2_1 { - char reserved_0[80]; - char sequence[16]; - char plant[4]; - char reserved_1[2]; - unsigned short cpu_address; -}; - -struct sysinfo_1_2_2 { - char format; - char reserved_0[1]; - unsigned short acc_offset; - char reserved_1[24]; - unsigned int secondary_capability; - unsigned int capability; - unsigned short cpus_total; - unsigned short cpus_configured; - unsigned short cpus_standby; - unsigned short cpus_reserved; - unsigned short adjustment[0]; -}; - -struct sysinfo_1_2_2_extension { - unsigned int alt_capability; - unsigned short alt_adjustment[0]; -}; - -struct sysinfo_2_2_1 { - char reserved_0[80]; - char sequence[16]; - char plant[4]; - unsigned short cpu_id; - unsigned short cpu_address; -}; - -struct sysinfo_2_2_2 { - char reserved_0[32]; - unsigned short lpar_number; - char reserved_1; - unsigned char characteristics; - unsigned short cpus_total; - unsigned short cpus_configured; - unsigned short cpus_standby; - unsigned short cpus_reserved; - char name[8]; - unsigned int caf; - char reserved_2[16]; - unsigned short cpus_dedicated; - unsigned short cpus_shared; -}; - -#define LPAR_CHAR_DEDICATED (1 << 7) -#define LPAR_CHAR_SHARED (1 << 6) -#define LPAR_CHAR_LIMITED (1 << 5) - -struct sysinfo_3_2_2 { - char reserved_0[31]; - unsigned char count; - struct { - char reserved_0[4]; - unsigned short cpus_total; - unsigned short cpus_configured; - unsigned short cpus_standby; - unsigned short cpus_reserved; - char name[8]; - unsigned int caf; - char cpi[16]; - char reserved_1[24]; - - } vm[8]; -}; - -static inline int stsi(void *sysinfo, int fc, int sel1, int sel2) -{ - register int r0 asm("0") = (fc << 28) | sel1; - register int r1 asm("1") = sel2; - - asm volatile( - " stsi 0(%2)\n" - "0: jz 2f\n" - "1: lhi %0,%3\n" - "2:\n" - EX_TABLE(0b,1b) - : "+d" (r0) : "d" (r1), "a" (sysinfo), "K" (-ENOSYS) - : "cc", "memory" ); - return r0; -} - static inline int stsi_0(void) { int rc = stsi (NULL, 0, 0, 0); diff --git a/include/asm-s390/sysinfo.h b/include/asm-s390/sysinfo.h new file mode 100644 index 0000000..0e4d57e --- /dev/null +++ b/include/asm-s390/sysinfo.h @@ -0,0 +1,111 @@ +/* + * definition for store system information stsi + * + * Copyright IBM Corp. 2001,2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Ulrich Weigand <we...@de...> + * Christian Borntraeger <bor...@de...> + */ + +struct sysinfo_1_1_1 { + char reserved_0[32]; + char manufacturer[16]; + char type[4]; + char reserved_1[12]; + char model_capacity[16]; + char sequence[16]; + char plant[4]; + char model[16]; +}; + +struct sysinfo_1_2_1 { + char reserved_0[80]; + char sequence[16]; + char plant[4]; + char reserved_1[2]; + unsigned short cpu_address; +}; + +struct sysinfo_1_2_2 { + char format; + char reserved_0[1]; + unsigned short acc_offset; + char reserved_1[24]; + unsigned int secondary_capability; + unsigned int capability; + unsigned short cpus_total; + unsigned short cpus_configured; + unsigned short cpus_standby; + unsigned short cpus_reserved; + unsigned short adjustment[0]; +}; + +struct sysinfo_1_2_2_extension { + unsigned int alt_capability; + unsigned short alt_adjustment[0]; +}; + +struct sysinfo_2_2_1 { + char reserved_0[80]; + char sequence[16]; + char plant[4]; + unsigned short cpu_id; + unsigned short cpu_address; +}; + +struct sysinfo_2_2_2 { + char reserved_0[32]; + unsigned short lpar_number; + char reserved_1; + unsigned char characteristics; + unsigned short cpus_total; + unsigned short cpus_configured; + unsigned short cpus_standby; + unsigned short cpus_reserved; + char name[8]; + unsigned int caf; + char reserved_2[16]; + unsigned short cpus_dedicated; + unsigned short cpus_shared; +}; + +#define LPAR_CHAR_DEDICATED (1 << 7) +#define LPAR_CHAR_SHARED (1 << 6) +#define LPAR_CHAR_LIMITED (1 << 5) + +struct sysinfo_3_2_2 { + char reserved_0[31]; + unsigned char count; + struct { + char reserved_0[4]; + unsigned short cpus_total; + unsigned short cpus_configured; + unsigned short cpus_standby; + unsigned short cpus_reserved; + char name[8]; + unsigned int caf; + char cpi[16]; + char reserved_1[24]; + + } vm[8]; +}; + +static inline int stsi(void *sysinfo, int fc, int sel1, int sel2) +{ + register int r0 asm("0") = (fc << 28) | sel1; + register int r1 asm("1") = sel2; + + asm volatile( + " stsi 0(%2)\n" + "0: jz 2f\n" + "1: lhi %0,%3\n" + "2:\n" + EX_TABLE(0b, 1b) + : "+d" (r0) : "d" (r1), "a" (sysinfo), "K" (-ENOSYS) + : "cc", "memory"); + return r0; +} -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:03
|
From: Carsten Otte <co...@de...> From: Christian Borntraeger <bor...@de...> This patch introduces interpretation of some diagnose instruction intercepts. Diagnose is our classic architected way of doing a hypercall. This patch features the following diagnose codes: - vm storage size, that tells the guest about its memory layout - time slice end, which is used by the guest to indicate that it waits for a lock and thus cannot use up its time slice in a useful way - ipl functions, which a guest can use to reset and reboot itself In order to implement ipl functions, we also introduce an exit reason that causes userspace to perform various resets on the virtual machine. All resets are described in the principles of operation book, except KVM_S390_RESET_IPL which causes a reboot of the machine. Acked-by: Martin Schwidefsky <mar...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/Makefile | 2 +- arch/s390/kvm/diag.c | 67 +++++++++++++++++++++++++++++++++++++++++++ arch/s390/kvm/intercept.c | 1 + arch/s390/kvm/kvm-s390.c | 1 + arch/s390/kvm/kvm-s390.h | 3 ++ include/asm-s390/kvm_host.h | 5 ++- include/linux/kvm.h | 8 +++++ 7 files changed, 85 insertions(+), 2 deletions(-) create mode 100644 arch/s390/kvm/diag.c diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile index f3bf11a..e5221ec 100644 --- a/arch/s390/kvm/Makefile +++ b/arch/s390/kvm/Makefile @@ -10,5 +10,5 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o) EXTRA_CFLAGS += -Ivirt/kvm -Iarch/s390/kvm -kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o priv.o sigp.o +kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o priv.o sigp.o diag.o obj-$(CONFIG_KVM) += kvm.o diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c new file mode 100644 index 0000000..f639a15 --- /dev/null +++ b/arch/s390/kvm/diag.c @@ -0,0 +1,67 @@ +/* + * diag.c - handling diagnose instructions + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Carsten Otte <co...@de...> + * Christian Borntraeger <bor...@de...> + */ + +#include <linux/kvm.h> +#include <linux/kvm_host.h> +#include "kvm-s390.h" + +static int __diag_time_slice_end(struct kvm_vcpu *vcpu) +{ + VCPU_EVENT(vcpu, 5, "%s", "diag time slice end"); + vcpu->stat.diagnose_44++; + vcpu_put(vcpu); + schedule(); + vcpu_load(vcpu); + return 0; +} + +static int __diag_ipl_functions(struct kvm_vcpu *vcpu) +{ + unsigned int reg = vcpu->arch.sie_block->ipa & 0xf; + unsigned long subcode = vcpu->arch.guest_gprs[reg] & 0xffff; + + VCPU_EVENT(vcpu, 5, "diag ipl functions, subcode %lx", subcode); + switch (subcode) { + case 3: + vcpu->run->s390_reset_flags = KVM_S390_RESET_CLEAR; + break; + case 4: + vcpu->run->s390_reset_flags = 0; + break; + default: + return -ENOTSUPP; + } + + atomic_clear_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags); + vcpu->run->s390_reset_flags |= KVM_S390_RESET_SUBSYSTEM; + vcpu->run->s390_reset_flags |= KVM_S390_RESET_IPL; + vcpu->run->s390_reset_flags |= KVM_S390_RESET_CPU_INIT; + vcpu->run->exit_reason = KVM_EXIT_S390_RESET; + VCPU_EVENT(vcpu, 3, "requesting userspace resets %lx", + vcpu->run->s390_reset_flags); + return -EREMOTE; +} + +int kvm_s390_handle_diag(struct kvm_vcpu *vcpu) +{ + int code = (vcpu->arch.sie_block->ipb & 0xfff0000) >> 16; + + switch (code) { + case 0x44: + return __diag_time_slice_end(vcpu); + case 0x308: + return __diag_ipl_functions(vcpu); + default: + return -ENOTSUPP; + } +} diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c index 9f0d8b2..349581a 100644 --- a/arch/s390/kvm/intercept.c +++ b/arch/s390/kvm/intercept.c @@ -95,6 +95,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu) } static intercept_handler_t instruction_handlers[256] = { + [0x83] = kvm_s390_handle_diag, [0xae] = kvm_s390_handle_sigp, [0xb2] = kvm_s390_handle_priv, [0xb7] = handle_lctl, diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index c632180..d3b1de8 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -63,6 +63,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "instruction_sigp_set_arch", VCPU_STAT(instruction_sigp_arch) }, { "instruction_sigp_set_prefix", VCPU_STAT(instruction_sigp_prefix) }, { "instruction_sigp_restart", VCPU_STAT(instruction_sigp_restart) }, + { "diagnose_44", VCPU_STAT(diagnose_44) }, { NULL } }; diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index e6e5756..3893cf1 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -58,4 +58,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu); /* implemented in kvm-s390.c */ int __kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr); +/* implemented in diag.c */ +int kvm_s390_handle_diag(struct kvm_vcpu *vcpu); + #endif diff --git a/include/asm-s390/kvm_host.h b/include/asm-s390/kvm_host.h index 1c829bd..f8204a4 100644 --- a/include/asm-s390/kvm_host.h +++ b/include/asm-s390/kvm_host.h @@ -94,7 +94,9 @@ struct sie_block { psw_t gpsw; /* 0x0090 */ __u64 gg14; /* 0x00a0 */ __u64 gg15; /* 0x00a8 */ - __u8 reservedb0[80]; /* 0x00b0 */ + __u8 reservedb0[30]; /* 0x00b0 */ + __u16 iprcc; /* 0x00ce */ + __u8 reservedd0[48]; /* 0x00d0 */ __u64 gcr[16]; /* 0x0100 */ __u64 gbea; /* 0x0180 */ __u8 reserved188[120]; /* 0x0188 */ @@ -134,6 +136,7 @@ struct kvm_vcpu_stat { u32 instruction_sigp_arch; u32 instruction_sigp_prefix; u32 instruction_sigp_restart; + u32 diagnose_44; }; struct io_info { diff --git a/include/linux/kvm.h b/include/linux/kvm.h index b0dea38..37b963e 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -75,6 +75,7 @@ struct kvm_irqchip { #define KVM_EXIT_SET_TPR 11 #define KVM_EXIT_TPR_ACCESS 12 #define KVM_EXIT_S390_SIEIC 13 +#define KVM_EXIT_S390_RESET 14 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { @@ -147,6 +148,13 @@ struct kvm_run { __u16 ipa; __u32 ipb; } s390_sieic; + /* KVM_EXIT_S390_RESET */ +#define KVM_S390_RESET_POR 1 +#define KVM_S390_RESET_CLEAR 2 +#define KVM_S390_RESET_SUBSYSTEM 4 +#define KVM_S390_RESET_CPU_INIT 8 +#define KVM_S390_RESET_IPL 16 + __u64 s390_reset_flags; /* Fix the size of the union. */ char padding[256]; }; -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:04
|
From: Carsten Otte <co...@de...> From: Christian Borntraeger <bor...@de...> This path introduces handling of sie intercepts in three flavors: Intercepts are either handled completely in-kernel by kvm_handle_sie_intercept(), or passed to userspace with corresponding data in struct kvm_run in case kvm_handle_sie_intercept() returns -ENOTSUPP. In case of partial execution in kernel with the need of userspace support, kvm_handle_sie_intercept() may choose to set up struct kvm_run and return -EREMOTE. The trivial intercept reasons are handled in this patch: handle_noop() just does nothing for intercepts that don't require our support at all handle_stop() is called when a cpu enters stopped state, and it drops out to userland after updating our vcpu state handle_validity() faults in the cpu lowcore if needed, or passes the request to userland Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/Makefile | 2 +- arch/s390/kvm/intercept.c | 80 +++++++++++++++++++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.c | 46 ++++++++++++++++++++++++- arch/s390/kvm/kvm-s390.h | 7 ++++ include/asm-s390/kvm_host.h | 4 ++ include/linux/kvm.h | 9 +++++ 6 files changed, 146 insertions(+), 2 deletions(-) create mode 100644 arch/s390/kvm/intercept.c diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile index 0d8d113..27882b3 100644 --- a/arch/s390/kvm/Makefile +++ b/arch/s390/kvm/Makefile @@ -10,5 +10,5 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o) EXTRA_CFLAGS += -Ivirt/kvm -Iarch/s390/kvm -kvm-objs := $(common-objs) kvm-s390.o sie64a.o +kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o obj-$(CONFIG_KVM) += kvm.o diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c new file mode 100644 index 0000000..e757230 --- /dev/null +++ b/arch/s390/kvm/intercept.c @@ -0,0 +1,80 @@ +/* + * intercept.c - in-kernel handling for sie intercepts + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Carsten Otte <co...@de...> + * Christian Borntraeger <bor...@de...> + */ + +#include <linux/kvm_host.h> +#include <linux/errno.h> +#include <linux/pagemap.h> + +#include <asm/kvm_host.h> + +#include "kvm-s390.h" + +static int handle_noop(struct kvm_vcpu *vcpu) +{ + switch (vcpu->arch.sie_block->icptcode) { + case 0x10: + vcpu->stat.exit_external_request++; + break; + case 0x14: + vcpu->stat.exit_external_interrupt++; + break; + default: + break; /* nothing */ + } + return 0; +} + +static int handle_stop(struct kvm_vcpu *vcpu) +{ + vcpu->stat.exit_stop_request++; + VCPU_EVENT(vcpu, 3, "%s", "cpu stopped"); + atomic_clear_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags); + return -ENOTSUPP; +} + +static int handle_validity(struct kvm_vcpu *vcpu) +{ + int viwhy = vcpu->arch.sie_block->ipb >> 16; + vcpu->stat.exit_validity++; + if (viwhy == 0x37) { + fault_in_pages_writeable((char __user *) + vcpu->kvm->arch.guest_origin + + vcpu->arch.sie_block->prefix, + PAGE_SIZE); + return 0; + } + VCPU_EVENT(vcpu, 2, "unhandled validity intercept code %d", + viwhy); + return -ENOTSUPP; +} + +static const intercept_handler_t intercept_funcs[0x48 >> 2] = { + [0x00 >> 2] = handle_noop, + [0x10 >> 2] = handle_noop, + [0x14 >> 2] = handle_noop, + [0x20 >> 2] = handle_validity, + [0x28 >> 2] = handle_stop, +}; + +int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu) +{ + intercept_handler_t func; + u8 code = vcpu->arch.sie_block->icptcode; + + if (code & 3 || code > 0x48) + return -ENOTSUPP; + func = intercept_funcs[code >> 2]; + if (func) + return func(vcpu); + return -ENOTSUPP; +} diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 6e1e1d3..a906499 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -23,12 +23,17 @@ #include <asm/lowcore.h> #include <asm/pgtable.h> +#include "kvm-s390.h" #include "gaccess.h" #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU struct kvm_stats_debugfs_item debugfs_entries[] = { { "userspace_handled", VCPU_STAT(exit_userspace) }, + { "exit_validity", VCPU_STAT(exit_validity) }, + { "exit_stop_request", VCPU_STAT(exit_stop_request) }, + { "exit_external_request", VCPU_STAT(exit_external_request) }, + { "exit_external_interrupt", VCPU_STAT(exit_external_interrupt) }, { NULL } }; @@ -380,6 +385,7 @@ static void __vcpu_run(struct kvm_vcpu *vcpu) int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { + int rc; sigset_t sigsaved; vcpu_load(vcpu); @@ -389,7 +395,45 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) atomic_set_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags); - __vcpu_run(vcpu); + switch (kvm_run->exit_reason) { + case KVM_EXIT_S390_SIEIC: + vcpu->arch.sie_block->gpsw.mask = kvm_run->s390_sieic.mask; + vcpu->arch.sie_block->gpsw.addr = kvm_run->s390_sieic.addr; + break; + case KVM_EXIT_UNKNOWN: + case KVM_EXIT_S390_RESET: + break; + default: + BUG(); + } + + might_sleep(); + + do { + __vcpu_run(vcpu); + + rc = kvm_handle_sie_intercept(vcpu); + } while (!signal_pending(current) && !rc); + + if (signal_pending(current) && !rc) + rc = -EINTR; + + if (rc == -ENOTSUPP) { + /* intercept cannot be handled in-kernel, prepare kvm-run */ + kvm_run->exit_reason = KVM_EXIT_S390_SIEIC; + kvm_run->s390_sieic.icptcode = vcpu->arch.sie_block->icptcode; + kvm_run->s390_sieic.mask = vcpu->arch.sie_block->gpsw.mask; + kvm_run->s390_sieic.addr = vcpu->arch.sie_block->gpsw.addr; + kvm_run->s390_sieic.ipa = vcpu->arch.sie_block->ipa; + kvm_run->s390_sieic.ipb = vcpu->arch.sie_block->ipb; + rc = 0; + } + + if (rc == -EREMOTE) { + /* intercept was handled, but userspace support is needed + * kvm_run has been prepared by the handler */ + rc = 0; + } if (vcpu->sigset_active) sigprocmask(SIG_SETMASK, &sigsaved, NULL); diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index ed64a22..5b82527 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -13,6 +13,13 @@ #ifndef ARCH_S390_KVM_S390_H #define ARCH_S390_KVM_S390_H + +#include <linux/kvm_host.h> + +typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu); + +int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu); + #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\ do { \ debug_sprintf_event(d_kvm->arch.dbf, d_loglevel, d_string "\n", \ diff --git a/include/asm-s390/kvm_host.h b/include/asm-s390/kvm_host.h index c9d6533..8965b38 100644 --- a/include/asm-s390/kvm_host.h +++ b/include/asm-s390/kvm_host.h @@ -101,6 +101,10 @@ struct sie_block { struct kvm_vcpu_stat { u32 exit_userspace; + u32 exit_external_request; + u32 exit_external_interrupt; + u32 exit_stop_request; + u32 exit_validity; }; struct kvm_vcpu_arch { diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 2367ff0..f2acd6b 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -74,6 +74,7 @@ struct kvm_irqchip { #define KVM_EXIT_INTR 10 #define KVM_EXIT_SET_TPR 11 #define KVM_EXIT_TPR_ACCESS 12 +#define KVM_EXIT_S390_SIEIC 13 /* for KVM_RUN, returned by mmap(vcpu_fd, offset=0) */ struct kvm_run { @@ -138,6 +139,14 @@ struct kvm_run { __u32 is_write; __u32 pad; } tpr_access; + /* KVM_EXIT_S390_SIEIC */ + struct { + __u8 icptcode; + __u64 mask; /* psw upper half */ + __u64 addr; /* psw lower half */ + __u16 ipa; + __u32 ipb; + } s390_sieic; /* Fix the size of the union. */ char padding[256]; }; -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:05
|
From: Carsten Otte <co...@de...> From: Christian Borntraeger <bor...@de...> This patch introduces in-kernel handling of _some_ sigp interprocessor signals (similar to ipi). kvm_s390_handle_sigp() decodes the sigp instruction and calls individual handlers depending on the operation requested: - sigp sense tries to retrieve information such as existence or running state of the remote cpu - sigp emergency sends an external interrupt to the remove cpu - sigp stop stops a remove cpu - sigp stop store status stops a remote cpu, and stores its entire internal state to the cpus lowcore - sigp set arch sets the architecture mode of the remote cpu. setting to ESAME (s390x 64bit) is accepted, setting to ESA/S390 (s390, 31 or 24 bit) is denied, all others are passed to userland - sigp set prefix sets the prefix register of a remote cpu For implementation of this, the stop intercept indication starts to get reused on purpose: a set of action bits defines what to do once a cpu gets stopped: ACTION_STOP_ON_STOP really stops the cpu when a stop intercept is recognized ACTION_STORE_ON_STOP stores the cpu status to lowcore when a stop intercept is recognized Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/Makefile | 2 +- arch/s390/kvm/intercept.c | 22 +++- arch/s390/kvm/kvm-s390.c | 7 + arch/s390/kvm/kvm-s390.h | 7 + arch/s390/kvm/sigp.c | 288 +++++++++++++++++++++++++++++++++++++++++++ include/asm-s390/kvm_host.h | 12 ++ 6 files changed, 335 insertions(+), 3 deletions(-) create mode 100644 arch/s390/kvm/sigp.c diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile index 82dde1e..f3bf11a 100644 --- a/arch/s390/kvm/Makefile +++ b/arch/s390/kvm/Makefile @@ -10,5 +10,5 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o) EXTRA_CFLAGS += -Ivirt/kvm -Iarch/s390/kvm -kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o priv.o +kvm-objs := $(common-objs) kvm-s390.o sie64a.o intercept.o interrupt.o priv.o sigp.o obj-$(CONFIG_KVM) += kvm.o diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c index 7a20d63..9f0d8b2 100644 --- a/arch/s390/kvm/intercept.c +++ b/arch/s390/kvm/intercept.c @@ -95,6 +95,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu) } static intercept_handler_t instruction_handlers[256] = { + [0xae] = kvm_s390_handle_sigp, [0xb2] = kvm_s390_handle_priv, [0xb7] = handle_lctl, [0xeb] = handle_lctg, @@ -117,10 +118,27 @@ static int handle_noop(struct kvm_vcpu *vcpu) static int handle_stop(struct kvm_vcpu *vcpu) { + int rc; + vcpu->stat.exit_stop_request++; - VCPU_EVENT(vcpu, 3, "%s", "cpu stopped"); atomic_clear_mask(CPUSTAT_RUNNING, &vcpu->arch.sie_block->cpuflags); - return -ENOTSUPP; + spin_lock_bh(&vcpu->arch.local_int.lock); + if (vcpu->arch.local_int.action_bits & ACTION_STORE_ON_STOP) { + vcpu->arch.local_int.action_bits &= ~ACTION_STORE_ON_STOP; + rc = __kvm_s390_vcpu_store_status(vcpu, + KVM_S390_STORE_STATUS_NOADDR); + if (rc >= 0) + rc = -ENOTSUPP; + } + + if (vcpu->arch.local_int.action_bits & ACTION_STOP_ON_STOP) { + vcpu->arch.local_int.action_bits &= ~ACTION_STOP_ON_STOP; + VCPU_EVENT(vcpu, 3, "%s", "cpu stopped"); + rc = -ENOTSUPP; + } else + rc = 0; + spin_unlock_bh(&vcpu->arch.local_int.lock); + return rc; } static int handle_validity(struct kvm_vcpu *vcpu) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5a17176..c632180 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -57,6 +57,12 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { "instruction_chsc", VCPU_STAT(instruction_chsc) }, { "instruction_stsi", VCPU_STAT(instruction_stsi) }, { "instruction_stfl", VCPU_STAT(instruction_stfl) }, + { "instruction_sigp_sense", VCPU_STAT(instruction_sigp_sense) }, + { "instruction_sigp_emergency", VCPU_STAT(instruction_sigp_emergency) }, + { "instruction_sigp_stop", VCPU_STAT(instruction_sigp_stop) }, + { "instruction_sigp_set_arch", VCPU_STAT(instruction_sigp_arch) }, + { "instruction_sigp_set_prefix", VCPU_STAT(instruction_sigp_prefix) }, + { "instruction_sigp_restart", VCPU_STAT(instruction_sigp_restart) }, { NULL } }; @@ -287,6 +293,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, spin_lock_bh(&kvm->arch.float_int.lock); kvm->arch.float_int.local_int[id] = &vcpu->arch.local_int; init_waitqueue_head(&vcpu->arch.local_int.wq); + vcpu->arch.local_int.cpuflags = &vcpu->arch.sie_block->cpuflags; spin_unlock_bh(&kvm->arch.float_int.lock); rc = kvm_vcpu_init(vcpu, kvm, id); diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 50f96b3..e6e5756 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -51,4 +51,11 @@ int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code); /* implemented in priv.c */ int kvm_s390_handle_priv(struct kvm_vcpu *vcpu); + +/* implemented in sigp.c */ +int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu); + +/* implemented in kvm-s390.c */ +int __kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, + unsigned long addr); #endif diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c new file mode 100644 index 0000000..0a236ac --- /dev/null +++ b/arch/s390/kvm/sigp.c @@ -0,0 +1,288 @@ +/* + * sigp.c - handlinge interprocessor communication + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Carsten Otte <co...@de...> + * Christian Borntraeger <bor...@de...> + */ + +#include <linux/kvm.h> +#include <linux/kvm_host.h> +#include "gaccess.h" +#include "kvm-s390.h" + +/* sigp order codes */ +#define SIGP_SENSE 0x01 +#define SIGP_EXTERNAL_CALL 0x02 +#define SIGP_EMERGENCY 0x03 +#define SIGP_START 0x04 +#define SIGP_STOP 0x05 +#define SIGP_RESTART 0x06 +#define SIGP_STOP_STORE_STATUS 0x09 +#define SIGP_INITIAL_CPU_RESET 0x0b +#define SIGP_CPU_RESET 0x0c +#define SIGP_SET_PREFIX 0x0d +#define SIGP_STORE_STATUS_ADDR 0x0e +#define SIGP_SET_ARCH 0x12 + +/* cpu status bits */ +#define SIGP_STAT_EQUIPMENT_CHECK 0x80000000UL +#define SIGP_STAT_INCORRECT_STATE 0x00000200UL +#define SIGP_STAT_INVALID_PARAMETER 0x00000100UL +#define SIGP_STAT_EXT_CALL_PENDING 0x00000080UL +#define SIGP_STAT_STOPPED 0x00000040UL +#define SIGP_STAT_OPERATOR_INTERV 0x00000020UL +#define SIGP_STAT_CHECK_STOP 0x00000010UL +#define SIGP_STAT_INOPERATIVE 0x00000004UL +#define SIGP_STAT_INVALID_ORDER 0x00000002UL +#define SIGP_STAT_RECEIVER_CHECK 0x00000001UL + + +static int __sigp_sense(struct kvm_vcpu *vcpu, u16 cpu_addr, u64 *reg) +{ + struct float_interrupt *fi = &vcpu->kvm->arch.float_int; + int rc; + + if (cpu_addr >= KVM_MAX_VCPUS) + return 3; /* not operational */ + + spin_lock_bh(&fi->lock); + if (fi->local_int[cpu_addr] == NULL) + rc = 3; /* not operational */ + else if (atomic_read(fi->local_int[cpu_addr]->cpuflags) + & CPUSTAT_RUNNING) { + *reg &= 0xffffffff00000000UL; + rc = 1; /* status stored */ + } else { + *reg &= 0xffffffff00000000UL; + *reg |= SIGP_STAT_STOPPED; + rc = 1; /* status stored */ + } + spin_unlock_bh(&fi->lock); + + VCPU_EVENT(vcpu, 4, "sensed status of cpu %x rc %x", cpu_addr, rc); + return rc; +} + +static int __sigp_emergency(struct kvm_vcpu *vcpu, u16 cpu_addr) +{ + struct float_interrupt *fi = &vcpu->kvm->arch.float_int; + struct local_interrupt *li; + struct interrupt_info *inti; + int rc; + + if (cpu_addr >= KVM_MAX_VCPUS) + return 3; /* not operational */ + + inti = kzalloc(sizeof(*inti), GFP_KERNEL); + if (!inti) + return -ENOMEM; + + inti->type = KVM_S390_INT_EMERGENCY; + + spin_lock_bh(&fi->lock); + li = fi->local_int[cpu_addr]; + if (li == NULL) { + rc = 3; /* not operational */ + kfree(inti); + goto unlock; + } + spin_lock_bh(&li->lock); + list_add_tail(&inti->list, &li->list); + atomic_set(&li->active, 1); + atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags); + if (waitqueue_active(&li->wq)) + wake_up_interruptible(&li->wq); + spin_unlock_bh(&li->lock); + rc = 0; /* order accepted */ +unlock: + spin_unlock_bh(&fi->lock); + VCPU_EVENT(vcpu, 4, "sent sigp emerg to cpu %x", cpu_addr); + return rc; +} + +static int __sigp_stop(struct kvm_vcpu *vcpu, u16 cpu_addr, int store) +{ + struct float_interrupt *fi = &vcpu->kvm->arch.float_int; + struct local_interrupt *li; + struct interrupt_info *inti; + int rc; + + if (cpu_addr >= KVM_MAX_VCPUS) + return 3; /* not operational */ + + inti = kzalloc(sizeof(*inti), GFP_KERNEL); + if (!inti) + return -ENOMEM; + + inti->type = KVM_S390_SIGP_STOP; + + spin_lock_bh(&fi->lock); + li = fi->local_int[cpu_addr]; + if (li == NULL) { + rc = 3; /* not operational */ + kfree(inti); + goto unlock; + } + spin_lock_bh(&li->lock); + list_add_tail(&inti->list, &li->list); + atomic_set(&li->active, 1); + atomic_set_mask(CPUSTAT_STOP_INT, li->cpuflags); + if (store) + li->action_bits |= ACTION_STORE_ON_STOP; + li->action_bits |= ACTION_STOP_ON_STOP; + if (waitqueue_active(&li->wq)) + wake_up_interruptible(&li->wq); + spin_unlock_bh(&li->lock); + rc = 0; /* order accepted */ +unlock: + spin_unlock_bh(&fi->lock); + VCPU_EVENT(vcpu, 4, "sent sigp stop to cpu %x", cpu_addr); + return rc; +} + +static int __sigp_set_arch(struct kvm_vcpu *vcpu, u32 parameter) +{ + int rc; + + switch (parameter & 0xff) { + case 0: + printk(KERN_WARNING "kvm: request to switch to ESA/390 mode" + " not supported"); + rc = 3; /* not operational */ + break; + case 1: + case 2: + rc = 0; /* order accepted */ + break; + default: + rc = -ENOTSUPP; + } + return rc; +} + +static int __sigp_set_prefix(struct kvm_vcpu *vcpu, u16 cpu_addr, u32 address, + u64 *reg) +{ + struct float_interrupt *fi = &vcpu->kvm->arch.float_int; + struct local_interrupt *li; + struct interrupt_info *inti; + int rc; + u8 tmp; + + /* make sure that the new value is valid memory */ + address = address & 0x7fffe000u; + if ((copy_from_guest(vcpu, &tmp, + (u64) (address + vcpu->kvm->arch.guest_origin) , 1)) || + (copy_from_guest(vcpu, &tmp, (u64) (address + + vcpu->kvm->arch.guest_origin + PAGE_SIZE), 1))) { + *reg |= SIGP_STAT_INVALID_PARAMETER; + return 1; /* invalid parameter */ + } + + inti = kzalloc(sizeof(*inti), GFP_KERNEL); + if (!inti) + return 2; /* busy */ + + spin_lock_bh(&fi->lock); + li = fi->local_int[cpu_addr]; + + if ((cpu_addr >= KVM_MAX_VCPUS) || (li == NULL)) { + rc = 1; /* incorrect state */ + *reg &= SIGP_STAT_INCORRECT_STATE; + kfree(inti); + goto out_fi; + } + + spin_lock_bh(&li->lock); + /* cpu must be in stopped state */ + if (atomic_read(li->cpuflags) & CPUSTAT_RUNNING) { + rc = 1; /* incorrect state */ + *reg &= SIGP_STAT_INCORRECT_STATE; + kfree(inti); + goto out_li; + } + + inti->type = KVM_S390_SIGP_SET_PREFIX; + inti->prefix.address = address; + + list_add_tail(&inti->list, &li->list); + atomic_set(&li->active, 1); + if (waitqueue_active(&li->wq)) + wake_up_interruptible(&li->wq); + rc = 0; /* order accepted */ + + VCPU_EVENT(vcpu, 4, "set prefix of cpu %02x to %x", cpu_addr, address); +out_li: + spin_unlock_bh(&li->lock); +out_fi: + spin_unlock_bh(&fi->lock); + return rc; +} + +int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu) +{ + int r1 = (vcpu->arch.sie_block->ipa & 0x00f0) >> 4; + int r3 = vcpu->arch.sie_block->ipa & 0x000f; + int base2 = vcpu->arch.sie_block->ipb >> 28; + int disp2 = ((vcpu->arch.sie_block->ipb & 0x0fff0000) >> 16); + u32 parameter; + u16 cpu_addr = vcpu->arch.guest_gprs[r3]; + u8 order_code; + int rc; + + order_code = disp2; + if (base2) + order_code += vcpu->arch.guest_gprs[base2]; + + if (r1 % 2) + parameter = vcpu->arch.guest_gprs[r1]; + else + parameter = vcpu->arch.guest_gprs[r1 + 1]; + + switch (order_code) { + case SIGP_SENSE: + vcpu->stat.instruction_sigp_sense++; + rc = __sigp_sense(vcpu, cpu_addr, + &vcpu->arch.guest_gprs[r1]); + break; + case SIGP_EMERGENCY: + vcpu->stat.instruction_sigp_emergency++; + rc = __sigp_emergency(vcpu, cpu_addr); + break; + case SIGP_STOP: + vcpu->stat.instruction_sigp_stop++; + rc = __sigp_stop(vcpu, cpu_addr, 0); + break; + case SIGP_STOP_STORE_STATUS: + vcpu->stat.instruction_sigp_stop++; + rc = __sigp_stop(vcpu, cpu_addr, 1); + break; + case SIGP_SET_ARCH: + vcpu->stat.instruction_sigp_arch++; + rc = __sigp_set_arch(vcpu, parameter); + break; + case SIGP_SET_PREFIX: + vcpu->stat.instruction_sigp_prefix++; + rc = __sigp_set_prefix(vcpu, cpu_addr, parameter, + &vcpu->arch.guest_gprs[r1]); + break; + case SIGP_RESTART: + vcpu->stat.instruction_sigp_restart++; + /* user space must know about restart */ + default: + return -ENOTSUPP; + } + + if (rc < 0) + return rc; + + vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44); + vcpu->arch.sie_block->gpsw.mask |= (rc & 3ul) << 44; + return 0; +} diff --git a/include/asm-s390/kvm_host.h b/include/asm-s390/kvm_host.h index 2eaf6fe..1c829bd 100644 --- a/include/asm-s390/kvm_host.h +++ b/include/asm-s390/kvm_host.h @@ -128,6 +128,12 @@ struct kvm_vcpu_stat { u32 instruction_chsc; u32 instruction_stsi; u32 instruction_stfl; + u32 instruction_sigp_sense; + u32 instruction_sigp_emergency; + u32 instruction_sigp_stop; + u32 instruction_sigp_arch; + u32 instruction_sigp_prefix; + u32 instruction_sigp_restart; }; struct io_info { @@ -169,6 +175,10 @@ struct interrupt_info { }; }; +/* for local_interrupt.action_flags */ +#define ACTION_STORE_ON_STOP 1 +#define ACTION_STOP_ON_STOP 2 + struct local_interrupt { spinlock_t lock; struct list_head list; @@ -176,6 +186,8 @@ struct local_interrupt { struct float_interrupt *float_int; int timer_due; /* event indicator for waitqueue below */ wait_queue_head_t wq; + atomic_t *cpuflags; + unsigned int action_bits; }; struct float_interrupt { -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:05
|
From: Sheng Yang <she...@in...> MSR Bitmap controls whether the accessing of an MSR causes VM Exit. Eliminating exits on automatically saved and restored MSRs yields a small performance gain. Signed-off-by: Sheng Yang <she...@in...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/vmx.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 60 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index cbca46a..87eee7a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -91,6 +91,7 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs); static struct page *vmx_io_bitmap_a; static struct page *vmx_io_bitmap_b; +static struct page *vmx_msr_bitmap; static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS); static DEFINE_SPINLOCK(vmx_vpid_lock); @@ -185,6 +186,11 @@ static inline int is_external_interrupt(u32 intr_info) == (INTR_TYPE_EXT_INTR | INTR_INFO_VALID_MASK); } +static inline int cpu_has_vmx_msr_bitmap(void) +{ + return (vmcs_config.cpu_based_exec_ctrl & CPU_BASED_USE_MSR_BITMAPS); +} + static inline int cpu_has_vmx_tpr_shadow(void) { return (vmcs_config.cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW); @@ -1001,6 +1007,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) CPU_BASED_MOV_DR_EXITING | CPU_BASED_USE_TSC_OFFSETING; opt = CPU_BASED_TPR_SHADOW | + CPU_BASED_USE_MSR_BITMAPS | CPU_BASED_ACTIVATE_SECONDARY_CONTROLS; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS, &_cpu_based_exec_control) < 0) @@ -1575,6 +1582,30 @@ static void allocate_vpid(struct vcpu_vmx *vmx) spin_unlock(&vmx_vpid_lock); } +void vmx_disable_intercept_for_msr(struct page *msr_bitmap, u32 msr) +{ + void *va; + + if (!cpu_has_vmx_msr_bitmap()) + return; + + /* + * See Intel PRM Vol. 3, 20.6.9 (MSR-Bitmap Address). Early manuals + * have the write-low and read-high bitmap offsets the wrong way round. + * We can control MSRs 0x00000000-0x00001fff and 0xc0000000-0xc0001fff. + */ + va = kmap(msr_bitmap); + if (msr <= 0x1fff) { + __clear_bit(msr, va + 0x000); /* read-low */ + __clear_bit(msr, va + 0x800); /* write-low */ + } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) { + msr &= 0x1fff; + __clear_bit(msr, va + 0x400); /* read-high */ + __clear_bit(msr, va + 0xc00); /* write-high */ + } + kunmap(msr_bitmap); +} + /* * Sets up the vmcs for emulated real mode. */ @@ -1592,6 +1623,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) vmcs_write64(IO_BITMAP_A, page_to_phys(vmx_io_bitmap_a)); vmcs_write64(IO_BITMAP_B, page_to_phys(vmx_io_bitmap_b)); + if (cpu_has_vmx_msr_bitmap()) + vmcs_write64(MSR_BITMAP, page_to_phys(vmx_msr_bitmap)); + vmcs_write64(VMCS_LINK_POINTER, -1ull); /* 22.3.1.5 */ /* Control */ @@ -2728,7 +2762,7 @@ static struct kvm_x86_ops vmx_x86_ops = { static int __init vmx_init(void) { - void *iova; + void *va; int r; vmx_io_bitmap_a = alloc_page(GFP_KERNEL | __GFP_HIGHMEM); @@ -2741,30 +2775,48 @@ static int __init vmx_init(void) goto out; } + vmx_msr_bitmap = alloc_page(GFP_KERNEL | __GFP_HIGHMEM); + if (!vmx_msr_bitmap) { + r = -ENOMEM; + goto out1; + } + /* * Allow direct access to the PC debug port (it is often used for I/O * delays, but the vmexits simply slow things down). */ - iova = kmap(vmx_io_bitmap_a); - memset(iova, 0xff, PAGE_SIZE); - clear_bit(0x80, iova); + va = kmap(vmx_io_bitmap_a); + memset(va, 0xff, PAGE_SIZE); + clear_bit(0x80, va); kunmap(vmx_io_bitmap_a); - iova = kmap(vmx_io_bitmap_b); - memset(iova, 0xff, PAGE_SIZE); + va = kmap(vmx_io_bitmap_b); + memset(va, 0xff, PAGE_SIZE); kunmap(vmx_io_bitmap_b); + va = kmap(vmx_msr_bitmap); + memset(va, 0xff, PAGE_SIZE); + kunmap(vmx_msr_bitmap); + set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */ r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx), THIS_MODULE); if (r) - goto out1; + goto out2; + + vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_FS_BASE); + vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_GS_BASE); + vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_IA32_SYSENTER_CS); + vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_IA32_SYSENTER_ESP); + vmx_disable_intercept_for_msr(vmx_msr_bitmap, MSR_IA32_SYSENTER_EIP); if (bypass_guest_pf) kvm_mmu_set_nonpresent_ptes(~0xffeull, 0ull); return 0; +out2: + __free_page(vmx_msr_bitmap); out1: __free_page(vmx_io_bitmap_b); out: @@ -2774,6 +2826,7 @@ out: static void __exit vmx_exit(void) { + __free_page(vmx_msr_bitmap); __free_page(vmx_io_bitmap_b); __free_page(vmx_io_bitmap_a); -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:07
|
From: Izik Eidus <iz...@qu...> Allow the Linux memory manager to reclaim memory in the kvm shadow cache. Signed-off-by: Izik Eidus <iz...@qu...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/mmu.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 56 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index c563283..1594ee0 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1966,7 +1966,53 @@ void kvm_mmu_zap_all(struct kvm *kvm) kvm_flush_remote_tlbs(kvm); } -void kvm_mmu_module_exit(void) +void kvm_mmu_remove_one_alloc_mmu_page(struct kvm *kvm) +{ + struct kvm_mmu_page *page; + + page = container_of(kvm->arch.active_mmu_pages.prev, + struct kvm_mmu_page, link); + kvm_mmu_zap_page(kvm, page); +} + +static int mmu_shrink(int nr_to_scan, gfp_t gfp_mask) +{ + struct kvm *kvm; + struct kvm *kvm_freed = NULL; + int cache_count = 0; + + spin_lock(&kvm_lock); + + list_for_each_entry(kvm, &vm_list, vm_list) { + int npages; + + spin_lock(&kvm->mmu_lock); + npages = kvm->arch.n_alloc_mmu_pages - + kvm->arch.n_free_mmu_pages; + cache_count += npages; + if (!kvm_freed && nr_to_scan > 0 && npages > 0) { + kvm_mmu_remove_one_alloc_mmu_page(kvm); + cache_count--; + kvm_freed = kvm; + } + nr_to_scan--; + + spin_unlock(&kvm->mmu_lock); + } + if (kvm_freed) + list_move_tail(&kvm_freed->vm_list, &vm_list); + + spin_unlock(&kvm_lock); + + return cache_count; +} + +static struct shrinker mmu_shrinker = { + .shrink = mmu_shrink, + .seeks = DEFAULT_SEEKS * 10, +}; + +void mmu_destroy_caches(void) { if (pte_chain_cache) kmem_cache_destroy(pte_chain_cache); @@ -1976,6 +2022,12 @@ void kvm_mmu_module_exit(void) kmem_cache_destroy(mmu_page_header_cache); } +void kvm_mmu_module_exit(void) +{ + mmu_destroy_caches(); + unregister_shrinker(&mmu_shrinker); +} + int kvm_mmu_module_init(void) { pte_chain_cache = kmem_cache_create("kvm_pte_chain", @@ -1995,10 +2047,12 @@ int kvm_mmu_module_init(void) if (!mmu_page_header_cache) goto nomem; + register_shrinker(&mmu_shrinker); + return 0; nomem: - kvm_mmu_module_exit(); + mmu_destroy_caches(); return -ENOMEM; } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:07
|
If the accessed bit is not set, the guest has never accessed this page (at least through this spte), so there's no need to mark the page accessed. This provides more accurate data for the eviction algortithm. Noted by Andrea Arcangeli. Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/mmu.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 1594ee0..5c4c166 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -549,7 +549,8 @@ static void rmap_remove(struct kvm *kvm, u64 *spte) return; sp = page_header(__pa(spte)); page = spte_to_page(*spte); - mark_page_accessed(page); + if (*spte & PT_ACCESSED_MASK) + mark_page_accessed(page); if (is_writeble_pte(*spte)) kvm_release_page_dirty(page); else -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:07
|
From: Heiko Carstens <hei...@de...> kvm_arch_vcpu_ioctl_run currently incorrectly always returns 0. Signed-off-by: Heiko Carstens <hei...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/s390/kvm/kvm-s390.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d3b1de8..d966137 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -497,7 +497,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu_put(vcpu); vcpu->stat.exit_userspace++; - return 0; + return rc; } static int __guestcopy(struct kvm_vcpu *vcpu, u64 guestdest, const void *from, -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:08
|
From: Christian Borntraeger <bor...@de...> This patch implements kvm guest kernel support for paravirtualized devices and contains two parts: o a basic virtio stub using virtio_ring and external interrupts and hypercalls o full hypercall implementation in kvm_para.h Currently we dont have PCI on s390. Making virtio_pci usable for s390 seems more complicated that providing an own stub. This virtio stub is similar to the lguest one, the memory for the descriptors and the device detection is made via additional mapped memory on top of the guest storage. We use an external interrupt with extint code 0x2603 for host->guest notification. The hypercall definition uses the diag instruction for issuing a hypercall. The parameters are written in R2-R7, the hypercall number is written in R1. This is similar to the system call ABI (svc) which can use R1 for the number and R2-R6 for the parameters. Signed-off-by: Christian Borntraeger <bor...@de...> Acked-by: Martin Schwidefsky <sch...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- drivers/s390/Makefile | 2 +- drivers/s390/kvm/Makefile | 9 + drivers/s390/kvm/kvm_virtio.c | 338 +++++++++++++++++++++++++++++++++++++++++ include/asm-s390/kvm_para.h | 124 +++++++++++++++- include/asm-s390/kvm_virtio.h | 53 +++++++ 5 files changed, 523 insertions(+), 3 deletions(-) create mode 100644 drivers/s390/kvm/Makefile create mode 100644 drivers/s390/kvm/kvm_virtio.c create mode 100644 include/asm-s390/kvm_virtio.h diff --git a/drivers/s390/Makefile b/drivers/s390/Makefile index 5a88870..4f4e7cf 100644 --- a/drivers/s390/Makefile +++ b/drivers/s390/Makefile @@ -5,7 +5,7 @@ CFLAGS_sysinfo.o += -Iinclude/math-emu -Iarch/s390/math-emu -w obj-y += s390mach.o sysinfo.o s390_rdev.o -obj-y += cio/ block/ char/ crypto/ net/ scsi/ +obj-y += cio/ block/ char/ crypto/ net/ scsi/ kvm/ drivers-y += drivers/s390/built-in.o diff --git a/drivers/s390/kvm/Makefile b/drivers/s390/kvm/Makefile new file mode 100644 index 0000000..4a5ec39 --- /dev/null +++ b/drivers/s390/kvm/Makefile @@ -0,0 +1,9 @@ +# Makefile for kvm guest drivers on s390 +# +# Copyright IBM Corp. 2008 +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License (version 2 only) +# as published by the Free Software Foundation. + +obj-$(CONFIG_VIRTIO) += kvm_virtio.o diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c new file mode 100644 index 0000000..bbef376 --- /dev/null +++ b/drivers/s390/kvm/kvm_virtio.c @@ -0,0 +1,338 @@ +/* + * kvm_virtio.c - virtio for kvm on s390 + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Christian Borntraeger <bor...@de...> + */ + +#include <linux/init.h> +#include <linux/bootmem.h> +#include <linux/err.h> +#include <linux/virtio.h> +#include <linux/virtio_config.h> +#include <linux/interrupt.h> +#include <linux/virtio_ring.h> +#include <asm/io.h> +#include <asm/kvm_para.h> +#include <asm/kvm_virtio.h> +#include <asm/setup.h> +#include <asm/s390_ext.h> + +#define VIRTIO_SUBCODE_64 0x0D00 + +/* + * The pointer to our (page) of device descriptions. + */ +static void *kvm_devices; + +/* + * Unique numbering for kvm devices. + */ +static unsigned int dev_index; + +struct kvm_device { + struct virtio_device vdev; + struct kvm_device_desc *desc; +}; + +#define to_kvmdev(vd) container_of(vd, struct kvm_device, vdev) + +/* + * memory layout: + * - kvm_device_descriptor + * struct kvm_device_desc + * - configuration + * struct kvm_vqconfig + * - feature bits + * - config space + */ +static struct kvm_vqconfig *kvm_vq_config(const struct kvm_device_desc *desc) +{ + return (struct kvm_vqconfig *)(desc + 1); +} + +static u8 *kvm_vq_features(const struct kvm_device_desc *desc) +{ + return (u8 *)(kvm_vq_config(desc) + desc->num_vq); +} + +static u8 *kvm_vq_configspace(const struct kvm_device_desc *desc) +{ + return kvm_vq_features(desc) + desc->feature_len * 2; +} + +/* + * The total size of the config page used by this device (incl. desc) + */ +static unsigned desc_size(const struct kvm_device_desc *desc) +{ + return sizeof(*desc) + + desc->num_vq * sizeof(struct kvm_vqconfig) + + desc->feature_len * 2 + + desc->config_len; +} + +/* + * This tests (and acknowleges) a feature bit. + */ +static bool kvm_feature(struct virtio_device *vdev, unsigned fbit) +{ + struct kvm_device_desc *desc = to_kvmdev(vdev)->desc; + u8 *features; + + if (fbit / 8 > desc->feature_len) + return false; + + features = kvm_vq_features(desc); + if (!(features[fbit / 8] & (1 << (fbit % 8)))) + return false; + + /* + * We set the matching bit in the other half of the bitmap to tell the + * Host we want to use this feature. + */ + features[desc->feature_len + fbit / 8] |= (1 << (fbit % 8)); + return true; +} + +/* + * Reading and writing elements in config space + */ +static void kvm_get(struct virtio_device *vdev, unsigned int offset, + void *buf, unsigned len) +{ + struct kvm_device_desc *desc = to_kvmdev(vdev)->desc; + + BUG_ON(offset + len > desc->config_len); + memcpy(buf, kvm_vq_configspace(desc) + offset, len); +} + +static void kvm_set(struct virtio_device *vdev, unsigned int offset, + const void *buf, unsigned len) +{ + struct kvm_device_desc *desc = to_kvmdev(vdev)->desc; + + BUG_ON(offset + len > desc->config_len); + memcpy(kvm_vq_configspace(desc) + offset, buf, len); +} + +/* + * The operations to get and set the status word just access + * the status field of the device descriptor. set_status will also + * make a hypercall to the host, to tell about status changes + */ +static u8 kvm_get_status(struct virtio_device *vdev) +{ + return to_kvmdev(vdev)->desc->status; +} + +static void kvm_set_status(struct virtio_device *vdev, u8 status) +{ + BUG_ON(!status); + to_kvmdev(vdev)->desc->status = status; + kvm_hypercall1(KVM_S390_VIRTIO_SET_STATUS, + (unsigned long) to_kvmdev(vdev)->desc); +} + +/* + * To reset the device, we use the KVM_VIRTIO_RESET hypercall, using the + * descriptor address. The Host will zero the status and all the + * features. + */ +static void kvm_reset(struct virtio_device *vdev) +{ + kvm_hypercall1(KVM_S390_VIRTIO_RESET, + (unsigned long) to_kvmdev(vdev)->desc); +} + +/* + * When the virtio_ring code wants to notify the Host, it calls us here and we + * make a hypercall. We hand the address of the virtqueue so the Host + * knows which virtqueue we're talking about. + */ +static void kvm_notify(struct virtqueue *vq) +{ + struct kvm_vqconfig *config = vq->priv; + + kvm_hypercall1(KVM_S390_VIRTIO_NOTIFY, config->address); +} + +/* + * This routine finds the first virtqueue described in the configuration of + * this device and sets it up. + */ +static struct virtqueue *kvm_find_vq(struct virtio_device *vdev, + unsigned index, + void (*callback)(struct virtqueue *vq)) +{ + struct kvm_device *kdev = to_kvmdev(vdev); + struct kvm_vqconfig *config; + struct virtqueue *vq; + int err; + + if (index >= kdev->desc->num_vq) + return ERR_PTR(-ENOENT); + + config = kvm_vq_config(kdev->desc)+index; + + if (add_shared_memory(config->address, + vring_size(config->num, PAGE_SIZE))) { + err = -ENOMEM; + goto out; + } + + vq = vring_new_virtqueue(config->num, vdev, (void *) config->address, + kvm_notify, callback); + if (!vq) { + err = -ENOMEM; + goto unmap; + } + + /* + * register a callback token + * The host will sent this via the external interrupt parameter + */ + config->token = (u64) vq; + + vq->priv = config; + return vq; +unmap: + remove_shared_memory(config->address, vring_size(config->num, + PAGE_SIZE)); +out: + return ERR_PTR(err); +} + +static void kvm_del_vq(struct virtqueue *vq) +{ + struct kvm_vqconfig *config = vq->priv; + + vring_del_virtqueue(vq); + remove_shared_memory(config->address, + vring_size(config->num, PAGE_SIZE)); +} + +/* + * The config ops structure as defined by virtio config + */ +static struct virtio_config_ops kvm_vq_configspace_ops = { + .feature = kvm_feature, + .get = kvm_get, + .set = kvm_set, + .get_status = kvm_get_status, + .set_status = kvm_set_status, + .reset = kvm_reset, + .find_vq = kvm_find_vq, + .del_vq = kvm_del_vq, +}; + +/* + * The root device for the kvm virtio devices. + * This makes them appear as /sys/devices/kvm_s390/0,1,2 not /sys/devices/0,1,2. + */ +static struct device kvm_root = { + .parent = NULL, + .bus_id = "kvm_s390", +}; + +/* + * adds a new device and register it with virtio + * appropriate drivers are loaded by the device model + */ +static void add_kvm_device(struct kvm_device_desc *d) +{ + struct kvm_device *kdev; + + kdev = kzalloc(sizeof(*kdev), GFP_KERNEL); + if (!kdev) { + printk(KERN_EMERG "Cannot allocate kvm dev %u\n", + dev_index++); + return; + } + + kdev->vdev.dev.parent = &kvm_root; + kdev->vdev.index = dev_index++; + kdev->vdev.id.device = d->type; + kdev->vdev.config = &kvm_vq_configspace_ops; + kdev->desc = d; + + if (register_virtio_device(&kdev->vdev) != 0) { + printk(KERN_ERR "Failed to register kvm device %u\n", + kdev->vdev.index); + kfree(kdev); + } +} + +/* + * scan_devices() simply iterates through the device page. + * The type 0 is reserved to mean "end of devices". + */ +static void scan_devices(void) +{ + unsigned int i; + struct kvm_device_desc *d; + + for (i = 0; i < PAGE_SIZE; i += desc_size(d)) { + d = kvm_devices + i; + + if (d->type == 0) + break; + + add_kvm_device(d); + } +} + +/* + * we emulate the request_irq behaviour on top of s390 extints + */ +static void kvm_extint_handler(u16 code) +{ + void *data = (void *) *(long *) __LC_PFAULT_INTPARM; + u16 subcode = S390_lowcore.cpu_addr; + + if ((subcode & 0xff00) != VIRTIO_SUBCODE_64) + return; + + vring_interrupt(0, data); +} + +/* + * Init function for virtio + * devices are in a single page above top of "normal" mem + */ +static int __init kvm_devices_init(void) +{ + int rc; + + if (!MACHINE_IS_KVM) + return -ENODEV; + + rc = device_register(&kvm_root); + if (rc) { + printk(KERN_ERR "Could not register kvm_s390 root device"); + return rc; + } + + if (add_shared_memory((max_pfn) << PAGE_SHIFT, PAGE_SIZE)) { + device_unregister(&kvm_root); + return -ENOMEM; + } + + kvm_devices = (void *) (max_pfn << PAGE_SHIFT); + + ctl_set_bit(0, 9); + register_external_interrupt(0x2603, kvm_extint_handler); + + scan_devices(); + return 0; +} + +/* + * We do this after core stuff, but before the drivers. + */ +postcore_initcall(kvm_devices_init); diff --git a/include/asm-s390/kvm_para.h b/include/asm-s390/kvm_para.h index e9bd3fb..2c50379 100644 --- a/include/asm-s390/kvm_para.h +++ b/include/asm-s390/kvm_para.h @@ -14,14 +14,134 @@ #define __S390_KVM_PARA_H /* - * No hypercalls for KVM on s390 + * Hypercalls for KVM on s390. The calling convention is similar to the + * s390 ABI, so we use R2-R6 for parameters 1-5. In addition we use R1 + * as hypercall number and R7 as parameter 6. The return value is + * written to R2. We use the diagnose instruction as hypercall. To avoid + * conflicts with existing diagnoses for LPAR and z/VM, we do not use + * the instruction encoded number, but specify the number in R1 and + * use 0x500 as KVM hypercall + * + * Copyright IBM Corp. 2007,2008 + * Author(s): Christian Borntraeger <bor...@de...> + * + * This work is licensed under the terms of the GNU GPL, version 2. */ +static inline long kvm_hypercall0(unsigned long nr) +{ + register unsigned long __nr asm("1") = nr; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr): "memory", "cc"); + return __rc; +} + +static inline long kvm_hypercall1(unsigned long nr, unsigned long p1) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1) : "memory", "cc"); + return __rc; +} + +static inline long kvm_hypercall2(unsigned long nr, unsigned long p1, + unsigned long p2) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register unsigned long __p2 asm("3") = p2; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2) + : "memory", "cc"); + return __rc; +} + +static inline long kvm_hypercall3(unsigned long nr, unsigned long p1, + unsigned long p2, unsigned long p3) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register unsigned long __p2 asm("3") = p2; + register unsigned long __p3 asm("4") = p3; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2), + "d" (__p3) : "memory", "cc"); + return __rc; +} + + +static inline long kvm_hypercall4(unsigned long nr, unsigned long p1, + unsigned long p2, unsigned long p3, + unsigned long p4) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register unsigned long __p2 asm("3") = p2; + register unsigned long __p3 asm("4") = p3; + register unsigned long __p4 asm("5") = p4; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2), + "d" (__p3), "d" (__p4) : "memory", "cc"); + return __rc; +} + +static inline long kvm_hypercall5(unsigned long nr, unsigned long p1, + unsigned long p2, unsigned long p3, + unsigned long p4, unsigned long p5) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register unsigned long __p2 asm("3") = p2; + register unsigned long __p3 asm("4") = p3; + register unsigned long __p4 asm("5") = p4; + register unsigned long __p5 asm("6") = p5; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2), + "d" (__p3), "d" (__p4), "d" (__p5) : "memory", "cc"); + return __rc; +} + +static inline long kvm_hypercall6(unsigned long nr, unsigned long p1, + unsigned long p2, unsigned long p3, + unsigned long p4, unsigned long p5, + unsigned long p6) +{ + register unsigned long __nr asm("1") = nr; + register unsigned long __p1 asm("2") = p1; + register unsigned long __p2 asm("3") = p2; + register unsigned long __p3 asm("4") = p3; + register unsigned long __p4 asm("5") = p4; + register unsigned long __p5 asm("6") = p5; + register unsigned long __p6 asm("7") = p6; + register long __rc asm("2"); + + asm volatile ("diag 2,4,0x500\n" + : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2), + "d" (__p3), "d" (__p4), "d" (__p5), "d" (__p6) + : "memory", "cc"); + return __rc; +} + +/* kvm on s390 is always paravirtualization enabled */ static inline int kvm_para_available(void) { - return 0; + return 1; } +/* No feature bits are currently assigned for kvm on s390 */ static inline unsigned int kvm_arch_para_features(void) { return 0; diff --git a/include/asm-s390/kvm_virtio.h b/include/asm-s390/kvm_virtio.h new file mode 100644 index 0000000..5c871a9 --- /dev/null +++ b/include/asm-s390/kvm_virtio.h @@ -0,0 +1,53 @@ +/* + * kvm_virtio.h - definition for virtio for kvm on s390 + * + * Copyright IBM Corp. 2008 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License (version 2 only) + * as published by the Free Software Foundation. + * + * Author(s): Christian Borntraeger <bor...@de...> + */ + +#ifndef __KVM_S390_VIRTIO_H +#define __KVM_S390_VIRTIO_H + +#include <linux/types.h> + +struct kvm_device_desc { + /* The device type: console, network, disk etc. Type 0 terminates. */ + __u8 type; + /* The number of virtqueues (first in config array) */ + __u8 num_vq; + /* + * The number of bytes of feature bits. Multiply by 2: one for host + * features and one for guest acknowledgements. + */ + __u8 feature_len; + /* The number of bytes of the config array after virtqueues. */ + __u8 config_len; + /* A status byte, written by the Guest. */ + __u8 status; + __u8 config[0]; +}; + +/* + * This is how we expect the device configuration field for a virtqueue + * to be laid out in config space. + */ +struct kvm_vqconfig { + /* The token returned with an interrupt. Set by the guest */ + __u64 token; + /* The address of the virtio ring */ + __u64 address; + /* The number of entries in the virtio_ring */ + __u16 num; + +}; + +#define KVM_S390_VIRTIO_NOTIFY 0 +#define KVM_S390_VIRTIO_RESET 1 +#define KVM_S390_VIRTIO_SET_STATUS 2 + +#endif -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:11
|
From: Xiantao Zhang <xia...@in...> This interface provides more flexible functionality for smp infrastructure. Signed-off-by: Xiantao Zhang <xia...@in...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kernel/smp.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++ include/asm-ia64/smp.h | 3 ++ 2 files changed, 85 insertions(+), 0 deletions(-) diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c index 4e446aa..9a9d4c4 100644 --- a/arch/ia64/kernel/smp.c +++ b/arch/ia64/kernel/smp.c @@ -213,6 +213,19 @@ send_IPI_allbutself (int op) * Called with preemption disabled. */ static inline void +send_IPI_mask(cpumask_t mask, int op) +{ + unsigned int cpu; + + for_each_cpu_mask(cpu, mask) { + send_IPI_single(cpu, op); + } +} + +/* + * Called with preemption disabled. + */ +static inline void send_IPI_all (int op) { int i; @@ -401,6 +414,75 @@ smp_call_function_single (int cpuid, void (*func) (void *info), void *info, int } EXPORT_SYMBOL(smp_call_function_single); +/** + * smp_call_function_mask(): Run a function on a set of other CPUs. + * <mask> The set of cpus to run on. Must not include the current cpu. + * <func> The function to run. This must be fast and non-blocking. + * <info> An arbitrary pointer to pass to the function. + * <wait> If true, wait (atomically) until function + * has completed on other CPUs. + * + * Returns 0 on success, else a negative status code. + * + * If @wait is true, then returns once @func has returned; otherwise + * it returns just before the target cpu calls @func. + * + * You must not call this function with disabled interrupts or from a + * hardware interrupt handler or from a bottom half handler. + */ +int smp_call_function_mask(cpumask_t mask, + void (*func)(void *), void *info, + int wait) +{ + struct call_data_struct data; + cpumask_t allbutself; + int cpus; + + spin_lock(&call_lock); + allbutself = cpu_online_map; + cpu_clear(smp_processor_id(), allbutself); + + cpus_and(mask, mask, allbutself); + cpus = cpus_weight(mask); + if (!cpus) { + spin_unlock(&call_lock); + return 0; + } + + /* Can deadlock when called with interrupts disabled */ + WARN_ON(irqs_disabled()); + + data.func = func; + data.info = info; + atomic_set(&data.started, 0); + data.wait = wait; + if (wait) + atomic_set(&data.finished, 0); + + call_data = &data; + mb(); /* ensure store to call_data precedes setting of IPI_CALL_FUNC*/ + + /* Send a message to other CPUs */ + if (cpus_equal(mask, allbutself)) + send_IPI_allbutself(IPI_CALL_FUNC); + else + send_IPI_mask(mask, IPI_CALL_FUNC); + + /* Wait for response */ + while (atomic_read(&data.started) != cpus) + cpu_relax(); + + if (wait) + while (atomic_read(&data.finished) != cpus) + cpu_relax(); + call_data = NULL; + + spin_unlock(&call_lock); + return 0; + +} +EXPORT_SYMBOL(smp_call_function_mask); + /* * this function sends a 'generic call function' IPI to all other CPUs * in the system. diff --git a/include/asm-ia64/smp.h b/include/asm-ia64/smp.h index 4fa733d..ec5f355 100644 --- a/include/asm-ia64/smp.h +++ b/include/asm-ia64/smp.h @@ -38,6 +38,9 @@ ia64_get_lid (void) return lid.f.id << 8 | lid.f.eid; } +extern int smp_call_function_mask(cpumask_t mask, void (*func)(void *), + void *info, int wait); + #define hard_smp_processor_id() ia64_get_lid() #ifdef CONFIG_SMP -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:11
|
From: Xiantao Zhang <xia...@in...> vtlb.c includes tlb/VHPT virtulization. Signed-off-by: Anthony Xu <ant...@in...> Signed-off-by: Xiantao Zhang <xia...@in...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/vtlb.c | 636 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 636 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/vtlb.c diff --git a/arch/ia64/kvm/vtlb.c b/arch/ia64/kvm/vtlb.c new file mode 100644 index 0000000..def4576 --- /dev/null +++ b/arch/ia64/kvm/vtlb.c @@ -0,0 +1,636 @@ +/* + * vtlb.c: guest virtual tlb handling module. + * Copyright (c) 2004, Intel Corporation. + * Yaozu Dong (Eddie Dong) <Edd...@in...> + * Xuefei Xu (Anthony Xu) <ant...@in...> + * + * Copyright (c) 2007, Intel Corporation. + * Xuefei Xu (Anthony Xu) <ant...@in...> + * Xiantao Zhang <xia...@in...> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ + +#include "vcpu.h" + +#include <linux/rwsem.h> + +#include <asm/tlb.h> + +/* + * Check to see if the address rid:va is translated by the TLB + */ + +static int __is_tr_translated(struct thash_data *trp, u64 rid, u64 va) +{ + return ((trp->p) && (trp->rid == rid) + && ((va-trp->vadr) < PSIZE(trp->ps))); +} + +/* + * Only for GUEST TR format. + */ +static int __is_tr_overlap(struct thash_data *trp, u64 rid, u64 sva, u64 eva) +{ + u64 sa1, ea1; + + if (!trp->p || trp->rid != rid) + return 0; + + sa1 = trp->vadr; + ea1 = sa1 + PSIZE(trp->ps) - 1; + eva -= 1; + if ((sva > ea1) || (sa1 > eva)) + return 0; + else + return 1; + +} + +void machine_tlb_purge(u64 va, u64 ps) +{ + ia64_ptcl(va, ps << 2); +} + +void local_flush_tlb_all(void) +{ + int i, j; + unsigned long flags, count0, count1; + unsigned long stride0, stride1, addr; + + addr = current_vcpu->arch.ptce_base; + count0 = current_vcpu->arch.ptce_count[0]; + count1 = current_vcpu->arch.ptce_count[1]; + stride0 = current_vcpu->arch.ptce_stride[0]; + stride1 = current_vcpu->arch.ptce_stride[1]; + + local_irq_save(flags); + for (i = 0; i < count0; ++i) { + for (j = 0; j < count1; ++j) { + ia64_ptce(addr); + addr += stride1; + } + addr += stride0; + } + local_irq_restore(flags); + ia64_srlz_i(); /* srlz.i implies srlz.d */ +} + +int vhpt_enabled(struct kvm_vcpu *vcpu, u64 vadr, enum vhpt_ref ref) +{ + union ia64_rr vrr; + union ia64_pta vpta; + struct ia64_psr vpsr; + + vpsr = *(struct ia64_psr *)&VCPU(vcpu, vpsr); + vrr.val = vcpu_get_rr(vcpu, vadr); + vpta.val = vcpu_get_pta(vcpu); + + if (vrr.ve & vpta.ve) { + switch (ref) { + case DATA_REF: + case NA_REF: + return vpsr.dt; + case INST_REF: + return vpsr.dt && vpsr.it && vpsr.ic; + case RSE_REF: + return vpsr.dt && vpsr.rt; + + } + } + return 0; +} + +struct thash_data *vsa_thash(union ia64_pta vpta, u64 va, u64 vrr, u64 *tag) +{ + u64 index, pfn, rid, pfn_bits; + + pfn_bits = vpta.size - 5 - 8; + pfn = REGION_OFFSET(va) >> _REGION_PAGE_SIZE(vrr); + rid = _REGION_ID(vrr); + index = ((rid & 0xff) << pfn_bits)|(pfn & ((1UL << pfn_bits) - 1)); + *tag = ((rid >> 8) & 0xffff) | ((pfn >> pfn_bits) << 16); + + return (struct thash_data *)((vpta.base << PTA_BASE_SHIFT) + + (index << 5)); +} + +struct thash_data *__vtr_lookup(struct kvm_vcpu *vcpu, u64 va, int type) +{ + + struct thash_data *trp; + int i; + u64 rid; + + rid = vcpu_get_rr(vcpu, va); + rid = rid & RR_RID_MASK;; + if (type == D_TLB) { + if (vcpu_quick_region_check(vcpu->arch.dtr_regions, va)) { + for (trp = (struct thash_data *)&vcpu->arch.dtrs, i = 0; + i < NDTRS; i++, trp++) { + if (__is_tr_translated(trp, rid, va)) + return trp; + } + } + } else { + if (vcpu_quick_region_check(vcpu->arch.itr_regions, va)) { + for (trp = (struct thash_data *)&vcpu->arch.itrs, i = 0; + i < NITRS; i++, trp++) { + if (__is_tr_translated(trp, rid, va)) + return trp; + } + } + } + + return NULL; +} + +static void vhpt_insert(u64 pte, u64 itir, u64 ifa, u64 gpte) +{ + union ia64_rr rr; + struct thash_data *head; + unsigned long ps, gpaddr; + + ps = itir_ps(itir); + + gpaddr = ((gpte & _PAGE_PPN_MASK) >> ps << ps) | + (ifa & ((1UL << ps) - 1)); + + rr.val = ia64_get_rr(ifa); + head = (struct thash_data *)ia64_thash(ifa); + head->etag = INVALID_TI_TAG; + ia64_mf(); + head->page_flags = pte & ~PAGE_FLAGS_RV_MASK; + head->itir = rr.ps << 2; + head->etag = ia64_ttag(ifa); + head->gpaddr = gpaddr; +} + +void mark_pages_dirty(struct kvm_vcpu *v, u64 pte, u64 ps) +{ + u64 i, dirty_pages = 1; + u64 base_gfn = (pte&_PAGE_PPN_MASK) >> PAGE_SHIFT; + spinlock_t *lock = __kvm_va(v->arch.dirty_log_lock_pa); + void *dirty_bitmap = (void *)v - (KVM_VCPU_OFS + v->vcpu_id * VCPU_SIZE) + + KVM_MEM_DIRTY_LOG_OFS; + dirty_pages <<= ps <= PAGE_SHIFT ? 0 : ps - PAGE_SHIFT; + + vmm_spin_lock(lock); + for (i = 0; i < dirty_pages; i++) { + /* avoid RMW */ + if (!test_bit(base_gfn + i, dirty_bitmap)) + set_bit(base_gfn + i , dirty_bitmap); + } + vmm_spin_unlock(lock); +} + +void thash_vhpt_insert(struct kvm_vcpu *v, u64 pte, u64 itir, u64 va, int type) +{ + u64 phy_pte, psr; + union ia64_rr mrr; + + mrr.val = ia64_get_rr(va); + phy_pte = translate_phy_pte(&pte, itir, va); + + if (itir_ps(itir) >= mrr.ps) { + vhpt_insert(phy_pte, itir, va, pte); + } else { + phy_pte &= ~PAGE_FLAGS_RV_MASK; + psr = ia64_clear_ic(); + ia64_itc(type, va, phy_pte, itir_ps(itir)); + ia64_set_psr(psr); + } + + if (!(pte&VTLB_PTE_IO)) + mark_pages_dirty(v, pte, itir_ps(itir)); +} + +/* + * vhpt lookup + */ +struct thash_data *vhpt_lookup(u64 va) +{ + struct thash_data *head; + u64 tag; + + head = (struct thash_data *)ia64_thash(va); + tag = ia64_ttag(va); + if (head->etag == tag) + return head; + return NULL; +} + +u64 guest_vhpt_lookup(u64 iha, u64 *pte) +{ + u64 ret; + struct thash_data *data; + + data = __vtr_lookup(current_vcpu, iha, D_TLB); + if (data != NULL) + thash_vhpt_insert(current_vcpu, data->page_flags, + data->itir, iha, D_TLB); + + asm volatile ("rsm psr.ic|psr.i;;" + "srlz.d;;" + "ld8.s r9=[%1];;" + "tnat.nz p6,p7=r9;;" + "(p6) mov %0=1;" + "(p6) mov r9=r0;" + "(p7) extr.u r9=r9,0,53;;" + "(p7) mov %0=r0;" + "(p7) st8 [%2]=r9;;" + "ssm psr.ic;;" + "srlz.d;;" + /* "ssm psr.i;;" Once interrupts in vmm open, need fix*/ + : "=r"(ret) : "r"(iha), "r"(pte):"memory"); + + return ret; +} + +/* + * purge software guest tlb + */ + +static void vtlb_purge(struct kvm_vcpu *v, u64 va, u64 ps) +{ + struct thash_data *cur; + u64 start, curadr, size, psbits, tag, rr_ps, num; + union ia64_rr vrr; + struct thash_cb *hcb = &v->arch.vtlb; + + vrr.val = vcpu_get_rr(v, va); + psbits = VMX(v, psbits[(va >> 61)]); + start = va & ~((1UL << ps) - 1); + while (psbits) { + curadr = start; + rr_ps = __ffs(psbits); + psbits &= ~(1UL << rr_ps); + num = 1UL << ((ps < rr_ps) ? 0 : (ps - rr_ps)); + size = PSIZE(rr_ps); + vrr.ps = rr_ps; + while (num) { + cur = vsa_thash(hcb->pta, curadr, vrr.val, &tag); + if (cur->etag == tag && cur->ps == rr_ps) + cur->etag = INVALID_TI_TAG; + curadr += size; + num--; + } + } +} + + +/* + * purge VHPT and machine TLB + */ +static void vhpt_purge(struct kvm_vcpu *v, u64 va, u64 ps) +{ + struct thash_data *cur; + u64 start, size, tag, num; + union ia64_rr rr; + + start = va & ~((1UL << ps) - 1); + rr.val = ia64_get_rr(va); + size = PSIZE(rr.ps); + num = 1UL << ((ps < rr.ps) ? 0 : (ps - rr.ps)); + while (num) { + cur = (struct thash_data *)ia64_thash(start); + tag = ia64_ttag(start); + if (cur->etag == tag) + cur->etag = INVALID_TI_TAG; + start += size; + num--; + } + machine_tlb_purge(va, ps); +} + +/* + * Insert an entry into hash TLB or VHPT. + * NOTES: + * 1: When inserting VHPT to thash, "va" is a must covered + * address by the inserted machine VHPT entry. + * 2: The format of entry is always in TLB. + * 3: The caller need to make sure the new entry will not overlap + * with any existed entry. + */ +void vtlb_insert(struct kvm_vcpu *v, u64 pte, u64 itir, u64 va) +{ + struct thash_data *head; + union ia64_rr vrr; + u64 tag; + struct thash_cb *hcb = &v->arch.vtlb; + + vrr.val = vcpu_get_rr(v, va); + vrr.ps = itir_ps(itir); + VMX(v, psbits[va >> 61]) |= (1UL << vrr.ps); + head = vsa_thash(hcb->pta, va, vrr.val, &tag); + head->page_flags = pte; + head->itir = itir; + head->etag = tag; +} + +int vtr_find_overlap(struct kvm_vcpu *vcpu, u64 va, u64 ps, int type) +{ + struct thash_data *trp; + int i; + u64 end, rid; + + rid = vcpu_get_rr(vcpu, va); + rid = rid & RR_RID_MASK; + end = va + PSIZE(ps); + if (type == D_TLB) { + if (vcpu_quick_region_check(vcpu->arch.dtr_regions, va)) { + for (trp = (struct thash_data *)&vcpu->arch.dtrs, i = 0; + i < NDTRS; i++, trp++) { + if (__is_tr_overlap(trp, rid, va, end)) + return i; + } + } + } else { + if (vcpu_quick_region_check(vcpu->arch.itr_regions, va)) { + for (trp = (struct thash_data *)&vcpu->arch.itrs, i = 0; + i < NITRS; i++, trp++) { + if (__is_tr_overlap(trp, rid, va, end)) + return i; + } + } + } + return -1; +} + +/* + * Purge entries in VTLB and VHPT + */ +void thash_purge_entries(struct kvm_vcpu *v, u64 va, u64 ps) +{ + if (vcpu_quick_region_check(v->arch.tc_regions, va)) + vtlb_purge(v, va, ps); + vhpt_purge(v, va, ps); +} + +void thash_purge_entries_remote(struct kvm_vcpu *v, u64 va, u64 ps) +{ + u64 old_va = va; + va = REGION_OFFSET(va); + if (vcpu_quick_region_check(v->arch.tc_regions, old_va)) + vtlb_purge(v, va, ps); + vhpt_purge(v, va, ps); +} + +u64 translate_phy_pte(u64 *pte, u64 itir, u64 va) +{ + u64 ps, ps_mask, paddr, maddr; + union pte_flags phy_pte; + + ps = itir_ps(itir); + ps_mask = ~((1UL << ps) - 1); + phy_pte.val = *pte; + paddr = *pte; + paddr = ((paddr & _PAGE_PPN_MASK) & ps_mask) | (va & ~ps_mask); + maddr = kvm_lookup_mpa(paddr >> PAGE_SHIFT); + if (maddr & GPFN_IO_MASK) { + *pte |= VTLB_PTE_IO; + return -1; + } + maddr = ((maddr & _PAGE_PPN_MASK) & PAGE_MASK) | + (paddr & ~PAGE_MASK); + phy_pte.ppn = maddr >> ARCH_PAGE_SHIFT; + return phy_pte.val; +} + +/* + * Purge overlap TCs and then insert the new entry to emulate itc ops. + * Notes: Only TC entry can purge and insert. + * 1 indicates this is MMIO + */ +int thash_purge_and_insert(struct kvm_vcpu *v, u64 pte, u64 itir, + u64 ifa, int type) +{ + u64 ps; + u64 phy_pte; + union ia64_rr vrr, mrr; + int ret = 0; + + ps = itir_ps(itir); + vrr.val = vcpu_get_rr(v, ifa); + mrr.val = ia64_get_rr(ifa); + + phy_pte = translate_phy_pte(&pte, itir, ifa); + + /* Ensure WB attribute if pte is related to a normal mem page, + * which is required by vga acceleration since qemu maps shared + * vram buffer with WB. + */ + if (!(pte & VTLB_PTE_IO) && ((pte & _PAGE_MA_MASK) != _PAGE_MA_NAT)) { + pte &= ~_PAGE_MA_MASK; + phy_pte &= ~_PAGE_MA_MASK; + } + + if (pte & VTLB_PTE_IO) + ret = 1; + + vtlb_purge(v, ifa, ps); + vhpt_purge(v, ifa, ps); + + if (ps == mrr.ps) { + if (!(pte&VTLB_PTE_IO)) { + vhpt_insert(phy_pte, itir, ifa, pte); + } else { + vtlb_insert(v, pte, itir, ifa); + vcpu_quick_region_set(VMX(v, tc_regions), ifa); + } + } else if (ps > mrr.ps) { + vtlb_insert(v, pte, itir, ifa); + vcpu_quick_region_set(VMX(v, tc_regions), ifa); + if (!(pte&VTLB_PTE_IO)) + vhpt_insert(phy_pte, itir, ifa, pte); + } else { + u64 psr; + phy_pte &= ~PAGE_FLAGS_RV_MASK; + psr = ia64_clear_ic(); + ia64_itc(type, ifa, phy_pte, ps); + ia64_set_psr(psr); + } + if (!(pte&VTLB_PTE_IO)) + mark_pages_dirty(v, pte, ps); + + return ret; +} + +/* + * Purge all TCs or VHPT entries including those in Hash table. + * + */ + +void thash_purge_all(struct kvm_vcpu *v) +{ + int i; + struct thash_data *head; + struct thash_cb *vtlb, *vhpt; + vtlb = &v->arch.vtlb; + vhpt = &v->arch.vhpt; + + for (i = 0; i < 8; i++) + VMX(v, psbits[i]) = 0; + + head = vtlb->hash; + for (i = 0; i < vtlb->num; i++) { + head->page_flags = 0; + head->etag = INVALID_TI_TAG; + head->itir = 0; + head->next = 0; + head++; + }; + + head = vhpt->hash; + for (i = 0; i < vhpt->num; i++) { + head->page_flags = 0; + head->etag = INVALID_TI_TAG; + head->itir = 0; + head->next = 0; + head++; + }; + + local_flush_tlb_all(); +} + + +/* + * Lookup the hash table and its collision chain to find an entry + * covering this address rid:va or the entry. + * + * INPUT: + * in: TLB format for both VHPT & TLB. + */ + +struct thash_data *vtlb_lookup(struct kvm_vcpu *v, u64 va, int is_data) +{ + struct thash_data *cch; + u64 psbits, ps, tag; + union ia64_rr vrr; + + struct thash_cb *hcb = &v->arch.vtlb; + + cch = __vtr_lookup(v, va, is_data);; + if (cch) + return cch; + + if (vcpu_quick_region_check(v->arch.tc_regions, va) == 0) + return NULL; + + psbits = VMX(v, psbits[(va >> 61)]); + vrr.val = vcpu_get_rr(v, va); + while (psbits) { + ps = __ffs(psbits); + psbits &= ~(1UL << ps); + vrr.ps = ps; + cch = vsa_thash(hcb->pta, va, vrr.val, &tag); + if (cch->etag == tag && cch->ps == ps) + return cch; + } + + return NULL; +} + + +/* + * Initialize internal control data before service. + */ +void thash_init(struct thash_cb *hcb, u64 sz) +{ + int i; + struct thash_data *head; + + hcb->pta.val = (unsigned long)hcb->hash; + hcb->pta.vf = 1; + hcb->pta.ve = 1; + hcb->pta.size = sz; + head = hcb->hash; + for (i = 0; i < hcb->num; i++) { + head->page_flags = 0; + head->itir = 0; + head->etag = INVALID_TI_TAG; + head->next = 0; + head++; + } +} + +u64 kvm_lookup_mpa(u64 gpfn) +{ + u64 *base = (u64 *) KVM_P2M_BASE; + return *(base + gpfn); +} + +u64 kvm_gpa_to_mpa(u64 gpa) +{ + u64 pte = kvm_lookup_mpa(gpa >> PAGE_SHIFT); + return (pte >> PAGE_SHIFT << PAGE_SHIFT) | (gpa & ~PAGE_MASK); +} + + +/* + * Fetch guest bundle code. + * INPUT: + * gip: guest ip + * pbundle: used to return fetched bundle. + */ +int fetch_code(struct kvm_vcpu *vcpu, u64 gip, IA64_BUNDLE *pbundle) +{ + u64 gpip = 0; /* guest physical IP*/ + u64 *vpa; + struct thash_data *tlb; + u64 maddr; + + if (!(VCPU(vcpu, vpsr) & IA64_PSR_IT)) { + /* I-side physical mode */ + gpip = gip; + } else { + tlb = vtlb_lookup(vcpu, gip, I_TLB); + if (tlb) + gpip = (tlb->ppn >> (tlb->ps - 12) << tlb->ps) | + (gip & (PSIZE(tlb->ps) - 1)); + } + if (gpip) { + maddr = kvm_gpa_to_mpa(gpip); + } else { + tlb = vhpt_lookup(gip); + if (tlb == NULL) { + ia64_ptcl(gip, ARCH_PAGE_SHIFT << 2); + return IA64_FAULT; + } + maddr = (tlb->ppn >> (tlb->ps - 12) << tlb->ps) + | (gip & (PSIZE(tlb->ps) - 1)); + } + vpa = (u64 *)__kvm_va(maddr); + + pbundle->i64[0] = *vpa++; + pbundle->i64[1] = *vpa; + + return IA64_NO_FAULT; +} + + +void kvm_init_vhpt(struct kvm_vcpu *v) +{ + v->arch.vhpt.num = VHPT_NUM_ENTRIES; + thash_init(&v->arch.vhpt, VHPT_SHIFT); + ia64_set_pta(v->arch.vhpt.pta.val); + /*Enable VHPT here?*/ +} + +void kvm_init_vtlb(struct kvm_vcpu *v) +{ + v->arch.vtlb.num = VTLB_NUM_ENTRIES; + thash_init(&v->arch.vtlb, VTLB_SHIFT); +} -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:12
|
From: Heiko Carstens <hei...@de...> There is no need to use interlocked updates when the rcp lock is held. Therefore the simple bitops variants can be used. This should improve performance. Signed-off-by: Heiko Carstens <hei...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- include/asm-s390/pgtable.h | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/asm-s390/pgtable.h b/include/asm-s390/pgtable.h index 7fe5c4b..4c0698c 100644 --- a/include/asm-s390/pgtable.h +++ b/include/asm-s390/pgtable.h @@ -553,12 +553,12 @@ static inline void ptep_rcp_copy(pte_t *ptep) skey = page_get_storage_key(page_to_phys(page)); if (skey & _PAGE_CHANGED) - set_bit(RCP_GC_BIT, pgste); + set_bit_simple(RCP_GC_BIT, pgste); if (skey & _PAGE_REFERENCED) - set_bit(RCP_GR_BIT, pgste); - if (test_and_clear_bit(RCP_HC_BIT, pgste)) + set_bit_simple(RCP_GR_BIT, pgste); + if (test_and_clear_bit_simple(RCP_HC_BIT, pgste)) SetPageDirty(page); - if (test_and_clear_bit(RCP_HR_BIT, pgste)) + if (test_and_clear_bit_simple(RCP_HR_BIT, pgste)) SetPageReferenced(page); #endif } @@ -732,8 +732,8 @@ static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, young = ((page_get_storage_key(physpage) & _PAGE_REFERENCED) != 0); rcp_lock(ptep); if (young) - set_bit(RCP_GR_BIT, pgste); - young |= test_and_clear_bit(RCP_HR_BIT, pgste); + set_bit_simple(RCP_GR_BIT, pgste); + young |= test_and_clear_bit_simple(RCP_HR_BIT, pgste); rcp_unlock(ptep); return young; #endif -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:12
|
From: Izik Eidus <iz...@qu...> the main purpose of adding this functions is the abilaty to release the spinlock that protect the kvm list while still be able to do operations on a specific kvm in a safe way. Signed-off-by: Izik Eidus <iz...@qu...> Signed-off-by: Avi Kivity <av...@qu...> --- include/linux/kvm_host.h | 4 ++++ virt/kvm/kvm_main.c | 17 ++++++++++++++++- 2 files changed, 20 insertions(+), 1 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f4e1436..a2ceb51 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -114,6 +114,7 @@ struct kvm { struct kvm_io_bus pio_bus; struct kvm_vm_stat stat; struct kvm_arch arch; + atomic_t users_count; }; /* The guest did something we don't support. */ @@ -140,6 +141,9 @@ int kvm_init(void *opaque, unsigned int vcpu_size, struct module *module); void kvm_exit(void); +void kvm_get_kvm(struct kvm *kvm); +void kvm_put_kvm(struct kvm *kvm); + #define HPA_MSB ((sizeof(hpa_t) * 8) - 1) #define HPA_ERR_MASK ((hpa_t)1 << HPA_MSB) static inline int is_error_hpa(hpa_t hpa) { return hpa >> HPA_MSB; } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 41d4b65..93ed78b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -193,6 +193,7 @@ static struct kvm *kvm_create_vm(void) mutex_init(&kvm->lock); kvm_io_bus_init(&kvm->mmio_bus); init_rwsem(&kvm->slots_lock); + atomic_set(&kvm->users_count, 1); spin_lock(&kvm_lock); list_add(&kvm->vm_list, &vm_list); spin_unlock(&kvm_lock); @@ -242,11 +243,25 @@ static void kvm_destroy_vm(struct kvm *kvm) mmdrop(mm); } +void kvm_get_kvm(struct kvm *kvm) +{ + atomic_inc(&kvm->users_count); +} +EXPORT_SYMBOL_GPL(kvm_get_kvm); + +void kvm_put_kvm(struct kvm *kvm) +{ + if (atomic_dec_and_test(&kvm->users_count)) + kvm_destroy_vm(kvm); +} +EXPORT_SYMBOL_GPL(kvm_put_kvm); + + static int kvm_vm_release(struct inode *inode, struct file *filp) { struct kvm *kvm = filp->private_data; - kvm_destroy_vm(kvm); + kvm_put_kvm(kvm); return 0; } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:17
|
Noticed by Marcelo Tosatti. Signed-off-by: Avi Kivity <av...@qu...> --- arch/x86/kvm/x86.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e6a38bf..c7ad235 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3853,6 +3853,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kfree(kvm->arch.vioapic); kvm_free_vcpus(kvm); kvm_free_physmem(kvm); + if (kvm->arch.apic_access_page) + put_page(kvm->arch.apic_access_page); kfree(kvm); } -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:17
|
From: Xiantao Zhang <xia...@in...> Some sal/pal calls would be traped to kvm for virtulization from guest firmware. Signed-off-by: Xiantao Zhang <xia...@in...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/kvm_fw.c | 500 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 500 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/kvm_fw.c diff --git a/arch/ia64/kvm/kvm_fw.c b/arch/ia64/kvm/kvm_fw.c new file mode 100644 index 0000000..091f936 --- /dev/null +++ b/arch/ia64/kvm/kvm_fw.c @@ -0,0 +1,500 @@ +/* + * PAL/SAL call delegation + * + * Copyright (c) 2004 Li Susie <sus...@in...> + * Copyright (c) 2005 Yu Ke <ke...@in...> + * Copyright (c) 2007 Xiantao Zhang <xia...@in...> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ + +#include <linux/kvm_host.h> +#include <linux/smp.h> + +#include "vti.h" +#include "misc.h" + +#include <asm/pal.h> +#include <asm/sal.h> +#include <asm/tlb.h> + +/* + * Handy macros to make sure that the PAL return values start out + * as something meaningful. + */ +#define INIT_PAL_STATUS_UNIMPLEMENTED(x) \ + { \ + x.status = PAL_STATUS_UNIMPLEMENTED; \ + x.v0 = 0; \ + x.v1 = 0; \ + x.v2 = 0; \ + } + +#define INIT_PAL_STATUS_SUCCESS(x) \ + { \ + x.status = PAL_STATUS_SUCCESS; \ + x.v0 = 0; \ + x.v1 = 0; \ + x.v2 = 0; \ + } + +static void kvm_get_pal_call_data(struct kvm_vcpu *vcpu, + u64 *gr28, u64 *gr29, u64 *gr30, u64 *gr31) { + struct exit_ctl_data *p; + + if (vcpu) { + p = &vcpu->arch.exit_data; + if (p->exit_reason == EXIT_REASON_PAL_CALL) { + *gr28 = p->u.pal_data.gr28; + *gr29 = p->u.pal_data.gr29; + *gr30 = p->u.pal_data.gr30; + *gr31 = p->u.pal_data.gr31; + return ; + } + } + printk(KERN_DEBUG"Failed to get vcpu pal data!!!\n"); +} + +static void set_pal_result(struct kvm_vcpu *vcpu, + struct ia64_pal_retval result) { + + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + if (p && p->exit_reason == EXIT_REASON_PAL_CALL) { + p->u.pal_data.ret = result; + return ; + } + INIT_PAL_STATUS_UNIMPLEMENTED(p->u.pal_data.ret); +} + +static void set_sal_result(struct kvm_vcpu *vcpu, + struct sal_ret_values result) { + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + if (p && p->exit_reason == EXIT_REASON_SAL_CALL) { + p->u.sal_data.ret = result; + return ; + } + printk(KERN_WARNING"Failed to set sal result!!\n"); +} + +struct cache_flush_args { + u64 cache_type; + u64 operation; + u64 progress; + long status; +}; + +cpumask_t cpu_cache_coherent_map; + +static void remote_pal_cache_flush(void *data) +{ + struct cache_flush_args *args = data; + long status; + u64 progress = args->progress; + + status = ia64_pal_cache_flush(args->cache_type, args->operation, + &progress, NULL); + if (status != 0) + args->status = status; +} + +static struct ia64_pal_retval pal_cache_flush(struct kvm_vcpu *vcpu) +{ + u64 gr28, gr29, gr30, gr31; + struct ia64_pal_retval result = {0, 0, 0, 0}; + struct cache_flush_args args = {0, 0, 0, 0}; + long psr; + + gr28 = gr29 = gr30 = gr31 = 0; + kvm_get_pal_call_data(vcpu, &gr28, &gr29, &gr30, &gr31); + + if (gr31 != 0) + printk(KERN_ERR"vcpu:%p called cache_flush error!\n", vcpu); + + /* Always call Host Pal in int=1 */ + gr30 &= ~PAL_CACHE_FLUSH_CHK_INTRS; + args.cache_type = gr29; + args.operation = gr30; + smp_call_function(remote_pal_cache_flush, + (void *)&args, 1, 1); + if (args.status != 0) + printk(KERN_ERR"pal_cache_flush error!," + "status:0x%lx\n", args.status); + /* + * Call Host PAL cache flush + * Clear psr.ic when call PAL_CACHE_FLUSH + */ + local_irq_save(psr); + result.status = ia64_pal_cache_flush(gr29, gr30, &result.v1, + &result.v0); + local_irq_restore(psr); + if (result.status != 0) + printk(KERN_ERR"vcpu:%p crashed due to cache_flush err:%ld" + "in1:%lx,in2:%lx\n", + vcpu, result.status, gr29, gr30); + +#if 0 + if (gr29 == PAL_CACHE_TYPE_COHERENT) { + cpus_setall(vcpu->arch.cache_coherent_map); + cpu_clear(vcpu->cpu, vcpu->arch.cache_coherent_map); + cpus_setall(cpu_cache_coherent_map); + cpu_clear(vcpu->cpu, cpu_cache_coherent_map); + } +#endif + return result; +} + +struct ia64_pal_retval pal_cache_summary(struct kvm_vcpu *vcpu) +{ + + struct ia64_pal_retval result; + + PAL_CALL(result, PAL_CACHE_SUMMARY, 0, 0, 0); + return result; +} + +static struct ia64_pal_retval pal_freq_base(struct kvm_vcpu *vcpu) +{ + + struct ia64_pal_retval result; + + PAL_CALL(result, PAL_FREQ_BASE, 0, 0, 0); + + /* + * PAL_FREQ_BASE may not be implemented in some platforms, + * call SAL instead. + */ + if (result.v0 == 0) { + result.status = ia64_sal_freq_base(SAL_FREQ_BASE_PLATFORM, + &result.v0, + &result.v1); + result.v2 = 0; + } + + return result; +} + +static struct ia64_pal_retval pal_freq_ratios(struct kvm_vcpu *vcpu) +{ + + struct ia64_pal_retval result; + + PAL_CALL(result, PAL_FREQ_RATIOS, 0, 0, 0); + return result; +} + +static struct ia64_pal_retval pal_logical_to_physica(struct kvm_vcpu *vcpu) +{ + struct ia64_pal_retval result; + + INIT_PAL_STATUS_UNIMPLEMENTED(result); + return result; +} + +static struct ia64_pal_retval pal_platform_addr(struct kvm_vcpu *vcpu) +{ + + struct ia64_pal_retval result; + + INIT_PAL_STATUS_SUCCESS(result); + return result; +} + +static struct ia64_pal_retval pal_proc_get_features(struct kvm_vcpu *vcpu) +{ + + struct ia64_pal_retval result = {0, 0, 0, 0}; + long in0, in1, in2, in3; + + kvm_get_pal_call_data(vcpu, &in0, &in1, &in2, &in3); + result.status = ia64_pal_proc_get_features(&result.v0, &result.v1, + &result.v2, in2); + + return result; +} + +static struct ia64_pal_retval pal_cache_info(struct kvm_vcpu *vcpu) +{ + + pal_cache_config_info_t ci; + long status; + unsigned long in0, in1, in2, in3, r9, r10; + + kvm_get_pal_call_data(vcpu, &in0, &in1, &in2, &in3); + status = ia64_pal_cache_config_info(in1, in2, &ci); + r9 = ci.pcci_info_1.pcci1_data; + r10 = ci.pcci_info_2.pcci2_data; + return ((struct ia64_pal_retval){status, r9, r10, 0}); +} + +#define GUEST_IMPL_VA_MSB 59 +#define GUEST_RID_BITS 18 + +static struct ia64_pal_retval pal_vm_summary(struct kvm_vcpu *vcpu) +{ + + pal_vm_info_1_u_t vminfo1; + pal_vm_info_2_u_t vminfo2; + struct ia64_pal_retval result; + + PAL_CALL(result, PAL_VM_SUMMARY, 0, 0, 0); + if (!result.status) { + vminfo1.pvi1_val = result.v0; + vminfo1.pal_vm_info_1_s.max_itr_entry = 8; + vminfo1.pal_vm_info_1_s.max_dtr_entry = 8; + result.v0 = vminfo1.pvi1_val; + vminfo2.pal_vm_info_2_s.impl_va_msb = GUEST_IMPL_VA_MSB; + vminfo2.pal_vm_info_2_s.rid_size = GUEST_RID_BITS; + result.v1 = vminfo2.pvi2_val; + } + + return result; +} + +static struct ia64_pal_retval pal_vm_info(struct kvm_vcpu *vcpu) +{ + struct ia64_pal_retval result; + + INIT_PAL_STATUS_UNIMPLEMENTED(result); + + return result; +} + +static u64 kvm_get_pal_call_index(struct kvm_vcpu *vcpu) +{ + u64 index = 0; + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + if (p && (p->exit_reason == EXIT_REASON_PAL_CALL)) + index = p->u.pal_data.gr28; + + return index; +} + +int kvm_pal_emul(struct kvm_vcpu *vcpu, struct kvm_run *run) +{ + + u64 gr28; + struct ia64_pal_retval result; + int ret = 1; + + gr28 = kvm_get_pal_call_index(vcpu); + /*printk("pal_call index:%lx\n",gr28);*/ + switch (gr28) { + case PAL_CACHE_FLUSH: + result = pal_cache_flush(vcpu); + break; + case PAL_CACHE_SUMMARY: + result = pal_cache_summary(vcpu); + break; + case PAL_HALT_LIGHT: + { + vcpu->arch.timer_pending = 1; + INIT_PAL_STATUS_SUCCESS(result); + if (kvm_highest_pending_irq(vcpu) == -1) + ret = kvm_emulate_halt(vcpu); + + } + break; + + case PAL_FREQ_RATIOS: + result = pal_freq_ratios(vcpu); + break; + + case PAL_FREQ_BASE: + result = pal_freq_base(vcpu); + break; + + case PAL_LOGICAL_TO_PHYSICAL : + result = pal_logical_to_physica(vcpu); + break; + + case PAL_VM_SUMMARY : + result = pal_vm_summary(vcpu); + break; + + case PAL_VM_INFO : + result = pal_vm_info(vcpu); + break; + case PAL_PLATFORM_ADDR : + result = pal_platform_addr(vcpu); + break; + case PAL_CACHE_INFO: + result = pal_cache_info(vcpu); + break; + case PAL_PTCE_INFO: + INIT_PAL_STATUS_SUCCESS(result); + result.v1 = (1L << 32) | 1L; + break; + case PAL_VM_PAGE_SIZE: + result.status = ia64_pal_vm_page_size(&result.v0, + &result.v1); + break; + case PAL_RSE_INFO: + result.status = ia64_pal_rse_info(&result.v0, + (pal_hints_u_t *)&result.v1); + break; + case PAL_PROC_GET_FEATURES: + result = pal_proc_get_features(vcpu); + break; + case PAL_DEBUG_INFO: + result.status = ia64_pal_debug_info(&result.v0, + &result.v1); + break; + case PAL_VERSION: + result.status = ia64_pal_version( + (pal_version_u_t *)&result.v0, + (pal_version_u_t *)&result.v1); + + break; + case PAL_FIXED_ADDR: + result.status = PAL_STATUS_SUCCESS; + result.v0 = vcpu->vcpu_id; + break; + default: + INIT_PAL_STATUS_UNIMPLEMENTED(result); + printk(KERN_WARNING"kvm: Unsupported pal call," + " index:0x%lx\n", gr28); + } + set_pal_result(vcpu, result); + return ret; +} + +static struct sal_ret_values sal_emulator(struct kvm *kvm, + long index, unsigned long in1, + unsigned long in2, unsigned long in3, + unsigned long in4, unsigned long in5, + unsigned long in6, unsigned long in7) +{ + unsigned long r9 = 0; + unsigned long r10 = 0; + long r11 = 0; + long status; + + status = 0; + switch (index) { + case SAL_FREQ_BASE: + status = ia64_sal_freq_base(in1, &r9, &r10); + break; + case SAL_PCI_CONFIG_READ: + printk(KERN_WARNING"kvm: Not allowed to call here!" + " SAL_PCI_CONFIG_READ\n"); + break; + case SAL_PCI_CONFIG_WRITE: + printk(KERN_WARNING"kvm: Not allowed to call here!" + " SAL_PCI_CONFIG_WRITE\n"); + break; + case SAL_SET_VECTORS: + if (in1 == SAL_VECTOR_OS_BOOT_RENDEZ) { + if (in4 != 0 || in5 != 0 || in6 != 0 || in7 != 0) { + status = -2; + } else { + kvm->arch.rdv_sal_data.boot_ip = in2; + kvm->arch.rdv_sal_data.boot_gp = in3; + } + printk("Rendvous called! iip:%lx\n\n", in2); + } else + printk(KERN_WARNING"kvm: CALLED SAL_SET_VECTORS %lu." + "ignored...\n", in1); + break; + case SAL_GET_STATE_INFO: + /* No more info. */ + status = -5; + r9 = 0; + break; + case SAL_GET_STATE_INFO_SIZE: + /* Return a dummy size. */ + status = 0; + r9 = 128; + break; + case SAL_CLEAR_STATE_INFO: + /* Noop. */ + break; + case SAL_MC_RENDEZ: + printk(KERN_WARNING + "kvm: called SAL_MC_RENDEZ. ignored...\n"); + break; + case SAL_MC_SET_PARAMS: + printk(KERN_WARNING + "kvm: called SAL_MC_SET_PARAMS.ignored!\n"); + break; + case SAL_CACHE_FLUSH: + if (1) { + /*Flush using SAL. + This method is faster but has a side + effect on other vcpu running on + this cpu. */ + status = ia64_sal_cache_flush(in1); + } else { + /*Maybe need to implement the method + without side effect!*/ + status = 0; + } + break; + case SAL_CACHE_INIT: + printk(KERN_WARNING + "kvm: called SAL_CACHE_INIT. ignored...\n"); + break; + case SAL_UPDATE_PAL: + printk(KERN_WARNING + "kvm: CALLED SAL_UPDATE_PAL. ignored...\n"); + break; + default: + printk(KERN_WARNING"kvm: called SAL_CALL with unknown index." + " index:%ld\n", index); + status = -1; + break; + } + return ((struct sal_ret_values) {status, r9, r10, r11}); +} + +static void kvm_get_sal_call_data(struct kvm_vcpu *vcpu, u64 *in0, u64 *in1, + u64 *in2, u64 *in3, u64 *in4, u64 *in5, u64 *in6, u64 *in7){ + + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + + if (p) { + if (p->exit_reason == EXIT_REASON_SAL_CALL) { + *in0 = p->u.sal_data.in0; + *in1 = p->u.sal_data.in1; + *in2 = p->u.sal_data.in2; + *in3 = p->u.sal_data.in3; + *in4 = p->u.sal_data.in4; + *in5 = p->u.sal_data.in5; + *in6 = p->u.sal_data.in6; + *in7 = p->u.sal_data.in7; + return ; + } + } + *in0 = 0; +} + +void kvm_sal_emul(struct kvm_vcpu *vcpu) +{ + + struct sal_ret_values result; + u64 index, in1, in2, in3, in4, in5, in6, in7; + + kvm_get_sal_call_data(vcpu, &index, &in1, &in2, + &in3, &in4, &in5, &in6, &in7); + result = sal_emulator(vcpu->kvm, index, in1, in2, in3, + in4, in5, in6, in7); + set_sal_result(vcpu, result); +} -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:16
|
From: Xiantao Zhang <xia...@in...> trampoline code targets for guest/host world switch. Signed-off-by: Xiantao Zhang <xia...@in...> Signed-off-by: Avi Kivity <av...@qu...> --- arch/ia64/kvm/trampoline.S | 1038 ++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 1038 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/trampoline.S diff --git a/arch/ia64/kvm/trampoline.S b/arch/ia64/kvm/trampoline.S new file mode 100644 index 0000000..30897d4 --- /dev/null +++ b/arch/ia64/kvm/trampoline.S @@ -0,0 +1,1038 @@ +/* Save all processor states + * + * Copyright (c) 2007 Fleming Feng <fle...@in...> + * Copyright (c) 2007 Anthony Xu <ant...@in...> + */ + +#include <asm/asmmacro.h> +#include "asm-offsets.h" + + +#define CTX(name) VMM_CTX_##name##_OFFSET + + /* + * r32: context_t base address + */ +#define SAVE_BRANCH_REGS \ + add r2 = CTX(B0),r32; \ + add r3 = CTX(B1),r32; \ + mov r16 = b0; \ + mov r17 = b1; \ + ;; \ + st8 [r2]=r16,16; \ + st8 [r3]=r17,16; \ + ;; \ + mov r16 = b2; \ + mov r17 = b3; \ + ;; \ + st8 [r2]=r16,16; \ + st8 [r3]=r17,16; \ + ;; \ + mov r16 = b4; \ + mov r17 = b5; \ + ;; \ + st8 [r2]=r16; \ + st8 [r3]=r17; \ + ;; + + /* + * r33: context_t base address + */ +#define RESTORE_BRANCH_REGS \ + add r2 = CTX(B0),r33; \ + add r3 = CTX(B1),r33; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov b0 = r16; \ + mov b1 = r17; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov b2 = r16; \ + mov b3 = r17; \ + ;; \ + ld8 r16=[r2]; \ + ld8 r17=[r3]; \ + ;; \ + mov b4=r16; \ + mov b5=r17; \ + ;; + + + /* + * r32: context_t base address + * bsw == 1 + * Save all bank1 general registers, r4 ~ r7 + */ +#define SAVE_GENERAL_REGS \ + add r2=CTX(R4),r32; \ + add r3=CTX(R5),r32; \ + ;; \ +.mem.offset 0,0; \ + st8.spill [r2]=r4,16; \ +.mem.offset 8,0; \ + st8.spill [r3]=r5,16; \ + ;; \ +.mem.offset 0,0; \ + st8.spill [r2]=r6,48; \ +.mem.offset 8,0; \ + st8.spill [r3]=r7,48; \ + ;; \ +.mem.offset 0,0; \ + st8.spill [r2]=r12; \ +.mem.offset 8,0; \ + st8.spill [r3]=r13; \ + ;; + + /* + * r33: context_t base address + * bsw == 1 + */ +#define RESTORE_GENERAL_REGS \ + add r2=CTX(R4),r33; \ + add r3=CTX(R5),r33; \ + ;; \ + ld8.fill r4=[r2],16; \ + ld8.fill r5=[r3],16; \ + ;; \ + ld8.fill r6=[r2],48; \ + ld8.fill r7=[r3],48; \ + ;; \ + ld8.fill r12=[r2]; \ + ld8.fill r13 =[r3]; \ + ;; + + + + + /* + * r32: context_t base address + */ +#define SAVE_KERNEL_REGS \ + add r2 = CTX(KR0),r32; \ + add r3 = CTX(KR1),r32; \ + mov r16 = ar.k0; \ + mov r17 = ar.k1; \ + ;; \ + st8 [r2] = r16,16; \ + st8 [r3] = r17,16; \ + ;; \ + mov r16 = ar.k2; \ + mov r17 = ar.k3; \ + ;; \ + st8 [r2] = r16,16; \ + st8 [r3] = r17,16; \ + ;; \ + mov r16 = ar.k4; \ + mov r17 = ar.k5; \ + ;; \ + st8 [r2] = r16,16; \ + st8 [r3] = r17,16; \ + ;; \ + mov r16 = ar.k6; \ + mov r17 = ar.k7; \ + ;; \ + st8 [r2] = r16; \ + st8 [r3] = r17; \ + ;; + + + + /* + * r33: context_t base address + */ +#define RESTORE_KERNEL_REGS \ + add r2 = CTX(KR0),r33; \ + add r3 = CTX(KR1),r33; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov ar.k0=r16; \ + mov ar.k1=r17; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov ar.k2=r16; \ + mov ar.k3=r17; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov ar.k4=r16; \ + mov ar.k5=r17; \ + ;; \ + ld8 r16=[r2],16; \ + ld8 r17=[r3],16; \ + ;; \ + mov ar.k6=r16; \ + mov ar.k7=r17; \ + ;; + + + + /* + * r32: context_t base address + */ +#define SAVE_APP_REGS \ + add r2 = CTX(BSPSTORE),r32; \ + mov r16 = ar.bspstore; \ + ;; \ + st8 [r2] = r16,CTX(RNAT)-CTX(BSPSTORE);\ + mov r16 = ar.rnat; \ + ;; \ + st8 [r2] = r16,CTX(FCR)-CTX(RNAT); \ + mov r16 = ar.fcr; \ + ;; \ + st8 [r2] = r16,CTX(EFLAG)-CTX(FCR); \ + mov r16 = ar.eflag; \ + ;; \ + st8 [r2] = r16,CTX(CFLG)-CTX(EFLAG); \ + mov r16 = ar.cflg; \ + ;; \ + st8 [r2] = r16,CTX(FSR)-CTX(CFLG); \ + mov r16 = ar.fsr; \ + ;; \ + st8 [r2] = r16,CTX(FIR)-CTX(FSR); \ + mov r16 = ar.fir; \ + ;; \ + st8 [r2] = r16,CTX(FDR)-CTX(FIR); \ + mov r16 = ar.fdr; \ + ;; \ + st8 [r2] = r16,CTX(UNAT)-CTX(FDR); \ + mov r16 = ar.unat; \ + ;; \ + st8 [r2] = r16,CTX(FPSR)-CTX(UNAT); \ + mov r16 = ar.fpsr; \ + ;; \ + st8 [r2] = r16,CTX(PFS)-CTX(FPSR); \ + mov r16 = ar.pfs; \ + ;; \ + st8 [r2] = r16,CTX(LC)-CTX(PFS); \ + mov r16 = ar.lc; \ + ;; \ + st8 [r2] = r16; \ + ;; + + /* + * r33: context_t base address + */ +#define RESTORE_APP_REGS \ + add r2=CTX(BSPSTORE),r33; \ + ;; \ + ld8 r16=[r2],CTX(RNAT)-CTX(BSPSTORE); \ + ;; \ + mov ar.bspstore=r16; \ + ld8 r16=[r2],CTX(FCR)-CTX(RNAT); \ + ;; \ + mov ar.rnat=r16; \ + ld8 r16=[r2],CTX(EFLAG)-CTX(FCR); \ + ;; \ + mov ar.fcr=r16; \ + ld8 r16=[r2],CTX(CFLG)-CTX(EFLAG); \ + ;; \ + mov ar.eflag=r16; \ + ld8 r16=[r2],CTX(FSR)-CTX(CFLG); \ + ;; \ + mov ar.cflg=r16; \ + ld8 r16=[r2],CTX(FIR)-CTX(FSR); \ + ;; \ + mov ar.fsr=r16; \ + ld8 r16=[r2],CTX(FDR)-CTX(FIR); \ + ;; \ + mov ar.fir=r16; \ + ld8 r16=[r2],CTX(UNAT)-CTX(FDR); \ + ;; \ + mov ar.fdr=r16; \ + ld8 r16=[r2],CTX(FPSR)-CTX(UNAT); \ + ;; \ + mov ar.unat=r16; \ + ld8 r16=[r2],CTX(PFS)-CTX(FPSR); \ + ;; \ + mov ar.fpsr=r16; \ + ld8 r16=[r2],CTX(LC)-CTX(PFS); \ + ;; \ + mov ar.pfs=r16; \ + ld8 r16=[r2]; \ + ;; \ + mov ar.lc=r16; \ + ;; + + /* + * r32: context_t base address + */ +#define SAVE_CTL_REGS \ + add r2 = CTX(DCR),r32; \ + mov r16 = cr.dcr; \ + ;; \ + st8 [r2] = r16,CTX(IVA)-CTX(DCR); \ + ;; \ + mov r16 = cr.iva; \ + ;; \ + st8 [r2] = r16,CTX(PTA)-CTX(IVA); \ + ;; \ + mov r16 = cr.pta; \ + ;; \ + st8 [r2] = r16 ; \ + ;; + + /* + * r33: context_t base address + */ +#define RESTORE_CTL_REGS \ + add r2 = CTX(DCR),r33; \ + ;; \ + ld8 r16 = [r2],CTX(IVA)-CTX(DCR); \ + ;; \ + mov cr.dcr = r16; \ + dv_serialize_data; \ + ;; \ + ld8 r16 = [r2],CTX(PTA)-CTX(IVA); \ + ;; \ + mov cr.iva = r16; \ + dv_serialize_data; \ + ;; \ + ld8 r16 = [r2]; \ + ;; \ + mov cr.pta = r16; \ + dv_serialize_data; \ + ;; + + + /* + * r32: context_t base address + */ +#define SAVE_REGION_REGS \ + add r2=CTX(RR0),r32; \ + mov r16=rr[r0]; \ + dep.z r18=1,61,3; \ + ;; \ + st8 [r2]=r16,8; \ + mov r17=rr[r18]; \ + dep.z r18=2,61,3; \ + ;; \ + st8 [r2]=r17,8; \ + mov r16=rr[r18]; \ + dep.z r18=3,61,3; \ + ;; \ + st8 [r2]=r16,8; \ + mov r17=rr[r18]; \ + dep.z r18=4,61,3; \ + ;; \ + st8 [r2]=r17,8; \ + mov r16=rr[r18]; \ + dep.z r18=5,61,3; \ + ;; \ + st8 [r2]=r16,8; \ + mov r17=rr[r18]; \ + dep.z r18=7,61,3; \ + ;; \ + st8 [r2]=r17,16; \ + mov r16=rr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + ;; + + /* + * r33:context_t base address + */ +#define RESTORE_REGION_REGS \ + add r2=CTX(RR0),r33;\ + mov r18=r0; \ + ;; \ + ld8 r20=[r2],8; \ + ;; /* rr0 */ \ + ld8 r21=[r2],8; \ + ;; /* rr1 */ \ + ld8 r22=[r2],8; \ + ;; /* rr2 */ \ + ld8 r23=[r2],8; \ + ;; /* rr3 */ \ + ld8 r24=[r2],8; \ + ;; /* rr4 */ \ + ld8 r25=[r2],16; \ + ;; /* rr5 */ \ + ld8 r27=[r2]; \ + ;; /* rr7 */ \ + mov rr[r18]=r20; \ + dep.z r18=1,61,3; \ + ;; /* rr1 */ \ + mov rr[r18]=r21; \ + dep.z r18=2,61,3; \ + ;; /* rr2 */ \ + mov rr[r18]=r22; \ + dep.z r18=3,61,3; \ + ;; /* rr3 */ \ + mov rr[r18]=r23; \ + dep.z r18=4,61,3; \ + ;; /* rr4 */ \ + mov rr[r18]=r24; \ + dep.z r18=5,61,3; \ + ;; /* rr5 */ \ + mov rr[r18]=r25; \ + dep.z r18=7,61,3; \ + ;; /* rr7 */ \ + mov rr[r18]=r27; \ + ;; \ + srlz.i; \ + ;; + + + + /* + * r32: context_t base address + * r36~r39:scratch registers + */ +#define SAVE_DEBUG_REGS \ + add r2=CTX(IBR0),r32; \ + add r3=CTX(DBR0),r32; \ + mov r16=ibr[r0]; \ + mov r17=dbr[r0]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=1,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=2,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=2,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=3,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=4,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=5,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=6,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + add r18=7,r0; \ + ;; \ + mov r16=ibr[r18]; \ + mov r17=dbr[r18]; \ + ;; \ + st8 [r2]=r16,8; \ + st8 [r3]=r17,8; \ + ;; + + +/* + * r33: point to context_t structure + * ar.lc are corrupted. + */ +#define RESTORE_DEBUG_REGS \ + add r2=CTX(IBR0),r33; \ + add r3=CTX(DBR0),r33; \ + mov r16=7; \ + mov r17=r0; \ + ;; \ + mov ar.lc = r16; \ + ;; \ +1: \ + ld8 r18=[r2],8; \ + ld8 r19=[r3],8; \ + ;; \ + mov ibr[r17]=r18; \ + mov dbr[r17]=r19; \ + ;; \ + srlz.i; \ + ;; \ + add r17=1,r17; \ + br.cloop.sptk 1b; \ + ;; + + + /* + * r32: context_t base address + */ +#define SAVE_FPU_LOW \ + add r2=CTX(F2),r32; \ + add r3=CTX(F3),r32; \ + ;; \ + stf.spill.nta [r2]=f2,32; \ + stf.spill.nta [r3]=f3,32; \ + ;; \ + stf.spill.nta [r2]=f4,32; \ + stf.spill.nta [r3]=f5,32; \ + ;; \ + stf.spill.nta [r2]=f6,32; \ + stf.spill.nta [r3]=f7,32; \ + ;; \ + stf.spill.nta [r2]=f8,32; \ + stf.spill.nta [r3]=f9,32; \ + ;; \ + stf.spill.nta [r2]=f10,32; \ + stf.spill.nta [r3]=f11,32; \ + ;; \ + stf.spill.nta [r2]=f12,32; \ + stf.spill.nta [r3]=f13,32; \ + ;; \ + stf.spill.nta [r2]=f14,32; \ + stf.spill.nta [r3]=f15,32; \ + ;; \ + stf.spill.nta [r2]=f16,32; \ + stf.spill.nta [r3]=f17,32; \ + ;; \ + stf.spill.nta [r2]=f18,32; \ + stf.spill.nta [r3]=f19,32; \ + ;; \ + stf.spill.nta [r2]=f20,32; \ + stf.spill.nta [r3]=f21,32; \ + ;; \ + stf.spill.nta [r2]=f22,32; \ + stf.spill.nta [r3]=f23,32; \ + ;; \ + stf.spill.nta [r2]=f24,32; \ + stf.spill.nta [r3]=f25,32; \ + ;; \ + stf.spill.nta [r2]=f26,32; \ + stf.spill.nta [r3]=f27,32; \ + ;; \ + stf.spill.nta [r2]=f28,32; \ + stf.spill.nta [r3]=f29,32; \ + ;; \ + stf.spill.nta [r2]=f30; \ + stf.spill.nta [r3]=f31; \ + ;; + + /* + * r32: context_t base address + */ +#define SAVE_FPU_HIGH \ + add r2=CTX(F32),r32; \ + add r3=CTX(F33),r32; \ + ;; \ + stf.spill.nta [r2]=f32,32; \ + stf.spill.nta [r3]=f33,32; \ + ;; \ + stf.spill.nta [r2]=f34,32; \ + stf.spill.nta [r3]=f35,32; \ + ;; \ + stf.spill.nta [r2]=f36,32; \ + stf.spill.nta [r3]=f37,32; \ + ;; \ + stf.spill.nta [r2]=f38,32; \ + stf.spill.nta [r3]=f39,32; \ + ;; \ + stf.spill.nta [r2]=f40,32; \ + stf.spill.nta [r3]=f41,32; \ + ;; \ + stf.spill.nta [r2]=f42,32; \ + stf.spill.nta [r3]=f43,32; \ + ;; \ + stf.spill.nta [r2]=f44,32; \ + stf.spill.nta [r3]=f45,32; \ + ;; \ + stf.spill.nta [r2]=f46,32; \ + stf.spill.nta [r3]=f47,32; \ + ;; \ + stf.spill.nta [r2]=f48,32; \ + stf.spill.nta [r3]=f49,32; \ + ;; \ + stf.spill.nta [r2]=f50,32; \ + stf.spill.nta [r3]=f51,32; \ + ;; \ + stf.spill.nta [r2]=f52,32; \ + stf.spill.nta [r3]=f53,32; \ + ;; \ + stf.spill.nta [r2]=f54,32; \ + stf.spill.nta [r3]=f55,32; \ + ;; \ + stf.spill.nta [r2]=f56,32; \ + stf.spill.nta [r3]=f57,32; \ + ;; \ + stf.spill.nta [r2]=f58,32; \ + stf.spill.nta [r3]=f59,32; \ + ;; \ + stf.spill.nta [r2]=f60,32; \ + stf.spill.nta [r3]=f61,32; \ + ;; \ + stf.spill.nta [r2]=f62,32; \ + stf.spill.nta [r3]=f63,32; \ + ;; \ + stf.spill.nta [r2]=f64,32; \ + stf.spill.nta [r3]=f65,32; \ + ;; \ + stf.spill.nta [r2]=f66,32; \ + stf.spill.nta [r3]=f67,32; \ + ;; \ + stf.spill.nta [r2]=f68,32; \ + stf.spill.nta [r3]=f69,32; \ + ;; \ + stf.spill.nta [r2]=f70,32; \ + stf.spill.nta [r3]=f71,32; \ + ;; \ + stf.spill.nta [r2]=f72,32; \ + stf.spill.nta [r3]=f73,32; \ + ;; \ + stf.spill.nta [r2]=f74,32; \ + stf.spill.nta [r3]=f75,32; \ + ;; \ + stf.spill.nta [r2]=f76,32; \ + stf.spill.nta [r3]=f77,32; \ + ;; \ + stf.spill.nta [r2]=f78,32; \ + stf.spill.nta [r3]=f79,32; \ + ;; \ + stf.spill.nta [r2]=f80,32; \ + stf.spill.nta [r3]=f81,32; \ + ;; \ + stf.spill.nta [r2]=f82,32; \ + stf.spill.nta [r3]=f83,32; \ + ;; \ + stf.spill.nta [r2]=f84,32; \ + stf.spill.nta [r3]=f85,32; \ + ;; \ + stf.spill.nta [r2]=f86,32; \ + stf.spill.nta [r3]=f87,32; \ + ;; \ + stf.spill.nta [r2]=f88,32; \ + stf.spill.nta [r3]=f89,32; \ + ;; \ + stf.spill.nta [r2]=f90,32; \ + stf.spill.nta [r3]=f91,32; \ + ;; \ + stf.spill.nta [r2]=f92,32; \ + stf.spill.nta [r3]=f93,32; \ + ;; \ + stf.spill.nta [r2]=f94,32; \ + stf.spill.nta [r3]=f95,32; \ + ;; \ + stf.spill.nta [r2]=f96,32; \ + stf.spill.nta [r3]=f97,32; \ + ;; \ + stf.spill.nta [r2]=f98,32; \ + stf.spill.nta [r3]=f99,32; \ + ;; \ + stf.spill.nta [r2]=f100,32; \ + stf.spill.nta [r3]=f101,32; \ + ;; \ + stf.spill.nta [r2]=f102,32; \ + stf.spill.nta [r3]=f103,32; \ + ;; \ + stf.spill.nta [r2]=f104,32; \ + stf.spill.nta [r3]=f105,32; \ + ;; \ + stf.spill.nta [r2]=f106,32; \ + stf.spill.nta [r3]=f107,32; \ + ;; \ + stf.spill.nta [r2]=f108,32; \ + stf.spill.nta [r3]=f109,32; \ + ;; \ + stf.spill.nta [r2]=f110,32; \ + stf.spill.nta [r3]=f111,32; \ + ;; \ + stf.spill.nta [r2]=f112,32; \ + stf.spill.nta [r3]=f113,32; \ + ;; \ + stf.spill.nta [r2]=f114,32; \ + stf.spill.nta [r3]=f115,32; \ + ;; \ + stf.spill.nta [r2]=f116,32; \ + stf.spill.nta [r3]=f117,32; \ + ;; \ + stf.spill.nta [r2]=f118,32; \ + stf.spill.nta [r3]=f119,32; \ + ;; \ + stf.spill.nta [r2]=f120,32; \ + stf.spill.nta [r3]=f121,32; \ + ;; \ + stf.spill.nta [r2]=f122,32; \ + stf.spill.nta [r3]=f123,32; \ + ;; \ + stf.spill.nta [r2]=f124,32; \ + stf.spill.nta [r3]=f125,32; \ + ;; \ + stf.spill.nta [r2]=f126; \ + stf.spill.nta [r3]=f127; \ + ;; + + /* + * r33: point to context_t structure + */ +#define RESTORE_FPU_LOW \ + add r2 = CTX(F2), r33; \ + add r3 = CTX(F3), r33; \ + ;; \ + ldf.fill.nta f2 = [r2], 32; \ + ldf.fill.nta f3 = [r3], 32; \ + ;; \ + ldf.fill.nta f4 = [r2], 32; \ + ldf.fill.nta f5 = [r3], 32; \ + ;; \ + ldf.fill.nta f6 = [r2], 32; \ + ldf.fill.nta f7 = [r3], 32; \ + ;; \ + ldf.fill.nta f8 = [r2], 32; \ + ldf.fill.nta f9 = [r3], 32; \ + ;; \ + ldf.fill.nta f10 = [r2], 32; \ + ldf.fill.nta f11 = [r3], 32; \ + ;; \ + ldf.fill.nta f12 = [r2], 32; \ + ldf.fill.nta f13 = [r3], 32; \ + ;; \ + ldf.fill.nta f14 = [r2], 32; \ + ldf.fill.nta f15 = [r3], 32; \ + ;; \ + ldf.fill.nta f16 = [r2], 32; \ + ldf.fill.nta f17 = [r3], 32; \ + ;; \ + ldf.fill.nta f18 = [r2], 32; \ + ldf.fill.nta f19 = [r3], 32; \ + ;; \ + ldf.fill.nta f20 = [r2], 32; \ + ldf.fill.nta f21 = [r3], 32; \ + ;; \ + ldf.fill.nta f22 = [r2], 32; \ + ldf.fill.nta f23 = [r3], 32; \ + ;; \ + ldf.fill.nta f24 = [r2], 32; \ + ldf.fill.nta f25 = [r3], 32; \ + ;; \ + ldf.fill.nta f26 = [r2], 32; \ + ldf.fill.nta f27 = [r3], 32; \ + ;; \ + ldf.fill.nta f28 = [r2], 32; \ + ldf.fill.nta f29 = [r3], 32; \ + ;; \ + ldf.fill.nta f30 = [r2], 32; \ + ldf.fill.nta f31 = [r3], 32; \ + ;; + + + + /* + * r33: point to context_t structure + */ +#define RESTORE_FPU_HIGH \ + add r2 = CTX(F32), r33; \ + add r3 = CTX(F33), r33; \ + ;; \ + ldf.fill.nta f32 = [r2], 32; \ + ldf.fill.nta f33 = [r3], 32; \ + ;; \ + ldf.fill.nta f34 = [r2], 32; \ + ldf.fill.nta f35 = [r3], 32; \ + ;; \ + ldf.fill.nta f36 = [r2], 32; \ + ldf.fill.nta f37 = [r3], 32; \ + ;; \ + ldf.fill.nta f38 = [r2], 32; \ + ldf.fill.nta f39 = [r3], 32; \ + ;; \ + ldf.fill.nta f40 = [r2], 32; \ + ldf.fill.nta f41 = [r3], 32; \ + ;; \ + ldf.fill.nta f42 = [r2], 32; \ + ldf.fill.nta f43 = [r3], 32; \ + ;; \ + ldf.fill.nta f44 = [r2], 32; \ + ldf.fill.nta f45 = [r3], 32; \ + ;; \ + ldf.fill.nta f46 = [r2], 32; \ + ldf.fill.nta f47 = [r3], 32; \ + ;; \ + ldf.fill.nta f48 = [r2], 32; \ + ldf.fill.nta f49 = [r3], 32; \ + ;; \ + ldf.fill.nta f50 = [r2], 32; \ + ldf.fill.nta f51 = [r3], 32; \ + ;; \ + ldf.fill.nta f52 = [r2], 32; \ + ldf.fill.nta f53 = [r3], 32; \ + ;; \ + ldf.fill.nta f54 = [r2], 32; \ + ldf.fill.nta f55 = [r3], 32; \ + ;; \ + ldf.fill.nta f56 = [r2], 32; \ + ldf.fill.nta f57 = [r3], 32; \ + ;; \ + ldf.fill.nta f58 = [r2], 32; \ + ldf.fill.nta f59 = [r3], 32; \ + ;; \ + ldf.fill.nta f60 = [r2], 32; \ + ldf.fill.nta f61 = [r3], 32; \ + ;; \ + ldf.fill.nta f62 = [r2], 32; \ + ldf.fill.nta f63 = [r3], 32; \ + ;; \ + ldf.fill.nta f64 = [r2], 32; \ + ldf.fill.nta f65 = [r3], 32; \ + ;; \ + ldf.fill.nta f66 = [r2], 32; \ + ldf.fill.nta f67 = [r3], 32; \ + ;; \ + ldf.fill.nta f68 = [r2], 32; \ + ldf.fill.nta f69 = [r3], 32; \ + ;; \ + ldf.fill.nta f70 = [r2], 32; \ + ldf.fill.nta f71 = [r3], 32; \ + ;; \ + ldf.fill.nta f72 = [r2], 32; \ + ldf.fill.nta f73 = [r3], 32; \ + ;; \ + ldf.fill.nta f74 = [r2], 32; \ + ldf.fill.nta f75 = [r3], 32; \ + ;; \ + ldf.fill.nta f76 = [r2], 32; \ + ldf.fill.nta f77 = [r3], 32; \ + ;; \ + ldf.fill.nta f78 = [r2], 32; \ + ldf.fill.nta f79 = [r3], 32; \ + ;; \ + ldf.fill.nta f80 = [r2], 32; \ + ldf.fill.nta f81 = [r3], 32; \ + ;; \ + ldf.fill.nta f82 = [r2], 32; \ + ldf.fill.nta f83 = [r3], 32; \ + ;; \ + ldf.fill.nta f84 = [r2], 32; \ + ldf.fill.nta f85 = [r3], 32; \ + ;; \ + ldf.fill.nta f86 = [r2], 32; \ + ldf.fill.nta f87 = [r3], 32; \ + ;; \ + ldf.fill.nta f88 = [r2], 32; \ + ldf.fill.nta f89 = [r3], 32; \ + ;; \ + ldf.fill.nta f90 = [r2], 32; \ + ldf.fill.nta f91 = [r3], 32; \ + ;; \ + ldf.fill.nta f92 = [r2], 32; \ + ldf.fill.nta f93 = [r3], 32; \ + ;; \ + ldf.fill.nta f94 = [r2], 32; \ + ldf.fill.nta f95 = [r3], 32; \ + ;; \ + ldf.fill.nta f96 = [r2], 32; \ + ldf.fill.nta f97 = [r3], 32; \ + ;; \ + ldf.fill.nta f98 = [r2], 32; \ + ldf.fill.nta f99 = [r3], 32; \ + ;; \ + ldf.fill.nta f100 = [r2], 32; \ + ldf.fill.nta f101 = [r3], 32; \ + ;; \ + ldf.fill.nta f102 = [r2], 32; \ + ldf.fill.nta f103 = [r3], 32; \ + ;; \ + ldf.fill.nta f104 = [r2], 32; \ + ldf.fill.nta f105 = [r3], 32; \ + ;; \ + ldf.fill.nta f106 = [r2], 32; \ + ldf.fill.nta f107 = [r3], 32; \ + ;; \ + ldf.fill.nta f108 = [r2], 32; \ + ldf.fill.nta f109 = [r3], 32; \ + ;; \ + ldf.fill.nta f110 = [r2], 32; \ + ldf.fill.nta f111 = [r3], 32; \ + ;; \ + ldf.fill.nta f112 = [r2], 32; \ + ldf.fill.nta f113 = [r3], 32; \ + ;; \ + ldf.fill.nta f114 = [r2], 32; \ + ldf.fill.nta f115 = [r3], 32; \ + ;; \ + ldf.fill.nta f116 = [r2], 32; \ + ldf.fill.nta f117 = [r3], 32; \ + ;; \ + ldf.fill.nta f118 = [r2], 32; \ + ldf.fill.nta f119 = [r3], 32; \ + ;; \ + ldf.fill.nta f120 = [r2], 32; \ + ldf.fill.nta f121 = [r3], 32; \ + ;; \ + ldf.fill.nta f122 = [r2], 32; \ + ldf.fill.nta f123 = [r3], 32; \ + ;; \ + ldf.fill.nta f124 = [r2], 32; \ + ldf.fill.nta f125 = [r3], 32; \ + ;; \ + ldf.fill.nta f126 = [r2], 32; \ + ldf.fill.nta f127 = [r3], 32; \ + ;; + + /* + * r32: context_t base address + */ +#define SAVE_PTK_REGS \ + add r2=CTX(PKR0), r32; \ + mov r16=7; \ + ;; \ + mov ar.lc=r16; \ + mov r17=r0; \ + ;; \ +1: \ + mov r18=pkr[r17]; \ + ;; \ + srlz.i; \ + ;; \ + st8 [r2]=r18, 8; \ + ;; \ + add r17 =1,r17; \ + ;; \ + br.cloop.sptk 1b; \ + ;; + +/* + * r33: point to context_t structure + * ar.lc are corrupted. + */ +#define RESTORE_PTK_REGS \ + add r2=CTX(PKR0), r33; \ + mov r16=7; \ + ;; \ + mov ar.lc=r16; \ + mov r17=r0; \ + ;; \ +1: \ + ld8 r18=[r2], 8; \ + ;; \ + mov pkr[r17]=r18; \ + ;; \ + srlz.i; \ + ;; \ + add r17 =1,r17; \ + ;; \ + br.cloop.sptk 1b; \ + ;; + + +/* + * void vmm_trampoline( context_t * from, + * context_t * to) + * + * from: r32 + * to: r33 + * note: interrupt disabled before call this function. + */ +GLOBAL_ENTRY(vmm_trampoline) + mov r16 = psr + adds r2 = CTX(PSR), r32 + ;; + st8 [r2] = r16, 8 // psr + mov r17 = pr + ;; + st8 [r2] = r17, 8 // pr + mov r18 = ar.unat + ;; + st8 [r2] = r18 + mov r17 = ar.rsc + ;; + adds r2 = CTX(RSC),r32 + ;; + st8 [r2]= r17 + mov ar.rsc =0 + flushrs + ;; + SAVE_GENERAL_REGS + ;; + SAVE_KERNEL_REGS + ;; + SAVE_APP_REGS + ;; + SAVE_BRANCH_REGS + ;; + SAVE_CTL_REGS + ;; + SAVE_REGION_REGS + ;; + //SAVE_DEBUG_REGS + ;; + rsm psr.dfl + ;; + srlz.d + ;; + SAVE_FPU_LOW + ;; + rsm psr.dfh + ;; + srlz.d + ;; + SAVE_FPU_HIGH + ;; + SAVE_PTK_REGS + ;; + RESTORE_PTK_REGS + ;; + RESTORE_FPU_HIGH + ;; + RESTORE_FPU_LOW + ;; + //RESTORE_DEBUG_REGS + ;; + RESTORE_REGION_REGS + ;; + RESTORE_CTL_REGS + ;; + RESTORE_BRANCH_REGS + ;; + RESTORE_APP_REGS + ;; + RESTORE_KERNEL_REGS + ;; + RESTORE_GENERAL_REGS + ;; + adds r2=CTX(PSR), r33 + ;; + ld8 r16=[r2], 8 // psr + ;; + mov psr.l=r16 + ;; + srlz.d + ;; + ld8 r16=[r2], 8 // pr + ;; + mov pr =r16,-1 + ld8 r16=[r2] // unat + ;; + mov ar.unat=r16 + ;; + adds r2=CTX(RSC),r33 + ;; + ld8 r16 =[r2] + ;; + mov ar.rsc = r16 + ;; + br.ret.sptk.few b0 +END(vmm_trampoline) -- 1.5.5 |
From: Avi K. <av...@qu...> - 2008-04-17 09:11:17
|
From: Christian Borntraeger <bor...@de...> This patch adds an entry for kvm on s390 to the MAINTAINERS file :-). We intend to push all patches regarding this via Avi's kvm.git. Signed-off-by: Christian Borntraeger <bor...@de...> Signed-off-by: Carsten Otte <co...@de...> Signed-off-by: Avi Kivity <av...@qu...> --- MAINTAINERS | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index e467758..27e42b9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2311,6 +2311,16 @@ L: kvm...@li... W: kvm.sourceforge.net S: Supported +KERNEL VIRTUAL MACHINE for s390 (KVM/s390) +P: Carsten Otte +M: co...@de... +P: Christian Borntraeger +M: bor...@de... +M: lin...@de... +L: lin...@vg... +W: http://www.ibm.com/developerworks/linux/linux390/ +S: Supported + KEXEC P: Eric Biederman M: ebi...@xm... -- 1.5.5 |