You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(22) |
Sep
(45) |
Oct
(165) |
Nov
(149) |
Dec
(53) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(155) |
Feb
(71) |
Mar
(219) |
Apr
(262) |
May
(21) |
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jerone Y. <jy...@us...> - 2008-04-10 21:08:59
|
1 file changed, 5 insertions(+) qemu/qemu-kvm.c | 5 +++++ This patch adds a call to load_kvm_registers after creation of vcpu. This is required for ppc since we are required to set certain registers before boot. This should not have any effect on the curren x86 code (though I need to test this to make sure). What I would like though are some comments on the fix. Is this the right place for this? We had this in our platform setup code, but with recent code changes it will not work there anymore). Signed-off-by: Jerone Young <jy...@us...> diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -353,6 +353,11 @@ static void *ap_main_loop(void *_env) sigdelset(&signals, SIG_IPI); sigprocmask(SIG_BLOCK, &signals, NULL); kvm_create_vcpu(kvm_context, env->cpu_index); + if (env->cpu_index == 0) { + /* load any registers set in env into + kvm for the first guest vcpu */ + kvm_load_registers(env); + } kvm_qemu_init_env(env); if (kvm_irqchip_in_kernel(kvm_context)) env->hflags &= ~HF_HALTED_MASK; |
From: Jerone Y. <jy...@us...> - 2008-04-10 21:07:35
|
1 file changed, 3 deletions(-) qemu/hw/ppc440_bamboo.c | 3 --- This patch removes the call to kvm_load_registers while in board platform setup code. This must now be done later in vcpu initialization. Signed-off-by: Jerone Young <jy...@us...> diff --git a/qemu/hw/ppc440_bamboo.c b/qemu/hw/ppc440_bamboo.c --- a/qemu/hw/ppc440_bamboo.c +++ b/qemu/hw/ppc440_bamboo.c @@ -174,9 +174,6 @@ void bamboo_init(ram_addr_t ram_size, in env->gpr[3] = dt_base; #endif env->nip = ep; - - printf("%s: loading kvm registers\n", __func__); - kvm_load_registers(env); } for (i = 0; i < nb_nics; i++) { |
From: Josh B. <jw...@li...> - 2008-04-10 16:49:54
|
On Tue, 8 Apr 2008 13:54:41 +1000 David Gibson <dw...@au...> wrote: > On Mon, Apr 07, 2008 at 10:25:32PM -0500, Hollis Blanchard wrote: > > On Monday 07 April 2008 20:11:28 David Gibson wrote: > > > On Mon, Apr 07, 2008 at 03:53:33PM -0500, Hollis Blanchard wrote: > > > > 1 file changed, 7 insertions(+) > > > > include/linux/kvm.h | 7 +++++++ > > > > > > > > > > > > Device Control Registers are essentially another address space found on > > > > PowerPC 4xx processors, analogous to PIO on x86. DCRs are always 32 bits, > > > > and are identified by a 32-bit number. > > > > > > Well... 10-bit, actually. > > > > The mtdcrux description in the ppc440x6 user manual says the following: > > > > Let the contents of register RA denote a Device Control Register. > > The contents of GPR[RS] are placed into the designated Device Control > > Register. > > > > I take that to mean that we must worry about 32 bits worth of DCR numbers. > > Perhaps I should say "no more than" rather than "always". > > I think that's less misleading. mtdcrux is very new, anything which > only has the mtdcr instruction certainly can't take DCR numbers above > 10 bits, and I would expect that even on chips with mtdcrux the DCR > bus is probably still only 10-bits, although it could be extended. http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/C94B06BE313211B887257110006EFFBD/$file/460migrate.pdf page 4. "DCR Address Space Increased to 32 bits". I realize that the above is for 460 cores, but I would not be surprised at all if that shows up in a future 440 core. 440x6 already seems to be a conglomeration of some of the features 460 has. josh |
From: <ehr...@li...> - 2008-04-10 15:13:36
|
From: Christian Ehrhardt <ehr...@li...> This extends the kvm_stat reports for kvm on embedded powerpc. Since kvmppc is using emulation (no hardware support yet) this gives people interested in performance a detailed split of the emul_instruction counter already available. The kvmstat script I posted will follow in an updated version that can read the new values. This statistic does not cover e.g. operants of the commands, but that way it should have only a small perf impact (never break what you want to measure). I intend to implement a full relayfs based instruction strem soon which will (I hope) support some filter mechanism to reduce the ammount of data moved. This more detailed statistic can be used e.g. after some special instructions are identified using the function of this patch. Both can coexist and are configurable in .config. New kvmstat script and an example measurement follow in separate mails. Signed-off-by: Christian Ehrhardt <ehr...@li...> --- [diffstat] arch/powerpc/kvm/Kconfig | 9 ++++ arch/powerpc/kvm/emulate.c | 57 +++++++++++++++++++++++++++ arch/powerpc/kvm/powerpc.c | 84 +++++++++++++++++++++++++++++++++++++++++ include/asm-powerpc/kvm_host.h | 40 +++++++++++++++++++ include/asm-powerpc/kvm_ppc.h | 50 ++++++++++++++++++++++++ 5 files changed, 240 insertions(+) [diff] diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -41,6 +41,15 @@ config KVM_POWERPC_440 ---help--- Provides support for KVM on 440 processors. +config KVM_POWERPC_440_INSTRUCTION_STAT + bool "ppc440 instruction emulation statistics" + depends on KVM && 44x && KVM_POWERPC_440 + select KVM_POWERPC + ---help--- + Adds the tracking of the different emulated instructions. This is + used to debug performance issues and adds a slight runtime overhead. + If unsure, say N. + config KVM_PPC_VIRTIO bool "Virtio Support" select VIRTIO diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -87,6 +87,14 @@ static inline unsigned int get_d(u32 ins static inline unsigned int get_d(u32 inst) { return inst & 0xffff; +} + +static inline void emulinstr_stat(struct kvm_vcpu *vcpu, + enum kvmppc_emulated_instructions kvmppc_emulinstr) +{ +#ifdef CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT + (*(u32 *)((void *)vcpu + kvmppc_emulinstr_offset[kvmppc_emulinstr]))++; +#endif } static int tlbe_is_host_safe(const struct kvm_vcpu *vcpu, @@ -229,6 +237,7 @@ int kvmppc_emulate_instruction(struct kv switch (get_op(inst)) { case 3: /* trap */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_TRAP); if (get_d(inst) == 1) { /* FIXME port to final hypercall API when defined */ printk(KERN_INFO"Guest requested shutdown\n"); @@ -242,6 +251,7 @@ int kvmppc_emulate_instruction(struct kv case 19: switch (get_xop(inst)) { case 50: /* rfi */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_RFI); kvmppc_emul_rfi(vcpu); advance = 0; break; @@ -256,32 +266,39 @@ int kvmppc_emulate_instruction(struct kv switch (get_xop(inst)) { case 83: /* mfmsr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MFMSR); rt = get_rt(inst); vcpu->arch.gpr[rt] = vcpu->arch.msr; break; case 87: /* lbzx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LBZX); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1); break; case 131: /* wrtee */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_WRTEE); rs = get_rs(inst); vcpu->arch.msr = (vcpu->arch.msr & ~MSR_EE) | (vcpu->arch.gpr[rs] & MSR_EE); break; case 146: /* mtmsr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MTMSR); rs = get_rs(inst); kvmppc_set_msr(vcpu, vcpu->arch.gpr[rs]); break; case 163: /* wrteei */ + emulinstr_stat(vcpu, + KVMPPC_EMULATED_INSTRUCTION_WRTEEI); vcpu->arch.msr = (vcpu->arch.msr & ~MSR_EE) | (inst & MSR_EE); break; case 215: /* stbx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STBX); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], @@ -289,6 +306,7 @@ int kvmppc_emulate_instruction(struct kv break; case 247: /* stbux */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STBUX); rs = get_rs(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -304,11 +322,13 @@ int kvmppc_emulate_instruction(struct kv break; case 279: /* lhzx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LHZX); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1); break; case 311: /* lhzux */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LHZUX); rt = get_rt(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -322,6 +342,7 @@ int kvmppc_emulate_instruction(struct kv break; case 323: /* mfdcr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MFDCR); dcrn = get_dcrn(inst); rt = get_rt(inst); @@ -349,6 +370,7 @@ int kvmppc_emulate_instruction(struct kv break; case 339: /* mfspr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MFSPR); sprn = get_sprn(inst); rt = get_rt(inst); @@ -438,6 +460,7 @@ int kvmppc_emulate_instruction(struct kv break; case 407: /* sthx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STHX); rs = get_rs(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -448,6 +471,7 @@ int kvmppc_emulate_instruction(struct kv break; case 439: /* sthux */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STHUX); rs = get_rs(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -463,6 +487,7 @@ int kvmppc_emulate_instruction(struct kv break; case 451: /* mtdcr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MTDCR); dcrn = get_dcrn(inst); rs = get_rs(inst); @@ -482,6 +507,7 @@ int kvmppc_emulate_instruction(struct kv break; case 467: /* mtspr */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_MTSPR); sprn = get_sprn(inst); rs = get_rs(inst); switch (sprn) { @@ -588,6 +614,7 @@ int kvmppc_emulate_instruction(struct kv break; case 470: /* dcbi */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_DCBI); /* Do nothing. The guest is performing dcbi because * hardware DMA is not snooped by the dcache, but * emulated DMA either goes through the dcache as @@ -596,14 +623,19 @@ int kvmppc_emulate_instruction(struct kv break; case 534: /* lwbrx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LWBRX); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 4, 0); break; case 566: /* tlbsync */ + emulinstr_stat(vcpu, + KVMPPC_EMULATED_INSTRUCTION_TLBSYNC); break; case 662: /* stwbrx */ + emulinstr_stat(vcpu, + KVMPPC_EMULATED_INSTRUCTION_STWBRX); rs = get_rs(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -614,6 +646,7 @@ int kvmppc_emulate_instruction(struct kv break; case 978: /* tlbwe */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_TLBWE); emulated = kvmppc_emul_tlbwe(vcpu, inst); break; @@ -621,6 +654,7 @@ int kvmppc_emulate_instruction(struct kv int index; unsigned int as = get_mmucr_sts(vcpu); unsigned int pid = get_mmucr_stid(vcpu); + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_TLBSX); rt = get_rt(inst); ra = get_ra(inst); @@ -644,11 +678,14 @@ int kvmppc_emulate_instruction(struct kv break; case 790: /* lhbrx */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LHBRX); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 2, 0); break; case 918: /* sthbrx */ + emulinstr_stat(vcpu, + KVMPPC_EMULATED_INSTRUCTION_STHBRX); rs = get_rs(inst); ra = get_ra(inst); rb = get_rb(inst); @@ -659,6 +696,7 @@ int kvmppc_emulate_instruction(struct kv break; case 966: /* iccci */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_ICCCI); break; default: @@ -670,11 +708,14 @@ int kvmppc_emulate_instruction(struct kv break; case 32: /* lwz */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LWZ); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1); break; case 33: /* lwzu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LWZU); + rt = get_rt(inst); ra = get_ra(inst); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 4, 1); @@ -682,11 +723,15 @@ int kvmppc_emulate_instruction(struct kv break; case 34: /* lbz */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LBZ); + rt = get_rt(inst); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1); break; case 35: /* lbzu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LBZU); + rt = get_rt(inst); ra = get_ra(inst); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 1, 1); @@ -694,12 +739,16 @@ int kvmppc_emulate_instruction(struct kv break; case 36: /* stw */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STW); + rt = get_rt(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], 4, 1); break; case 37: /* stwu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STWU); + rt = get_rt(inst); ra = get_ra(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], @@ -708,12 +757,16 @@ int kvmppc_emulate_instruction(struct kv break; case 38: /* stb */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STB); + rt = get_rt(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], 1, 1); break; case 39: /* stbu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STBU); + rt = get_rt(inst); ra = get_ra(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], @@ -722,11 +775,15 @@ int kvmppc_emulate_instruction(struct kv break; case 40: /* lhz */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LHZ); + rt = get_rt(inst); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1); break; case 41: /* lhzu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_LHZU); + rt = get_rt(inst); ra = get_ra(inst); rt = get_rt(inst); emulated = kvmppc_handle_load(run, vcpu, rt, 2, 1); @@ -734,12 +791,16 @@ int kvmppc_emulate_instruction(struct kv break; case 44: /* sth */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STH); + rt = get_rt(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], 2, 1); break; case 45: /* sthu */ + emulinstr_stat(vcpu, KVMPPC_EMULATED_INSTRUCTION_STHU); + rt = get_rt(inst); ra = get_ra(inst); rs = get_rs(inst); emulated = kvmppc_handle_store(run, vcpu, vcpu->arch.gpr[rs], diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -56,8 +56,92 @@ struct kvm_stats_debugfs_item debugfs_en { "inst_emu", VCPU_STAT(emulated_inst_exits) }, { "dec", VCPU_STAT(dec_exits) }, { "ext_intr", VCPU_STAT(ext_intr_exits) }, +#ifdef CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT + { "instr_trap", VCPU_STAT(instr_trap) }, + { "instr_rfi", VCPU_STAT(instr_rfi) }, + { "instr_mfmsr", VCPU_STAT(instr_mfmsr) }, + { "instr_lbzx", VCPU_STAT(instr_lbzx) }, + { "instr_wrtee", VCPU_STAT(instr_wrtee) }, + { "instr_mtmsr", VCPU_STAT(instr_mtmsr) }, + { "instr_wrteei", VCPU_STAT(instr_wrteei) }, + { "instr_stbx", VCPU_STAT(instr_stbx) }, + { "instr_stbux", VCPU_STAT(instr_stbux) }, + { "instr_lhzx", VCPU_STAT(instr_lhzx) }, + { "instr_lhzux", VCPU_STAT(instr_lhzux) }, + { "instr_mfdcr", VCPU_STAT(instr_mfdcr) }, + { "instr_mfspr", VCPU_STAT(instr_mfspr) }, + { "instr_sthx", VCPU_STAT(instr_sthx) }, + { "instr_sthux", VCPU_STAT(instr_sthux) }, + { "instr_mtdcr", VCPU_STAT(instr_mtdcr) }, + { "instr_mtspr", VCPU_STAT(instr_mtspr) }, + { "instr_dcbi", VCPU_STAT(instr_dcbi) }, + { "instr_lwbrx", VCPU_STAT(instr_lwbrx) }, + { "instr_tlbsync", VCPU_STAT(instr_tlbsync) }, + { "instr_stwbrx", VCPU_STAT(instr_stwbrx) }, + { "instr_tlbwe", VCPU_STAT(instr_tlbwe) }, + { "instr_tlbsx", VCPU_STAT(instr_tlbsx) }, + { "instr_lhbrx", VCPU_STAT(instr_lhbrx) }, + { "instr_sthbrx", VCPU_STAT(instr_sthbrx) }, + { "instr_iccci", VCPU_STAT(instr_iccci) }, + { "instr_lwz", VCPU_STAT(instr_lwz) }, + { "instr_lwzu", VCPU_STAT(instr_lwzu) }, + { "instr_lbz", VCPU_STAT(instr_lbz) }, + { "instr_lbzu", VCPU_STAT(instr_lbzu) }, + { "instr_stw", VCPU_STAT(instr_stw) }, + { "instr_stwu", VCPU_STAT(instr_stwu) }, + { "instr_stb", VCPU_STAT(instr_stb) }, + { "instr_stbu", VCPU_STAT(instr_stbu) }, + { "instr_lhz", VCPU_STAT(instr_lhz) }, + { "instr_lhzu", VCPU_STAT(instr_lhzu) }, + { "instr_sth", VCPU_STAT(instr_sth) }, + { "instr_sthu", VCPU_STAT(instr_sthu) }, +#endif /* CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT */ { NULL } }; + +#ifdef CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT +#define INST_STAT_OFFSET(x) offsetof(struct kvm_vcpu, stat.x) +const int kvmppc_emulinstr_offset[] = { + [KVMPPC_EMULATED_INSTRUCTION_TRAP] = INST_STAT_OFFSET(instr_trap), + [KVMPPC_EMULATED_INSTRUCTION_RFI] = INST_STAT_OFFSET(instr_rfi), + [KVMPPC_EMULATED_INSTRUCTION_MFMSR] = INST_STAT_OFFSET(instr_mfmsr), + [KVMPPC_EMULATED_INSTRUCTION_LBZX] = INST_STAT_OFFSET(instr_lbzx), + [KVMPPC_EMULATED_INSTRUCTION_WRTEE] = INST_STAT_OFFSET(instr_wrtee), + [KVMPPC_EMULATED_INSTRUCTION_MTMSR] = INST_STAT_OFFSET(instr_mtmsr), + [KVMPPC_EMULATED_INSTRUCTION_WRTEEI] = INST_STAT_OFFSET(instr_wrteei), + [KVMPPC_EMULATED_INSTRUCTION_STBX] = INST_STAT_OFFSET(instr_stbx), + [KVMPPC_EMULATED_INSTRUCTION_STBUX] = INST_STAT_OFFSET(instr_stbux), + [KVMPPC_EMULATED_INSTRUCTION_LHZX] = INST_STAT_OFFSET(instr_lhzx), + [KVMPPC_EMULATED_INSTRUCTION_LHZUX] = INST_STAT_OFFSET(instr_lhzux), + [KVMPPC_EMULATED_INSTRUCTION_MFDCR] = INST_STAT_OFFSET(instr_mfdcr), + [KVMPPC_EMULATED_INSTRUCTION_MFSPR] = INST_STAT_OFFSET(instr_mfspr), + [KVMPPC_EMULATED_INSTRUCTION_STHX] = INST_STAT_OFFSET(instr_sthx), + [KVMPPC_EMULATED_INSTRUCTION_STHUX] = INST_STAT_OFFSET(instr_sthux), + [KVMPPC_EMULATED_INSTRUCTION_MTDCR] = INST_STAT_OFFSET(instr_mtdcr), + [KVMPPC_EMULATED_INSTRUCTION_MTSPR] = INST_STAT_OFFSET(instr_mtspr), + [KVMPPC_EMULATED_INSTRUCTION_DCBI] = INST_STAT_OFFSET(instr_dcbi), + [KVMPPC_EMULATED_INSTRUCTION_LWBRX] = INST_STAT_OFFSET(instr_lwbrx), + [KVMPPC_EMULATED_INSTRUCTION_TLBSYNC] = INST_STAT_OFFSET(instr_tlbsync), + [KVMPPC_EMULATED_INSTRUCTION_STWBRX] = INST_STAT_OFFSET(instr_stwbrx), + [KVMPPC_EMULATED_INSTRUCTION_TLBWE] = INST_STAT_OFFSET(instr_tlbwe), + [KVMPPC_EMULATED_INSTRUCTION_TLBSX] = INST_STAT_OFFSET(instr_tlbsx), + [KVMPPC_EMULATED_INSTRUCTION_LHBRX] = INST_STAT_OFFSET(instr_lhbrx), + [KVMPPC_EMULATED_INSTRUCTION_STHBRX] = INST_STAT_OFFSET(instr_sthbrx), + [KVMPPC_EMULATED_INSTRUCTION_ICCCI] = INST_STAT_OFFSET(instr_iccci), + [KVMPPC_EMULATED_INSTRUCTION_LWZ] = INST_STAT_OFFSET(instr_lwz), + [KVMPPC_EMULATED_INSTRUCTION_LWZU] = INST_STAT_OFFSET(instr_lwzu), + [KVMPPC_EMULATED_INSTRUCTION_LBZ] = INST_STAT_OFFSET(instr_lbz), + [KVMPPC_EMULATED_INSTRUCTION_LBZU] = INST_STAT_OFFSET(instr_lbzu), + [KVMPPC_EMULATED_INSTRUCTION_STW] = INST_STAT_OFFSET(instr_stw), + [KVMPPC_EMULATED_INSTRUCTION_STWU] = INST_STAT_OFFSET(instr_stwu), + [KVMPPC_EMULATED_INSTRUCTION_STB] = INST_STAT_OFFSET(instr_stb), + [KVMPPC_EMULATED_INSTRUCTION_STBU] = INST_STAT_OFFSET(instr_stbu), + [KVMPPC_EMULATED_INSTRUCTION_LHZ] = INST_STAT_OFFSET(instr_lhz), + [KVMPPC_EMULATED_INSTRUCTION_LHZU] = INST_STAT_OFFSET(instr_lhzu), + [KVMPPC_EMULATED_INSTRUCTION_STH] = INST_STAT_OFFSET(instr_sth), + [KVMPPC_EMULATED_INSTRUCTION_STHU] = INST_STAT_OFFSET(instr_sthu) +}; +#endif /* CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT */ static int kvmppc_debug; diff --git a/include/asm-powerpc/kvm_host.h b/include/asm-powerpc/kvm_host.h --- a/include/asm-powerpc/kvm_host.h +++ b/include/asm-powerpc/kvm_host.h @@ -59,6 +59,46 @@ struct kvm_vcpu_stat { u32 emulated_inst_exits; u32 dec_exits; u32 ext_intr_exits; +#ifdef CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT + u32 instr_trap; + u32 instr_rfi; + u32 instr_mfmsr; + u32 instr_lbzx; + u32 instr_wrtee; + u32 instr_mtmsr; + u32 instr_wrteei; + u32 instr_stbx; + u32 instr_stbux; + u32 instr_lhzx; + u32 instr_lhzux; + u32 instr_mfdcr; + u32 instr_mfspr; + u32 instr_sthx; + u32 instr_sthux; + u32 instr_mtdcr; + u32 instr_mtspr; + u32 instr_dcbi; + u32 instr_lwbrx; + u32 instr_tlbsync; + u32 instr_stwbrx; + u32 instr_tlbwe; + u32 instr_tlbsx; + u32 instr_lhbrx; + u32 instr_sthbrx; + u32 instr_iccci; + u32 instr_lwz; + u32 instr_lwzu; + u32 instr_lbz; + u32 instr_lbzu; + u32 instr_stw; + u32 instr_stwu; + u32 instr_stb; + u32 instr_stbu; + u32 instr_lhz; + u32 instr_lhzu; + u32 instr_sth; + u32 instr_sthu; +#endif /* CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT */ }; struct tlbe { diff --git a/include/asm-powerpc/kvm_ppc.h b/include/asm-powerpc/kvm_ppc.h --- a/include/asm-powerpc/kvm_ppc.h +++ b/include/asm-powerpc/kvm_ppc.h @@ -41,6 +41,56 @@ enum emulation_result { EMULATE_FAIL, /* can't emulate this instruction */ EMULATE_SHUTDOWN, /* shutdown guest requested (no kvm_run data) */ }; + +enum kvmppc_emulated_instructions { + KVMPPC_EMULATED_INSTRUCTION_TRAP, + KVMPPC_EMULATED_INSTRUCTION_RFI, + KVMPPC_EMULATED_INSTRUCTION_MFMSR, + KVMPPC_EMULATED_INSTRUCTION_LBZX, + KVMPPC_EMULATED_INSTRUCTION_WRTEE, + KVMPPC_EMULATED_INSTRUCTION_MTMSR, + KVMPPC_EMULATED_INSTRUCTION_WRTEEI, + KVMPPC_EMULATED_INSTRUCTION_STBX, + KVMPPC_EMULATED_INSTRUCTION_STBUX, + KVMPPC_EMULATED_INSTRUCTION_LHZX, + KVMPPC_EMULATED_INSTRUCTION_LHZUX, + KVMPPC_EMULATED_INSTRUCTION_MFDCR, + KVMPPC_EMULATED_INSTRUCTION_MFSPR, + KVMPPC_EMULATED_INSTRUCTION_STHX, + KVMPPC_EMULATED_INSTRUCTION_STHUX, + KVMPPC_EMULATED_INSTRUCTION_MTDCR, + KVMPPC_EMULATED_INSTRUCTION_MTSPR, + KVMPPC_EMULATED_INSTRUCTION_DCBI, + KVMPPC_EMULATED_INSTRUCTION_LWBRX, + KVMPPC_EMULATED_INSTRUCTION_TLBSYNC, + KVMPPC_EMULATED_INSTRUCTION_STWBRX, + KVMPPC_EMULATED_INSTRUCTION_TLBWE, + KVMPPC_EMULATED_INSTRUCTION_TLBSX, + KVMPPC_EMULATED_INSTRUCTION_LHBRX, + KVMPPC_EMULATED_INSTRUCTION_STHBRX, + KVMPPC_EMULATED_INSTRUCTION_ICCCI, + KVMPPC_EMULATED_INSTRUCTION_LWZ, + KVMPPC_EMULATED_INSTRUCTION_LWZU, + KVMPPC_EMULATED_INSTRUCTION_LBZ, + KVMPPC_EMULATED_INSTRUCTION_LBZU, + KVMPPC_EMULATED_INSTRUCTION_STW, + KVMPPC_EMULATED_INSTRUCTION_STWU, + KVMPPC_EMULATED_INSTRUCTION_STB, + KVMPPC_EMULATED_INSTRUCTION_STBU, + KVMPPC_EMULATED_INSTRUCTION_LHZ, + KVMPPC_EMULATED_INSTRUCTION_LHZU, + KVMPPC_EMULATED_INSTRUCTION_STH, + KVMPPC_EMULATED_INSTRUCTION_STHU, +}; + +#ifdef CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT +/* + * offset definitions to map&update counters with low runtime overhead + * directly in vcpu->stat.x (no new op/xop -> counter mapping needed, this + * uses the already implemented switch/case in emulate_instruction). +*/ +extern const int kvmppc_emulinstr_offset[]; +#endif /* CONFIG_KVM_POWERPC_440_INSTRUCTION_STAT */ extern const unsigned char exception_priority[]; extern const unsigned char priority_exception[]; |
From: Christian E. <ehr...@li...> - 2008-04-10 14:48:01
|
We all saw in prior kvm_stat counters that listed the different exit reasons the high count of instruction emulation. The recent patch to kvm-ppc-devel splits up those into the different instructions. I attached a log of a simple guest boot&halt FYI. E.g. see the high number of m[ft]msr instructions. The file is attached because no one wants to look at it line wrapped in the mail client anyway. -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization |
From: Christian E. <ehr...@li...> - 2008-04-10 14:45:12
|
Our kvmstat script that works without python got updated to display the new instruction statistics. The new command line switch is "-i" and you should redirect the output to a file because it won't fit a console ;-) New script attached -- Grüsse / regards, Christian Ehrhardt IBM Linux Technology Center, Open Virtualization |
From: Josh B. <jw...@li...> - 2008-04-10 14:33:05
|
On Mon, 07 Apr 2008 15:53:32 -0500 Hollis Blanchard <ho...@us...> wrote: > 1 file changed, 2 insertions(+) > include/asm-powerpc/mmu-44x.h | 2 ++ > > > PowerPC 440 KVM needs to know how many TLB entries are used for the host kernel > linear mapping (it does not modify these mappings when switching between guest > and host execution). > > Signed-off-by: Hollis Blanchard <ho...@us...> Acked-by: Josh Boyer <jw...@li...> > > diff --git a/include/asm-powerpc/mmu-44x.h b/include/asm-powerpc/mmu-44x.h > --- a/include/asm-powerpc/mmu-44x.h > +++ b/include/asm-powerpc/mmu-44x.h > @@ -53,6 +53,8 @@ > > #ifndef __ASSEMBLY__ > > +extern unsigned int tlb_44x_hwater; > + > typedef unsigned long long phys_addr_t; > > typedef struct { > _______________________________________________ > Linuxppc-dev mailing list > Lin...@oz... > https://ozlabs.org/mailman/listinfo/linuxppc-dev |
From: Hollis B. <ho...@us...> - 2008-04-10 14:28:44
|
On Thursday 10 April 2008 06:55:18 Josh Boyer wrote: > On Mon, 07 Apr 2008 15:53:31 -0500 > > Hollis Blanchard <ho...@us...> wrote: > > Implement initial support for KVM for PowerPC 440. There are just two > > small prerequisite patches, and then the bulk of the code can't be split > > easily. > > > > Please review; I would like to submit these for 2.6.26. There is plenty > > of work to do, both functional and optimization, but this code is > > sufficient to run unmodified 440 Linux guests on a 440 Linux host. What's > > your favorite bike shed color? > > Who's tree are you looking to get this patch set into? Since the only PPC-specific change is patch 1, if I can get your Acked-by for it I will send it via Avi. -- Hollis Blanchard IBM Linux Technology Center |
From: Arnd B. <ar...@ar...> - 2008-04-10 13:50:39
|
On Tuesday 08 April 2008, Jerone Young wrote: > +static struct sleep_mode modes[] = { > + { .name = "wait", .entry = &ppc44x_idle }, > + { .name = "spin", .entry = NULL }, > +}; > + > +int __init ppc44x_idle_init(void) > +{ > + void *func = modes[current_mode].entry; > + ppc_md.power_save = func; > + return 0; > +} > + > +arch_initcall(ppc44x_idle_init); > + > +static int __init idle_param(char *p) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(modes); i++) { > + if (!strcmp(modes[i].name, p)) { > + current_mode = i; > + break; > + } > + } > + > + return 0; > +} > + > +early_param("idle", idle_param); ok, sorry to steal the show again, now that everyone seems to be happy with the current code, but isn't this equivalent to the simpler static int __init idle_param(char *p) { if (!strcmp(modes[i].name, "spin")) ppc_md.power_save = NULL; } early_param("idle", idle_param); if you statically initialize the ppc_md.power_save function to ppc44x_idle in the platform setup files? Arnd <>< |
From: Josh B. <jw...@li...> - 2008-04-10 11:57:20
|
On Mon, 07 Apr 2008 15:53:31 -0500 Hollis Blanchard <ho...@us...> wrote: > Implement initial support for KVM for PowerPC 440. There are just two small > prerequisite patches, and then the bulk of the code can't be split easily. > > Please review; I would like to submit these for 2.6.26. There is plenty of work > to do, both functional and optimization, but this code is sufficient to run > unmodified 440 Linux guests on a 440 Linux host. What's your favorite bike shed > color? Who's tree are you looking to get this patch set into? josh |
From: Josh B. <jw...@li...> - 2008-04-10 11:57:11
|
On Tue, 08 Apr 2008 11:49:14 -0500 Jerone Young <jy...@us...> wrote: > 2 files changed, 77 insertions(+), 1 deletion(-) > arch/powerpc/platforms/44x/Makefile | 2 > arch/powerpc/platforms/44x/idle.c | 76 +++++++++++++++++++++++++++++++++++ > > > Updates: Now setting MSR_WE is now default > Tested on hardware platforms bamboo & sequioa and appears > things are working fine on actually hardware! > > This patch adds the ability for the CPU to go into wait state while in cpu_idle loop. This helps virtulization solutions know when the guest Linux kernel is in an idle state. There are two ways to do it. > > Command line > idle=spin <-- CPU will spin > idle=wait <-- set CPU into wait state when idle (default) > > > Signed-off-by: Jerone Young <jy...@us...> Looks great. I'll add it to my tree shortly. josh |
From: Jerone Y. <jy...@us...> - 2008-04-08 22:16:39
|
# HG changeset patch # User Jerone Young <jy...@us...> # Date 1207692873 18000 # Branch merge # Node ID 8ddf560729aac228cd84068e1227e601e68a6840 # Parent 94cbc19df0f0fcab150599b10d859f1a3bc1b7cb [v2] Move kvm_get_pit to libkvm.c common code - I am resending this patch removing ia64. It apprently fell through the cracks. Don't compile kvm_*_pit() on architectures whose currently supported platforms do not contain a PIT. Signed-off-by: Hollis Blanchard <ho...@us...> Signed-off-by: Jerone Young <jy...@us...> diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -549,6 +549,7 @@ int kvm_pit_in_kernel(kvm_context_t kvm) #ifdef KVM_CAP_PIT +#if defined(__i386__) || defined(__x86_64__) /*! * \brief Get in kernel PIT of the virtual domain * @@ -569,6 +570,7 @@ int kvm_get_pit(kvm_context_t kvm, struc * \param s PIT state of the virtual domain */ int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s); +#endif #endif |
From: Hollis B. <ho...@us...> - 2008-04-08 19:44:39
|
On Tuesday 08 April 2008 11:49:14 Jerone Young wrote: > 2 files changed, 77 insertions(+), 1 deletion(-) > arch/powerpc/platforms/44x/Makefile | 2 > arch/powerpc/platforms/44x/idle.c | 76 > +++++++++++++++++++++++++++++++++++ > > > Updates: Now setting MSR_WE is now default > Tested on hardware platforms bamboo & sequioa and appears > things are working fine on actually hardware! > > This patch adds the ability for the CPU to go into wait state while in > cpu_idle loop. This helps virtulization solutions know when the guest Linux > kernel is in an idle state. There are two ways to do it. > > Command line > idle=spin <-- CPU will spin > idle=wait <-- set CPU into wait state when idle (default) > > > Signed-off-by: Jerone Young <jy...@us...> Acked-by: Hollis Blanchard <ho...@us...> -- Hollis Blanchard IBM Linux Technology Center |
From: Jerone Y. <jy...@us...> - 2008-04-08 16:49:37
|
2 files changed, 77 insertions(+), 1 deletion(-) arch/powerpc/platforms/44x/Makefile | 2 arch/powerpc/platforms/44x/idle.c | 76 +++++++++++++++++++++++++++++++++++ Updates: Now setting MSR_WE is now default Tested on hardware platforms bamboo & sequioa and appears things are working fine on actually hardware! This patch adds the ability for the CPU to go into wait state while in cpu_idle loop. This helps virtulization solutions know when the guest Linux kernel is in an idle state. There are two ways to do it. Command line idle=spin <-- CPU will spin idle=wait <-- set CPU into wait state when idle (default) Signed-off-by: Jerone Young <jy...@us...> diff --git a/arch/powerpc/platforms/44x/Makefile b/arch/powerpc/platforms/44x/Makefile --- a/arch/powerpc/platforms/44x/Makefile +++ b/arch/powerpc/platforms/44x/Makefile @@ -1,4 +1,4 @@ obj-$(CONFIG_44x) := misc_44x.o -obj-$(CONFIG_44x) := misc_44x.o +obj-$(CONFIG_44x) := misc_44x.o idle.o obj-$(CONFIG_EBONY) += ebony.o obj-$(CONFIG_TAISHAN) += taishan.o obj-$(CONFIG_BAMBOO) += bamboo.o diff --git a/arch/powerpc/platforms/44x/idle.c b/arch/powerpc/platforms/44x/idle.c new file mode 100644 --- /dev/null +++ b/arch/powerpc/platforms/44x/idle.c @@ -0,0 +1,76 @@ +/* + * Copyright 2008 IBM Corp. + * + * Based on arch/powerpc/platforms/pasemi/idle.c: + * Copyright (C) 2006-2007 PA Semi, Inc + * + * Added by: Jerone Young <jy...@us...> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include <linux/of.h> +#include <linux/kernel.h> +#include <asm/machdep.h> + +static int current_mode; + +struct sleep_mode { + char *name; + void (*entry)(void); +}; + +static void ppc44x_idle(void) +{ + unsigned long msr_save; + + msr_save = mfmsr(); + /* set wait state MSR */ + mtmsr(msr_save|MSR_WE|MSR_EE|MSR_CE|MSR_DE); + isync(); + /* return to initial state */ + mtmsr(msr_save); + isync(); +} + +static struct sleep_mode modes[] = { + { .name = "wait", .entry = &ppc44x_idle }, + { .name = "spin", .entry = NULL }, +}; + +int __init ppc44x_idle_init(void) +{ + void *func = modes[current_mode].entry; + ppc_md.power_save = func; + return 0; +} + +arch_initcall(ppc44x_idle_init); + +static int __init idle_param(char *p) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(modes); i++) { + if (!strcmp(modes[i].name, p)) { + current_mode = i; + break; + } + } + + return 0; +} + +early_param("idle", idle_param); |
From: Arnd B. <ar...@ar...> - 2008-04-08 05:09:37
|
On Tuesday 08 April 2008, Hollis Blanchard wrote: > If there is one thing I have learned in my various porting efforts, it's that > using a variable-sized type in an interface is just begging for trouble. > > x86 uses fixed 64-bit variables here (even with x86-32), so that might be the > right solution here. Right, I think there was even a conference presentation about this topic done by one of us, right? Arnd <>< |
From: David G. <dw...@au...> - 2008-04-08 04:19:55
|
On Mon, Apr 07, 2008 at 11:06:10PM -0500, Hollis Blanchard wrote: > On Monday 07 April 2008 22:54:41 David Gibson wrote: > > On Mon, Apr 07, 2008 at 10:25:32PM -0500, Hollis Blanchard wrote: > > > On Monday 07 April 2008 20:11:28 David Gibson wrote: > > > > On Mon, Apr 07, 2008 at 03:53:33PM -0500, Hollis Blanchard wrote: > > > > > 1 file changed, 7 insertions(+) > > > > > include/linux/kvm.h | 7 +++++++ > > > > > > > > > > > > > > > Device Control Registers are essentially another address space found > > > > > on PowerPC 4xx processors, analogous to PIO on x86. DCRs are always > > > > > 32 bits, and are identified by a 32-bit number. > > > > > > > > Well... 10-bit, actually. > > > > > > The mtdcrux description in the ppc440x6 user manual says the following: > > > > > > Let the contents of register RA denote a Device Control Register. > > > The contents of GPR[RS] are placed into the designated Device Control > > > Register. > > > > > > I take that to mean that we must worry about 32 bits worth of DCR > > > numbers. Perhaps I should say "no more than" rather than "always". > > > > I think that's less misleading. mtdcrux is very new, anything which > > only has the mtdcr instruction certainly can't take DCR numbers above > > 10 bits, and I would expect that even on chips with mtdcrux the DCR > > bus is probably still only 10-bits, although it could be extended. > > We're defining a kernel/userspace interface here, and since the hardware is > capable of 32-bit DCR numbers, I don't think it makes any sense to not > support that. Also, we would just end up placing that number into a u32 > anyways, so... :) Oh, of course you should represent it as a u32 and support 32-bit addresses, it's only the patch comment I'm objecting to. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson |
From: Hollis B. <ho...@us...> - 2008-04-08 04:19:12
|
On Monday 07 April 2008 21:58:17 Arnd Bergmann wrote: > On Monday 07 April 2008, Hollis Blanchard wrote: > > --- a/include/asm-powerpc/kvm.h > > +++ b/include/asm-powerpc/kvm.h > > @@ -1,6 +1,55 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License, version 2, as > > + * published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, > > USA. + * > > + * Copyright IBM Corp. 2007 > > + * > > + * Authors: Hollis Blanchard <ho...@us...> > > + */ > > + > > #ifndef __LINUX_KVM_POWERPC_H > > #define __LINUX_KVM_POWERPC_H > > > > -/* powerpc does not support KVM */ > > +#include <asm/types.h> > > > > -#endif > > +struct kvm_regs { > > + __u32 pc; > > + __u32 cr; > > + __u32 ctr; > > + __u32 lr; > > + __u32 xer; > > + __u32 msr; > > + __u32 srr0; > > + __u32 srr1; > > + __u32 pid; > > + > > + __u32 sprg0; > > + __u32 sprg1; > > + __u32 sprg2; > > + __u32 sprg3; > > + __u32 sprg4; > > + __u32 sprg5; > > + __u32 sprg6; > > + __u32 sprg7; > > + > > + __u64 fpr[32]; > > + __u32 gpr[32]; > > +}; > > + > > +struct kvm_sregs { > > +}; > > + > > +struct kvm_fpu { > > +}; > > + > > +#endif /* __LINUX_KVM_POWERPC_H */ > > Since this defines part of the ABI, it would be nice if it's possible > to have it in a platform independent way. Most of the registers here > should probably become "unsigned long" instead of "__u32" so that > the definition can be used for a potential 64 bit port. If there is one thing I have learned in my various porting efforts, it's that using a variable-sized type in an interface is just begging for trouble. x86 uses fixed 64-bit variables here (even with x86-32), so that might be the right solution here. > Also, I noticed that you lump everything into kvm_regs, instead of > using sregs for stuff like srr0 and kvm_fpu for the fprs. What is > the reason for that? The FPRs and SPRs are only really useful for two things here: debugger support and migration. We don't really support either at the moment, so this part of the user/kernel ABI will need change as we implement those. I will move the FPR stuff into kvm_fpu though. (I think when I originally wrote this, kvm_fpu was defined to be x86 stuff, but it obviously isn't now...) -- Hollis Blanchard IBM Linux Technology Center |
From: Hollis B. <ho...@us...> - 2008-04-08 04:08:51
|
On Monday 07 April 2008 22:54:41 David Gibson wrote: > On Mon, Apr 07, 2008 at 10:25:32PM -0500, Hollis Blanchard wrote: > > On Monday 07 April 2008 20:11:28 David Gibson wrote: > > > On Mon, Apr 07, 2008 at 03:53:33PM -0500, Hollis Blanchard wrote: > > > > 1 file changed, 7 insertions(+) > > > > include/linux/kvm.h | 7 +++++++ > > > > > > > > > > > > Device Control Registers are essentially another address space found > > > > on PowerPC 4xx processors, analogous to PIO on x86. DCRs are always > > > > 32 bits, and are identified by a 32-bit number. > > > > > > Well... 10-bit, actually. > > > > The mtdcrux description in the ppc440x6 user manual says the following: > > > > Let the contents of register RA denote a Device Control Register. > > The contents of GPR[RS] are placed into the designated Device Control > > Register. > > > > I take that to mean that we must worry about 32 bits worth of DCR > > numbers. Perhaps I should say "no more than" rather than "always". > > I think that's less misleading. mtdcrux is very new, anything which > only has the mtdcr instruction certainly can't take DCR numbers above > 10 bits, and I would expect that even on chips with mtdcrux the DCR > bus is probably still only 10-bits, although it could be extended. We're defining a kernel/userspace interface here, and since the hardware is capable of 32-bit DCR numbers, I don't think it makes any sense to not support that. Also, we would just end up placing that number into a u32 anyways, so... :) -- Hollis Blanchard IBM Linux Technology Center |
From: Hollis B. <ho...@us...> - 2008-04-08 04:00:58
|
On Monday 07 April 2008 21:12:40 Josh Boyer wrote: > On Mon, 07 Apr 2008 15:53:34 -0500 > > Hollis Blanchard <ho...@us...> wrote: > > Currently supports only PowerPC 440 Linux guests on 440 hosts. (Only > > tested with 440EP "Bamboo" guests so far, but with appropriate userspace > > support other SoC/board combinations should work.) > > I haven't reviewed the whole patch yet, but lots of it looks pretty > clean. A couple comments on some things that made me scratch my head > below. > > > Interrupt handling: We use IVPR to hijack the host interrupt vectors > > while running the guest, but hand off decrementer and external interrupts > > for normal guest processing. > > > > Address spaces: We take advantage of the fact that Linux doesn't use the > > AS=1 address space (in host or guest), which gives us virtual address > > space to use for guest mappings. While the guest is running, the host > > kernel remains mapped in AS=0, but the guest can only use AS=1 mappings. > > > > TLB entries: The TLB entries covering the host linear address space > > remain present while running the guest (which reduces the overhead of > > lightweight exits). We keep three copies of the TLB: > > - guest TLB: contents of the TLB as the guest sees it > > - shadow TLB: the TLB that is actually in hardware while guest is > > running - host TLB: to restore TLB state when context switching guest -> > > host When a TLB miss occurs because a mapping was not present in the > > shadow TLB, but was present in the guest TLB, KVM handles the fault > > without invoking the guest. Large guest pages are backed by multiple 4KB > > shadow pages through this mechanism. > > > > Instruction emulation: The guest kernel executes at user level, so > > executing privileged instructions trap into KVM, where we decode and > > emulate them. Future performance work will focus on reducing the overhead > > and frequency of these traps. > > > > IO: MMIO and DCR accesses are emulated by userspace. We use virtio for > > network and block IO, so those drivers must be enabled in the guest. It's > > possible that some qemu device emulation (e.g. e1000 or rtl8139) may also > > work with little effort. > > > > Signed-off-by: Hollis Blanchard <ho...@us...> > > As Olof pointed out, having this in Documentation might be nice. Yeah, the trouble is that a book could be written on the subject. (OK, a short book. At least a paper.) Anyways, I will provide something... > > diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug > > --- a/arch/powerpc/Kconfig.debug > > +++ b/arch/powerpc/Kconfig.debug > > @@ -151,6 +151,7 @@ > > > > config PPC_EARLY_DEBUG > > bool "Early debugging (dangerous)" > > + depends on !KVM > > help > > Say Y to enable some early debugging facilities that may be available > > for your processor/board combination. Those facilities are hacks > > Might want to add a brief explanation as to why this doesn't work with > KVM due to the AS conflict. Will do. > > diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c > > new file mode 100644 > > --- /dev/null > > +++ b/arch/powerpc/kvm/44x_tlb.c > > @@ -0,0 +1,224 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License, version 2, as > > + * published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, > > USA. + * > > + * Copyright IBM Corp. 2007 > > + * > > + * Authors: Hollis Blanchard <ho...@us...> > > + */ > > + > > +#include <linux/types.h> > > +#include <linux/string.h> > > +#include <linux/kvm_host.h> > > +#include <linux/highmem.h> > > +#include <asm/mmu-44x.h> > > +#include <asm/kvm_ppc.h> > > + > > +#include "44x_tlb.h" > > + > > +#define PPC44x_TLB_USER_PERM_MASK > > (PPC44x_TLB_UX|PPC44x_TLB_UR|PPC44x_TLB_UW) +#define > > PPC44x_TLB_SUPER_PERM_MASK (PPC44x_TLB_SX|PPC44x_TLB_SR|PPC44x_TLB_SW) + > > +static unsigned int kvmppc_tlb_44x_pos; > > + > > +static u32 kvmppc_44x_tlb_shadow_attrib(u32 attrib, int usermode) > > +{ > > + /* XXX remove mask when Linux is fixed */ > > + attrib &= 0xf03f; > > What does that mean? Is this the "44x TLB handler writes to reserved > fields" issue? If so, could you comment that a little more verbosely? Yup, you're right. Actually, what I should really do is this: attrib &= PPC44x_TLB_ATTR_MASK | PPC44x_TLB_PERM_MASK; > Or you could just send a patch to fix Linux... ;). I had a look at it once, but didn't have time to grok the assembly bit manipulations. I guess nobody else has either. :) > > + > > + if (!usermode) { > > + /* Guest is in supervisor mode, so we need to translate guest > > + * supervisor permissions into user permissions. */ > > + attrib &= ~PPC44x_TLB_USER_PERM_MASK; > > + attrib |= (attrib & PPC44x_TLB_SUPER_PERM_MASK) << 3; > > + } > > + > > + /* Make sure host can always access this memory. */ > > + attrib |= PPC44x_TLB_SX|PPC44x_TLB_SR|PPC44x_TLB_SW; > > + > > + return attrib; > > +} > > + > > <snip> > > > diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c > > new file mode 100644 > > --- /dev/null > > +++ b/arch/powerpc/kvm/emulate.c > > @@ -0,0 +1,753 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License, version 2, as > > + * published by the Free Software Foundation. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, > > USA. + * > > + * Copyright IBM Corp. 2007 > > + * > > + * Authors: Hollis Blanchard <ho...@us...> > > + */ > > + > > +#include <linux/jiffies.h> > > +#include <linux/timer.h> > > +#include <linux/types.h> > > +#include <linux/string.h> > > +#include <linux/kvm_host.h> > > + > > +#include <asm/dcr.h> > > +#include <asm/time.h> > > +#include <asm/byteorder.h> > > +#include <asm/kvm_ppc.h> > > #include <asm/dcr-regs.h> > > > + > > +#include "44x_tlb.h" > > + > > +#define DCRN_CPR0_CFGADDR 0xc > > +#define DCRN_CPR0_CFGDATA 0xd > > Remove these. OK. > <snip> > > > +int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu > > *vcpu) +{ > > + u32 inst = vcpu->arch.last_inst; > > + u32 ea; > > + int ra; > > + int rb; > > + int rc; > > + int rs; > > + int rt; > > + int sprn; > > + int dcrn; > > + enum emulation_result emulated = EMULATE_DONE; > > + int advance = 1; > > + > > + switch (get_op(inst)) { > > <snip> > > > + case 323: /* mfdcr */ > > + dcrn = get_dcrn(inst); > > + rt = get_rt(inst); > > + > > + /* emulate some access in kernel */ > > + switch (dcrn) { > > + case DCRN_CPR0_CFGADDR: > > + vcpu->arch.gpr[rt] = vcpu->arch.cpr0_cfgaddr; > > + break; > > + case DCRN_CPR0_CFGDATA: > > + local_irq_disable(); > > + mtdcr(DCRN_CPR0_CFGADDR, > > + vcpu->arch.cpr0_cfgaddr); > > + vcpu->arch.gpr[rt] = mfdcr(DCRN_CPR0_CFGDATA); > > + local_irq_enable(); > > + break; > > + default: > > + run->dcr.dcrn = dcrn; > > + run->dcr.data = 0; > > + run->dcr.is_write = 0; > > + vcpu->arch.io_gpr = rt; > > + vcpu->arch.dcr_needed = 1; > > + emulated = EMULATE_DO_DCR; > > + } > > + > > + break; > > <snip> > > > + case 451: /* mtdcr */ > > + dcrn = get_dcrn(inst); > > + rs = get_rs(inst); > > + > > + /* emulate some access in kernel */ > > + switch (dcrn) { > > + case DCRN_CPR0_CFGADDR: > > + vcpu->arch.cpr0_cfgaddr = vcpu->arch.gpr[rs]; > > + break; > > + default: > > + run->dcr.dcrn = dcrn; > > + run->dcr.data = vcpu->arch.gpr[rs]; > > + run->dcr.is_write = 1; > > + vcpu->arch.dcr_needed = 1; > > + emulated = EMULATE_DO_DCR; > > + } > > + > > + break; > > What exactly are those doing? I'm confused as to why the CPR0 DCRs are > special cased. Also, you don't seem to actually be doing much on > storing guest DCRs, so can I assume they aren't really needed at > runtime? A 440 cuImage accesses CPR0 DCRs to determine the timebase frequency. Ordinarily, these DCR accesses would be handed to userspace (qemu) to emulate. To emulate this in userspace, we could extract the TB freq from the host's /proc/device-tree (userspace can't directly access DCRs), but honestly I didn't want to deal with the math necessary to decompose that back into the various CPR0 register fields. So instead I special-case CPR0 reads in the kernel. The guest needs to know the real/host timebase frequency anyways, since it can execute mftb without host involvement, and will get very confused if the timebase increments at a different frequency than it expects. -- Hollis Blanchard IBM Linux Technology Center |
From: David G. <dw...@au...> - 2008-04-08 03:56:05
|
On Mon, Apr 07, 2008 at 10:25:32PM -0500, Hollis Blanchard wrote: > On Monday 07 April 2008 20:11:28 David Gibson wrote: > > On Mon, Apr 07, 2008 at 03:53:33PM -0500, Hollis Blanchard wrote: > > > 1 file changed, 7 insertions(+) > > > include/linux/kvm.h | 7 +++++++ > > > > > > > > > Device Control Registers are essentially another address space found on > > > PowerPC 4xx processors, analogous to PIO on x86. DCRs are always 32 bits, > > > and are identified by a 32-bit number. > > > > Well... 10-bit, actually. > > The mtdcrux description in the ppc440x6 user manual says the following: > > Let the contents of register RA denote a Device Control Register. > The contents of GPR[RS] are placed into the designated Device Control > Register. > > I take that to mean that we must worry about 32 bits worth of DCR numbers. > Perhaps I should say "no more than" rather than "always". I think that's less misleading. mtdcrux is very new, anything which only has the mtdcr instruction certainly can't take DCR numbers above 10 bits, and I would expect that even on chips with mtdcrux the DCR bus is probably still only 10-bits, although it could be extended. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson |
From: Hollis B. <ho...@us...> - 2008-04-08 03:26:03
|
On Monday 07 April 2008 20:11:28 David Gibson wrote: > On Mon, Apr 07, 2008 at 03:53:33PM -0500, Hollis Blanchard wrote: > > 1 file changed, 7 insertions(+) > > include/linux/kvm.h | 7 +++++++ > > > > > > Device Control Registers are essentially another address space found on > > PowerPC 4xx processors, analogous to PIO on x86. DCRs are always 32 bits, > > and are identified by a 32-bit number. > > Well... 10-bit, actually. The mtdcrux description in the ppc440x6 user manual says the following: Let the contents of register RA denote a Device Control Register. The contents of GPR[RS] are placed into the designated Device Control Register. I take that to mean that we must worry about 32 bits worth of DCR numbers. Perhaps I should say "no more than" rather than "always". -- Hollis Blanchard IBM Linux Technology Center |
From: Arnd B. <ar...@ar...> - 2008-04-08 02:58:22
|
On Monday 07 April 2008, Hollis Blanchard wrote: > --- a/include/asm-powerpc/kvm.h > +++ b/include/asm-powerpc/kvm.h > @@ -1,6 +1,55 @@ > +/* > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License, version 2, as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. > + * > + * Copyright IBM Corp. 2007 > + * > + * Authors: Hollis Blanchard <ho...@us...> > + */ > + > #ifndef __LINUX_KVM_POWERPC_H > #define __LINUX_KVM_POWERPC_H > > -/* powerpc does not support KVM */ > +#include <asm/types.h> > > -#endif > +struct kvm_regs { > + __u32 pc; > + __u32 cr; > + __u32 ctr; > + __u32 lr; > + __u32 xer; > + __u32 msr; > + __u32 srr0; > + __u32 srr1; > + __u32 pid; > + > + __u32 sprg0; > + __u32 sprg1; > + __u32 sprg2; > + __u32 sprg3; > + __u32 sprg4; > + __u32 sprg5; > + __u32 sprg6; > + __u32 sprg7; > + > + __u64 fpr[32]; > + __u32 gpr[32]; > +}; > + > +struct kvm_sregs { > +}; > + > +struct kvm_fpu { > +}; > + > +#endif /* __LINUX_KVM_POWERPC_H */ Since this defines part of the ABI, it would be nice if it's possible to have it in a platform independent way. Most of the registers here should probably become "unsigned long" instead of "__u32" so that the definition can be used for a potential 64 bit port. Also, I noticed that you lump everything into kvm_regs, instead of using sregs for stuff like srr0 and kvm_fpu for the fprs. What is the reason for that? Arnd <>< |
From: Josh B. <jw...@li...> - 2008-04-08 02:45:49
|
On Tue, 8 Apr 2008 04:41:28 +0200 Arnd Bergmann <ar...@ar...> wrote: > On Tuesday 08 April 2008, Josh Boyer wrote: > > > > > > Actually, a static assignment to 0 has not caused the symbol to end up > > > in .data for many gcc versions, it always goes into .bss now unless you > > > assign it a value other than 0 or use explicit section attributes. > > > > IIRC, gcc 3.2 is still supported and it didn't do that. Old toolchains > > still exist. > > Ok, I thought it was before 3.2. The oldest version I had around was > gcc-3.3 and that had the new behaviour. http://www.gnu.org/software/gcc/gcc-3.3/changes.html gcc 3.3.1 it seems. I thought I included that in the original reply, but it's late and I suck :). josh |
From: Arnd B. <ar...@ar...> - 2008-04-08 02:43:32
|
On Tuesday 08 April 2008, Josh Boyer wrote: > > > > Actually, a static assignment to 0 has not caused the symbol to end up > > in .data for many gcc versions, it always goes into .bss now unless you > > assign it a value other than 0 or use explicit section attributes. > > IIRC, gcc 3.2 is still supported and it didn't do that. Old toolchains > still exist. Ok, I thought it was before 3.2. The oldest version I had around was gcc-3.3 and that had the new behaviour. Arnd <>< |
From: Josh B. <jw...@li...> - 2008-04-08 02:37:31
|
On Tue, 8 Apr 2008 04:21:44 +0200 Arnd Bergmann <ar...@ar...> wrote: > On Friday 04 April 2008, Jerone Young wrote: > > > > +int __init ppc44x_idle_init(void) > > > > +{ > > > > + void *func = modes[current_mode].entry; > > > > + struct device_node *node; > > > > + > > > > + node = of_find_node_by_path("/hypervisor"); > > > > + if (node) { > > > > + /* if we find /hypervisor node is in device tree, > > > > + set idle mode to wait */ > > > > + func = &ppc44x_idle; /* wait */ > > > > + of_node_put(node); > > > > + } > > > > + > > > > + ppc_md.power_save = func; > > > > + return 0; > > > > +} > > > > + > > > > +arch_initcall(ppc44x_idle_init); > > > > > > IIRC, this would over-ride the idle_param() below, is that the > > > intended behavior? > > > > Yes. At the moment if it detects a hypervisor in the kernel tree it > > overrides what the command line says. > > > > This is unusual behavior. Normally, we try to come up with reasonable > defaults and give the user the chance to override it. The /hypervisor check is going away completely in the next version of the patch. I've spoken to the HW guys and am more comfortable with wait being the default. The spin version should still be an option, specified by the kernel parameter. josh |