From: Guillaume T. <gui...@ex...> - 2008-04-29 13:07:13
|
Hello, This patch should solve the problem observed during protected mode transitions that appears for example during the installation of openSuse-10.3. Unfortunately there is an issue that crashes kvm-userspace. I'm not sure if it's a problem introduced by the patch or if the patch is good and raises a new issue. Here is what I'm doing: 1) Remove the SS patching that modifies SS_SELECTOR in enter_pmode() to see vmentry failure. 2) Add the handler that catches the VMentry failure. It is called handle_vmentry_failure() 3) while CS.RPL != SS.RPL, emulate the instruction. 4) Add the emulation of "ljmp", "mov r, imm", "mov sreg, r/m16" and "mov r/m16, sreg" that have respectively opcode 0xea, 0xb8, 0x8e and 0x8c. Normally, it should be sufficient to boot openSuse-10.3 because instructions that need to be emulated are: 0x0000000000046e53: ljmp $0x18,$0x6e18 0x0000000000046e58: mov $0x20,%ax 0x0000000000046e5c: mov %eax,%ds 0x0000000000046e5e: mov %ss,%eax 0x0000000000046e60: and $0xffff,%esp 0x0000000000046e66: shl $0x4,%eax 0x0000000000046e69: add %eax,%esp 0x0000000000046e6b: mov $0x8,%ax 0x0000000000046e6f: mov %eax,%ss At this point, cs.rpl is equal to ss.rpl. I added trace in handle_vmentry_failure() and also in writeback() to see what functions are emulated and I observe: [82766.614575] Failed vm entry (exit reason 0x21) invalid guest state [82766.651046] emulation at (46e53) rip 6e13: ea 18 6e 18 [82766.682611] writeback: dst.byte 0 [82766.706180] writeback: dst.ptr 0x0000000000000000 [82766.734890] writeback: dst.val 0x0 [82766.758591] writeback: src.ptr 0x0000000000000000 [82766.790594] writeback: src.val 0x0 [82766.855058] successfully emulated instruction [82766.882695] Failed vm entry (exit reason 0x21) invalid guest state [82766.923061] emulation at (46e58) rip 6e18: 66 b8 20 00 [82766.951079] writeback: dst.byte 2 [82766.975074] writeback: dst.ptr 0xffff810324d07400 [82767.003112] writeback: dst.val 0x20 [82767.027100] writeback: src.ptr 0x0000000000006e1a [82767.059092] writeback: src.val 0x20 [82767.127094] successfully emulated instruction [82767.151111] Failed vm entry (exit reason 0x21) invalid guest state [82767.191099] emulation at (46e5c) rip 6e1c: 8e d8 8c d0 [82767.219156] writeback: dst.byte 4 [82767.243118] writeback: dst.ptr 0xffff810324d07418 [82767.275091] writeback: dst.val 0x800000 [82767.299122] writeback: src.ptr 0x0000000000000000 [82767.331106] writeback: src.val 0x20 [82767.395255] successfully emulated instruction [82767.423135] Failed vm entry (exit reason 0x21) invalid guest state [82767.459260] emulation at (46e5e) rip 6e1e: 8c d0 81 e4 [82767.491137] writeback: dst.byte 2 [82767.515117] writeback: dst.ptr 0xffff810324d07400 [82767.543138] writeback: dst.val 0x53e1 [82767.567264] writeback: src.ptr 0xffff810324d07410 [82767.599142] writeback: src.val 0x20 [82767.667146] successfully emulated instruction [82767.691277] Failed vm entry (exit reason 0x21) invalid guest state [82767.731152] emulation at (46e60) rip 6e20: 81 e4 ff ff [82767.763136] writeback: dst.byte 0 [82767.783154] writeback: dst.ptr 0x0000000000000000 [82767.815157] writeback: dst.val 0x2004 [82767.839156] writeback: src.ptr 0x0000000000006e22 [82767.871140] writeback: src.val 0xffff [82767.939170] successfully emulated instruction [82767.963307] Failed vm entry (exit reason 0x21) invalid guest state [82768.003174] emulation at (46e66) rip 6e26: c1 e0 04 01 [82768.035153] writeback: dst.byte 0 [82768.055174] writeback: dst.ptr 0x0000000000000000 [82768.087177] writeback: dst.val 0x53e1 [82768.111178] writeback: src.ptr 0x0000000000006e28 [82768.143157] writeback: src.val 0x4 [82768.211151] successfully emulated instruction [82768.235189] Failed vm entry (exit reason 0x21) invalid guest state [82768.271311] emulation at (46e69) rip 6e29: 01 c4 66 b8 [82768.303214] writeback: dst.byte 0 [82768.327213] writeback: dst.ptr 0x0000000000000000 [82768.355238] writeback: dst.val 0x2004 [82768.379316] writeback: src.ptr 0xffff810324d07400 [82768.411227] writeback: src.val 0x53e1 [82768.483168] successfully emulated instruction [82768.507240] Failed vm entry (exit reason 0x21) invalid guest state [82768.543329] emulation at (46e6b) rip 6e2b: 66 b8 08 00 [82768.575239] writeback: dst.byte 2 [82768.599233] writeback: dst.ptr 0xffff810324d07400 [82768.627257] writeback: dst.val 0x8 [82768.651246] writeback: src.ptr 0x0000000000006e2d [82768.683245] writeback: src.val 0x8 [82768.751250] successfully emulated instruction [82768.775331] Failed vm entry (exit reason 0x21) invalid guest state [82768.815256] emulation at (46e6f) rip 6e2f: 8e d0 8e c0 [82768.843348] writeback: dst.byte 4 [82768.867268] writeback: dst.ptr 0xffff810324d07410 [82768.899204] writeback: dst.val 0x53e1 [82768.923259] writeback: src.ptr 0x0000000000000000 [82768.951351] writeback: src.val 0x8 [82769.019279] successfully emulated instruction So everything seems ok but after the emulation of "mov %eax,%ss" instruction, it seems that cs.rpl == ss.rpl but the guest is still in a VT-unfriendly state because I have the following error in kvm-userspace: [guill@enterprise][~/local/kvm-userspace.git/bin]$ ./qemu-system-x86_64 -hda ~/disk_images/hd_50G.qcow2 -cdrom /images_iso/openSUSE-10.3-GM-x86_64-mini.iso -boot d -s -m 1024 exception 13 (33) rax 0000000000000673 rbx 0000000000800000 rcx 0000000000000000 rdx 00000000000013ca rsi 0000000000055e1c rdi 0000000000055e1d rsp 00000000fffa0080 rbp 000000000000200b r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000 rip 000000000000b071 rflags 00033092 cs 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es 00ff (00000ff0/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss ff11 (000ff110/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs 3002 (00030020/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr 0000 (fffbd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 40920/47 idt 0/ffff cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 code: 17 06 29 4b 01 18 eb 18 a8 25 aa 19 28 4c 01 28 4d 01 01 17 --> 0f 17 0f 01 17 0f 17 12 01 17 2c 25 4b 19 21 00 02 17 1a 94 0a 76 67 61 3d 30 78 25 78 20 Aborted It's strange because handle_vmentry_failure() is not called. I'm trying to see where is the problem, any comments are welcome Regards, Guillaume arch/x86/kvm/vmx.c | 68 +++++++++++++++++++++++++++ arch/x86/kvm/vmx.h | 3 + arch/x86/kvm/x86.c | 12 ++-- arch/x86/kvm/x86_emulate.c | 112 +++++++++++++++++++++++++++++++++++++++++++-- include/asm-x86/kvm_host.h | 4 + 5 files changed, 190 insertions(+), 9 deletions(-) --- diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 79cdbe8..a0a13b8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1272,7 +1272,9 @@ static void enter_pmode(struct kvm_vcpu *vcpu) fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs); fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs); +#if 0 vmcs_write16(GUEST_SS_SELECTOR, 0); +#endif vmcs_write32(GUEST_SS_AR_BYTES, 0x93); vmcs_write16(GUEST_CS_SELECTOR, @@ -2635,6 +2637,66 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 1; } +static int invalid_guest_state(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, u32 failure_reason) +{ + u16 ss, cs; + u8 opcodes[4]; + unsigned long rip = vcpu->arch.rip; + unsigned long rip_linear; + + ss = vmcs_read16(GUEST_SS_SELECTOR); + cs = vmcs_read16(GUEST_CS_SELECTOR); + + if ((ss & 0x03) != (cs & 0x03)) { + int err; + rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS); + emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu); + printk(KERN_INFO "emulation at (%lx) rip %lx: %02x %02x %02x %02x\n", + rip_linear, + rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]); + err = emulate_instruction(vcpu, kvm_run, 0, 0, 0); + switch (err) { + case EMULATE_DONE: + printk(KERN_INFO "successfully emulated instruction\n"); + return 1; + case EMULATE_DO_MMIO: + printk(KERN_INFO "mmio?\n"); + return 0; + default: + kvm_report_emulation_failure(vcpu, "vmentry failure"); + break; + } + } + + kvm_run->exit_reason = KVM_EXIT_UNKNOWN; + kvm_run->hw.hardware_exit_reason = failure_reason; + return 0; +} + +static int handle_vmentry_failure(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, + u32 failure_reason) +{ + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); + + printk(KERN_INFO "Failed vm entry (exit reason 0x%x) ", failure_reason); + switch (failure_reason) { + case EXIT_REASON_INVALID_GUEST_STATE: + printk("invalid guest state \n"); + return invalid_guest_state(vcpu, kvm_run, failure_reason); + case EXIT_REASON_MSR_LOADING: + printk("caused by MSR entry %ld loading.\n", exit_qualification); + break; + case EXIT_REASON_MACHINE_CHECK: + printk("caused by machine check.\n"); + break; + default: + printk("reason not known yet!\n"); + break; + } + return 0; +} /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -2696,6 +2758,12 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) exit_reason != EXIT_REASON_EPT_VIOLATION)) printk(KERN_WARNING "%s: unexpected, valid vectoring info and " "exit reason is 0x%x\n", __func__, exit_reason); + + if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) { + exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY; + return handle_vmentry_failure(vcpu, kvm_run, exit_reason); + } + if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run); diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 79d94c6..2cebf48 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -238,7 +238,10 @@ enum vmcs_field { #define EXIT_REASON_IO_INSTRUCTION 30 #define EXIT_REASON_MSR_READ 31 #define EXIT_REASON_MSR_WRITE 32 +#define EXIT_REASON_INVALID_GUEST_STATE 33 +#define EXIT_REASON_MSR_LOADING 34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 +#define EXIT_REASON_MACHINE_CHECK 41 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define EXIT_REASON_APIC_ACCESS 44 #define EXIT_REASON_EPT_VIOLATION 48 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 578a0c1..9e5d687 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3027,8 +3027,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) return 0; } -static void get_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void get_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->get_segment(vcpu, var, seg); } @@ -3111,8 +3111,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, return 0; } -static void set_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void set_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->set_segment(vcpu, var, seg); } @@ -3270,8 +3270,8 @@ static int load_segment_descriptor_to_kvm_desct(struct kvm_vcpu *vcpu, return 0; } -static int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, - int type_bits, int seg) +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg) { struct kvm_segment kvm_seg; diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 2ca0838..f6b9dad 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -138,7 +138,8 @@ static u16 opcode_table[256] = { /* 0x88 - 0x8F */ ByteOp | DstMem | SrcReg | ModRM | Mov, DstMem | SrcReg | ModRM | Mov, ByteOp | DstReg | SrcMem | ModRM | Mov, DstReg | SrcMem | ModRM | Mov, - 0, ModRM | DstReg, 0, Group | Group1A, + DstMem | SrcReg | ModRM | Mov, ModRM | DstReg, + DstReg | SrcMem | ModRM | Mov, Group | Group1A, /* 0x90 - 0x9F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, @@ -152,7 +153,8 @@ static u16 opcode_table[256] = { ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String, ByteOp | ImplicitOps | String, ImplicitOps | String, /* 0xB0 - 0xBF */ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + DstReg | SrcImm | Mov, 0, 0, 0, 0, 0, 0, 0, /* 0xC0 - 0xC7 */ ByteOp | DstMem | SrcImm | ModRM, DstMem | SrcImmByte | ModRM, 0, ImplicitOps | Stack, 0, 0, @@ -168,7 +170,7 @@ static u16 opcode_table[256] = { /* 0xE0 - 0xE7 */ 0, 0, 0, 0, 0, 0, 0, 0, /* 0xE8 - 0xEF */ - ImplicitOps | Stack, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, + ImplicitOps | Stack, SrcImm | ImplicitOps, ImplicitOps, SrcImmByte | ImplicitOps, 0, 0, 0, 0, /* 0xF0 - 0xF7 */ 0, 0, 0, 0, @@ -1511,14 +1513,90 @@ special_insn: break; case 0x88 ... 0x8b: /* mov */ goto mov; + case 0x8c: { /* mov r/m, sreg */ + struct kvm_segment segreg; + + if (c->modrm_mod == 0x3) + c->src.val = c->modrm_val; + + switch ( c->modrm_reg ) { + case 0: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_ES); + break; + case 1: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_CS); + break; + case 2: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_SS); + break; + case 3: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_DS); + break; + case 4: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_FS); + break; + case 5: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_GS); + break; + default: + printk(KERN_INFO "0x8c: Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + c->dst.val = segreg.selector; + c->dst.bytes = 2; + c->dst.ptr = (unsigned long *)decode_register(c->modrm_rm, c->regs, + c->d & ByteOp); + break; + } case 0x8d: /* lea r16/r32, m */ c->dst.val = c->modrm_ea; break; + case 0x8e: { /* mov seg, r/m16 */ + uint16_t sel; + + sel = c->src.val; + switch ( c->modrm_reg ) { + case 0: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_ES) < 0) + goto cannot_emulate; + break; + case 1: + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) + goto cannot_emulate; + break; + case 2: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_SS) < 0) + goto cannot_emulate; + break; + case 3: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_DS) < 0) + goto cannot_emulate; + break; + case 4: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_FS) < 0) + goto cannot_emulate; + break; + case 5: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_GS) < 0) + goto cannot_emulate; + break; + default: + printk(KERN_INFO "Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + } case 0x8f: /* pop (sole member of Grp1a) */ rc = emulate_grp1a(ctxt, ops); if (rc != 0) goto done; break; + case 0xb8: /* mov r, imm */ + goto mov; case 0x9c: /* pushf */ c->src.val = (unsigned long) ctxt->eflags; emulate_push(ctxt); @@ -1657,6 +1735,34 @@ special_insn: break; } case 0xe9: /* jmp rel */ + jmp_rel(c, c->src.val); + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + case 0xea: /* jmp far */ { + uint32_t eip; + uint16_t sel; + + switch (c->op_bytes) { + case 2: + eip = insn_fetch(u16, 2, c->eip); + eip = eip & 0x0000FFFF; /* clear upper 16 bits */ + break; + case 4: + eip = insn_fetch(u32, 4, c->eip); + break; + default: + DPRINTF("jmp far: Invalid op_bytes\n"); + goto cannot_emulate; + } + sel = insn_fetch(u16, 2, c->eip); + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { + DPRINTF("jmp far: Failed to load CS descriptor\n"); + goto cannot_emulate; + } + + c->eip = eip; + break; + } case 0xeb: /* jmp rel short */ jmp_rel(c, c->src.val); c->dst.type = OP_NONE; /* Disable writeback. */ diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 4baa9c9..7a0846a 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -495,6 +495,10 @@ int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value); +void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +void get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg); int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason); void kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); |
From: Anthony L. <an...@co...> - 2008-04-29 16:41:49
|
Guillaume Thouvenin wrote: > Hello, > > This patch should solve the problem observed during protected mode > transitions that appears for example during the installation of > openSuse-10.3. Unfortunately there is an issue that crashes > kvm-userspace. I'm not sure if it's a problem introduced by the > patch or if the patch is good and raises a new issue. > You still aren't emulating the instructions correctly I think. Running your patch, I see: [ 979.755349] Failed vm entry (exit reason 0x21) invalid guest state [ 979.755354] emulation at (46e4b) rip 6e0b: ea 10 6e 18 [ 979.755358] successfully emulated instruction [ 979.756105] Failed vm entry (exit reason 0x21) invalid guest state [ 979.756109] emulation at (46e50) rip 6e10: 66 b8 20 00 [ 979.756111] successfully emulated instruction [ 979.756749] Failed vm entry (exit reason 0x21) invalid guest state [ 979.756752] emulation at (46e54) rip 6e14: 8e d8 8c d0 [ 979.756755] successfully emulated instruction [ 979.757427] Failed vm entry (exit reason 0x21) invalid guest state [ 979.757430] emulation at (46e56) rip 6e16: 8c d0 81 e4 [ 979.757433] successfully emulated instruction [ 979.758074] Failed vm entry (exit reason 0x21) invalid guest state [ 979.758077] emulation at (46e58) rip 6e18: 81 e4 ff ff The corresponding gfxboot code is: 16301 00006E0B EA[106E]1800 jmp pm_seg.prog_c32:switch_to_pm_20 16302 switch_to_pm_20: 16303 16304 bits 32 16305 16306 00006E10 66B82000 mov ax,pm_seg.prog_d16 16307 00006E14 8ED8 mov ds,ax 16308 16309 00006E16 8CD0 mov eax,ss 16310 00006E18 81E4FFFF0000 and esp,0ffffh The VT state should be correct after executing instruction an RIP 6E16 (mov eax, ss). The next instruction should not cause a vmentry failure. The fact that it is for you indicates that you're not updating guest state correctly. My guess would be that load_segment_descriptor is not updating the values within the VMCS. Regards, Anthony Liguori |
From: Laurent V. <Lau...@bu...> - 2008-04-29 17:22:20
|
Le mardi 29 avril 2008 à 19:09 +0200, Laurent Vivier a écrit : > Le mardi 29 avril 2008 à 11:41 -0500, Anthony Liguori a écrit : > > Guillaume Thouvenin wrote: > > > Hello, > > > > > > This patch should solve the problem observed during protected mode > > > transitions that appears for example during the installation of > > > openSuse-10.3. Unfortunately there is an issue that crashes > > > kvm-userspace. I'm not sure if it's a problem introduced by the > > > patch or if the patch is good and raises a new issue. > > > > > > > You still aren't emulating the instructions correctly I think. Running > > your patch, I see: > > > > [ 979.755349] Failed vm entry (exit reason 0x21) invalid guest state > > [ 979.755354] emulation at (46e4b) rip 6e0b: ea 10 6e 18 > > [ 979.755358] successfully emulated instruction > > [ 979.756105] Failed vm entry (exit reason 0x21) invalid guest state > > [ 979.756109] emulation at (46e50) rip 6e10: 66 b8 20 00 > > [ 979.756111] successfully emulated instruction > > [ 979.756749] Failed vm entry (exit reason 0x21) invalid guest state > > [ 979.756752] emulation at (46e54) rip 6e14: 8e d8 8c d0 > > [ 979.756755] successfully emulated instruction > > [ 979.757427] Failed vm entry (exit reason 0x21) invalid guest state > > [ 979.757430] emulation at (46e56) rip 6e16: 8c d0 81 e4 > > [ 979.757433] successfully emulated instruction > > [ 979.758074] Failed vm entry (exit reason 0x21) invalid guest state > > [ 979.758077] emulation at (46e58) rip 6e18: 81 e4 ff ff > > > > > > The corresponding gfxboot code is: > > > > 16301 00006E0B EA[106E]1800 jmp > > pm_seg.prog_c32:switch_to_pm_20 > > 16302 switch_to_pm_20: > > 16303 > > 16304 bits 32 > > 16305 > > 16306 00006E10 66B82000 mov ax,pm_seg.prog_d16 > > 16307 00006E14 8ED8 mov ds,ax > > 16308 > > 16309 00006E16 8CD0 mov eax,ss > > 16310 00006E18 81E4FFFF0000 and esp,0ffffh > > > > > > The VT state should be correct after executing instruction an RIP 6E16 > > (mov eax, ss). The next instruction should not cause a vmentry > > Are you sure ? It is intel notation (opcode dst,src) , so it updates > eax, not ss. Guillaumes gives us (with gdb notation, opcode src,dst): > > 0x0000000000046e53: ljmp $0x18,$0x6e18 > > 0x0000000000046e58: mov $0x20,%ax > > %EAX = 0x20 > > 0x0000000000046e5c: mov %eax,%ds > > %DS = 0x20 > > 0x0000000000046e5e: mov %ss,%eax > > %EAX = %SS = 0x53E1 (in this particular case) > > For me the issue is with instructions with "dst.byte = 0". > for instance: > > 0x0000000000046e66: shl $0x4,%eax > > [82768.003174] emulation at (46e66) rip 6e26: c1 e0 04 01 > [82768.035153] writeback: dst.byte 0 > [82768.055174] writeback: dst.ptr 0x0000000000000000 > [82768.087177] writeback: dst.val 0x53e1 > [82768.111178] writeback: src.ptr 0x0000000000006e28 > [82768.143157] writeback: src.val 0x4 > > So my questions are: > > Why dst.val is not 0x53e10 ? I can answer myself to this one: emulate_2op_SrcB("sal", c->src, c->dst, ctxt->eflags); does nothing if dst.byte == 0 So next question is the good question... > Why dst.byte is 0 ? > > > failure. The fact that it is for you indicates that you're not updating > > guest state correctly. > > > > My guess would be that load_segment_descriptor is not updating the > > values within the VMCS. > > > > Regards, > > > > Anthony Liguori > > Regards > Laurent -- ------------- Lau...@bu... --------------- "The best way to predict the future is to invent it." - Alan Kay |
From: Avi K. <av...@qu...> - 2008-04-29 23:22:24
|
Laurent Vivier wrote: >> Why dst.val is not 0x53e10 ? >> > > I can answer myself to this one: > > emulate_2op_SrcB("sal", c->src, c->dst, ctxt->eflags); > > does nothing if dst.byte == 0 > > So next question is the good question... > > >> Why dst.byte is 0 ? >> >> Because dst.bytes is only set if dst.type == OP_MEM, or ad hoc in the instruction itself. Better to set it unconditionally (and adjust in the instruction if necessary). -- Any sufficiently difficult bug is indistinguishable from a feature. |
From: David M. <dm...@ma...> - 2008-04-29 16:55:59
|
Guillaume Thouvenin wrote: > Hello, > > This patch should solve the problem observed during protected mode > transitions that appears for example during the installation of > openSuse-10.3. Unfortunately there is an issue that crashes > kvm-userspace. I'm not sure if it's a problem introduced by the > patch or if the patch is good and raises a new issue. > > Here is what I'm doing: > > 1) Remove the SS patching that modifies SS_SELECTOR in enter_pmode() > to see vmentry failure. > 2) Add the handler that catches the VMentry failure. It is called > handle_vmentry_failure() > 3) while CS.RPL != SS.RPL, emulate the instruction. > 4) Add the emulation of "ljmp", "mov r, imm", "mov sreg, r/m16" and > "mov r/m16, sreg" that have respectively opcode 0xea, 0xb8, 0x8e and > 0x8c. > > Normally, it should be sufficient to boot openSuse-10.3 because > instructions that need to be emulated are: > > 0x0000000000046e53: ljmp $0x18,$0x6e18 > 0x0000000000046e58: mov $0x20,%ax > 0x0000000000046e5c: mov %eax,%ds > 0x0000000000046e5e: mov %ss,%eax > 0x0000000000046e60: and $0xffff,%esp > 0x0000000000046e66: shl $0x4,%eax > 0x0000000000046e69: add %eax,%esp > 0x0000000000046e6b: mov $0x8,%ax > 0x0000000000046e6f: mov %eax,%ss > > At this point, cs.rpl is equal to ss.rpl. > > I added trace in handle_vmentry_failure() and also in writeback() to > see what functions are emulated and I observe: > <snip trace> > > So everything seems ok but after the emulation of "mov %eax,%ss" > instruction, it seems that cs.rpl == ss.rpl but the guest is still in a > VT-unfriendly state because I have the following error in kvm-userspace: > > [guill@enterprise][~/local/kvm-userspace.git/bin]$ ./qemu-system-x86_64 > -hda ~/disk_images/hd_50G.qcow2 > -cdrom /images_iso/openSUSE-10.3-GM-x86_64-mini.iso -boot d -s -m 1024 > > exception 13 (33) > rax 0000000000000673 rbx 0000000000800000 rcx 0000000000000000 > rdx 00000000000013ca rsi 0000000000055e1c rdi 0000000000055e1d > rsp 00000000fffa0080 rbp 000000000000200b r8 0000000000000000 > r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 > r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 > r15 0000000000000000 rip 000000000000b071 rflags 00033092 > cs 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > ds 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > es 00ff (00000ff0/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > ss ff11 (000ff110/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > fs 3002 (00030020/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > tr 0000 (fffbd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) > ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) > gdt 40920/47 idt 0/ffff cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 > code: 17 06 29 4b 01 18 eb 18 a8 25 aa 19 28 4c 01 28 4d 01 01 17 --> > 0f 17 0f 01 17 0f 17 12 01 17 2c 25 4b 19 21 00 02 17 1a 94 0a 76 67 61 > 3d 30 78 25 78 20 Aborted My memory of x86 protected mode is flaky so I apologise if this is wasted time. Are we looking at the runtime registers for the VM or the registers for the host? Isn't PE clear in CR0 (which I think is real mode and there should be no cpl or rpl). If this is in protected mode (or cpl/rpl are a carried over as a side effect of big real mode), are you sure cs.rpl == ss.rpl? I think I read cs.rpl == 0 and ss.rpl == 1. The opcode with the exception is pop %ss I believe (assuming 32 bit code). Is the value dumped for ss the value loaded by the pop or the value from before the pop? I think cpl is zero and I thought it was ok for code at some cpl to use selectors with rpls equal to its cpl or lower (higher rpl number). That made me wonder if the loaded ss is not the value shown but the value that would have been loaded by the pop. In which case I wonder if it would be a selector for an invalid descriptor. It's a shame we don't see the stack. Beyond that I risk confusion so I'll leave it there, I hope it helps. --- David Mair. |
From: Laurent V. <Lau...@bu...> - 2008-04-29 17:09:39
|
Le mardi 29 avril 2008 à 11:41 -0500, Anthony Liguori a écrit : > Guillaume Thouvenin wrote: > > Hello, > > > > This patch should solve the problem observed during protected mode > > transitions that appears for example during the installation of > > openSuse-10.3. Unfortunately there is an issue that crashes > > kvm-userspace. I'm not sure if it's a problem introduced by the > > patch or if the patch is good and raises a new issue. > > > > You still aren't emulating the instructions correctly I think. Running > your patch, I see: > > [ 979.755349] Failed vm entry (exit reason 0x21) invalid guest state > [ 979.755354] emulation at (46e4b) rip 6e0b: ea 10 6e 18 > [ 979.755358] successfully emulated instruction > [ 979.756105] Failed vm entry (exit reason 0x21) invalid guest state > [ 979.756109] emulation at (46e50) rip 6e10: 66 b8 20 00 > [ 979.756111] successfully emulated instruction > [ 979.756749] Failed vm entry (exit reason 0x21) invalid guest state > [ 979.756752] emulation at (46e54) rip 6e14: 8e d8 8c d0 > [ 979.756755] successfully emulated instruction > [ 979.757427] Failed vm entry (exit reason 0x21) invalid guest state > [ 979.757430] emulation at (46e56) rip 6e16: 8c d0 81 e4 > [ 979.757433] successfully emulated instruction > [ 979.758074] Failed vm entry (exit reason 0x21) invalid guest state > [ 979.758077] emulation at (46e58) rip 6e18: 81 e4 ff ff > > > The corresponding gfxboot code is: > > 16301 00006E0B EA[106E]1800 jmp > pm_seg.prog_c32:switch_to_pm_20 > 16302 switch_to_pm_20: > 16303 > 16304 bits 32 > 16305 > 16306 00006E10 66B82000 mov ax,pm_seg.prog_d16 > 16307 00006E14 8ED8 mov ds,ax > 16308 > 16309 00006E16 8CD0 mov eax,ss > 16310 00006E18 81E4FFFF0000 and esp,0ffffh > > > The VT state should be correct after executing instruction an RIP 6E16 > (mov eax, ss). The next instruction should not cause a vmentry Are you sure ? It is intel notation (opcode dst,src) , so it updates eax, not ss. Guillaumes gives us (with gdb notation, opcode src,dst): 0x0000000000046e53: ljmp $0x18,$0x6e18 0x0000000000046e58: mov $0x20,%ax %EAX = 0x20 0x0000000000046e5c: mov %eax,%ds %DS = 0x20 0x0000000000046e5e: mov %ss,%eax %EAX = %SS = 0x53E1 (in this particular case) For me the issue is with instructions with "dst.byte = 0". for instance: 0x0000000000046e66: shl $0x4,%eax [82768.003174] emulation at (46e66) rip 6e26: c1 e0 04 01 [82768.035153] writeback: dst.byte 0 [82768.055174] writeback: dst.ptr 0x0000000000000000 [82768.087177] writeback: dst.val 0x53e1 [82768.111178] writeback: src.ptr 0x0000000000006e28 [82768.143157] writeback: src.val 0x4 So my questions are: Why dst.val is not 0x53e10 ? Why dst.byte is 0 ? > failure. The fact that it is for you indicates that you're not updating > guest state correctly. > > My guess would be that load_segment_descriptor is not updating the > values within the VMCS. > > Regards, > > Anthony Liguori Regards Laurent -- ------------- Lau...@bu... --------------- "The best way to predict the future is to invent it." - Alan Kay |
From: Anthony L. <an...@co...> - 2008-04-29 18:17:15
|
Laurent Vivier wrote: > Le mardi 29 avril 2008 à 11:41 -0500, Anthony Liguori a écrit : > >> Guillaume Thouvenin wrote: >> >>> Hello, >>> >>> This patch should solve the problem observed during protected mode >>> transitions that appears for example during the installation of >>> openSuse-10.3. Unfortunately there is an issue that crashes >>> kvm-userspace. I'm not sure if it's a problem introduced by the >>> patch or if the patch is good and raises a new issue. >>> >>> >> You still aren't emulating the instructions correctly I think. Running >> your patch, I see: >> >> [ 979.755349] Failed vm entry (exit reason 0x21) invalid guest state >> [ 979.755354] emulation at (46e4b) rip 6e0b: ea 10 6e 18 >> [ 979.755358] successfully emulated instruction >> [ 979.756105] Failed vm entry (exit reason 0x21) invalid guest state >> [ 979.756109] emulation at (46e50) rip 6e10: 66 b8 20 00 >> [ 979.756111] successfully emulated instruction >> [ 979.756749] Failed vm entry (exit reason 0x21) invalid guest state >> [ 979.756752] emulation at (46e54) rip 6e14: 8e d8 8c d0 >> [ 979.756755] successfully emulated instruction >> [ 979.757427] Failed vm entry (exit reason 0x21) invalid guest state >> [ 979.757430] emulation at (46e56) rip 6e16: 8c d0 81 e4 >> [ 979.757433] successfully emulated instruction >> [ 979.758074] Failed vm entry (exit reason 0x21) invalid guest state >> [ 979.758077] emulation at (46e58) rip 6e18: 81 e4 ff ff >> >> >> The corresponding gfxboot code is: >> >> 16301 00006E0B EA[106E]1800 jmp >> pm_seg.prog_c32:switch_to_pm_20 >> 16302 switch_to_pm_20: >> 16303 >> 16304 bits 32 >> 16305 >> 16306 00006E10 66B82000 mov ax,pm_seg.prog_d16 >> 16307 00006E14 8ED8 mov ds,ax >> 16308 >> 16309 00006E16 8CD0 mov eax,ss >> 16310 00006E18 81E4FFFF0000 and esp,0ffffh >> >> >> The VT state should be correct after executing instruction an RIP 6E16 >> (mov eax, ss). The next instruction should not cause a vmentry >> > > Are you sure ? It is intel notation (opcode dst,src) , so it updates > eax, not ss. Guillaumes gives us (with gdb notation, opcode src,dst): > You're right, it's a fair bit down the code before the ss move happens. Regards, Anthony Liguori > 0x0000000000046e53: ljmp $0x18,$0x6e18 > > 0x0000000000046e58: mov $0x20,%ax > > %EAX = 0x20 > > 0x0000000000046e5c: mov %eax,%ds > > %DS = 0x20 > > 0x0000000000046e5e: mov %ss,%eax > > %EAX = %SS = 0x53E1 (in this particular case) > > For me the issue is with instructions with "dst.byte = 0". > for instance: > > 0x0000000000046e66: shl $0x4,%eax > > [82768.003174] emulation at (46e66) rip 6e26: c1 e0 04 01 > [82768.035153] writeback: dst.byte 0 > [82768.055174] writeback: dst.ptr 0x0000000000000000 > [82768.087177] writeback: dst.val 0x53e1 > [82768.111178] writeback: src.ptr 0x0000000000006e28 > [82768.143157] writeback: src.val 0x4 > > So my questions are: > > Why dst.val is not 0x53e10 ? > Why dst.byte is 0 ? > > >> failure. The fact that it is for you indicates that you're not updating >> guest state correctly. >> >> My guess would be that load_segment_descriptor is not updating the >> values within the VMCS. >> >> Regards, >> >> Anthony Liguori >> > > Regards > Laurent > |
From: Anthony L. <an...@co...> - 2008-04-29 18:16:59
|
Guillaume Thouvenin wrote: > Hello, > > It's strange because handle_vmentry_failure() is not called. I'm trying > to see where is the problem, any comments are welcome > [ 979.761321] handle_exception: unexpected, vectoring info 0x80000306 intr info 0x80000b0d Is the error I'm seeing. Regards, Anthony Liguori > Regards, > Guillaume > > > > arch/x86/kvm/vmx.c | 68 +++++++++++++++++++++++++++ > arch/x86/kvm/vmx.h | 3 + > arch/x86/kvm/x86.c | 12 ++-- > arch/x86/kvm/x86_emulate.c | 112 +++++++++++++++++++++++++++++++++++++++++++-- > include/asm-x86/kvm_host.h | 4 + > 5 files changed, 190 insertions(+), 9 deletions(-) > > --- > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 79cdbe8..a0a13b8 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -1272,7 +1272,9 @@ static void enter_pmode(struct kvm_vcpu *vcpu) > fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs); > fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs); > > +#if 0 > vmcs_write16(GUEST_SS_SELECTOR, 0); > +#endif > vmcs_write32(GUEST_SS_AR_BYTES, 0x93); > > vmcs_write16(GUEST_CS_SELECTOR, > @@ -2635,6 +2637,66 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) > return 1; > } > > +static int invalid_guest_state(struct kvm_vcpu *vcpu, > + struct kvm_run *kvm_run, u32 failure_reason) > +{ > + u16 ss, cs; > + u8 opcodes[4]; > + unsigned long rip = vcpu->arch.rip; > + unsigned long rip_linear; > + > + ss = vmcs_read16(GUEST_SS_SELECTOR); > + cs = vmcs_read16(GUEST_CS_SELECTOR); > + > + if ((ss & 0x03) != (cs & 0x03)) { > + int err; > + rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS); > + emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu); > + printk(KERN_INFO "emulation at (%lx) rip %lx: %02x %02x %02x %02x\n", > + rip_linear, > + rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]); > + err = emulate_instruction(vcpu, kvm_run, 0, 0, 0); > + switch (err) { > + case EMULATE_DONE: > + printk(KERN_INFO "successfully emulated instruction\n"); > + return 1; > + case EMULATE_DO_MMIO: > + printk(KERN_INFO "mmio?\n"); > + return 0; > + default: > + kvm_report_emulation_failure(vcpu, "vmentry failure"); > + break; > + } > + } > + > + kvm_run->exit_reason = KVM_EXIT_UNKNOWN; > + kvm_run->hw.hardware_exit_reason = failure_reason; > + return 0; > +} > + > +static int handle_vmentry_failure(struct kvm_vcpu *vcpu, > + struct kvm_run *kvm_run, > + u32 failure_reason) > +{ > + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); > + > + printk(KERN_INFO "Failed vm entry (exit reason 0x%x) ", failure_reason); > + switch (failure_reason) { > + case EXIT_REASON_INVALID_GUEST_STATE: > + printk("invalid guest state \n"); > + return invalid_guest_state(vcpu, kvm_run, failure_reason); > + case EXIT_REASON_MSR_LOADING: > + printk("caused by MSR entry %ld loading.\n", exit_qualification); > + break; > + case EXIT_REASON_MACHINE_CHECK: > + printk("caused by machine check.\n"); > + break; > + default: > + printk("reason not known yet!\n"); > + break; > + } > + return 0; > +} > /* > * The exit handlers return 1 if the exit was handled fully and guest execution > * may resume. Otherwise they set the kvm_run parameter to indicate what needs > @@ -2696,6 +2758,12 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) > exit_reason != EXIT_REASON_EPT_VIOLATION)) > printk(KERN_WARNING "%s: unexpected, valid vectoring info and " > "exit reason is 0x%x\n", __func__, exit_reason); > + > + if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) { > + exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY; > + return handle_vmentry_failure(vcpu, kvm_run, exit_reason); > + } > + > if (exit_reason < kvm_vmx_max_exit_handlers > && kvm_vmx_exit_handlers[exit_reason]) > return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run); > diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h > index 79d94c6..2cebf48 100644 > --- a/arch/x86/kvm/vmx.h > +++ b/arch/x86/kvm/vmx.h > @@ -238,7 +238,10 @@ enum vmcs_field { > #define EXIT_REASON_IO_INSTRUCTION 30 > #define EXIT_REASON_MSR_READ 31 > #define EXIT_REASON_MSR_WRITE 32 > +#define EXIT_REASON_INVALID_GUEST_STATE 33 > +#define EXIT_REASON_MSR_LOADING 34 > #define EXIT_REASON_MWAIT_INSTRUCTION 36 > +#define EXIT_REASON_MACHINE_CHECK 41 > #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 > #define EXIT_REASON_APIC_ACCESS 44 > #define EXIT_REASON_EPT_VIOLATION 48 > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 578a0c1..9e5d687 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3027,8 +3027,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) > return 0; > } > > -static void get_segment(struct kvm_vcpu *vcpu, > - struct kvm_segment *var, int seg) > +void get_segment(struct kvm_vcpu *vcpu, > + struct kvm_segment *var, int seg) > { > kvm_x86_ops->get_segment(vcpu, var, seg); > } > @@ -3111,8 +3111,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, > return 0; > } > > -static void set_segment(struct kvm_vcpu *vcpu, > - struct kvm_segment *var, int seg) > +void set_segment(struct kvm_vcpu *vcpu, > + struct kvm_segment *var, int seg) > { > kvm_x86_ops->set_segment(vcpu, var, seg); > } > @@ -3270,8 +3270,8 @@ static int load_segment_descriptor_to_kvm_desct(struct kvm_vcpu *vcpu, > return 0; > } > > -static int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, > - int type_bits, int seg) > +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, > + int type_bits, int seg) > { > struct kvm_segment kvm_seg; > > diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c > index 2ca0838..f6b9dad 100644 > --- a/arch/x86/kvm/x86_emulate.c > +++ b/arch/x86/kvm/x86_emulate.c > @@ -138,7 +138,8 @@ static u16 opcode_table[256] = { > /* 0x88 - 0x8F */ > ByteOp | DstMem | SrcReg | ModRM | Mov, DstMem | SrcReg | ModRM | Mov, > ByteOp | DstReg | SrcMem | ModRM | Mov, DstReg | SrcMem | ModRM | Mov, > - 0, ModRM | DstReg, 0, Group | Group1A, > + DstMem | SrcReg | ModRM | Mov, ModRM | DstReg, > + DstReg | SrcMem | ModRM | Mov, Group | Group1A, > /* 0x90 - 0x9F */ > 0, 0, 0, 0, 0, 0, 0, 0, > 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, > @@ -152,7 +153,8 @@ static u16 opcode_table[256] = { > ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String, > ByteOp | ImplicitOps | String, ImplicitOps | String, > /* 0xB0 - 0xBF */ > - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > + 0, 0, 0, 0, 0, 0, 0, 0, > + DstReg | SrcImm | Mov, 0, 0, 0, 0, 0, 0, 0, > /* 0xC0 - 0xC7 */ > ByteOp | DstMem | SrcImm | ModRM, DstMem | SrcImmByte | ModRM, > 0, ImplicitOps | Stack, 0, 0, > @@ -168,7 +170,7 @@ static u16 opcode_table[256] = { > /* 0xE0 - 0xE7 */ > 0, 0, 0, 0, 0, 0, 0, 0, > /* 0xE8 - 0xEF */ > - ImplicitOps | Stack, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, > + ImplicitOps | Stack, SrcImm | ImplicitOps, ImplicitOps, SrcImmByte | ImplicitOps, > 0, 0, 0, 0, > /* 0xF0 - 0xF7 */ > 0, 0, 0, 0, > @@ -1511,14 +1513,90 @@ special_insn: > break; > case 0x88 ... 0x8b: /* mov */ > goto mov; > + case 0x8c: { /* mov r/m, sreg */ > + struct kvm_segment segreg; > + > + if (c->modrm_mod == 0x3) > + c->src.val = c->modrm_val; > + > + switch ( c->modrm_reg ) { > + case 0: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_ES); > + break; > + case 1: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_CS); > + break; > + case 2: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_SS); > + break; > + case 3: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_DS); > + break; > + case 4: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_FS); > + break; > + case 5: > + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_GS); > + break; > + default: > + printk(KERN_INFO "0x8c: Invalid segreg in modrm byte 0x%02x\n", > + c->modrm); > + goto cannot_emulate; > + } > + c->dst.val = segreg.selector; > + c->dst.bytes = 2; > + c->dst.ptr = (unsigned long *)decode_register(c->modrm_rm, c->regs, > + c->d & ByteOp); > + break; > + } > case 0x8d: /* lea r16/r32, m */ > c->dst.val = c->modrm_ea; > break; > + case 0x8e: { /* mov seg, r/m16 */ > + uint16_t sel; > + > + sel = c->src.val; > + switch ( c->modrm_reg ) { > + case 0: > + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_ES) < 0) > + goto cannot_emulate; > + break; > + case 1: > + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) > + goto cannot_emulate; > + break; > + case 2: > + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_SS) < 0) > + goto cannot_emulate; > + break; > + case 3: > + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_DS) < 0) > + goto cannot_emulate; > + break; > + case 4: > + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_FS) < 0) > + goto cannot_emulate; > + break; > + case 5: > + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_GS) < 0) > + goto cannot_emulate; > + break; > + default: > + printk(KERN_INFO "Invalid segreg in modrm byte 0x%02x\n", > + c->modrm); > + goto cannot_emulate; > + } > + > + c->dst.type = OP_NONE; /* Disable writeback. */ > + break; > + } > case 0x8f: /* pop (sole member of Grp1a) */ > rc = emulate_grp1a(ctxt, ops); > if (rc != 0) > goto done; > break; > + case 0xb8: /* mov r, imm */ > + goto mov; > case 0x9c: /* pushf */ > c->src.val = (unsigned long) ctxt->eflags; > emulate_push(ctxt); > @@ -1657,6 +1735,34 @@ special_insn: > break; > } > case 0xe9: /* jmp rel */ > + jmp_rel(c, c->src.val); > + c->dst.type = OP_NONE; /* Disable writeback. */ > + break; > + case 0xea: /* jmp far */ { > + uint32_t eip; > + uint16_t sel; > + > + switch (c->op_bytes) { > + case 2: > + eip = insn_fetch(u16, 2, c->eip); > + eip = eip & 0x0000FFFF; /* clear upper 16 bits */ > + break; > + case 4: > + eip = insn_fetch(u32, 4, c->eip); > + break; > + default: > + DPRINTF("jmp far: Invalid op_bytes\n"); > + goto cannot_emulate; > + } > + sel = insn_fetch(u16, 2, c->eip); > + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { > + DPRINTF("jmp far: Failed to load CS descriptor\n"); > + goto cannot_emulate; > + } > + > + c->eip = eip; > + break; > + } > case 0xeb: /* jmp rel short */ > jmp_rel(c, c->src.val); > c->dst.type = OP_NONE; /* Disable writeback. */ > diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h > index 4baa9c9..7a0846a 100644 > --- a/include/asm-x86/kvm_host.h > +++ b/include/asm-x86/kvm_host.h > @@ -495,6 +495,10 @@ int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, > int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, > unsigned long value); > > +void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); > +void get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); > +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, > + int type_bits, int seg); > int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason); > > void kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); > |
From: Marcelo T. <mto...@re...> - 2008-05-01 19:11:10
|
Hi Guillaume, On Tue, Apr 29, 2008 at 03:02:36PM +0200, Guillaume Thouvenin wrote: > Hello, <snip> > -hda ~/disk_images/hd_50G.qcow2 > -cdrom /images_iso/openSUSE-10.3-GM-x86_64-mini.iso -boot d -s -m 1024 > > exception 13 (33) > rax 0000000000000673 rbx 0000000000800000 rcx 0000000000000000 > rdx 00000000000013ca rsi 0000000000055e1c rdi 0000000000055e1d > rsp 00000000fffa0080 rbp 000000000000200b r8 0000000000000000 > r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 > r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 > r15 0000000000000000 rip 000000000000b071 rflags 00033092 > cs 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > ds 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > es 00ff (00000ff0/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > ss ff11 (000ff110/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > fs 3002 (00030020/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) > tr 0000 (fffbd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) > ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) > gdt 40920/47 idt 0/ffff cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 > code: 17 06 29 4b 01 18 eb 18 a8 25 aa 19 28 4c 01 28 4d 01 01 17 --> > 0f 17 0f 01 17 0f 17 12 01 17 2c 25 4b 19 21 00 02 17 1a 94 0a 76 67 61 > 3d 30 78 25 78 20 Aborted > > It's strange because handle_vmentry_failure() is not called. I'm trying > to see where is the problem, any comments are welcome Not sure if this is the same problem you're seeing, but with your patch Plan9 triggers: exception 13 (6b) rax 0000000000010010 rbx 0000000000000001 rcx 00000000f0012000 rdx 00000000000000a1 rsi 00000000f0101000 rdi 00000000f0009000 rsp 0000000000007bfc rbp 00000000f0001320 r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000 rip 000000000000023e rflags 00033002 cs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr 0000 (fffbd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 14000/4f idt 0/3ff cr0 10010 cr2 0 cr3 12000 cr4 d0 cr8 0 efer 0 code: 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff --> 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 53 ff 00 f0 The code sequence is: 8235: 66 data16 8236: 0f 22 c0 mov %eax,%cr0 8239: ea 3e 02 00 08 b8 00 ljmp $0xb8,$0x800023e So it switches to realmode and then does a ljmp. Problem is that you're using the segment selector as a GDT index, but in realmode it should be shifted left by 4 to determine the segment base address. Following patch makes Plan9 happy. Other than that, load_segment_descriptor() can return a positive error on failure, should do a proper check. Index: kvm/arch/x86/kvm/x86_emulate.c =================================================================== --- kvm.orig/arch/x86/kvm/x86_emulate.c +++ kvm/arch/x86/kvm/x86_emulate.c @@ -1755,7 +1755,10 @@ special_insn: goto cannot_emulate; } sel = insn_fetch(u16, 2, c->eip); - if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { + if (ctxt->mode == X86EMUL_MODE_REAL) + eip |= (sel << 4); + else if (load_segment_descriptor(ctxt->vcpu, sel, 9, + VCPU_SREG_CS) < 0) { DPRINTF("jmp far: Failed to load CS descriptor\n"); goto cannot_emulate; } |
From: Balaji R. <bal...@gm...> - 2008-05-03 08:27:08
|
On Friday 02 May 2008 12:43:31 am Marcelo Tosatti wrote: Hi Guillaume, With your patch applied ubuntu 8.04 livecd fails to boot. Not any better with Marcelo's patch on top. exception 13 (33) rax 000000000000007f rbx 0000000000800000 rcx 0000000000000000 rdx 0000000000000000 rsi 000000000005a81c rdi 000000000005a820 rsp 00000000fffa97cc rbp 000000000000200c r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000 rip 000000000000b02c rflags 00033882 cs 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es 4004 (00040040/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss 5881 (00058810/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs 3002 (00030020/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs 0000 (00000000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr 0000 (fffbd000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 40920/47 idt 0/ffff cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 code: 10 28 6d 01 28 1e 01 28 6d 01 28 1f 01 28 6d 01 28 73 01 17 --> 0f 28 6d 01 28 74 01 17 0f 17 3b 28 6d 01 28 75 01 17 0f 28 6d 01 28 76 01 17 0f 11 1c 17 Aborted -- Warm Regards, Balaji Rao Dept. of Mechanical Engineering NITK |
From: Guillaume T. <gui...@ex...> - 2008-05-05 12:41:24
|
On Sat, 3 May 2008 13:56:56 +0530 Balaji Rao <bal...@gm...> wrote: > With your patch applied ubuntu 8.04 livecd fails to boot. Not any better > with Marcelo's patch on top. Hi Balaji, And without the patch, can you boot the ubuntu 8.04 livecd? Regards, Guillaume |
From: Balaji R. <bal...@gm...> - 2008-05-05 12:44:35
|
On Monday 05 May 2008 06:10:08 pm Guillaume Thouvenin wrote: > On Sat, 3 May 2008 13:56:56 +0530 > > Balaji Rao <bal...@gm...> wrote: > > With your patch applied ubuntu 8.04 livecd fails to boot. Not any better > > with Marcelo's patch on top. > > Hi Balaji, > > And without the patch, can you boot the ubuntu 8.04 livecd? Yes, I can. :) > > Regards, > Guillaume -- Warm Regards, Balaji Rao Dept. of Mechanical Engineering NITK |
From: Anthony L. <an...@co...> - 2008-05-05 12:57:06
|
Guillaume Thouvenin wrote: > On Sat, 3 May 2008 13:56:56 +0530 > Balaji Rao <bal...@gm...> wrote: > > > >> With your patch applied ubuntu 8.04 livecd fails to boot. Not any better >> with Marcelo's patch on top. >> > > Hi Balaji, > > And without the patch, can you boot the ubuntu 8.04 livecd? > WinXP fails to boot with your patch applied too. FWIW, Ubuntu 8.04 has a fixed version of gfxboot that doesn't do nasty things with SS on privileged mode transitions. Regards, Anthony Liguori > Regards, > Guillaume > > ------------------------------------------------------------------------- > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > _______________________________________________ > kvm-devel mailing list > kvm...@li... > https://lists.sourceforge.net/lists/listinfo/kvm-devel > |
From: Mohammed G. <m.g...@gm...> - 2008-05-05 13:29:19
|
On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <an...@co...> wrote: > WinXP fails to boot with your patch applied too. FWIW, Ubuntu 8.04 has > a fixed version of gfxboot that doesn't do nasty things with SS on > privileged mode transitions. > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and FreeDOS don't boot but complain about instruction mov 0x11,sreg not being emulated. |
From: Guillaume T. <gui...@ex...> - 2008-05-06 13:38:43
|
On Mon, 5 May 2008 16:29:21 +0300 "Mohammed Gamal" <m.g...@gm...> wrote: > On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <an...@co...> wrote: > > > WinXP fails to boot with your patch applied too. FWIW, Ubuntu 8.04 has > > a fixed version of gfxboot that doesn't do nasty things with SS on > > privileged mode transitions. > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and > FreeDOS don't boot but complain about instruction mov 0x11,sreg not > being emulated. Can you try with this one please? On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also openSUSE-10.3-GM-x86_64-mini.iso I will try FreeDOS and WinXP if I can find one ;) Regards, Guillaume --- diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 26c4f02..6e76c2e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1272,7 +1272,9 @@ static void enter_pmode(struct kvm_vcpu *vcpu) fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs); fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs); +#if 0 vmcs_write16(GUEST_SS_SELECTOR, 0); +#endif vmcs_write32(GUEST_SS_AR_BYTES, 0x93); vmcs_write16(GUEST_CS_SELECTOR, @@ -2633,6 +2635,73 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 1; } +static int invalid_guest_state(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, u32 failure_reason) +{ + u16 ss, cs; + u8 opcodes[4]; + unsigned long rip = vcpu->arch.rip; + unsigned long rip_linear; + + ss = vmcs_read16(GUEST_SS_SELECTOR); + cs = vmcs_read16(GUEST_CS_SELECTOR); + + if ((ss & 0x03) != (cs & 0x03)) { + int err; + rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS); + emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu); +#if 0 + printk(KERN_INFO "emulation at (%lx) rip %lx: %02x %02x %02x %02x\n", + rip_linear, + rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]); +#endif + err = emulate_instruction(vcpu, kvm_run, 0, 0, 0); + switch (err) { + case EMULATE_DONE: +#if 0 + printk(KERN_INFO "successfully emulated instruction\n"); +#endif + return 1; + case EMULATE_DO_MMIO: + printk(KERN_INFO "mmio?\n"); + return 0; + default: + kvm_report_emulation_failure(vcpu, "vmentry failure"); + break; + } + } + + kvm_run->exit_reason = KVM_EXIT_UNKNOWN; + kvm_run->hw.hardware_exit_reason = failure_reason; + return 0; +} + +static int handle_vmentry_failure(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, + u32 failure_reason) +{ + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); +#if 0 + printk(KERN_INFO "Failed vm entry (exit reason 0x%x) ", failure_reason); +#endif + switch (failure_reason) { + case EXIT_REASON_INVALID_GUEST_STATE: +#if 0 + printk("invalid guest state \n"); +#endif + return invalid_guest_state(vcpu, kvm_run, failure_reason); + case EXIT_REASON_MSR_LOADING: + printk("caused by MSR entry %ld loading.\n", exit_qualification); + break; + case EXIT_REASON_MACHINE_CHECK: + printk("caused by machine check.\n"); + break; + default: + printk("reason not known yet!\n"); + break; + } + return 0; +} /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -2694,6 +2763,12 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) exit_reason != EXIT_REASON_EPT_VIOLATION)) printk(KERN_WARNING "%s: unexpected, valid vectoring info and " "exit reason is 0x%x\n", __func__, exit_reason); + + if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) { + exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY; + return handle_vmentry_failure(vcpu, kvm_run, exit_reason); + } + if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run); diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 79d94c6..2cebf48 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -238,7 +238,10 @@ enum vmcs_field { #define EXIT_REASON_IO_INSTRUCTION 30 #define EXIT_REASON_MSR_READ 31 #define EXIT_REASON_MSR_WRITE 32 +#define EXIT_REASON_INVALID_GUEST_STATE 33 +#define EXIT_REASON_MSR_LOADING 34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 +#define EXIT_REASON_MACHINE_CHECK 41 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define EXIT_REASON_APIC_ACCESS 44 #define EXIT_REASON_EPT_VIOLATION 48 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 979f983..c84c5ec 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3044,8 +3044,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) return 0; } -static void get_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void get_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->get_segment(vcpu, var, seg); } @@ -3128,8 +3128,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, return 0; } -static void set_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void set_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->set_segment(vcpu, var, seg); } @@ -3287,8 +3287,8 @@ static int load_segment_descriptor_to_kvm_desct(struct kvm_vcpu *vcpu, return 0; } -static int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, - int type_bits, int seg) +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg) { struct kvm_segment kvm_seg; diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 8a96320..581d18e 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -69,6 +69,7 @@ #define GroupDual (1<<15) /* Alternate decoding of mod == 3 */ #define GroupMask 0xff /* Group number stored in bits 0:7 */ +int switch_perso = 0; enum { Group1_80, Group1_81, Group1_82, Group1_83, Group1A, Group3_Byte, Group3, Group4, Group5, Group7, @@ -138,7 +139,8 @@ static u16 opcode_table[256] = { /* 0x88 - 0x8F */ ByteOp | DstMem | SrcReg | ModRM | Mov, DstMem | SrcReg | ModRM | Mov, ByteOp | DstReg | SrcMem | ModRM | Mov, DstReg | SrcMem | ModRM | Mov, - 0, ModRM | DstReg, 0, Group | Group1A, + DstMem | SrcReg | ModRM | Mov, ModRM | DstReg, + DstReg | SrcMem | ModRM | Mov, Group | Group1A, /* 0x90 - 0x9F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, @@ -152,7 +154,8 @@ static u16 opcode_table[256] = { ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String, ByteOp | ImplicitOps | String, ImplicitOps | String, /* 0xB0 - 0xBF */ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + DstReg | SrcImm | Mov, 0, 0, 0, 0, 0, 0, 0, /* 0xC0 - 0xC7 */ ByteOp | DstMem | SrcImm | ModRM, DstMem | SrcImmByte | ModRM, 0, ImplicitOps | Stack, 0, 0, @@ -168,7 +171,7 @@ static u16 opcode_table[256] = { /* 0xE0 - 0xE7 */ 0, 0, 0, 0, 0, 0, 0, 0, /* 0xE8 - 0xEF */ - ImplicitOps | Stack, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, + ImplicitOps | Stack, SrcImm | ImplicitOps, ImplicitOps, SrcImmByte | ImplicitOps, 0, 0, 0, 0, /* 0xF0 - 0xF7 */ 0, 0, 0, 0, @@ -1246,6 +1249,19 @@ static inline int writeback(struct x86_emulate_ctxt *ctxt, default: break; } +#if 0 + if (switch_perso) { + printk(KERN_INFO " writeback: dst.byte %d\n" , c->dst.bytes); + printk(KERN_INFO " writeback: dst.ptr 0x%p\n" , c->dst.ptr); + printk(KERN_INFO " writeback: dst.val 0x%lx\n", c->dst.val); + printk(KERN_INFO " writeback: src.ptr 0x%p\n", c->src.ptr); + printk(KERN_INFO " writeback: src.val 0x%lx\n", c->src.val); + printk(KERN_INFO " writeback: RAX 0x%lx\n", c->regs[VCPU_REGS_RAX]); + printk(KERN_INFO " writeback: RSP 0x%lx\n", c->regs[VCPU_REGS_RSP]); + printk(KERN_INFO " writeback: CS 0x%lx\n", c->regs[VCPU_SREG_CS]); + printk(KERN_INFO " writeback: SS 0x%lx\n", c->regs[VCPU_SREG_SS]); + } +#endif return 0; } @@ -1342,6 +1358,10 @@ special_insn: switch (c->b) { case 0x00 ... 0x05: add: /* add */ + if ((c->d & ModRM) && c->modrm_mod == 3) { + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); + } emulate_2op_SrcV("add", c->src, c->dst, ctxt->eflags); break; case 0x08 ... 0x0d: @@ -1514,14 +1534,90 @@ special_insn: break; case 0x88 ... 0x8b: /* mov */ goto mov; + case 0x8c: { /* mov r/m, sreg */ + struct kvm_segment segreg; + + if (c->modrm_mod == 0x3) + c->src.val = c->modrm_val; + + switch ( c->modrm_reg ) { + case 0: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_ES); + break; + case 1: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_CS); + break; + case 2: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_SS); + break; + case 3: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_DS); + break; + case 4: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_FS); + break; + case 5: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_GS); + break; + default: + printk(KERN_INFO "0x8c: Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + c->dst.val = segreg.selector; + c->dst.bytes = 2; + c->dst.ptr = (unsigned long *)decode_register(c->modrm_rm, c->regs, + c->d & ByteOp); + break; + } case 0x8d: /* lea r16/r32, m */ c->dst.val = c->modrm_ea; break; + case 0x8e: { /* mov seg, r/m16 */ + uint16_t sel; + + sel = c->src.val; + switch ( c->modrm_reg ) { + case 0: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_ES) < 0) + goto cannot_emulate; + break; + case 1: + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) + goto cannot_emulate; + break; + case 2: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_SS) < 0) + goto cannot_emulate; + break; + case 3: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_DS) < 0) + goto cannot_emulate; + break; + case 4: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_FS) < 0) + goto cannot_emulate; + break; + case 5: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_GS) < 0) + goto cannot_emulate; + break; + default: + printk(KERN_INFO "Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + } case 0x8f: /* pop (sole member of Grp1a) */ rc = emulate_grp1a(ctxt, ops); if (rc != 0) goto done; break; + case 0xb8: /* mov r, imm */ + goto mov; case 0x9c: /* pushf */ c->src.val = (unsigned long) ctxt->eflags; emulate_push(ctxt); @@ -1623,6 +1719,10 @@ special_insn: DPRINTF("Urk! I don't handle SCAS.\n"); goto cannot_emulate; case 0xc0 ... 0xc1: + if ((c->d & ModRM) && c->modrm_mod == 3) { + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); + } emulate_grp2(ctxt); break; case 0xc3: /* ret */ @@ -1660,6 +1760,39 @@ special_insn: break; } case 0xe9: /* jmp rel */ + jmp_rel(c, c->src.val); + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + case 0xea: /* jmp far */ { + uint32_t eip; + uint16_t sel; + + /* enable switch_perso */ + switch_perso = 1; + + switch (c->op_bytes) { + case 2: + eip = insn_fetch(u16, 2, c->eip); + eip = eip & 0x0000FFFF; /* clear upper 16 bits */ + break; + case 4: + eip = insn_fetch(u32, 4, c->eip); + break; + default: + DPRINTF("jmp far: Invalid op_bytes\n"); + goto cannot_emulate; + } + sel = insn_fetch(u16, 2, c->eip); + if (ctxt->mode == X86EMUL_MODE_REAL) + eip |= (sel << 4); + else if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { + DPRINTF("jmp far: Failed to load CS descriptor\n"); + goto cannot_emulate; + } + + c->eip = eip; + break; + } case 0xeb: /* jmp rel short */ jmp_rel(c, c->src.val); c->dst.type = OP_NONE; /* Disable writeback. */ diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 1d8cd01..29254b4 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -495,6 +495,10 @@ int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value); +void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +void get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg); int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason); void kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); |
From: Guillaume T. <gui...@ex...> - 2008-05-05 06:30:56
|
On Thu, 1 May 2008 16:13:31 -0300 Marcelo Tosatti <mto...@re...> wrote: > The code sequence is: > > 8235: 66 data16 > 8236: 0f 22 c0 mov %eax,%cr0 > 8239: ea 3e 02 00 08 b8 00 ljmp $0xb8,$0x800023e > > So it switches to realmode and then does a ljmp. Problem is that you're > using the segment selector as a GDT index, but in realmode it should be > shifted left by 4 to determine the segment base address. Following patch > makes Plan9 happy. > > Other than that, load_segment_descriptor() can return a positive error > on failure, should do a proper check. > > Index: kvm/arch/x86/kvm/x86_emulate.c > =================================================================== > --- kvm.orig/arch/x86/kvm/x86_emulate.c > +++ kvm/arch/x86/kvm/x86_emulate.c > @@ -1755,7 +1755,10 @@ special_insn: > goto cannot_emulate; > } > sel = insn_fetch(u16, 2, c->eip); > - if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { > + if (ctxt->mode == X86EMUL_MODE_REAL) > + eip |= (sel << 4); > + else if (load_segment_descriptor(ctxt->vcpu, sel, 9, > + VCPU_SREG_CS) < 0) { > DPRINTF("jmp far: Failed to load CS descriptor\n"); > goto cannot_emulate; > } > Thank you Marcelo for the report. Unfortunately it is not the same problem I'm seeing. The problem I have now is that I can boot until the gfxboot screen but when I choose to install openSuse it generates a kernel panic like this: [guill@enterprise][~/local/kvm-userspace.git/bin]$ ./qemu-system-x86_64 -hda ~/disk_images/hd_50G.qcow2 -cdrom /images_iso/openSUSE-10.3-GM-x86_64-mini.iso -boot d -s -m 1024 -serial stdio Linux version 2.6.22.5-31-default (geeko@buildhost) (gcc version 4.2.1 (SUSE Linux)) #1 SMP 2007/09/21 22:29:00 UTC Command line: BOOT_IMAGE=linux initrd=initrd,08000600.spl splash=silent vga=0x314 install=slp:/ console=ttyS0 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000fffbd000 - 0000000100000000 (reserved) end_pfn_map = 1048576 DMI 2.4 present. ACPI: RSDP 000FB450, 0014 (r0 QEMU ) ACPI: RSDT 3FFF0000, 002C (r1 QEMU QEMURSDT 1 QEMU 1) ACPI: FACP 3FFF002C, 0074 (r1 QEMU QEMUFACP 1 QEMU 1) ACPI: DSDT 3FFF0100, 2464 (r1 BXPC BXDSDT 1 INTL 20061109) ACPI: FACS 3FFF00C0, 0040 ACPI: APIC 3FFF2568, 00E0 (r1 QEMU QEMUAPIC 1 QEMU 1) No NUMA configuration found Faking a node at 0000000000000000-000000003fff0000 Bootmem setup node 0 0000000000000000-000000003fff0000 No mptable found. Zone PFN ranges: DMA 0 -> 4096 DMA32 4096 -> 1048576 Normal 1048576 -> 1048576 early_node_map[2] active PFN ranges 0: 0 -> 159 0: 256 -> 262128 ACPI: PM-Timer IO Port: 0xb008 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] disabled) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] disabled) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x0a] disabled) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x0b] disabled) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0c] disabled) ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x0d] disabled) ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x0e] disabled) ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x0f] disabled) ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000 swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000e8000 swsusp: Registered nosave memory region: 00000000000e8000 - 0000000000100000 Allocating PCI resources starting at 50000000 (gap: 40000000:bffbd000) SMP: Allowing 16 CPUs, 15 hotplug CPUs PERCPU: Allocating 50296 bytes of per cpu data Built 1 zonelists. Total pages: 257180 Kernel command line: BOOT_IMAGE=linux initrd=initrd,08000600.spl splash=silent vga=0x314 install=slp:/ console=ttyS0 bootsplash: silent mode. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 32768 bytes) time.c: Detected 3002.939 MHz processor. Console: colour dummy device 80x25 Checking aperture... Memory: 1012688k/1048512k available (2050k kernel code, 35436k reserved, 1017k data, 316k init) Calibrating delay using timer specific routine.. 6034.80 BogoMIPS (lpj=12069613) Security Framework v1.0.0 initialized Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Mount-cache hash table entries: 256 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 2048K CPU 0/0 -> Node 0 invalid opcode: 0000 [1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.22.5-31-default #1 RIP: 0010:[<ffffffff80283be4>] [<ffffffff80283be4>] kmem_cache_zalloc+0x8d/0xad RSP: 0018:ffffffff805c7f18 EFLAGS: 00010046 RAX: 000000000000000a RBX: 0000000000000046 RCX: 0000000000000000 RDX: ffff8100015dfa40 RSI: 0000000000000001 RDI: ffff81003ffd33d8 RBP: 00000000000000d0 R08: 0000000000000000 R09: ffffffff804b6870 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100015d2080 R13: ffffffff805cf298 R14: ffffffff805c9000 R15: ffffffff804673bd FS: 0000000000000000(0000) GS:ffffffff804ff000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0 Process swapper (pid: 0, threadinfo ffffffff805c6000, task ffffffff804b6870) Stack: 0000000000000282 ffffffff804009a5 ffffffff80200000 ffffffff80210e10 0000000000000000 ffffffff802f3841 0000000000000000 0000000000000282 0000000000000000 0000000000000000 ffffffffffffffff ffffffff805f2700 Call Trace: [<ffffffff804009a5>] _etext+0x0/0x1cf65b [<ffffffff80210e10>] alternatives_smp_module_add+0x77/0x149 [<ffffffff802f3841>] __bitmap_weight+0x39/0x80 [<ffffffff805d607e>] alternative_instructions+0xdf/0xea [<ffffffff805d076c>] start_kernel+0x2c0/0x2db [<ffffffff805d0148>] _sinittext+0x148/0x14c Code: 0f 0d 0a 48 85 d2 74 10 41 8b 8c 24 0c 04 00 00 31 c0 48 89 RIP [<ffffffff80283be4>] kmem_cache_zalloc+0x8d/0xad RSP <ffffffff805c7f18> Kernel panic - not syncing: Attempted to kill the idle task! .................... Anyway your remark about the usage of the segment selector in real mode or not is true and I added your patch in my series of patches. I will also make proper check with return value of load_segment_descriptor(). Best regards, Guillaume |
From: Anthony L. <an...@co...> - 2008-05-06 14:31:12
|
Guillaume Thouvenin wrote: > On Mon, 5 May 2008 16:29:21 +0300 > "Mohammed Gamal" <m.g...@gm...> wrote: > > >> On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <an...@co...> wrote: >> >> >>> WinXP fails to boot with your patch applied too. FWIW, Ubuntu 8.04 has >>> a fixed version of gfxboot that doesn't do nasty things with SS on >>> privileged mode transitions. >>> >>> >> WinXP fails with the patch applied too. Ubuntu 7.10 live CD and >> FreeDOS don't boot but complain about instruction mov 0x11,sreg not >> being emulated. >> > > Can you try with this one please? > On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also > openSUSE-10.3-GM-x86_64-mini.iso > 8.04 is not a good test-case. 7.10 is what you want to try. The good news is, 7.10 appears to work! The bad news is that about 20% of the time, it crashes and displays the following: kvm_run: failed entry, reason 5 kvm_run returned -8 So something appears to be a bit buggy. Still, very good work! Regards, Anthony Liguori |
From: Guillaume T. <gui...@ex...> - 2008-05-07 05:57:13
|
On Tue, 06 May 2008 09:30:44 -0500 Anthony Liguori <an...@co...> wrote: > > 8.04 is not a good test-case. 7.10 is what you want to try. Oh yes you're right. I tried 8.04 because Balaji had problems to boot it with the patch. > The good news is, 7.10 appears to work! The bad news is that about 20% > of the time, it crashes and displays the following: > > kvm_run: failed entry, reason 5 > kvm_run returned -8 > > So something appears to be a bit buggy. Still, very good work! I can see the problem with openSuse10.3 too but no so often.... I'm looking for this issue. Thank you for the help, Regards, Guillaume |
From: Mohammed G. <m.g...@gm...> - 2008-05-06 17:05:41
|
On Tue, May 6, 2008 at 5:30 PM, Anthony Liguori <an...@co...> wrote: > Guillaume Thouvenin wrote: > > > On Mon, 5 May 2008 16:29:21 +0300 > > "Mohammed Gamal" <m.g...@gm...> wrote: > > > > > > > > > On Mon, May 5, 2008 at 3:57 PM, Anthony Liguori <an...@co...> > wrote: > > > > > > > > > > > > > WinXP fails to boot with your patch applied too. FWIW, Ubuntu 8.04 > has > > > > a fixed version of gfxboot that doesn't do nasty things with SS on > > > > privileged mode transitions. > > > > > > > > > > > > > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and > > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not > > > being emulated. > > > > > > > > > > Can you try with this one please? > > On my computer it boots ubuntu-8.04-desktop-i386.iso liveCD and also > > openSUSE-10.3-GM-x86_64-mini.iso > > > > > > 8.04 is not a good test-case. 7.10 is what you want to try. > > The good news is, 7.10 appears to work! The bad news is that about 20% of > the time, it crashes and displays the following: > > kvm_run: failed entry, reason 5 > kvm_run returned -8 > > So something appears to be a bit buggy. Still, very good work! > > Regards, > > Anthony Liguori > > 7.10 liveCD doesn't work with me at all. It only works with -no-kvm |
From: Guillaume T. <gui...@ex...> - 2008-05-14 07:29:18
|
On Tue, 6 May 2008 20:05:39 +0300 "Mohammed Gamal" <m.g...@gm...> wrote: > > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and > > > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not > > > > being emulated. Mohammed, can you try the patch at the end of this mail? Here it's working with FreeDOS now (I added the emulation of 0x90 that is an xchg instruction). I can also boot winXP Professional X64 edition. I still have a weird issue with Ubuntu 7.10 that crashes sometimes with the error: kvm_run: failed entry, reason 5 kvm_run returned -8 It's a little bit strange because this error appears very often with the wmii window manager but never with XFCE. And with wmii, it only occurs when I move the mouse above the Qemu/KVM window. If I wait 30s until the automatic boot it works... So to give a summary, on my box: OpensSuse 10.3 -> OK WinXP Pro X64 -> OK FreeDOS -> OK Ubuntu 7.10 -> NOK Regards, Guillaume --- diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e94a8c3..efde223 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1287,7 +1287,9 @@ static void enter_pmode(struct kvm_vcpu *vcpu) fix_pmode_dataseg(VCPU_SREG_GS, &vcpu->arch.rmode.gs); fix_pmode_dataseg(VCPU_SREG_FS, &vcpu->arch.rmode.fs); +#if 0 vmcs_write16(GUEST_SS_SELECTOR, 0); +#endif vmcs_write32(GUEST_SS_AR_BYTES, 0x93); vmcs_write16(GUEST_CS_SELECTOR, @@ -2648,6 +2650,73 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 1; } +static int invalid_guest_state(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, u32 failure_reason) +{ + u16 ss, cs; + u8 opcodes[4]; + unsigned long rip = vcpu->arch.rip; + unsigned long rip_linear; + + ss = vmcs_read16(GUEST_SS_SELECTOR); + cs = vmcs_read16(GUEST_CS_SELECTOR); + + if ((ss & 0x03) != (cs & 0x03)) { + int err; + rip_linear = rip + vmx_get_segment_base(vcpu, VCPU_SREG_CS); + emulator_read_std(rip_linear, (void *)opcodes, 4, vcpu); +#if 0 + printk(KERN_INFO "emulation at (%lx) rip %lx: %02x %02x %02x %02x\n", + rip_linear, + rip, opcodes[0], opcodes[1], opcodes[2], opcodes[3]); +#endif + err = emulate_instruction(vcpu, kvm_run, 0, 0, 0); + switch (err) { + case EMULATE_DONE: +#if 0 + printk(KERN_INFO "successfully emulated instruction\n"); +#endif + return 1; + case EMULATE_DO_MMIO: + printk(KERN_INFO "mmio?\n"); + return 0; + default: + kvm_report_emulation_failure(vcpu, "vmentry failure"); + break; + } + } + + kvm_run->exit_reason = KVM_EXIT_UNKNOWN; + kvm_run->hw.hardware_exit_reason = failure_reason; + return 0; +} + +static int handle_vmentry_failure(struct kvm_vcpu *vcpu, + struct kvm_run *kvm_run, + u32 failure_reason) +{ + unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); +#if 0 + printk(KERN_INFO "Failed vm entry (exit reason 0x%x) ", failure_reason); +#endif + switch (failure_reason) { + case EXIT_REASON_INVALID_GUEST_STATE: +#if 0 + printk("invalid guest state \n"); +#endif + return invalid_guest_state(vcpu, kvm_run, failure_reason); + case EXIT_REASON_MSR_LOADING: + printk("caused by MSR entry %ld loading.\n", exit_qualification); + break; + case EXIT_REASON_MACHINE_CHECK: + printk("caused by machine check.\n"); + break; + default: + printk("reason not known yet!\n"); + break; + } + return 0; +} /* * The exit handlers return 1 if the exit was handled fully and guest execution * may resume. Otherwise they set the kvm_run parameter to indicate what needs @@ -2709,6 +2778,12 @@ static int kvm_handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) exit_reason != EXIT_REASON_EPT_VIOLATION)) printk(KERN_WARNING "%s: unexpected, valid vectoring info and " "exit reason is 0x%x\n", __func__, exit_reason); + + if ((exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) { + exit_reason &= ~VMX_EXIT_REASONS_FAILED_VMENTRY; + return handle_vmentry_failure(vcpu, kvm_run, exit_reason); + } + if (exit_reason < kvm_vmx_max_exit_handlers && kvm_vmx_exit_handlers[exit_reason]) return kvm_vmx_exit_handlers[exit_reason](vcpu, kvm_run); diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h index 79d94c6..2cebf48 100644 --- a/arch/x86/kvm/vmx.h +++ b/arch/x86/kvm/vmx.h @@ -238,7 +238,10 @@ enum vmcs_field { #define EXIT_REASON_IO_INSTRUCTION 30 #define EXIT_REASON_MSR_READ 31 #define EXIT_REASON_MSR_WRITE 32 +#define EXIT_REASON_INVALID_GUEST_STATE 33 +#define EXIT_REASON_MSR_LOADING 34 #define EXIT_REASON_MWAIT_INSTRUCTION 36 +#define EXIT_REASON_MACHINE_CHECK 41 #define EXIT_REASON_TPR_BELOW_THRESHOLD 43 #define EXIT_REASON_APIC_ACCESS 44 #define EXIT_REASON_EPT_VIOLATION 48 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index dab3d4f..eb7db67 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3009,8 +3009,8 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) return 0; } -static void get_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void get_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->get_segment(vcpu, var, seg); } @@ -3093,8 +3093,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, return 0; } -static void set_segment(struct kvm_vcpu *vcpu, - struct kvm_segment *var, int seg) +void set_segment(struct kvm_vcpu *vcpu, + struct kvm_segment *var, int seg) { kvm_x86_ops->set_segment(vcpu, var, seg); } @@ -3252,8 +3252,8 @@ static int load_segment_descriptor_to_kvm_desct(struct kvm_vcpu *vcpu, return 0; } -static int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, - int type_bits, int seg) +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg) { struct kvm_segment kvm_seg; diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 8a96320..40ebb46 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -69,6 +69,7 @@ #define GroupDual (1<<15) /* Alternate decoding of mod == 3 */ #define GroupMask 0xff /* Group number stored in bits 0:7 */ +int switch_perso = 0; enum { Group1_80, Group1_81, Group1_82, Group1_83, Group1A, Group3_Byte, Group3, Group4, Group5, Group7, @@ -138,9 +139,10 @@ static u16 opcode_table[256] = { /* 0x88 - 0x8F */ ByteOp | DstMem | SrcReg | ModRM | Mov, DstMem | SrcReg | ModRM | Mov, ByteOp | DstReg | SrcMem | ModRM | Mov, DstReg | SrcMem | ModRM | Mov, - 0, ModRM | DstReg, 0, Group | Group1A, + DstMem | SrcReg | ModRM | Mov, ModRM | DstReg, + DstReg | SrcMem | ModRM | Mov, Group | Group1A, /* 0x90 - 0x9F */ - 0, 0, 0, 0, 0, 0, 0, 0, + ImplicitOps, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, /* 0xA0 - 0xA7 */ ByteOp | DstReg | SrcMem | Mov | MemAbs, DstReg | SrcMem | Mov | MemAbs, @@ -152,7 +154,8 @@ static u16 opcode_table[256] = { ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String, ByteOp | ImplicitOps | String, ImplicitOps | String, /* 0xB0 - 0xBF */ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, + DstReg | SrcImm | Mov, 0, 0, 0, 0, 0, 0, 0, /* 0xC0 - 0xC7 */ ByteOp | DstMem | SrcImm | ModRM, DstMem | SrcImmByte | ModRM, 0, ImplicitOps | Stack, 0, 0, @@ -168,7 +171,7 @@ static u16 opcode_table[256] = { /* 0xE0 - 0xE7 */ 0, 0, 0, 0, 0, 0, 0, 0, /* 0xE8 - 0xEF */ - ImplicitOps | Stack, SrcImm|ImplicitOps, 0, SrcImmByte|ImplicitOps, + ImplicitOps | Stack, SrcImm | ImplicitOps, ImplicitOps, SrcImmByte | ImplicitOps, 0, 0, 0, 0, /* 0xF0 - 0xF7 */ 0, 0, 0, 0, @@ -1246,6 +1249,19 @@ static inline int writeback(struct x86_emulate_ctxt *ctxt, default: break; } +#if 0 + if (switch_perso) { + printk(KERN_INFO " writeback: dst.byte %d\n" , c->dst.bytes); + printk(KERN_INFO " writeback: dst.ptr 0x%p\n" , c->dst.ptr); + printk(KERN_INFO " writeback: dst.val 0x%lx\n", c->dst.val); + printk(KERN_INFO " writeback: src.ptr 0x%p\n", c->src.ptr); + printk(KERN_INFO " writeback: src.val 0x%lx\n", c->src.val); + printk(KERN_INFO " writeback: RAX 0x%lx\n", c->regs[VCPU_REGS_RAX]); + printk(KERN_INFO " writeback: RSP 0x%lx\n", c->regs[VCPU_REGS_RSP]); + printk(KERN_INFO " writeback: CS 0x%lx\n", c->regs[VCPU_SREG_CS]); + printk(KERN_INFO " writeback: SS 0x%lx\n", c->regs[VCPU_SREG_SS]); + } +#endif return 0; } @@ -1342,6 +1358,10 @@ special_insn: switch (c->b) { case 0x00 ... 0x05: add: /* add */ + if ((c->d & ModRM) && c->modrm_mod == 3) { + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); + } emulate_2op_SrcV("add", c->src, c->dst, ctxt->eflags); break; case 0x08 ... 0x0d: @@ -1514,14 +1534,119 @@ special_insn: break; case 0x88 ... 0x8b: /* mov */ goto mov; + case 0x8c: { /* mov r/m, sreg */ + struct kvm_segment segreg; + + if (c->modrm_mod == 0x3) + c->src.val = c->modrm_val; + + switch ( c->modrm_reg ) { + case 0: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_ES); + break; + case 1: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_CS); + break; + case 2: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_SS); + break; + case 3: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_DS); + break; + case 4: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_FS); + break; + case 5: + get_segment(ctxt->vcpu, &segreg, VCPU_SREG_GS); + break; + default: + printk(KERN_INFO "0x8c: Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + c->dst.val = segreg.selector; + c->dst.bytes = 2; + c->dst.ptr = (unsigned long *)decode_register(c->modrm_rm, c->regs, + c->d & ByteOp); + break; + } case 0x8d: /* lea r16/r32, m */ c->dst.val = c->modrm_ea; break; + case 0x8e: { /* mov seg, r/m16 */ + uint16_t sel; + + sel = c->src.val; + switch ( c->modrm_reg ) { + case 0: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_ES) < 0) + goto cannot_emulate; + break; + case 1: + if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) + goto cannot_emulate; + break; + case 2: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_SS) < 0) + goto cannot_emulate; + break; + case 3: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_DS) < 0) + goto cannot_emulate; + break; + case 4: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_FS) < 0) + goto cannot_emulate; + break; + case 5: + if (load_segment_descriptor(ctxt->vcpu, sel, 1, VCPU_SREG_GS) < 0) + goto cannot_emulate; + break; + default: + printk(KERN_INFO "Invalid segreg in modrm byte 0x%02x\n", + c->modrm); + goto cannot_emulate; + } + + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + } case 0x8f: /* pop (sole member of Grp1a) */ rc = emulate_grp1a(ctxt, ops); if (rc != 0) goto done; break; + case 0x90: /* xhcg r8, rAx */ + c->src.ptr = & c->regs[c->b & 0x7]; + c->dst.ptr = & c->regs[VCPU_REGS_RAX]; + + switch (c->op_bytes) { + case 2: + c->dst.val = *(u16*) c->dst.ptr; + c->src.val = *(u16*) c->src.ptr; + *(u16 *) c->dst.ptr = (u16) c->src.val; + *(u16 *) c->src.ptr = (u16) c->dst.val; + break; + case 4: + c->dst.val = *(u32*) c->dst.ptr; + c->src.val = *(u32*) c->src.ptr; + *(u32 *) c->dst.ptr = (u32) c->src.val; + *(u32 *) c->src.ptr = (u32) c->dst.val; + break; + case 8: + c->dst.val = *(u64*) c->dst.ptr; + c->src.val = *(u64*) c->src.ptr; + *(u64 *) c->dst.ptr = (u64) c->src.val; + *(u64 *) c->src.ptr = (u64) c->dst.val; + break; + default: + printk("xchg: op_bytes=%d is not supported.\n", c->op_bytes); + goto cannot_emulate; + } + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + case 0xb8: /* mov r, imm */ + goto mov; case 0x9c: /* pushf */ c->src.val = (unsigned long) ctxt->eflags; emulate_push(ctxt); @@ -1623,6 +1748,10 @@ special_insn: DPRINTF("Urk! I don't handle SCAS.\n"); goto cannot_emulate; case 0xc0 ... 0xc1: + if ((c->d & ModRM) && c->modrm_mod == 3) { + c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; + c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); + } emulate_grp2(ctxt); break; case 0xc3: /* ret */ @@ -1660,6 +1789,39 @@ special_insn: break; } case 0xe9: /* jmp rel */ + jmp_rel(c, c->src.val); + c->dst.type = OP_NONE; /* Disable writeback. */ + break; + case 0xea: /* jmp far */ { + uint32_t eip; + uint16_t sel; + + /* enable switch_perso */ + switch_perso = 1; + + switch (c->op_bytes) { + case 2: + eip = insn_fetch(u16, 2, c->eip); + eip = eip & 0x0000FFFF; /* clear upper 16 bits */ + break; + case 4: + eip = insn_fetch(u32, 4, c->eip); + break; + default: + DPRINTF("jmp far: Invalid op_bytes\n"); + goto cannot_emulate; + } + sel = insn_fetch(u16, 2, c->eip); + if (ctxt->mode == X86EMUL_MODE_REAL) + eip |= (sel << 4); + else if (load_segment_descriptor(ctxt->vcpu, sel, 9, VCPU_SREG_CS) < 0) { + DPRINTF("jmp far: Failed to load CS descriptor\n"); + goto cannot_emulate; + } + + c->eip = eip; + break; + } case 0xeb: /* jmp rel short */ jmp_rel(c, c->src.val); c->dst.type = OP_NONE; /* Disable writeback. */ diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 1466c3f..99e343e 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -494,6 +494,10 @@ int emulator_get_dr(struct x86_emulate_ctxt *ctxt, int dr, int emulator_set_dr(struct x86_emulate_ctxt *ctxt, int dr, unsigned long value); +void set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +void get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +int load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, + int type_bits, int seg); int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason); void kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); |
From: Mohammed G. <m.g...@gm...> - 2008-05-15 18:14:05
Attachments:
real_mode_support_20080605.patch
|
On Wed, May 14, 2008 at 10:29 AM, Guillaume Thouvenin <gui...@ex...> wrote: > On Tue, 6 May 2008 20:05:39 +0300 > "Mohammed Gamal" <m.g...@gm...> wrote: > > >> > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and >> > > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not >> > > > being emulated. > > Mohammed, can you try the patch at the end of this mail? Here it's > working with FreeDOS now (I added the emulation of 0x90 that is an xchg > instruction). I can also boot winXP Professional X64 edition. I still > have a weird issue with Ubuntu 7.10 that crashes sometimes with the > error: > > kvm_run: failed entry, reason 5 > kvm_run returned -8 > > It's a little bit strange because this error appears very often with > the wmii window manager but never with XFCE. And with wmii, it only > occurs when I move the mouse above the Qemu/KVM window. If I wait 30s > until the automatic boot it works... > > So to give a summary, on my box: > > OpensSuse 10.3 -> OK > WinXP Pro X64 -> OK > FreeDOS -> OK > Ubuntu 7.10 -> NOK > > Regards, > Guillaume > On Wed, May 14, 2008 at 10:29 AM, Guillaume Thouvenin <gui...@ex...> wrote: > On Tue, 6 May 2008 20:05:39 +0300 > "Mohammed Gamal" <m.g...@gm...> wrote: > > >> > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and >> > > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not >> > > > being emulated. > > Mohammed, can you try the patch at the end of this mail? Here it's > working with FreeDOS now (I added the emulation of 0x90 that is an xchg > instruction). I can also boot winXP Professional X64 edition. I still > have a weird issue with Ubuntu 7.10 that crashes sometimes with the > error: > > kvm_run: failed entry, reason 5 > kvm_run returned -8 > > It's a little bit strange because this error appears very often with > the wmii window manager but never with XFCE. And with wmii, it only > occurs when I move the mouse above the Qemu/KVM window. If I wait 30s > until the automatic boot it works... > > So to give a summary, on my box: > > OpensSuse 10.3 -> OK > WinXP Pro X64 -> OK > FreeDOS -> OK > Ubuntu 7.10 -> NOK > > Regards, > Guillaume > Hi Guillaume, I still haven't applied the patch you sent now. However I'm using the patch you last sent me (it's attached in case anyone wants to have a look). I'm having the same problem with Ubuntu 7.10 Live CD under GNOME. Regarding WinXP, I'm using 32-bit WinXP Pro and it crashes with this error: unhandled vm exit: 0x21 vcpu_id 0 rax 0000000000000011 rbx 00000000000014fc rcx 000000000000ffff rdx 00000000534d0000 rsi 00000000ffff1d68 rdi 000000000008164f rsp 00000000000014fa rbp 0000000000001522 r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000 r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000 rip 0000000000000269 rflags 00010006 cs 2000 (00020000/0000ffff p 1 dpl 0 db 0 s 1 type b l 0 g 0 avl 0) ds 22f3 (00022f30/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0) ss 22f3 (00022f30/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0) fs 0030 (00000300/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0) gs 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0) tr 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 17000/3ff idt 17400/7ff cr0 11 cr2 0 cr3 0 cr4 0 cr8 0 efer 0 Aborted and dmesg outputs this: emulation failed (vmentry failure) rip 269 68 6d 02 cb The output is the same on every run. I'll give this patch (and Marcello's) a try and report on what happens. |
From: Marcelo T. <mto...@re...> - 2008-05-14 21:27:10
|
Hi Guillaume, On Wed, May 14, 2008 at 09:29:11AM +0200, Guillaume Thouvenin wrote: > On Tue, 6 May 2008 20:05:39 +0300 > "Mohammed Gamal" <m.g...@gm...> wrote: > > > > > > > WinXP fails with the patch applied too. Ubuntu 7.10 live CD and > > > > > FreeDOS don't boot but complain about instruction mov 0x11,sreg not > > > > > being emulated. > > Mohammed, can you try the patch at the end of this mail? Here it's > working with FreeDOS now (I added the emulation of 0x90 that is an xchg > instruction). I can also boot winXP Professional X64 edition. I still > have a weird issue with Ubuntu 7.10 that crashes sometimes with the > error: > > kvm_run: failed entry, reason 5 > kvm_run returned -8 > > It's a little bit strange because this error appears very often with > the wmii window manager but never with XFCE. And with wmii, it only > occurs when I move the mouse above the Qemu/KVM window. If I wait 30s > until the automatic boot it works... This appears to be due to the vmport save/load bug: https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/219165 I'll look into it if nobody beats me to it. Regarding FreeDOS, it necessary to emulate software interrupts and NOP to get the "HIMEM XMS-memory driver" version to boot (with the FreeOSZOO image). The "maximum RAM free, using EMM86" version is more complicated, requiring ldt, ltr and a few other things. There are two problems remaining: 1) add is storing the result in the wrong register 6486: 66 64 89 3e 72 01 mov %edi,%fs:0x172 648c: 66 be 8d 03 00 00 mov $0x38d,%esi 6492: 66 c1 e6 04 shl $0x4,%esi 6496: 66 b8 98 0a 00 00 mov $0xa98,%eax 649c: 66 03 f0 add %eax,%esi The destination for the add is "%esi", but the emulation stores the result in eax, because: if ((c->d & ModRM) && c->modrm_mod == 3) { u8 reg; c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); } modrm_reg contains "6", which is the correct register index, but modrm_rm contains 0, so the result is stored in "eax" (see hack). 2) iretl generates pagefaults 1226df: 0f 06 clts 1226e1: b8 14 00 mov $0x14,%ax 1226e4: 8e e0 mov %ax,%fs 1226e6: 66 64 a1 50 01 mov %fs:0x150,%eax 1226eb: 0f 22 d8 mov %eax,%cr3 1226ee: 0f 20 c0 mov %cr0,%eax 1226f1: 66 0d 00 00 00 80 or $0x80000000,%eax 1226f7: 0f 22 c0 mov %eax,%cr0 1226fa: 66 cf iretl The iretl which happens after enabling paging faults in different ways: kvm_inject_page_fault: EIP=1226fa kvm_inject_page_fault: ADDR=1226fa kvm_inject_page_fault: EIP=1226fa kvm_inject_page_fault: ADDR=1237d1 kvm: inject_page_fault: double fault 0x1237d1 Index: kvm.tip/arch/x86/kvm/vmx.c =================================================================== --- kvm.tip.orig/arch/x86/kvm/vmx.c +++ kvm.tip/arch/x86/kvm/vmx.c @@ -194,6 +194,12 @@ static inline int is_external_interrupt( == (INTR_TYPE_EXT_INTR | INTR_INFO_VALID_MASK); } +static inline int is_software_interrupt(u32 intr_info) +{ + return (intr_info & (INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK)) + == (INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK); +} + static inline int cpu_has_vmx_msr_bitmap(void) { return (vmcs_config.cpu_based_exec_ctrl & CPU_BASED_USE_MSR_BITMAPS); @@ -2190,8 +2196,10 @@ static void kvm_guest_debug_pre(struct k } static int handle_rmode_exception(struct kvm_vcpu *vcpu, - int vec, u32 err_code) + u32 intr_info, u32 err_code) { + int vec = intr_info & INTR_INFO_VECTOR_MASK; + if (!vcpu->arch.rmode.active) return 0; @@ -2202,6 +2210,10 @@ static int handle_rmode_exception(struct if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0) if (emulate_instruction(vcpu, NULL, 0, 0, 0) == EMULATE_DONE) return 1; + if (is_software_interrupt(intr_info) && err_code == 0) { + if (emulate_instruction(vcpu, NULL, 0, 0, 0) == EMULATE_DONE) + return 1; + } return 0; } @@ -2257,8 +2269,7 @@ static int handle_exception(struct kvm_v } if (vcpu->arch.rmode.active && - handle_rmode_exception(vcpu, intr_info & INTR_INFO_VECTOR_MASK, - error_code)) { + handle_rmode_exception(vcpu, intr_info, error_code)) { if (vcpu->arch.halt_request) { vcpu->arch.halt_request = 0; return kvm_emulate_halt(vcpu); Index: kvm.tip/arch/x86/kvm/x86.c =================================================================== --- kvm.tip.orig/arch/x86/kvm/x86.c +++ kvm.tip/arch/x86/kvm/x86.c @@ -3294,13 +3294,21 @@ int load_segment_descriptor(struct kvm_v if (load_segment_descriptor_to_kvm_desct(vcpu, selector, &kvm_seg)) return 1; - kvm_seg.type |= type_bits; if (seg != VCPU_SREG_SS && seg != VCPU_SREG_CS && seg != VCPU_SREG_LDTR) if (!kvm_seg.s) kvm_seg.unusable = 1; + if (seg == VCPU_SREG_CS && !kvm_seg.s) { + switch (kvm_seg.type) { + case 9: /* TSS */ + return kvm_task_switch(vcpu, selector, TASK_SWITCH_JMP); + default: + } + } + + kvm_seg.type |= type_bits; set_segment(vcpu, &kvm_seg, seg); return 0; } Index: kvm.tip/arch/x86/kvm/x86_emulate.c =================================================================== --- kvm.tip.orig/arch/x86/kvm/x86_emulate.c +++ kvm.tip/arch/x86/kvm/x86_emulate.c @@ -99,7 +99,7 @@ static u16 opcode_table[256] = { /* 0x28 - 0x2F */ ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM, ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM, - 0, 0, 0, 0, + DstReg | SrcImm, DstReg | SrcImm, 0, 0, /* 0x30 - 0x37 */ ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM, ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM, @@ -107,7 +107,7 @@ static u16 opcode_table[256] = { /* 0x38 - 0x3F */ ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM, ByteOp | DstReg | SrcMem | ModRM, DstReg | SrcMem | ModRM, - 0, 0, 0, 0, + 0, ByteOp | DstReg | SrcImm, 0, 0, /* 0x40 - 0x47 */ DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, DstReg, /* 0x48 - 0x4F */ @@ -154,8 +154,10 @@ static u16 opcode_table[256] = { ByteOp | ImplicitOps | Mov | String, ImplicitOps | Mov | String, ByteOp | ImplicitOps | String, ImplicitOps | String, /* 0xB0 - 0xBF */ - 0, 0, 0, 0, 0, 0, 0, 0, DstReg | SrcImm | Mov, 0, 0, 0, 0, 0, 0, 0, + DstReg | SrcImm | Mov, DstReg | SrcImm | Mov, DstReg | SrcImm | Mov, + DstReg | SrcImm | Mov, DstReg | SrcImm | Mov, DstReg | SrcImm | Mov, + DstReg | SrcImm | Mov, DstReg | SrcImm | Mov, /* 0xC0 - 0xC7 */ ByteOp | DstMem | SrcImm | ModRM, DstMem | SrcImmByte | ModRM, 0, ImplicitOps | Stack, 0, 0, @@ -169,7 +171,7 @@ static u16 opcode_table[256] = { /* 0xD8 - 0xDF */ 0, 0, 0, 0, 0, 0, 0, 0, /* 0xE0 - 0xE7 */ - 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, SrcImmByte, 0, 0, 0, 0, 0, /* 0xE8 - 0xEF */ ImplicitOps | Stack, SrcImm | ImplicitOps, ImplicitOps, SrcImmByte | ImplicitOps, 0, 0, 0, 0, @@ -183,7 +185,8 @@ static u16 opcode_table[256] = { static u16 twobyte_table[256] = { /* 0x00 - 0x0F */ - 0, Group | GroupDual | Group7, 0, 0, 0, 0, ImplicitOps, 0, + SrcReg|SrcMem16|ModRM, + Group | GroupDual | Group7, 0, 0, 0, 0, ImplicitOps, 0, ImplicitOps, ImplicitOps, 0, 0, 0, ImplicitOps | ModRM, 0, 0, /* 0x10 - 0x1F */ 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | ModRM, 0, 0, 0, 0, 0, 0, 0, @@ -275,7 +278,8 @@ static u16 group_table[] = { 0, 0, 0, 0, 0, 0, [Group5*8] = DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM, 0, 0, - SrcMem | ModRM, 0, SrcMem | ModRM | Stack, 0, + SrcMem | ModRM, ImplicitOps | ModRM, SrcMem | ModRM | Stack, 0, + [Group7*8] = 0, 0, ModRM | SrcMem, ModRM | SrcMem, SrcNone | ModRM | DstMem | Mov, 0, @@ -951,8 +955,8 @@ done_prefixes: } /* Unrecognised? */ - if (c->d == 0) { - DPRINTF("Cannot emulate %02x\n", c->b); + if (c->d == 0 && (c->b != 0xcc) && (c->b != 0x90) && (c->b != 0xf)) { + DPRINTF("Cannot emulate %02x %x\n", c->b, c->eip); return -1; } @@ -1359,8 +1363,15 @@ special_insn: case 0x00 ... 0x05: add: /* add */ if ((c->d & ModRM) && c->modrm_mod == 3) { + u8 reg; c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; - c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); + + if (ctxt->cs_base + c->eip == 0x649f) + reg = c->modrm_rm|c->modrm_reg; + else + reg = c->modrm_rm; + + c->dst.ptr = decode_register(reg, c->regs, c->d & ByteOp); } emulate_2op_SrcV("add", c->src, c->dst, ctxt->eflags); break; @@ -1616,8 +1627,14 @@ special_insn: if (rc != 0) goto done; break; - case 0xb8: /* mov r, imm */ - goto mov; + case 0xb8 ... 0xbf: /* mov r, imm */ + { + int reg = c->b & 0x7; + c->dst.ptr = (unsigned long *)&c->regs[VCPU_REGS_RAX + reg]; + goto mov; + } + case 0x90: /* nop */ + break; case 0x9c: /* pushf */ c->src.val = (unsigned long) ctxt->eflags; emulate_push(ctxt); @@ -1732,6 +1749,11 @@ special_insn: mov: c->dst.val = c->src.val; break; + case 0xcc ... 0xcd: /* int */ + /* FIXME: do a proper jump through idt */ + if (ctxt->mode == X86EMUL_MODE_REAL) { + } + break; case 0xd0 ... 0xd1: /* Grp2 */ c->src.val = 1; emulate_grp2(ctxt); @@ -1740,6 +1762,12 @@ special_insn: c->src.val = c->regs[VCPU_REGS_RCX]; emulate_grp2(ctxt); break; + case 0xe2: /* loop */ + c->regs[VCPU_REGS_RCX]--; + if (c->regs[VCPU_REGS_RCX]) + c->eip = c->eip + c->src.val; + c->dst.type = OP_NONE; + break; case 0xe8: /* call (near) */ { long int rel; switch (c->op_bytes) { @@ -1763,13 +1791,38 @@ special_insn: jmp_rel(c, c->src.val); c->dst.type = OP_NONE; /* Disable writeback. */ break; - case 0xea: /* jmp far */ { + case 0xea: + jmpfar: /* jmp far */ { uint32_t eip; uint16_t sel; /* enable switch_perso */ switch_perso = 1; + if (c->b == 0xff) { + rc = ops->read_emulated(c->modrm_ea, &eip, + c->op_bytes, ctxt->vcpu); + if (rc != 0) + goto cannot_emulate; + + c->modrm_ea += c->op_bytes; + rc = ops->read_emulated(c->modrm_ea, &sel, + 2, ctxt->vcpu); + if (rc != 0) + goto cannot_emulate; + + c->eip = eip; + if (load_segment_descriptor(ctxt->vcpu, sel, 9, + VCPU_SREG_CS) < 0) { + printk("failed to load cs!\n"); + goto cannot_emulate; + } + goto done; + /* FIXME: if this is not a TSS jump need to + * perform register writeback. + * break; + */ + } switch (c->op_bytes) { case 2: eip = insn_fetch(u16, 2, c->eip); @@ -1823,6 +1876,8 @@ special_insn: c->dst.type = OP_NONE; /* Disable writeback. */ break; case 0xfe ... 0xff: /* Grp4/Grp5 */ + if (c->modrm_reg == 5) + goto jmpfar; rc = emulate_grp45(ctxt, ops); if (rc != 0) goto done; @@ -1847,6 +1902,22 @@ done: twobyte_insn: switch (c->b) { + case 0x0: + switch (c->modrm_reg) { + case 2: /* ldt */ + if (load_segment_descriptor(ctxt->vcpu, c->src.val, + 0, VCPU_SREG_LDTR)) + goto cannot_emulate; + break; + case 3: /* ltr */ + if (load_segment_descriptor(ctxt->vcpu, c->src.val, + 1, VCPU_SREG_TR)) + goto cannot_emulate; + break; + default: + goto cannot_emulate; + } + break; case 0x01: /* lgdt, lidt, lmsw */ switch (c->modrm_reg) { u16 size; |
From: Avi K. <av...@qu...> - 2008-05-15 07:33:46
|
Marcelo Tosatti wrote: > 1) add is storing the result in the wrong register > > 6486: 66 64 89 3e 72 01 mov %edi,%fs:0x172 > 648c: 66 be 8d 03 00 00 mov $0x38d,%esi > 6492: 66 c1 e6 04 shl $0x4,%esi > 6496: 66 b8 98 0a 00 00 mov $0xa98,%eax > 649c: 66 03 f0 add %eax,%esi > > The destination for the add is "%esi", but the emulation stores the > result in eax, because: > > if ((c->d & ModRM) && c->modrm_mod == 3) { > u8 reg; > c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; > c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); > } > > modrm_reg contains "6", which is the correct register index, but > modrm_rm contains 0, so the result is stored in "eax" (see hack). > What version are you looking at? Current code doesn't have exactly this. But register-in-modrm decoding is a mess, yes. I think the best thing is to have decode_modrm() accept a struct operand parameter and decode into that. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. |
From: Guillaume T. <gui...@ex...> - 2008-05-15 08:13:25
|
On Thu, 15 May 2008 10:33:38 +0300 Avi Kivity <av...@qu...> wrote: > Marcelo Tosatti wrote: > > 1) add is storing the result in the wrong register > > > > 6486: 66 64 89 3e 72 01 mov %edi,%fs:0x172 > > 648c: 66 be 8d 03 00 00 mov $0x38d,%esi > > 6492: 66 c1 e6 04 shl $0x4,%esi > > 6496: 66 b8 98 0a 00 00 mov $0xa98,%eax > > 649c: 66 03 f0 add %eax,%esi > > > > The destination for the add is "%esi", but the emulation stores the > > result in eax, because: > > > > if ((c->d & ModRM) && c->modrm_mod == 3) { > > u8 reg; > > c->dst.bytes = (c->d & ByteOp) ? 1 : c->op_bytes; > > c->dst.ptr = decode_register(c->modrm_rm, c->regs, c->d & ByteOp); > > } > > > > modrm_reg contains "6", which is the correct register index, but > > modrm_rm contains 0, so the result is stored in "eax" (see hack). > > > > What version are you looking at? Current code doesn't have exactly this. It's in my patch. I added this because in gfxboot code there is an instruction "add %eax, %esp" that needs to be emulated and with the normal path, if I remember well, we have c->dst.bytes == 0 and thus, the emulate_2op_SrcV() function just do nothing. Regards, Guillaume |