Activity for Christopher

  • Christopher Christopher posted a comment on ticket #564

    Disregard this for now, there's another bug to work out then will submit a newer patch.

  • Christopher Christopher created ticket #564

    AMD PauseFilterThreshold support

  • Christopher Christopher created ticket #1443

    Out of bounds memory access in memory.cc

  • Christopher Christopher created ticket #1440

    AMD SVM VMCBPTR not saved on snapshot

  • Christopher Christopher posted a comment on ticket #1428

    Ok I'm kinda getting closer. What's happening is first the MSR KernelGSBase (0xC0000102) is being accessed and this is in the MSR bitmap and NOT being intercepted, so that occurs in the guest without VMEXIT. Now when the MSR 0x40000071 follows, this somehow leads to a memory access exception (still need to trace where exactly). But if I intercept all MSRs, meaning the KernelGSBase MSR leads to a VMEXIT instead, then it works fine and the following 0x40000071 MSR VMEXITs and doesn't cause any more...

  • Christopher Christopher posted a comment on ticket #1428

    Actually no, I'm so confused. When guest does wrmsr to 0x40000071, MSR is 0x40000071. However somehow when I check "if(msr == 0x40000071)" the if fails? And I know MSR is 0x40000071 from BX_INFO printing it, yet somehow it also passes the checks "else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff))" and enters this if? When it shouldn't? So confused whats going on here at the moment haha.

  • Christopher Christopher posted a comment on ticket #1428

    Hey man, New issue to fix, this one is why I hate C lol. in SVM.cc you have SvmInterceptMSR, theres a bunch of if and else if statements like: if (msr <= 0x1fff) msr_map_offset = 0; else if (msr >= 0xc0000000 && msr <= 0xc0001fff) msr_map_offset = 2048; else if (msr >= 0xc0010000 && msr <= 0xc0011fff) msr_map_offset = 4096; Theres a problem here, specifically the double conditions inside the brackets like "msr >= 0xc0000000 && msr <= 0xc0001fff" Its not being calculated properly because you need...

  • Christopher Christopher posted a comment on ticket #1428

    Yep, with these changes AMD Hyper-V boots and works! Thanks for you assistance with finding the bugs and patches

  • Christopher Christopher posted a comment on ticket #1428

    Yeah that works

  • Christopher Christopher posted a comment on ticket #1428

    Yeah the hack works, Hyper-V AMD boots fine with that alongside the Guest EFER.SVME change I mentioned earlier too.

  • Christopher Christopher posted a comment on ticket #1428

    I checked with your changes in event.cc, it doesn't have the intercept set so it doesn't vmexit, it goes the "take it normally" route. Also: - The SMI comes via apic_bus_deliver_smi called via iodev/acpi.cc in generate_smi with the value 0xf1 - When doing the problematic page walk, status is: GUEST_NXE:0, HOST_NXE:1, IN_SVM:1, IN_SMM:1, RW:2, is_page_walk:0

  • Christopher Christopher posted a comment on ticket #1428

    I'll look into it more tomorrow too, but for now I can provide a dump of the debug prints right before it happens (with some extras in there): 07119718387i[CPU0 ] GUEST_NXE:1, HOST_NXE:1, IN_SVM:1 07119718387d[CPU0 ] Nested walk for guest paddr 0x000004605000 07119718387i[CPU0 ] GUEST_NXE:1, HOST_NXE:1, IN_SVM:1 07119718387d[CPU0 ] Nested walk for guest paddr 0x000004606070 07119718387i[CPU0 ] GUEST_NXE:1, HOST_NXE:1, IN_SVM:1 07119718387d[CPU0 ] Nested walk for guest paddr 0x00000010e04c 07119718393i[CPU0...

  • Christopher Christopher posted a comment on ticket #1428

    Hey, so when the NX fault occurs, the EFER.NXE status for guest and host are: GUEST_NXE:0, HOST_NXE:1 Also yes your fix for the nested_page_fault issue above seems good

  • Christopher Christopher modified a comment on ticket #1428

    Yeah all the changes were necessary, reasons being: For the paging.cc nested_walk change, without this change as mentioned the exitinfo1 was like 0000000200000004 and when this went to hyper-v, it must have injected an exception or something because code execution jumps to BSOD in guest, but with this change it doesn't and the exitinfo1 becomes 0000000100000004 and the guest continues. This code is hit when the guest attempts to read addresses like 0xfee00320 and 0xfee00340 For the paging.cc *nx_fault...

  • Christopher Christopher modified a comment on ticket #1428

    Yeah all the changes were necessary, reasons being: For the paging.cc nested_walk change, without this change as mentioned the exitinfo1 was like 0000000200000004 and when this went to hyper-v, it must have injected an exception or something because code execution jumps to BSOD in guest, but with this change it doesn't and the guest continues. This code is hit when the guest attempts to read addresses like 0xfee00320 and 0xfee00340 For the paging.cc *nx_fault change, this is required because the...

  • Christopher Christopher posted a comment on ticket #1428

    Yeah all the changes were necessary, reasons being: - For the paging.cc nested_walk change, without this change as mentioned the exitinfo1 was like 0000000200000004 and when this went to hyper-v, it must have injected an exception or something because code execution jumps to BSOD in guest, but with this change it doesn't and the guest continues. This code is hit when the guest attempts to read addresses like 0xfee00320 and 0xfee00340 For the paging.cc *nx_fault change, this is required because the...

  • Christopher Christopher posted a comment on ticket #1428

    Well I actually got Hyper-V AMD to boot properly with the following changes: Change that paging.cc line ~1384 from nested_walk(paddress, rw, 0); to nested_walk(paddress, rw, 1); Change paging.cc line ~657 by removing the "*nx_fault = 1;" line, as this was being hit for some unknown reason Change cpu/svm.cc line ~430 by removing the guest efer.svme requirment by commiting out the lines "BX_ERROR(("VMRUN: Guest EFER.SVME = 0"));" and "return 0;" With those changes, I can actually boot into windows...

  • Christopher Christopher posted a comment on ticket #1428

    Note that by patching line 1384 in cpu/paging.cc to "nested_walk(paddress, rw, 1); instead of (paddress,rw, 0), does then make bochs send the proper exitinfo1 value, which allows windows to boot further. However then somehow we get a "PAE PTE: non-executable page fault occurred" and " SVM VMEXIT reason=1024 exitinfo1=0000000100000015 exitinfo2=00000000000a8000" and then further down a panic happens "[BXVGA ] >>PANIC<< update: select_high_bank != 1". Not sure if the patch I mentioned is actually correct,...

  • Christopher Christopher posted a comment on ticket #1428

    Hey so I looked into this, turns out the ExitInfo1 value bochs provides when a NPF occurs doesn't match hardware. For example, on the same build of windows the guest accesses to 0xfee00320 cause an ExitInfo1 code of "0000000100000004" however under bochs we see "0000000200000004". I'm not familiar with how the codes used for NPF yet, in the meantime could you reconfirm the accuracy of how bochs sets the ExitInfo1 code for nested page faults under SVM, as they're different to hardware. Though I also...

  • Christopher Christopher posted a comment on ticket #1428

    Hey so I've got another crash, in windows with AMD SVM enabled alongside nested paging, the windows guest will try to access the APIC address 0xfee00320, this leads to nested page fault vmexit and the hypervisor ends up injecting a general protection fault exception that crashes the guest. Trying to look into it, but if you have any thoughts on this that'd be great. Thanks.

  • Christopher Christopher posted a comment on ticket #1428

    Ok so with the code in the repo now, its like my last dot point above. Execution continues until the kernel panics due to corrupted registers, which I think is from the stale VMCB bug I described. Thanks for the fix, I figure the PAT bug / incorrect handling was just unrelated to the VMCB issues im having. I'm continuing to look into this, but any ideas are appreciated.

  • Christopher Christopher posted a comment on ticket #1428

    Ok so a couple of updates: With the current code in the repo, booting AMD hyper-v in Bochs results in a CPU exception that panics and kills execution By adding in the hardcoding of the PAT, no panics occur but the host just seems to hang and nothing really happens By adding in code to set the guest PAT properly before vmrun (setting msr.pat to the saved guest pat in VMCB in SvmEnterLoadCheckControls) code progresses further to the original BSOD I was getting when I started this thread. By adding...

  • Christopher Christopher posted a comment on ticket #1428

    Sorry i should have confirmed, no hyper-v will not load without nested paging support (it just loads into base windows without hyper-v if it doesnt detect nested paging support).

  • Christopher Christopher posted a comment on ticket #1428

    oh also i dont see the guest pat actually being loaded before vmrun? i see it being checked in SvmEnterLoadCheckControls, but not actually applied to msr.pat?

  • Christopher Christopher posted a comment on ticket #1428

    Actually you will want to save/restore the host PAT. right now with guest pat change just made, hyper-v amd doesnt boot nearly as far as it did, at the first svm exit it just resets, i think the host pat being corrupted by not being restored is definitely affecting this.

  • Christopher Christopher posted a comment on ticket #1428

    Hello, It looks like there are MSRs (like the PAT) that aren't saved/restored for the guest and host in svm.cc. This looks like for all MSRs defined after SVM_GUEST_PAT too. I think this is the issue preventing hyper-v from booting properly. Thanks.

  • Christopher Christopher created ticket #1428

    AMD SVM Hyper-V fails (bug)

1