Thread: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

Brought to you by: avik, mtosatti

kvm-devel

[kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-23 18:23:30

Attachments: kvm-66-msr-k7.patch

Avi, Joerg,
     While trying to boot a RHEL-4 guest on latest KVM tip on an AMD machine, I
found that the guest would consistently crash when trying to setup the NMI
watchdog.  I traced it down to the following commit:

51ef1ac7b23ee32bfcc61c229d634fdc1c68b38a

It seems that in that commit, the K7_EVNTSEL MSR's were set to fail if the data
!= 0.  That test is actually fine, the problem is how the code around it is
generated.  That is, we are only supposed to go to unhandled if data != 0; but
for some reason, we are *always* going to unhandled, even when the data == 0.
That causes RHEL-4 kernel to crash.  If I rearrange the code to look like this:

	case MSR_K7_EVNTSEL0:
	case MSR_K7_EVNTSEL1:
	case MSR_K7_EVNTSEL2:
	case MSR_K7_EVNTSEL3:

		if (data != 0)
			return kvm_set_msr_common(vcpu, ecx, data);
		
	default:
		return kvm_set_msr_common(vcpu, ecx, data);
	}

Then everything works again.  A patch that does just this is attached.  It might
be slightly nicer to say "if (data == 0) return 0" and then just fall through to
the default case, but I don't much care either way.

Signed-off-by: Chris Lalancette <cla...@re...>

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Avi K. <av...@qu...> - 2008-04-24 07:35:49

Chris Lalancette wrote:
> Avi, Joerg,
>      While trying to boot a RHEL-4 guest on latest KVM tip on an AMD machine, I
> found that the guest would consistently crash when trying to setup the NMI
> watchdog.  I traced it down to the following commit:
>
> 51ef1ac7b23ee32bfcc61c229d634fdc1c68b38a
>
> It seems that in that commit, the K7_EVNTSEL MSR's were set to fail if the data
> != 0.  That test is actually fine, the problem is how the code around it is
> generated.  That is, we are only supposed to go to unhandled if data != 0; but
> for some reason, we are *always* going to unhandled, even when the data == 0.
> That causes RHEL-4 kernel to crash.  If I rearrange the code to look like this:
>
> 	case MSR_K7_EVNTSEL0:
> 	case MSR_K7_EVNTSEL1:
> 	case MSR_K7_EVNTSEL2:
> 	case MSR_K7_EVNTSEL3:
>
> 		if (data != 0)
> 			return kvm_set_msr_common(vcpu, ecx, data);
> 		
> 	default:
> 		return kvm_set_msr_common(vcpu, ecx, data);
> 	}
>
> Then everything works again.  A patch that does just this is attached.  It might
> be slightly nicer to say "if (data == 0) return 0" and then just fall through to
> the default case, but I don't much care either way.
>   

You mean the gcc generates wrong code?  It seems fine here (though 
wonderfully obfuscated).

Can you attach an objdump -Sr svm.o?  Also, what gcc version are you using?

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Avi K. <av...@qu...> - 2008-04-24 15:20:41

Chris Lalancette wrote:
> Avi Kivity wrote:
>   
>> You mean the gcc generates wrong code?  It seems fine here (though 
>> wonderfully obfuscated).
>>
>> Can you attach an objdump -Sr svm.o?  Also, what gcc version are you using?
>>
>>     
>
> (sending attachment in private mail, so I don't spam the whole list with 189K of
> objdump).  Note that this is an objdump -Sr of the current code, with my patch
> *not* applied.
>
> gcc is gcc-4.3.0-7 in Fedora 9.
>
>   


It's a gcc bug.  svm_set_msr() places ecx in %rsi, and consistently uses 
%esi to refer to the first 32 bits.  But when it compiles this bit:

>     case MSR_K7_EVNTSEL0:
>     case MSR_K7_EVNTSEL1:
>     case MSR_K7_EVNTSEL2:
>     case MSR_K7_EVNTSEL3:
>         /*
>          * only support writing 0 to the performance counters for now
>          * to make Windows happy. Should be replaced by a real
>          * performance counter emulation later.
>          */
>         if (data != 0)
>             goto unhandled;
>         break;

(where MSR_K7_EVENTSEL[0123] == 0xc001000[0123])

it compiles it into

>     1811:	8d 86 00 00 ff 3f    	lea    0x3fff0000(%rsi),%eax
>     1817:	83 f8 03             	cmp    $0x3,%eax
>     181a:	0f 87 e2 01 00 00    	ja     1a02 <svm_set_msr+0x27f>
Now it uses %rsi instead of %esi, and any junk in the upper bits will 
cause the ja to be taken.

We need to get a reduced testcase to the gcc folks, this is a serious 
bug.  Any changes in the code to work around this would be fragile.

-- 
error compiling committee.c: too many arguments to function

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-24 15:44:07

Avi Kivity wrote:
> 
>>     1811:	8d 86 00 00 ff 3f    	lea    0x3fff0000(%rsi),%eax
>>     1817:	83 f8 03             	cmp    $0x3,%eax
>>     181a:	0f 87 e2 01 00 00    	ja     1a02 <svm_set_msr+0x27f>
> Now it uses %rsi instead of %esi, and any junk in the upper bits will 
> cause the ja to be taken.
> 
> We need to get a reduced testcase to the gcc folks, this is a serious 
> bug.  Any changes in the code to work around this would be fragile.
> 

Ouch, I missed this on my reading of it.  I'll try to come up with a standalone
program that shows this.

Thanks, Avi.

Chris Lalancette

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-24 22:13:13

Avi Kivity wrote:
> Now it uses %rsi instead of %esi, and any junk in the upper bits will 
> cause the ja to be taken.
> 
> We need to get a reduced testcase to the gcc folks, this is a serious 
> bug.  Any changes in the code to work around this would be fragile.
> 

Avi,
    I've now filed a bug in the upstream gcc database:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36040

And I came up with a reduced test case, available here:

http://people.redhat.com/clalance/rsi-test-case.tar.bz2

If I compile the code in the above and look at the disassembly, it shows the
problem; however, I can't reproduce the bug by actually running the code.  I
suspect the %rsi register is always 0 when we start in this userland code, so I
never run into the bogus ja, but I just thought I'd mention it.

Chris Lalancette

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Avi K. <av...@qu...> - 2008-04-25 07:31:49

Chris Lalancette wrote:
> Avi Kivity wrote:
>   
>> Now it uses %rsi instead of %esi, and any junk in the upper bits will 
>> cause the ja to be taken.
>>
>> We need to get a reduced testcase to the gcc folks, this is a serious 
>> bug.  Any changes in the code to work around this would be fragile.
>>
>>     
>
> Avi,
>     I've now filed a bug in the upstream gcc database:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36040
>
> And I came up with a reduced test case, available here:
>
> http://people.redhat.com/clalance/rsi-test-case.tar.bz2
>
> If I compile the code in the above and look at the disassembly, it shows the
> problem; however, I can't reproduce the bug by actually running the code.  I
> suspect the %rsi register is always 0 when we start in this userland code, so I
> never run into the bogus ja, but I just thought I'd mention it.
>
>   

Hmm, looking back at the dump:

>     1811:	8d 86 00 00 ff 3f    	lea    0x3fff0000(%rsi),%eax
>     1817:	83 f8 03             	cmp    $0x3,%eax
>     181a:	0f 87 e2 01 00 00    	ja     1a02 <svm_set_msr+0x27f>

So while gcc is using %rsi, it loads the result back into %eax, which 
has the effect of dropping back into 32-bits.  So looks like gcc was 
right here.  Sorry for spreading confusion and apologies to gcc.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-25 13:08:08

Avi Kivity wrote:
> 
> Hmm, looking back at the dump:
> 
>>     1811:	8d 86 00 00 ff 3f    	lea    0x3fff0000(%rsi),%eax
>>     1817:	83 f8 03             	cmp    $0x3,%eax
>>     181a:	0f 87 e2 01 00 00    	ja     1a02 <svm_set_msr+0x27f>
> 
> So while gcc is using %rsi, it loads the result back into %eax, which 
> has the effect of dropping back into 32-bits.  So looks like gcc was 
> right here.  Sorry for spreading confusion and apologies to gcc.
> 

OK.  Well, then I can't explain why we are unconditionally calling
kvm_set_msr_common(), regardless of whether data == 0 or not.  Avi, you said it
works for you; what version of gcc are you using, and can you send me your
objdump -Sr?  I'd like to compare the assembly output with what 4.3.0 is
spitting out.

Chris Lalancette

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-25 18:43:36

Avi Kivity wrote:
> 
> Hmm, looking back at the dump:
> 
>>     1811:	8d 86 00 00 ff 3f    	lea    0x3fff0000(%rsi),%eax
>>     1817:	83 f8 03             	cmp    $0x3,%eax
>>     181a:	0f 87 e2 01 00 00    	ja     1a02 <svm_set_msr+0x27f>
> 
> So while gcc is using %rsi, it loads the result back into %eax, which 
> has the effect of dropping back into 32-bits.  So looks like gcc was 
> right here.  Sorry for spreading confusion and apologies to gcc.
> 

Avi,
     Arg.  I was completely, utterly wrong about the problem here (although
there is definitely still a problem).  I'm sorry for making a confusing mess out
of this.  Here is what is actually happening:

During startup, the RHEL-4 x86_64 kernel (2.6.9-67.EL, if you care) setups up
the NMI watchdog.  It does the following:

	for(i = 0; i < 4; ++i) {
		/* Simulator may not support it */
		if (checking_wrmsrl(MSR_K7_EVNTSEL0+i, 0UL))
			return;
		wrmsrl(MSR_K7_PERFCTR0+i, 0UL);
	}

checking_wrmsrl() just does a "test write" to the msr; because of the code that
is currently in there, this succeeds.  However, when it tries to do the
MSR_K7_PERFCTR0 wrmsr, *that* is where it fails, since we don't currently handle
that MSR, and KVM injects a GPF into the guest (which kills it).  My previous
patch just happened to fix this because it was making checking_wrmsrl() fail on
the EVNTSEL0, so we just returned out of this loop rather than trying to write
to the PERFCTR0.

Unfortunately, we can't just "fake emulate" MSR_K7_PERFCTR[0-3] like we are
doing for MSR_K7_EVNTSEL[0-3]; if they are there, linux expects to be able to
put values into them.  I think the correct solution here is to emulate
MSR_K7_PERFCTR[0-3] and MSR_K7_EVNTSEL[0-3] for real.  I'm working on a patch to
do this now.

Chris Lalancette

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Joerg R. <joe...@am...> - 2008-04-28 12:46:17

On Fri, Apr 25, 2008 at 02:43:19PM -0400, Chris Lalancette wrote:
> Unfortunately, we can't just "fake emulate" MSR_K7_PERFCTR[0-3] like we are
> doing for MSR_K7_EVNTSEL[0-3]; if they are there, linux expects to be able to
> put values into them.  I think the correct solution here is to emulate
> MSR_K7_PERFCTR[0-3] and MSR_K7_EVNTSEL[0-3] for real.  I'm working on a patch to
> do this now.

We already discussed the emulation of the performance counter registers
in the past. The conclusion is, that we loose live migration with that
emulation because performance monitoring is implemented differently
between AMD and Intel systems. Maybe discarding _any_ writes to
the performance counter MSRs will fix the guest crash. What we should
not do is injecting GPF on zero writes to the MSRs because that will
break Windows XP 64bit installation.

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-28 14:19:57

Attachments: kvm-66-fix-k7-msr.patch

Joerg Roedel wrote:
> We already discussed the emulation of the performance counter registers
> in the past. The conclusion is, that we loose live migration with that
> emulation because performance monitoring is implemented differently
> between AMD and Intel systems. Maybe discarding _any_ writes to
> the performance counter MSRs will fix the guest crash. What we should
> not do is injecting GPF on zero writes to the MSRs because that will
> break Windows XP 64bit installation.

     OK, yeah, I read some of those threads over the weekend.  It is a larger
problem than just this crash, and probably one to be solved later.
     I have attached a patch which just discards all writes, as you suggested;
this allows my RHEL-4 guest to successfully boot, although obviously the NMI
watchdog which it thinks it set up is not actually working.  Joerg, can you test
this on your 64-bit Windows XP guest to make sure it is still working?

Thanks,
Chris Lalancette

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Andi K. <an...@fi...> - 2008-04-28 16:42:10

Chris Lalancette <cla...@re...> writes:
> this allows my RHEL-4 guest to successfully boot, although obviously the NMI
> watchdog which it thinks it set up is not actually working.

That's fine. On a virtualized environment you don't really need a NMI watchdog
because you can always debug from the outside

-Andi

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Joerg R. <joe...@am...> - 2008-04-28 15:58:30

On Mon, Apr 28, 2008 at 10:17:37AM -0400, Chris Lalancette wrote:
> Joerg Roedel wrote:
> > We already discussed the emulation of the performance counter registers
> > in the past. The conclusion is, that we loose live migration with that
> > emulation because performance monitoring is implemented differently
> > between AMD and Intel systems. Maybe discarding _any_ writes to
> > the performance counter MSRs will fix the guest crash. What we should
> > not do is injecting GPF on zero writes to the MSRs because that will
> > break Windows XP 64bit installation.
> 
>      OK, yeah, I read some of those threads over the weekend.  It is a larger
> problem than just this crash, and probably one to be solved later.
>      I have attached a patch which just discards all writes, as you suggested;
> this allows my RHEL-4 guest to successfully boot, although obviously the NMI
> watchdog which it thinks it set up is not actually working.  Joerg, can you test
> this on your 64-bit Windows XP guest to make sure it is still working?

XP 64 bit installs and works fine with your patch.

Joerg


-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

Re: [kvm-devel] [PATCH]: Fix MSR_K7_EVNTSEL{0,3} for SVM

From: Chris L. <cla...@re...> - 2008-04-28 16:41:33

Joerg Roedel wrote:
>>      OK, yeah, I read some of those threads over the weekend.  It is a larger
>> problem than just this crash, and probably one to be solved later.
>>      I have attached a patch which just discards all writes, as you suggested;
>> this allows my RHEL-4 guest to successfully boot, although obviously the NMI
>> watchdog which it thinks it set up is not actually working.  Joerg, can you test
>> this on your 64-bit Windows XP guest to make sure it is still working?
> 
> XP 64 bit installs and works fine with your patch.

Great, thanks for testing, I'll post the patch again with a proper changelog for
Avi.

Chris Lalancette