|
From: Eliot M. <mo...@cs...> - 2012-01-30 03:13:27
|
Dear valgrind-ers -- I am enjoying getting started with valgrind and have been trying to apply it to a couple of Java VMs, Jikes RVM and HotSpot (the usual Oracle/Sun JVM). While valgrind seems to play well with Jikes RVM, I have run into two unimplemented instructions for x64 used by HotSpot: 0x66 0x0F 0x3A 0x61 0x06 0x0D: PCMPESTRI for 16-bit characters For this one I am attempting to add the 16-bit versions of those PCMPxSTRx instructions already supported for 8-bit. Once I have tested it some, I will see about submitting a patch. It's relatively straightforward. 0x66 0xDD 0x4: FRSTOR with size-change prefix. For AMD guest FRSTOR is completely commented out. I am guessing it was originally copied from x86 (since the contents of the commented our code correspond with guest_x86 code) but never really "ported" to AMD. Several runs of HotSpot on DaCapo benchmarks fail under valgrind because this instruction is missing. Could someone perhaps clarify whether it simply hasn't been done or if there is some deeper reason why the code was commented out? Regards -- Eliot Moss PS -- I did search the archives on these but failed to find anything immediately relevant. Sorry if it's there but I missed it ... EM |
|
From: Julian S. <js...@ac...> - 2012-01-30 08:29:53
|
> 0x66 0x0F 0x3A 0x61 0x06 0x0D: PCMPESTRI for 16-bit characters > For this one I am attempting to add the 16-bit versions of those > PCMPxSTRx instructions already supported for 8-bit. Once > I have tested it some, I will see about submitting a patch. > It's relatively straightforward. Great. Please be sure to add test cases to none/tests/amd64/pcmpstr64.c as that's the only way you can be (even remotely) confident you have it working correctly. > 0x66 0xDD 0x4: FRSTOR with size-change prefix. > For AMD guest FRSTOR is completely commented out. I am guessing > it was originally copied from x86 (since the contents of the > commented our code correspond with guest_x86 code) but never > really "ported" to AMD. Several runs of HotSpot on DaCapo > benchmarks fail under valgrind because this instruction is > missing. I would have thought that FSAVE/FRSTOR in 64 bit mode is essentially useless, and the fact that we've not so far needed to implement them kind of supports that suspicion. Reason is that they don't save or restore the XMM registers. The x86-64 bit ELF ABI requires floating point values to be in the XMM registers; hence FSAVE/FRSTOR don't succeed in correctly saving and restoring the CPU's floating point state. What you need for that is FXSAVE and FXRSTOR, and they are indeed implemented. > Could someone perhaps clarify whether it simply hasn't been > done or if there is some deeper reason why the code was > commented out? Instructions don't get implemented until a need arises and a test case is created. Experience shows that implementing instructions without proper test cases leads to nearly impossible to find (literally) emulation bugs. Oh, and be sure to do your hacking on the svn trunk, not on a tarball tree. The x86_64 instruction decoding framework got overhauled recently and you'll have rebasing difficulties if you don't work in the new framework. J |
|
From: Eliot M. <mo...@cs...> - 2012-01-30 23:09:23
|
On 1/30/2012 3:29 AM, Julian Seward wrote: > I would have thought that FSAVE/FRSTOR in 64 bit mode is essentially > useless, and the fact that we've not so far needed to implement them > kind of supports that suspicion. Reason is that they don't save or > restore the XMM registers. The x86-64 bit ELF ABI requires floating point > values to be in the XMM registers; hence FSAVE/FRSTOR don't succeed in > correctly saving and restoring the CPU's floating point state. What you > need for that is FXSAVE and FXRSTOR, and they are indeed implemented. I couldn't find an ELF ABI requirement that floating point values *must* be in XMM regs versus MMX regs. The difference I could find is that MMX registers are not saved across calls -- but saving and restoring might still be useful if you are doing MMX computations ... Anyway, HotSpot uses them :-) ... They may not be interesting for *OS* use though ... Does this seem right? Regards -- Eliot |
|
From: Eliot M. <mo...@cs...> - 2012-01-31 12:04:32
|
On 1/31/2012 4:15 AM, Julian Seward wrote: > > (You didn't reply-all on this; I assume that was intended) I did; thanks ... >> Your comments on the probable (non)utility of the instruction are >> suggestive as to why the copied-over x86 code is commented out. >> Unsurprisingly, simply uncommenting it did not work :-) . > > It might be that the 32- and 64-bit versions generate slightly different > in-memory formats. You'll need to read the Intel docs closely. You may > find it useful to play with some or all of the following test programs > (ymmv, #include<disclaimer.h> etc) > > ./memcheck/tests/amd64/fxsave-amd64.c > ./memcheck/tests/x86/fxsave.c > ./VEX/test/fsave.c > ./VEX/test/fxsave.c > > One thing that's important to understand is that Valgrind simulates > x87 style FP only to 64 bit precision. Hence the memory image that its > fsave instruction makes will differ in the lowest, uh, 16 or so mantissa > bits compared to running on real hardware. Some of the test cases above > take that into account. All very helpful, Julian. What I meant was that the commented out code would not compile. Among other things, it refers to helpers over in x86 code not visible to the amd64 code I was trying to modify. You are correct that the (AMD) docs to which I referred describe several different in-memory formats and it will be necessary to determine which one happens in 64-bit mode. That is something I think I had best verify on an actual machine as well, since the docs are not simple to follow. In particular I want to make sure that I understand the effect of the 0x66 size prefix on these instructions! Regards -- Eliot |