|
From: Florian K. <fl...@ei...> - 2013-10-04 17:47:46
|
Hi,
this testcase guards itself like so:
prereq: test -x tm1 && ../../../tests/ amd64-avx
But that is not good enough. My laptop has the AVX feature according to
x86_amd64_features but it certainly does not have transactional memory
insns.
Intel docs say:
Note: Applications using instructions from the RTM subset of
Intel TSX extension need to guard the code by checking the
CPUID.(EAX=07H, ECX=0H).EBX.RTM[bit 11]==1.
OK, let's have a new feature test: amd64-tsx.
Can somebody with x86-64 foo look at the patch below and tell me whether
I got it right?
Thanks,
Florian
Index: tests/x86_amd64_features.c
===================================================================
--- tests/x86_amd64_features.c (revision 13617)
+++ tests/x86_amd64_features.c (working copy)
@@ -70,7 +70,7 @@
static Bool go(char* cpu)
{
- unsigned int level = 0, cmask = 0, dmask = 0, a, b, c, d;
+ unsigned int level = 0, bmask = 0, cmask = 0, dmask = 0, a, b, c, d;
Bool require_amd = False;
Bool require_xgetbv = False;
if ( strcmp( cpu, "x86-fpu" ) == 0 ) {
@@ -125,13 +125,21 @@
level = 1;
cmask = (1 << 27) | (1 << 28);
require_xgetbv = True;
+ } else if ( strcmp( cpu, "amd64-tsx" ) == 0 ) {
+ level = 7;
+ bmask = 1 << 11;
+ require_xgetbv = True;
#endif
} else {
return UNRECOGNISED_FEATURE;
}
- assert( !(cmask != 0 && dmask != 0) );
- assert( !(cmask == 0 && dmask == 0) );
+ /* Only one mask should be != 0 */
+ int count = 0;
+ if (bmask) ++count;
+ if (cmask) ++count;
+ if (dmask) ++count;
+ assert(count == 1);
if (require_amd && !vendorStringEquals("AuthenticAMD"))
return FEATURE_NOT_PRESENT;
@@ -154,6 +162,12 @@
else
return FEATURE_PRESENT;
}
+ if (bmask > 0 && (b & bmask) == bmask) {
+ if (require_xgetbv && !have_xgetbv())
+ return FEATURE_NOT_PRESENT;
+ else
+ return FEATURE_PRESENT;
+ }
}
return FEATURE_NOT_PRESENT;
}
|
|
From: Mark W. <mj...@re...> - 2013-10-04 21:45:15
|
On Fri, 2013-10-04 at 19:47 +0200, Florian Krohm wrote:
> this testcase guards itself like so:
>
> prereq: test -x tm1 && ../../../tests/ amd64-avx
>
> But that is not good enough. My laptop has the AVX feature according to
> x86_amd64_features but it certainly does not have transactional memory
> insns.
The funny thing with valgrind is that it can emulate tm even if the host
doesn't (just like it emulates avx2 if you just have avx).
But there is a typo in VEX/priv/guest_amd64_toIR.c:
diff --git a/priv/guest_amd64_toIR.c b/priv/guest_amd64_toIR.c
index a29e175..c421007 100644
--- a/priv/guest_amd64_toIR.c
+++ b/priv/guest_amd64_toIR.c
@@ -20067,7 +20067,7 @@ Long dis_ESC_NONE (
}
/* BEGIN HACKY SUPPORT FOR xbegin */
if (modrm == 0xF8 && !have66orF2orF3(pfx) && sz == 4
- && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX2)) {
+ && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX)) {
delta++; /* mod/rm byte */
d64 = getSDisp(4,delta);
delta += 4;
@@ -20725,7 +20725,7 @@ Long dis_ESC_0F (
}
/* BEGIN HACKY SUPPORT FOR xtest */
/* 0F 01 D6 = XTEST */
- if (modrm == 0xD6 && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX2)) {
+ if (modrm == 0xD6 && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX)) {
/* Sets ZF because there never is a transaction, and all
CF, OF, SF, PF and AF are always cleared by xtest. */
delta += 1;
That should make it work as is with the prereq given in the test.
Cheers,
Mark
|
|
From: Florian K. <fl...@ei...> - 2013-10-07 20:04:43
|
On 10/04/2013 11:44 PM, Mark Wielaard wrote:
> On Fri, 2013-10-04 at 19:47 +0200, Florian Krohm wrote:
>> this testcase guards itself like so:
>>
>> prereq: test -x tm1 && ../../../tests/ amd64-avx
>>
>> But that is not good enough. My laptop has the AVX feature according to
>> x86_amd64_features but it certainly does not have transactional memory
>> insns.
>
> The funny thing with valgrind is that it can emulate tm even if the host
> doesn't (just like it emulates avx2 if you just have avx).
>
> But there is a typo in VEX/priv/guest_amd64_toIR.c:
>
.. snip..
>
> That should make it work as is with the prereq given in the test.
Yes, I confirm your change fixes my problem.
Thx
Florian
|
|
From: Julian S. <js...@ac...> - 2013-10-13 00:34:29
|
[arriving way too late at the party ..]
On 10/04/2013 11:44 PM, Mark Wielaard wrote:
> The funny thing with valgrind is that it can emulate tm even if the host
> doesn't (just like it emulates avx2 if you just have avx).
>
> But there is a typo in VEX/priv/guest_amd64_toIR.c:
>
> diff --git a/priv/guest_amd64_toIR.c b/priv/guest_amd64_toIR.c
> index a29e175..c421007 100644
> --- a/priv/guest_amd64_toIR.c
> +++ b/priv/guest_amd64_toIR.c
> @@ -20067,7 +20067,7 @@ Long dis_ESC_NONE (
> }
> /* BEGIN HACKY SUPPORT FOR xbegin */
> if (modrm == 0xF8 && !have66orF2orF3(pfx) && sz == 4
> - && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX2)) {
> + && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX)) {
> delta++; /* mod/rm byte */
> d64 = getSDisp(4,delta);
> delta += 4;
So .. _AVX2 was what I intended, in the sense that I wanted only to provide
TM if the host could do AVX2. But if changing to _AVX makes the prereq
problem go away, then fine.
btw I suspect I made this more complex/inaccurate/confusing than it needed
to be by assuming that CPUs that can do AVX2 can also do TM. But that's
not so -- all Haswell-branded CPUs can do AVX2 (IIUC) but only the higher
spec ones can do TM.
J
|
|
From: Mark W. <mj...@re...> - 2013-10-13 09:13:04
|
> On 10/04/2013 11:44 PM, Mark Wielaard wrote:
>
> > The funny thing with valgrind is that it can emulate tm even if the host
> > doesn't (just like it emulates avx2 if you just have avx).
> >
> > But there is a typo in VEX/priv/guest_amd64_toIR.c:
> >
> > diff --git a/priv/guest_amd64_toIR.c b/priv/guest_amd64_toIR.c
> > index a29e175..c421007 100644
> > --- a/priv/guest_amd64_toIR.c
> > +++ b/priv/guest_amd64_toIR.c
> > @@ -20067,7 +20067,7 @@ Long dis_ESC_NONE (
> > }
> > /* BEGIN HACKY SUPPORT FOR xbegin */
> > if (modrm == 0xF8 && !have66orF2orF3(pfx) && sz == 4
> > - && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX2)) {
> > + && (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX)) {
> > delta++; /* mod/rm byte */
> > d64 = getSDisp(4,delta);
> > delta += 4;
>
> So .. _AVX2 was what I intended, in the sense that I wanted only to provide
> TM if the host could do AVX2. But if changing to _AVX makes the prereq
> problem go away, then fine.
>
> btw I suspect I made this more complex/inaccurate/confusing than it needed
> to be by assuming that CPUs that can do AVX2 can also do TM. But that's
> not so -- all Haswell-branded CPUs can do AVX2 (IIUC) but only the higher
> spec ones can do TM.
I think we need to rethink our VEX capabilities vs host capabilities a
bit (after 3.9.0). Currently this makes the prereq problem go away since
prereq checks the host capabilities. But that does not translate
one-on-one to valgrind capabilities. Valgrind (on x86) advertises
its own capabilities through fixed emulated cpuid "families" that only
map coarsely on the host cpuid. But in guest_toIR we check against the
individual host hardware capabilities. We need to connect those
a little better (and IMHO give the user a way to select which ones
they want - e.g. we can always emulate TM currently, but the user
might not always want it, even on hardware that does have it).
Cheers,
Mark
|