|
From: Florian K. <br...@ac...> - 2013-01-31 02:19:54
|
We should change the definition of HReg to make mixups of integers
(typically register number) and HReg less likely. Like so:
Index: VEX/priv/host_generic_regs.h
===================================================================
--- VEX/priv/host_generic_regs.h (revision 2671)
+++ VEX/priv/host_generic_regs.h (working copy)
@@ -116,22 +116,22 @@
occupy 24 bits. */
if (r24 != regno)
vpanic("mkHReg: regno exceeds 2^24");
- return regno | (((UInt)rc) << 28) | (virtual ? (1<<24) : 0);
+ return (regno << 5) | (((UInt)rc) << 1) | (virtual ? 1 : 0);
}
static inline HRegClass hregClass ( HReg r ) {
UInt rc = r;
- rc = (rc >> 28) & 0x0F;
+ rc = (rc >> 1) & 0x0F;
vassert(rc >= HRcInt32 && rc <= HRcVec128);
return (HRegClass)rc;
}
static inline UInt hregNumber ( HReg r ) {
- return ((UInt)r) & 0x00FFFFFF;
+ return ((UInt)r) >> 5;
}
static inline Bool hregIsVirtual ( HReg r ) {
- return toBool(((UInt)r) & (1<<24));
+ return toBool(((UInt)r) & 1);
}
While it would be preferred to find such mixups so at compile time, I
cannot think of a non-intrusive way to do it. So.. the above patch is
most likely to find those mixups at runtime as assigning a HReg to an
integer will no longer give the register number in the low order bits.
Florian
|
|
From: Julian S. <js...@ac...> - 2013-01-31 08:38:36
|
On 01/31/2013 03:19 AM, Florian Krohm wrote:
> We should change the definition of HReg to make mixups of integers
> (typically register number) and HReg less likely.
Yes.
> While it would be preferred to find such mixups so at compile time, I
> cannot think of a non-intrusive way to do it.
Didn't you have some previous proposal that did this?
typedef struct { UInt r; } HReg;
so that the typechecker can find all mixups? Changing the encoding
as you suggest is also possible, but the failures will appear as incorrect
generated code, which is a lot harder to diagnose than compiler errors.
What's the objection to the struct scheme? Provided that gcc can inline
suitably, so that there is no extra performance overhead -- important since
regalloc and hence HReg related stuff is the biggest JIT cost -- I would
be happy with it.
But I'd also be happy with your different-encoding scheme too, if you
prefer that. The current situation is for sure not good.
J
|
|
From: Florian K. <br...@ac...> - 2013-01-31 16:50:20
|
On 01/31/2013 03:37 AM, Julian Seward wrote:
>
> Didn't you have some previous proposal that did this?
>
> typedef struct { UInt r; } HReg;
>
> so that the typechecker can find all mixups?
Yes. The thing is that we compare HREgs for equality using == or != and
you cannot do that if HReg is a struct. Won't compile.
We do that 91 times in current code. If we want to go this route we need
to define a function sameHReg or so and change all comparisons to call
that function. That is what I meant that it will be somewhat intrusive.
If the function is inlined it probably will generate the same quality of
code although we would have to make sure.
> But I'd also be happy with your different-encoding scheme too, if you
> prefer that. The current situation is for sure not good.
>
Hmm. I made the encoding change and ran regtest on x86-64 with no new
regression. However, there are still two issues in amd64 insn selection.
I must have overlooked them first time around..:
priv/host_amd64_isel.c:3446:7: error: invalid initializer
priv/host_amd64_isel.c:3447:43: error: invalid operands to binary ^
(have 'HReg' and 'int')
Hmm, maybe we should bite the bullet and make HReg a struct after all.
Let me see how generated code looks like....
Florian
|
|
From: Florian K. <br...@ac...> - 2013-02-01 20:27:00
Attachments:
hreg-wrapper-patch
|
On 01/31/2013 11:50 AM, Florian Krohm wrote:
>
> Hmm, maybe we should bite the bullet and make HReg a struct after all.
> Let me see how generated code looks like....
>
I looked at two implementation choices:
1) make HReg a proper struct
typedef struct {
UInt regno : 24;
HRegClass class : 4;
Bool virtual : 1;
} HReg;
2) Wrap the current register encoding into a struct:
typedef struct {
UInt reg;
} HReg;
For both choices a function sameHReg will have to be introduced to
compare two HRegs for equality. In the case of a proper struct (#1
above) that function will do this:
return toBool(r1.regno == r2.regno && r1.class == r2.class &&
r1.virtual == r2.virtual)
That generates:
movl %esi, %edx
xorl %eax, %eax
xorl %edi, %edx
andl $16777215, %edx
jne .L2
xorq %rsi, %rdi
testl $520093696, %edi
sete %al
.L2:
rep
ret
as opposed to
cmpl %edi, %esi
sete %al
ret
for #2 where we simply return toBool(r1.reg == r2.reg)
Because comparison gets used a lot in the register allocator I chose #2.
Patch attached and regtested on x86-x64, s390, and ppc with no new
regressions.
Florian
|
|
From: Greg P. <gp...@ap...> - 2013-02-01 22:52:13
|
On Feb 1, 2013, at 12:26 PM, Florian Krohm <br...@ac...> wrote:
> I looked at two implementation choices:
>
> 1) make HReg a proper struct
>
> typedef struct {
> UInt regno : 24;
> HRegClass class : 4;
> Bool virtual : 1;
> } HReg;
>
> 2) Wrap the current register encoding into a struct:
>
> typedef struct {
> UInt reg;
> } HReg;
>
> Because comparison gets used a lot in the register allocator I chose #2.
> Patch attached and regtested on x86-x64, s390, and ppc with no new
> regressions.
You can optimize #1.
First, make sure the bitfield spans all 32 bits. (I'd also recommend not using the C++ keywords "class" and "virtual" in the names.)
typedef struct {
UInt regno : 27;
HRegClass regclass : 4;
Bool isvirtual : 1;
} HReg;
Second, write the comparison function with memcmp(). Your compiler optimizer should be able to transform the call into a single compare.
(Curiously, `clang -Os` was unable to optimize the field-by-field comparison, even after expanding the struct to 32 bits.)
--
Greg Parker gp...@ap... Runtime Wrangler
|
|
From: Florian K. <br...@ac...> - 2013-02-02 15:46:41
|
On 02/01/2013 05:52 PM, Greg Parker wrote:
>
> First, make sure the bitfield spans all 32 bits. (I'd also recommend not using the C++ keywords "class" and "virtual" in the names.)
>
> typedef struct {
> UInt regno : 27;
> HRegClass regclass : 4;
> Bool isvirtual : 1;
> } HReg;
>
> Second, write the comparison function with memcmp(). Your compiler optimizer should be able to transform the call into a single compare.
>
> (Curiously, `clang -Os` was unable to optimize the field-by-field comparison, even after expanding the struct to 32 bits.)
>
>
Thanks for the pointers. Unfortunately, neither of the suggestions work.
With gcc 4.6.3 -O2 that is. Comparing two HRegs field by field will
actually generate worse code than before (where regno was only 24 bits
wide).
And using __builtin_memcmp will generate a call into memcmp. So we get
an unresolved reference as we don't link against glibc..
I tried a few other optimisation options -O3, -Os, but they didn't make
a difference.
Florian
|
|
From: Eric P. <eri...@or...> - 2013-02-03 12:03:44
|
Le 01/02/2013 21:26, Florian Krohm a écrit :
> On 01/31/2013 11:50 AM, Florian Krohm wrote:
>> Hmm, maybe we should bite the bullet and make HReg a struct after all.
>> Let me see how generated code looks like....
>>
> I looked at two implementation choices:
>
> 1) make HReg a proper struct
>
> typedef struct {
> UInt regno : 24;
> HRegClass class : 4;
> Bool virtual : 1;
> } HReg;
and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ?
(assuming sizeof(HReg)==sizeof(UInt))
A+
|
|
From: John R. <jr...@bi...> - 2013-02-03 15:46:20
|
>> typedef struct {
>> UInt regno : 24;
>> HRegClass class : 4;
>> Bool virtual : 1;
>> } HReg;
> and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ?
> (assuming sizeof(HReg)==sizeof(UInt))
Here that definition of sameHReg provides a bug, because the declaration
of HReg has 3 uninitialized bits whose value is not controlled.
To fix the bug, increase the width of some field(s) of HReg until
the widths sum to 32 bits (the bit width of UInt.)
--
|
|
From: Дмитрий Д. <di...@gm...> - 2013-02-03 12:48:18
|
Fedora 17/x64
gcc-4.7.2 -O2 -S tst.c
typedef struct {
unsigned regno : 27;
unsigned class : 4;
unsigned virtual : 1;
} HReg;
static inline int cmp_Hreg(const HReg* lhs, const HReg* rhs)
{
const int a = *(int*)lhs;
const int b = *(int*)rhs;
return a == b;
}
int foo(const HReg* lhs, const HReg* rhs)
{
return cmp_Hreg(lhs, rhs);
}
.globl foo
.type foo, @function
foo:
.LFB1:
.cfi_startproc
movl (%rsi), %eax
cmpl %eax, (%rdi)
sete %al
movzbl %al, %eax
ret
Tnanks,
Dmitry
2013/2/3 Eric Pouech <eri...@or...>:
> Le 01/02/2013 21:26, Florian Krohm a écrit :
>> On 01/31/2013 11:50 AM, Florian Krohm wrote:
>>> Hmm, maybe we should bite the bullet and make HReg a struct after all.
>>> Let me see how generated code looks like....
>>>
>> I looked at two implementation choices:
>>
>> 1) make HReg a proper struct
>>
>> typedef struct {
>> UInt regno : 24;
>> HRegClass class : 4;
>> Bool virtual : 1;
>> } HReg;
> and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ?
> (assuming sizeof(HReg)==sizeof(UInt))
> A+
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_jan
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: Patrick J. L. <lop...@gm...> - 2013-02-04 17:25:57
|
Very bad idea. Accessing pointers via the "wrong" type is undefined
behavior, and the compiler is free to assume you do not do that.
In simple cases, GCC with -Wall will warn you about this
("Dereferencing type-punned pointer will break strict-aliasing
rules"). Here is a simple test program illustrating the principle:
=====
#include <stdio.h>
struct A { int x; };
struct B { int x; };
int
foo(struct A *a, struct B *b)
{
a->x = 1;
b->x = 2;
return a->x;
}
int
main(int argc, char *argv[])
{
struct A a;
printf("%d\n", foo(&a, (struct B *)&a));
return 0;
}
=====
Compile this with optimization and it will probably print "1"; compile
without optimization, and it will probably print "2". (I say
"probably" because in the presence of undefined behavior the compiler
is free to emit any code whatsoever.)
Never access an object via the wrong pointer type. The only exception
is "char *", which can alias anything by definition. (You can also
use a union to access the same memory as different types. I believe
this takes you into the realm of implementation-defined behavior.
That is also bad, in my opinion, but it is not the disaster of
undefined behavior.)
Speaking of "char *", the memcmp() proposal is also a bad idea.
Compilers are free to insert padding into structures however they
like. Are you sure no compiler on any platform ever would choose to
pad this structure to 64 bits? In that case, memcmp() would be
comparing uninitialized bits.
If you want to go the struct route, just compare the structs
field-wise and convince the compiler writers to optimize it. Or maybe
use a union if you want to compare the structures as integers, with
assertions that everything has the size you expect.
Finally, as long as I am on my soap box... In my humble opinion, it
is generally a bad idea to use memcmp(), memcpy(), or memset() on
anything other than actual byte arrays. To zero-initialize a
structure, instead of using memset(), just use:
struct HReg foo = { };
This is shorter to write, easier to read, and it will properly
initialize floats/doubles/pointers, even on platforms that do not use
all-zero as the byte representation for 0.0 or NULL.
To reinitialize a structure to zero:
static const struct HReg empty = { };
struct HReg foo;
...
foo = empty;
Any compiler worth its salt will generate optimal code for these.
- Pat
On Sun, Feb 3, 2013 at 4:47 AM, Дмитрий Дьяченко <di...@gm...> wrote:
> Fedora 17/x64
> gcc-4.7.2 -O2 -S tst.c
>
> typedef struct {
> unsigned regno : 27;
> unsigned class : 4;
> unsigned virtual : 1;
> } HReg;
>
> static inline int cmp_Hreg(const HReg* lhs, const HReg* rhs)
> {
> const int a = *(int*)lhs;
> const int b = *(int*)rhs;
> return a == b;
> }
>
> int foo(const HReg* lhs, const HReg* rhs)
> {
> return cmp_Hreg(lhs, rhs);
> }
>
>
> .globl foo
> .type foo, @function
> foo:
> .LFB1:
> .cfi_startproc
> movl (%rsi), %eax
> cmpl %eax, (%rdi)
> sete %al
> movzbl %al, %eax
> ret
>
>
> Tnanks,
> Dmitry
>
> 2013/2/3 Eric Pouech <eri...@or...>:
>> Le 01/02/2013 21:26, Florian Krohm a écrit :
>>> On 01/31/2013 11:50 AM, Florian Krohm wrote:
>>>> Hmm, maybe we should bite the bullet and make HReg a struct after all.
>>>> Let me see how generated code looks like....
>>>>
>>> I looked at two implementation choices:
>>>
>>> 1) make HReg a proper struct
>>>
>>> typedef struct {
>>> UInt regno : 24;
>>> HRegClass class : 4;
>>> Bool virtual : 1;
>>> } HReg;
>> and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ?
>> (assuming sizeof(HReg)==sizeof(UInt))
>> A+
>>
>>
>> ------------------------------------------------------------------------------
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>> http://p.sf.net/sfu/appdyn_d2d_jan
>> _______________________________________________
>> Valgrind-developers mailing list
>> Val...@li...
>> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_jan
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: John R. <jr...@bi...> - 2013-02-11 01:06:13
|
On 02/04/2013, Patrick J. LoPresti wrote:
> Finally, as long as I am on my soap box... In my humble opinion, it
> is generally a bad idea to use memcmp(), memcpy(), or memset() on
> anything other than actual byte arrays. To zero-initialize a
> structure, instead of using memset(), just use:
>
> struct HReg foo = { };
>
> This is shorter to write, easier to read, and it will properly
> initialize floats/doubles/pointers, even on platforms that do not use
> all-zero as the byte representation for 0.0 or NULL.
>
> To reinitialize a structure to zero:
>
> static const struct HReg empty = { };
> struct HReg foo;
> ...
> foo = empty;
>
> Any compiler worth its salt will generate optimal code for these.
How do you recommend to (re-)initialize an array of structs to zero,
where the length of the array varies at runtime?
Which platforms do not use all-zero as the representation of 0.0 ?
In IEEE 794, -0.0 uses 0x800...; but -0.0 causes various trouble.
The last instance I know where memset(,0,) did not work was the Transputer,
which had signed pointers and used 0x80000000 as NULL. The 4KB of
on-chip RAM was 0x80000000 to 0x80001000.
The Transputer died in the early 1990s.
--
|
|
From: Tom H. <to...@co...> - 2013-02-04 17:45:53
|
On 04/02/13 17:25, Patrick J. LoPresti wrote: > Never access an object via the wrong pointer type. The only exception > is "char *", which can alias anything by definition. (You can also > use a union to access the same memory as different types. I believe > this takes you into the realm of implementation-defined behavior. > That is also bad, in my opinion, but it is not the disaster of > undefined behavior.) The C standard says you can only read a union through the member it was last written through. My guess would be that doing otherwise is invokes undefined behaviour, but you could be right that is is "just" implementation defined. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Patrick J. L. <lop...@gm...> - 2013-02-04 18:03:21
|
Well, this appears to be a bit complicated. According to the GCC manual, using unions for "type punning" is permitted, provided the underlying type really is a union: <http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict_002daliasing-879> On the other hand, a little searching on StackOverflow suggests that this is technically still undefined behavior. So a union will work on GCC, which probably means it will work on Clang and the Intel compiler... But "by the book", it's still no good. Whether this is good enough for Valgrind is up to you :-) - Pat On Mon, Feb 4, 2013 at 9:45 AM, Tom Hughes <to...@co...> wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: > >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > > The C standard says you can only read a union through the member it was last > written through. My guess would be that doing otherwise is invokes undefined > behaviour, but you could be right that is is "just" implementation defined. > > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ |
|
From: Eliot M. <mo...@cs...> - 2013-02-04 20:11:28
|
On 2/4/2013 12:45 PM, Tom Hughes wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: > >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > The C standard says you can only read a union through the member it was > last written through. My guess would be that doing otherwise is invokes > undefined behaviour, but you could be right that is is "just" > implementation defined. Of course it is sometimes useful. For example, if you wish to serial and IEEE floating point number into a file or over a pipe or network connection, you need to reduce it to bytes and to control the order. From the standpoint of general C, it is "undefined" since they cannot give a definition apart from the machine (and apart from any given compiler's rules about how it lays out struct fields, etc.), but I agree that "implementation defined" is a more useful way to think of it. -- Eliot Moss |
|
From: Greg P. <gp...@ap...> - 2013-02-04 18:32:36
|
On Feb 4, 2013, at 9:45 AM, Tom Hughes <to...@co...> wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > The C standard says you can only read a union through the member it was > last written through. My guess would be that doing otherwise is invokes > undefined behaviour, but you could be right that is is "just" > implementation defined. C99 fixed this. Type punning through a union is now officially supported. (The result depends on the implementation-defined bit representation of the objects of course, but I expect valgrind already requires the implementation to be two's-complement integers and IEEE float.) valgrind probably performs formally-undefined aliasing and type punning everywhere anyway, which is why it builds with -fno-strict-aliasing. -- Greg Parker gp...@ap... Runtime Wrangler |