From: Florian K. <br...@ac...> - 2013-01-31 02:19:54
|
We should change the definition of HReg to make mixups of integers (typically register number) and HReg less likely. Like so: Index: VEX/priv/host_generic_regs.h =================================================================== --- VEX/priv/host_generic_regs.h (revision 2671) +++ VEX/priv/host_generic_regs.h (working copy) @@ -116,22 +116,22 @@ occupy 24 bits. */ if (r24 != regno) vpanic("mkHReg: regno exceeds 2^24"); - return regno | (((UInt)rc) << 28) | (virtual ? (1<<24) : 0); + return (regno << 5) | (((UInt)rc) << 1) | (virtual ? 1 : 0); } static inline HRegClass hregClass ( HReg r ) { UInt rc = r; - rc = (rc >> 28) & 0x0F; + rc = (rc >> 1) & 0x0F; vassert(rc >= HRcInt32 && rc <= HRcVec128); return (HRegClass)rc; } static inline UInt hregNumber ( HReg r ) { - return ((UInt)r) & 0x00FFFFFF; + return ((UInt)r) >> 5; } static inline Bool hregIsVirtual ( HReg r ) { - return toBool(((UInt)r) & (1<<24)); + return toBool(((UInt)r) & 1); } While it would be preferred to find such mixups so at compile time, I cannot think of a non-intrusive way to do it. So.. the above patch is most likely to find those mixups at runtime as assigning a HReg to an integer will no longer give the register number in the low order bits. Florian |
From: Julian S. <js...@ac...> - 2013-01-31 08:38:36
|
On 01/31/2013 03:19 AM, Florian Krohm wrote: > We should change the definition of HReg to make mixups of integers > (typically register number) and HReg less likely. Yes. > While it would be preferred to find such mixups so at compile time, I > cannot think of a non-intrusive way to do it. Didn't you have some previous proposal that did this? typedef struct { UInt r; } HReg; so that the typechecker can find all mixups? Changing the encoding as you suggest is also possible, but the failures will appear as incorrect generated code, which is a lot harder to diagnose than compiler errors. What's the objection to the struct scheme? Provided that gcc can inline suitably, so that there is no extra performance overhead -- important since regalloc and hence HReg related stuff is the biggest JIT cost -- I would be happy with it. But I'd also be happy with your different-encoding scheme too, if you prefer that. The current situation is for sure not good. J |
From: Florian K. <br...@ac...> - 2013-01-31 16:50:20
|
On 01/31/2013 03:37 AM, Julian Seward wrote: > > Didn't you have some previous proposal that did this? > > typedef struct { UInt r; } HReg; > > so that the typechecker can find all mixups? Yes. The thing is that we compare HREgs for equality using == or != and you cannot do that if HReg is a struct. Won't compile. We do that 91 times in current code. If we want to go this route we need to define a function sameHReg or so and change all comparisons to call that function. That is what I meant that it will be somewhat intrusive. If the function is inlined it probably will generate the same quality of code although we would have to make sure. > But I'd also be happy with your different-encoding scheme too, if you > prefer that. The current situation is for sure not good. > Hmm. I made the encoding change and ran regtest on x86-64 with no new regression. However, there are still two issues in amd64 insn selection. I must have overlooked them first time around..: priv/host_amd64_isel.c:3446:7: error: invalid initializer priv/host_amd64_isel.c:3447:43: error: invalid operands to binary ^ (have 'HReg' and 'int') Hmm, maybe we should bite the bullet and make HReg a struct after all. Let me see how generated code looks like.... Florian |
From: Florian K. <br...@ac...> - 2013-02-01 20:27:00
Attachments:
hreg-wrapper-patch
|
On 01/31/2013 11:50 AM, Florian Krohm wrote: > > Hmm, maybe we should bite the bullet and make HReg a struct after all. > Let me see how generated code looks like.... > I looked at two implementation choices: 1) make HReg a proper struct typedef struct { UInt regno : 24; HRegClass class : 4; Bool virtual : 1; } HReg; 2) Wrap the current register encoding into a struct: typedef struct { UInt reg; } HReg; For both choices a function sameHReg will have to be introduced to compare two HRegs for equality. In the case of a proper struct (#1 above) that function will do this: return toBool(r1.regno == r2.regno && r1.class == r2.class && r1.virtual == r2.virtual) That generates: movl %esi, %edx xorl %eax, %eax xorl %edi, %edx andl $16777215, %edx jne .L2 xorq %rsi, %rdi testl $520093696, %edi sete %al .L2: rep ret as opposed to cmpl %edi, %esi sete %al ret for #2 where we simply return toBool(r1.reg == r2.reg) Because comparison gets used a lot in the register allocator I chose #2. Patch attached and regtested on x86-x64, s390, and ppc with no new regressions. Florian |
From: Greg P. <gp...@ap...> - 2013-02-01 22:52:13
|
On Feb 1, 2013, at 12:26 PM, Florian Krohm <br...@ac...> wrote: > I looked at two implementation choices: > > 1) make HReg a proper struct > > typedef struct { > UInt regno : 24; > HRegClass class : 4; > Bool virtual : 1; > } HReg; > > 2) Wrap the current register encoding into a struct: > > typedef struct { > UInt reg; > } HReg; > > Because comparison gets used a lot in the register allocator I chose #2. > Patch attached and regtested on x86-x64, s390, and ppc with no new > regressions. You can optimize #1. First, make sure the bitfield spans all 32 bits. (I'd also recommend not using the C++ keywords "class" and "virtual" in the names.) typedef struct { UInt regno : 27; HRegClass regclass : 4; Bool isvirtual : 1; } HReg; Second, write the comparison function with memcmp(). Your compiler optimizer should be able to transform the call into a single compare. (Curiously, `clang -Os` was unable to optimize the field-by-field comparison, even after expanding the struct to 32 bits.) -- Greg Parker gp...@ap... Runtime Wrangler |
From: Florian K. <br...@ac...> - 2013-02-02 15:46:41
|
On 02/01/2013 05:52 PM, Greg Parker wrote: > > First, make sure the bitfield spans all 32 bits. (I'd also recommend not using the C++ keywords "class" and "virtual" in the names.) > > typedef struct { > UInt regno : 27; > HRegClass regclass : 4; > Bool isvirtual : 1; > } HReg; > > Second, write the comparison function with memcmp(). Your compiler optimizer should be able to transform the call into a single compare. > > (Curiously, `clang -Os` was unable to optimize the field-by-field comparison, even after expanding the struct to 32 bits.) > > Thanks for the pointers. Unfortunately, neither of the suggestions work. With gcc 4.6.3 -O2 that is. Comparing two HRegs field by field will actually generate worse code than before (where regno was only 24 bits wide). And using __builtin_memcmp will generate a call into memcmp. So we get an unresolved reference as we don't link against glibc.. I tried a few other optimisation options -O3, -Os, but they didn't make a difference. Florian |
From: Eric P. <eri...@or...> - 2013-02-03 12:03:44
|
Le 01/02/2013 21:26, Florian Krohm a écrit : > On 01/31/2013 11:50 AM, Florian Krohm wrote: >> Hmm, maybe we should bite the bullet and make HReg a struct after all. >> Let me see how generated code looks like.... >> > I looked at two implementation choices: > > 1) make HReg a proper struct > > typedef struct { > UInt regno : 24; > HRegClass class : 4; > Bool virtual : 1; > } HReg; and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ? (assuming sizeof(HReg)==sizeof(UInt)) A+ |
From: John R. <jr...@bi...> - 2013-02-03 15:46:20
|
>> typedef struct { >> UInt regno : 24; >> HRegClass class : 4; >> Bool virtual : 1; >> } HReg; > and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ? > (assuming sizeof(HReg)==sizeof(UInt)) Here that definition of sameHReg provides a bug, because the declaration of HReg has 3 uninitialized bits whose value is not controlled. To fix the bug, increase the width of some field(s) of HReg until the widths sum to 32 bits (the bit width of UInt.) -- |
From: Дмитрий Д. <di...@gm...> - 2013-02-03 12:48:18
|
Fedora 17/x64 gcc-4.7.2 -O2 -S tst.c typedef struct { unsigned regno : 27; unsigned class : 4; unsigned virtual : 1; } HReg; static inline int cmp_Hreg(const HReg* lhs, const HReg* rhs) { const int a = *(int*)lhs; const int b = *(int*)rhs; return a == b; } int foo(const HReg* lhs, const HReg* rhs) { return cmp_Hreg(lhs, rhs); } .globl foo .type foo, @function foo: .LFB1: .cfi_startproc movl (%rsi), %eax cmpl %eax, (%rdi) sete %al movzbl %al, %eax ret Tnanks, Dmitry 2013/2/3 Eric Pouech <eri...@or...>: > Le 01/02/2013 21:26, Florian Krohm a écrit : >> On 01/31/2013 11:50 AM, Florian Krohm wrote: >>> Hmm, maybe we should bite the bullet and make HReg a struct after all. >>> Let me see how generated code looks like.... >>> >> I looked at two implementation choices: >> >> 1) make HReg a proper struct >> >> typedef struct { >> UInt regno : 24; >> HRegClass class : 4; >> Bool virtual : 1; >> } HReg; > and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ? > (assuming sizeof(HReg)==sizeof(UInt)) > A+ > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_jan > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
From: Patrick J. L. <lop...@gm...> - 2013-02-04 17:25:57
|
Very bad idea. Accessing pointers via the "wrong" type is undefined behavior, and the compiler is free to assume you do not do that. In simple cases, GCC with -Wall will warn you about this ("Dereferencing type-punned pointer will break strict-aliasing rules"). Here is a simple test program illustrating the principle: ===== #include <stdio.h> struct A { int x; }; struct B { int x; }; int foo(struct A *a, struct B *b) { a->x = 1; b->x = 2; return a->x; } int main(int argc, char *argv[]) { struct A a; printf("%d\n", foo(&a, (struct B *)&a)); return 0; } ===== Compile this with optimization and it will probably print "1"; compile without optimization, and it will probably print "2". (I say "probably" because in the presence of undefined behavior the compiler is free to emit any code whatsoever.) Never access an object via the wrong pointer type. The only exception is "char *", which can alias anything by definition. (You can also use a union to access the same memory as different types. I believe this takes you into the realm of implementation-defined behavior. That is also bad, in my opinion, but it is not the disaster of undefined behavior.) Speaking of "char *", the memcmp() proposal is also a bad idea. Compilers are free to insert padding into structures however they like. Are you sure no compiler on any platform ever would choose to pad this structure to 64 bits? In that case, memcmp() would be comparing uninitialized bits. If you want to go the struct route, just compare the structs field-wise and convince the compiler writers to optimize it. Or maybe use a union if you want to compare the structures as integers, with assertions that everything has the size you expect. Finally, as long as I am on my soap box... In my humble opinion, it is generally a bad idea to use memcmp(), memcpy(), or memset() on anything other than actual byte arrays. To zero-initialize a structure, instead of using memset(), just use: struct HReg foo = { }; This is shorter to write, easier to read, and it will properly initialize floats/doubles/pointers, even on platforms that do not use all-zero as the byte representation for 0.0 or NULL. To reinitialize a structure to zero: static const struct HReg empty = { }; struct HReg foo; ... foo = empty; Any compiler worth its salt will generate optimal code for these. - Pat On Sun, Feb 3, 2013 at 4:47 AM, Дмитрий Дьяченко <di...@gm...> wrote: > Fedora 17/x64 > gcc-4.7.2 -O2 -S tst.c > > typedef struct { > unsigned regno : 27; > unsigned class : 4; > unsigned virtual : 1; > } HReg; > > static inline int cmp_Hreg(const HReg* lhs, const HReg* rhs) > { > const int a = *(int*)lhs; > const int b = *(int*)rhs; > return a == b; > } > > int foo(const HReg* lhs, const HReg* rhs) > { > return cmp_Hreg(lhs, rhs); > } > > > .globl foo > .type foo, @function > foo: > .LFB1: > .cfi_startproc > movl (%rsi), %eax > cmpl %eax, (%rdi) > sete %al > movzbl %al, %eax > ret > > > Tnanks, > Dmitry > > 2013/2/3 Eric Pouech <eri...@or...>: >> Le 01/02/2013 21:26, Florian Krohm a écrit : >>> On 01/31/2013 11:50 AM, Florian Krohm wrote: >>>> Hmm, maybe we should bite the bullet and make HReg a struct after all. >>>> Let me see how generated code looks like.... >>>> >>> I looked at two implementation choices: >>> >>> 1) make HReg a proper struct >>> >>> typedef struct { >>> UInt regno : 24; >>> HRegClass class : 4; >>> Bool virtual : 1; >>> } HReg; >> and what does sameHReg: toBool(*(UInt*)&r1 == *(UInt*)&r2)) provides ? >> (assuming sizeof(HReg)==sizeof(UInt)) >> A+ >> >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_d2d_jan >> _______________________________________________ >> Valgrind-developers mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_jan > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
From: John R. <jr...@bi...> - 2013-02-11 01:06:13
|
On 02/04/2013, Patrick J. LoPresti wrote: > Finally, as long as I am on my soap box... In my humble opinion, it > is generally a bad idea to use memcmp(), memcpy(), or memset() on > anything other than actual byte arrays. To zero-initialize a > structure, instead of using memset(), just use: > > struct HReg foo = { }; > > This is shorter to write, easier to read, and it will properly > initialize floats/doubles/pointers, even on platforms that do not use > all-zero as the byte representation for 0.0 or NULL. > > To reinitialize a structure to zero: > > static const struct HReg empty = { }; > struct HReg foo; > ... > foo = empty; > > Any compiler worth its salt will generate optimal code for these. How do you recommend to (re-)initialize an array of structs to zero, where the length of the array varies at runtime? Which platforms do not use all-zero as the representation of 0.0 ? In IEEE 794, -0.0 uses 0x800...; but -0.0 causes various trouble. The last instance I know where memset(,0,) did not work was the Transputer, which had signed pointers and used 0x80000000 as NULL. The 4KB of on-chip RAM was 0x80000000 to 0x80001000. The Transputer died in the early 1990s. -- |
From: Tom H. <to...@co...> - 2013-02-04 17:45:53
|
On 04/02/13 17:25, Patrick J. LoPresti wrote: > Never access an object via the wrong pointer type. The only exception > is "char *", which can alias anything by definition. (You can also > use a union to access the same memory as different types. I believe > this takes you into the realm of implementation-defined behavior. > That is also bad, in my opinion, but it is not the disaster of > undefined behavior.) The C standard says you can only read a union through the member it was last written through. My guess would be that doing otherwise is invokes undefined behaviour, but you could be right that is is "just" implementation defined. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Patrick J. L. <lop...@gm...> - 2013-02-04 18:03:21
|
Well, this appears to be a bit complicated. According to the GCC manual, using unions for "type punning" is permitted, provided the underlying type really is a union: <http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict_002daliasing-879> On the other hand, a little searching on StackOverflow suggests that this is technically still undefined behavior. So a union will work on GCC, which probably means it will work on Clang and the Intel compiler... But "by the book", it's still no good. Whether this is good enough for Valgrind is up to you :-) - Pat On Mon, Feb 4, 2013 at 9:45 AM, Tom Hughes <to...@co...> wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: > >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > > The C standard says you can only read a union through the member it was last > written through. My guess would be that doing otherwise is invokes undefined > behaviour, but you could be right that is is "just" implementation defined. > > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ |
From: Eliot M. <mo...@cs...> - 2013-02-04 20:11:28
|
On 2/4/2013 12:45 PM, Tom Hughes wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: > >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > The C standard says you can only read a union through the member it was > last written through. My guess would be that doing otherwise is invokes > undefined behaviour, but you could be right that is is "just" > implementation defined. Of course it is sometimes useful. For example, if you wish to serial and IEEE floating point number into a file or over a pipe or network connection, you need to reduce it to bytes and to control the order. From the standpoint of general C, it is "undefined" since they cannot give a definition apart from the machine (and apart from any given compiler's rules about how it lays out struct fields, etc.), but I agree that "implementation defined" is a more useful way to think of it. -- Eliot Moss |
From: Greg P. <gp...@ap...> - 2013-02-04 18:32:36
|
On Feb 4, 2013, at 9:45 AM, Tom Hughes <to...@co...> wrote: > On 04/02/13 17:25, Patrick J. LoPresti wrote: >> Never access an object via the wrong pointer type. The only exception >> is "char *", which can alias anything by definition. (You can also >> use a union to access the same memory as different types. I believe >> this takes you into the realm of implementation-defined behavior. >> That is also bad, in my opinion, but it is not the disaster of >> undefined behavior.) > > The C standard says you can only read a union through the member it was > last written through. My guess would be that doing otherwise is invokes > undefined behaviour, but you could be right that is is "just" > implementation defined. C99 fixed this. Type punning through a union is now officially supported. (The result depends on the implementation-defined bit representation of the objects of course, but I expect valgrind already requires the implementation to be two's-complement integers and IEEE float.) valgrind probably performs formally-undefined aliasing and type punning everywhere anyway, which is why it builds with -fno-strict-aliasing. -- Greg Parker gp...@ap... Runtime Wrangler |