|
From: John R. <jr...@bi...> - 2011-10-08 22:40:47
|
LibBEX_Alloc is confused about 4- versus 8-alignment, and it matters
on some hardware such as ARM. (On most x86/x86_64 it works but costs
one extra cycle per access.)
On a 32-bit machine, the results of LibVEX_Alloc (VEX/pub/libvex.h)
are only 4-aligned:
Int ALIGN;
ALIGN = sizeof(void*)-1; // 3 for 32-bit, 7 for 64-bit
nbytes = (nbytes + ALIGN) & ~ALIGN;
curr = private_LibVEX_alloc_curr;
next = curr + nbytes;
if (next >= private_LibVEX_alloc_last)
private_LibVEX_alloc_OOM();
private_LibVEX_alloc_curr = next;
return curr;
yet the code in VEX/priv/main_util.c had intentions for 8-aligned:
static HChar temporary[N_TEMPORARY_BYTES] __attribute__((aligned(8)));
static HChar permanent[N_PERMANENT_BYTES] __attribute__((aligned(8)));
gcc-4.6 on armv5te (gcc -O) assumes 8-aligned, without proof of safety:
(gdb) x/23i IRStmt_IMark
0x380e8804 <IRStmt_IMark>: push {r4, r5, r6, lr}
0x380e8808 <IRStmt_IMark+4>: ldr r5, [pc, #56] ; &private_LibVEX_alloc_curr
0x380e880c <IRStmt_IMark+8>: ldr r4, [pc, #56] ; &private_LibVEX_alloc_last
0x380e8810 <IRStmt_IMark+12>: ldr r12, [r5] ; private_LibVEX_alloc_curr
0x380e8814 <IRStmt_IMark+16>: ldr r6, [r4] ; private_LibVEX_alloc_last
0x380e8818 <IRStmt_IMark+20>: add r4, r12, #24
0x380e881c <IRStmt_IMark+24>: cmp r4, r6
0x380e8820 <IRStmt_IMark+28>: bcs 0x380e8844 <IRStmt_IMark+64>
0x380e8824 <IRStmt_IMark+32>: strd r0, [r12, #8] ; garbage unless 8-aligned
On armv5te this works so far for memcheck (regression tests pass except for
minor errors in debug info, ldrex/strex atomics, floating point hardware,
mmap of fractional page [beginning of .data]), but fails miserably for callgrind.
I'm going to try forcing 8-alignment by:
ALIGN = 8 -1;
--
|