You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
1
(1) |
2
(1) |
3
(5) |
4
(4) |
5
(7) |
6
(4) |
|
7
(3) |
8
(5) |
9
|
10
(5) |
11
|
12
(4) |
13
|
|
14
|
15
(3) |
16
(1) |
17
(2) |
18
|
19
(1) |
20
|
|
21
(1) |
22
|
23
|
24
|
25
|
26
(3) |
27
|
|
28
(2) |
29
(1) |
30
|
31
|
|
|
|
|
From: <sv...@va...> - 2016-08-05 17:22:31
|
Author: sewardj
Date: Fri Aug 5 18:22:21 2016
New Revision: 15929
Log:
Fix completely bogus array indexing introduced in r15927 -- how did this
ever work? Spotted by UBSAN. Yay UBSAN! Also update comments.
Modified:
trunk/memcheck/mc_translate.c
Modified: trunk/memcheck/mc_translate.c
==============================================================================
--- trunk/memcheck/mc_translate.c (original)
+++ trunk/memcheck/mc_translate.c Fri Aug 5 18:22:21 2016
@@ -6612,35 +6612,35 @@
bz2-32
- 1 111,840 -> 1,702,810; ratio 15.2
- 2 111,840 -> 1,656,644; ratio 14.8
- 3 111,840 -> 1,650,457; ratio 14.7
- 4 111,840 -> 1,649,103; ratio 14.7
- 5 111,840 -> 1,648,655; ratio 14.7
- 6 111,840 -> 1,648,435; ratio 14.7
- 7 111,840 -> 1,648,304; ratio 14.7
- 8 111,840 -> 1,648,304; ratio 14.7
- 10 111,840 -> 1,648,171; ratio 14.7
- 12 111,840 -> 1,648,109 ratio 14.7
- 16 111,840 -> 1,647,947; ratio 14.7
- 32 111,840 -> 1,647,881; ratio 14.7
- inf 111,840 -> 1,647,881; ratio 14.7
+ 1 4,336 (112,212 -> 1,709,473; ratio 15.2)
+ 2 4,336 (112,194 -> 1,669,895; ratio 14.9)
+ 3 4,336 (112,194 -> 1,660,713; ratio 14.8)
+ 4 4,336 (112,194 -> 1,658,555; ratio 14.8)
+ 5 4,336 (112,194 -> 1,655,447; ratio 14.8)
+ 6 4,336 (112,194 -> 1,655,101; ratio 14.8)
+ 7 4,336 (112,194 -> 1,654,858; ratio 14.7)
+ 8 4,336 (112,194 -> 1,654,810; ratio 14.7)
+ 10 4,336 (112,194 -> 1,654,621; ratio 14.7)
+ 12 4,336 (112,194 -> 1,654,678; ratio 14.7)
+ 16 4,336 (112,194 -> 1,654,494; ratio 14.7)
+ 32 4,336 (112,194 -> 1,654,602; ratio 14.7)
+ inf 4,336 (112,194 -> 1,654,602; ratio 14.7)
bz2-64
- 1 106,628 -> 1,811,992; ratio 17.0
- 2 106,628 -> 1,797,805; ratio 16.9
- 3 106,628 -> 1,792,429; ratio 16.8
- 4 106,628 -> 1,791,037; ratio 16.8
- 5 106,628 -> 1,790,929; ratio 16.8
- 6 106,628 -> 1,790,810; ratio 16.8
- 7 106,628 -> 1,790,764; ratio 16.8
- 8 106,628 -> 1,790,764; ratio 16.8
- 10 106,628 -> 1,790,764; ratio 16.8
- 12 106,628 -> 1,790,764; ratio 16.8
- 16 106,628 -> 1,790,701; ratio 16.8
- 32 106,628 -> 1,790,671; ratio 16.8
- inf 106,628 -> 1,790,671; ratio 16.8
+ 1 4,113 (107,329 -> 1,822,171; ratio 17.0)
+ 2 4,113 (107,329 -> 1,806,443; ratio 16.8)
+ 3 4,113 (107,329 -> 1,803,967; ratio 16.8)
+ 4 4,113 (107,329 -> 1,802,785; ratio 16.8)
+ 5 4,113 (107,329 -> 1,802,412; ratio 16.8)
+ 6 4,113 (107,329 -> 1,802,062; ratio 16.8)
+ 7 4,113 (107,329 -> 1,801,976; ratio 16.8)
+ 8 4,113 (107,329 -> 1,801,886; ratio 16.8)
+ 10 4,113 (107,329 -> 1,801,653; ratio 16.8)
+ 12 4,113 (107,329 -> 1,801,526; ratio 16.8)
+ 16 4,113 (107,329 -> 1,801,298; ratio 16.8)
+ 32 4,113 (107,329 -> 1,800,827; ratio 16.8)
+ inf 4,113 (107,329 -> 1,800,827; ratio 16.8)
*/
/* Structs for recording which (helper, guard) pairs we have already
@@ -6734,7 +6734,7 @@
tl_assert(i == n);
if (n == N_TIDYING_PAIRS) {
for (i = 1; i < N_TIDYING_PAIRS; i++) {
- tidyingEnv[n-1] = tidyingEnv[n];
+ tidyingEnv->pairs[i-1] = tidyingEnv->pairs[i];
}
tidyingEnv->pairs[N_TIDYING_PAIRS-1].entry = entry;
tidyingEnv->pairs[N_TIDYING_PAIRS-1].guard = guard;
|
|
From: <sv...@va...> - 2016-08-05 15:15:27
|
Author: sewardj
Date: Fri Aug 5 16:15:20 2016
New Revision: 15928
Log:
Update.
Modified:
trunk/NEWS
trunk/docs/internals/3_11_BUGSTATUS.txt
Modified: trunk/NEWS
==============================================================================
--- trunk/NEWS (original)
+++ trunk/NEWS Fri Aug 5 16:15:20 2016
@@ -40,6 +40,10 @@
- zlib ELF gABI format with SHF_COMPRESSED flag (gcc option -gz=zlib)
- zlib GNU format with .zdebug sections (gcc option -gz=zlib-gnu)
+* Modest JIT-cost improvements: the cost of instrumenting code blocks
+ for the most common use case (x86_64-linux, Memcheck) has been
+ reduced by 10%-15%.
+
* ==================== FIXED BUGS ====================
The following bugs have been fixed or resolved. Note that "n-i-bz"
@@ -116,6 +120,7 @@
360425 arm64 unsupported instruction ldpsw
== 364435
360519 none/tests/arm64/memory.vgtest might fail with newer gcc
+360574 Wrong parameter type for an ashmem ioctl() call on Android and ARM64
360749 kludge for multiple .rodata sections on Solaris no longer needed
360752 raise the number of reserved fds in m_main.c from 10 to 12
361207 Valgrind does not support the IBM POWER ISA 3.0 instructions, part 2
@@ -136,6 +141,7 @@
get_otrack_shadow_offset_wrk()
365273 Invalid write to stack location reported after signal handler runs
365912 ppc64BE segfault during jm-insns test (RELRO)
+366344 Multiple unhandled instruction for Aarch64
n-i-bz Fix incorrect (or infinite loop) unwind on RHEL7 x86 and amd64
n-i-bz massif --pages-as-heap=yes does not report peak caused by mmap+munmap
Modified: trunk/docs/internals/3_11_BUGSTATUS.txt
==============================================================================
--- trunk/docs/internals/3_11_BUGSTATUS.txt (original)
+++ trunk/docs/internals/3_11_BUGSTATUS.txt Fri Aug 5 16:15:20 2016
@@ -90,7 +90,6 @@
359705 memcheck causes segfault on a dynamically-linked test from
rustlang's test suite on i686
360429 Warning: noted but unhandled ioctl 0x530d with no size/direction hints.
-360574 Wrong parameter type for an ashmem ioctl() call on Android and ARM64
361615 Inconsistent termination when an instrumented multithreaded process
is terminated by signal
361726 WARNING:unhandled syscall on ppc64
|
|
From: <sv...@va...> - 2016-08-05 15:02:55
|
Author: sewardj
Date: Fri Aug 5 16:02:48 2016
New Revision: 3239
Log:
Reduce the number of IR sanity checks from 4 per block to 2 per block.
Also relax assertion checking in the register allocator.
Together with valgrind r15927 this reduces per-block JITting cost by 10%-15%.
Modified:
trunk/priv/host_generic_reg_alloc2.c
trunk/priv/main_main.c
Modified: trunk/priv/host_generic_reg_alloc2.c
==============================================================================
--- trunk/priv/host_generic_reg_alloc2.c (original)
+++ trunk/priv/host_generic_reg_alloc2.c Fri Aug 5 16:02:48 2016
@@ -993,13 +993,13 @@
/* ------------ Sanity checks ------------ */
/* Sanity checks are expensive. So they are done only once
- every 13 instructions, and just before the last
+ every 17 instructions, and just before the last
instruction. */
do_sanity_check
= toBool(
False /* Set to True for sanity checking of all insns. */
|| ii == instrs_in->arr_used-1
- || (ii > 0 && (ii % 13) == 0)
+ || (ii > 0 && (ii % 17) == 0)
);
if (do_sanity_check) {
Modified: trunk/priv/main_main.c
==============================================================================
--- trunk/priv/main_main.c (original)
+++ trunk/priv/main_main.c Fri Aug 5 16:02:48 2016
@@ -916,8 +916,13 @@
irsb = do_iropt_BB ( irsb, specHelper, preciseMemExnsFn, pxControl,
vta->guest_bytes_addr,
vta->arch_guest );
- sanityCheckIRSB( irsb, "after initial iropt",
- True/*must be flat*/, guest_word_type );
+
+ // JRS 2016 Aug 03: Sanity checking is expensive, we already checked
+ // the output of the front end, and iropt never screws up the IR by
+ // itself, unless it is being hacked on. So remove this post-iropt
+ // check in "production" use.
+ // sanityCheckIRSB( irsb, "after initial iropt",
+ // True/*must be flat*/, guest_word_type );
if (vex_traceflags & VEX_TRACE_OPT1) {
vex_printf("\n------------------------"
@@ -953,9 +958,12 @@
vex_printf("\n");
}
- if (vta->instrument1 || vta->instrument2)
- sanityCheckIRSB( irsb, "after instrumentation",
- True/*must be flat*/, guest_word_type );
+ // JRS 2016 Aug 03: as above, this never actually fails in practice.
+ // And we'll sanity check anyway after the post-instrumentation
+ // cleanup pass. So skip this check in "production" use.
+ // if (vta->instrument1 || vta->instrument2)
+ // sanityCheckIRSB( irsb, "after instrumentation",
+ // True/*must be flat*/, guest_word_type );
/* Do a post-instrumentation cleanup pass. */
if (vta->instrument1 || vta->instrument2) {
|
|
From: <sv...@va...> - 2016-08-05 14:59:58
|
Author: sewardj
Date: Fri Aug 5 15:59:50 2016
New Revision: 15927
Log:
Reimplement MC_(final_tidy) much more efficiently. This reduces its instruction
count by a factor of about 4.
Modified:
trunk/memcheck/mc_include.h
trunk/memcheck/mc_main.c
trunk/memcheck/mc_translate.c
Modified: trunk/memcheck/mc_include.h
==============================================================================
--- trunk/memcheck/mc_include.h (original)
+++ trunk/memcheck/mc_include.h Fri Aug 5 15:59:50 2016
@@ -788,6 +788,9 @@
IRSB* MC_(final_tidy) ( IRSB* );
+/* Check some assertions to do with the instrumentation machinery. */
+void MC_(do_instrumentation_startup_checks)( void );
+
#endif /* ndef __MC_INCLUDE_H */
/*--------------------------------------------------------------------*/
Modified: trunk/memcheck/mc_main.c
==============================================================================
--- trunk/memcheck/mc_main.c (original)
+++ trunk/memcheck/mc_main.c Fri Aug 5 15:59:50 2016
@@ -8142,6 +8142,9 @@
tl_assert(MASK(4) == 0xFFFFFFF000000003ULL);
tl_assert(MASK(8) == 0xFFFFFFF000000007ULL);
# endif
+
+ /* Check some assertions to do with the instrumentation machinery. */
+ MC_(do_instrumentation_startup_checks)();
}
STATIC_ASSERT(sizeof(UWord) == sizeof(SizeT));
Modified: trunk/memcheck/mc_translate.c
==============================================================================
--- trunk/memcheck/mc_translate.c (original)
+++ trunk/memcheck/mc_translate.c Fri Aug 5 15:59:50 2016
@@ -6584,6 +6584,7 @@
return sb_out;
}
+
/*------------------------------------------------------------*/
/*--- Post-tree-build final tidying ---*/
/*------------------------------------------------------------*/
@@ -6602,17 +6603,69 @@
reference, which is kinda pointless. MC_(final_tidy) therefore
looks for such repeated calls and removes all but the first. */
-/* A struct for recording which (helper, guard) pairs we have already
+
+/* With some testing on perf/bz2.c, on amd64 and x86, compiled with
+ gcc-5.3.1 -O2, it appears that 16 entries in the array are enough to
+ get almost all the benefits of this transformation whilst causing
+ the slide-back case to just often enough to be verifiably
+ correct. For posterity, the numbers are:
+
+ bz2-32
+
+ 1 111,840 -> 1,702,810; ratio 15.2
+ 2 111,840 -> 1,656,644; ratio 14.8
+ 3 111,840 -> 1,650,457; ratio 14.7
+ 4 111,840 -> 1,649,103; ratio 14.7
+ 5 111,840 -> 1,648,655; ratio 14.7
+ 6 111,840 -> 1,648,435; ratio 14.7
+ 7 111,840 -> 1,648,304; ratio 14.7
+ 8 111,840 -> 1,648,304; ratio 14.7
+ 10 111,840 -> 1,648,171; ratio 14.7
+ 12 111,840 -> 1,648,109 ratio 14.7
+ 16 111,840 -> 1,647,947; ratio 14.7
+ 32 111,840 -> 1,647,881; ratio 14.7
+ inf 111,840 -> 1,647,881; ratio 14.7
+
+ bz2-64
+
+ 1 106,628 -> 1,811,992; ratio 17.0
+ 2 106,628 -> 1,797,805; ratio 16.9
+ 3 106,628 -> 1,792,429; ratio 16.8
+ 4 106,628 -> 1,791,037; ratio 16.8
+ 5 106,628 -> 1,790,929; ratio 16.8
+ 6 106,628 -> 1,790,810; ratio 16.8
+ 7 106,628 -> 1,790,764; ratio 16.8
+ 8 106,628 -> 1,790,764; ratio 16.8
+ 10 106,628 -> 1,790,764; ratio 16.8
+ 12 106,628 -> 1,790,764; ratio 16.8
+ 16 106,628 -> 1,790,701; ratio 16.8
+ 32 106,628 -> 1,790,671; ratio 16.8
+ inf 106,628 -> 1,790,671; ratio 16.8
+*/
+
+/* Structs for recording which (helper, guard) pairs we have already
seen. */
+
+#define N_TIDYING_PAIRS 16
+
typedef
struct { void* entry; IRExpr* guard; }
Pair;
+typedef
+ struct {
+ Pair pairs[N_TIDYING_PAIRS +1/*for bounds checking*/];
+ UInt pairsUsed;
+ }
+ Pairs;
+
+
/* Return True if e1 and e2 definitely denote the same value (used to
compare guards). Return False if unknown; False is the safe
answer. Since guest registers and guest memory do not have the
SSA property we must return False if any Gets or Loads appear in
- the expression. */
+ the expression. This implicitly assumes that e1 and e2 have the
+ same IR type, which is always true here -- the type is Ity_I1. */
static Bool sameIRValue ( IRExpr* e1, IRExpr* e2 )
{
@@ -6661,45 +6714,98 @@
True if so. If not, add an entry. */
static
-Bool check_or_add ( XArray* /*of Pair*/ pairs, IRExpr* guard, void* entry )
+Bool check_or_add ( Pairs* tidyingEnv, IRExpr* guard, void* entry )
{
- Pair p;
- Pair* pp;
- Int i, n = VG_(sizeXA)( pairs );
+ UInt i, n = tidyingEnv->pairsUsed;
+ tl_assert(n <= N_TIDYING_PAIRS);
for (i = 0; i < n; i++) {
- pp = VG_(indexXA)( pairs, i );
- if (pp->entry == entry && sameIRValue(pp->guard, guard))
+ if (tidyingEnv->pairs[i].entry == entry
+ && sameIRValue(tidyingEnv->pairs[i].guard, guard))
return True;
}
- p.guard = guard;
- p.entry = entry;
- VG_(addToXA)( pairs, &p );
+ /* (guard, entry) wasn't found in the array. Add it at the end.
+ If the array is already full, slide the entries one slot
+ backwards. This means we will lose to ability to detect
+ duplicates from the pair in slot zero, but that happens so
+ rarely that it's unlikely to have much effect on overall code
+ quality. Also, this strategy loses the check for the oldest
+ tracked exit (memory reference, basically) and so that is (I'd
+ guess) least likely to be re-used after this point. */
+ tl_assert(i == n);
+ if (n == N_TIDYING_PAIRS) {
+ for (i = 1; i < N_TIDYING_PAIRS; i++) {
+ tidyingEnv[n-1] = tidyingEnv[n];
+ }
+ tidyingEnv->pairs[N_TIDYING_PAIRS-1].entry = entry;
+ tidyingEnv->pairs[N_TIDYING_PAIRS-1].guard = guard;
+ } else {
+ tl_assert(n < N_TIDYING_PAIRS);
+ tidyingEnv->pairs[n].entry = entry;
+ tidyingEnv->pairs[n].guard = guard;
+ n++;
+ tidyingEnv->pairsUsed = n;
+ }
return False;
}
static Bool is_helperc_value_checkN_fail ( const HChar* name )
{
- return
- 0==VG_(strcmp)(name, "MC_(helperc_value_check0_fail_no_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check1_fail_no_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check4_fail_no_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check8_fail_no_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check0_fail_w_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check1_fail_w_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check4_fail_w_o)")
- || 0==VG_(strcmp)(name, "MC_(helperc_value_check8_fail_w_o)");
+ /* This is expensive because it happens a lot. We are checking to
+ see whether |name| is one of the following 8 strings:
+
+ MC_(helperc_value_check8_fail_no_o)
+ MC_(helperc_value_check4_fail_no_o)
+ MC_(helperc_value_check0_fail_no_o)
+ MC_(helperc_value_check1_fail_no_o)
+ MC_(helperc_value_check8_fail_w_o)
+ MC_(helperc_value_check0_fail_w_o)
+ MC_(helperc_value_check1_fail_w_o)
+ MC_(helperc_value_check4_fail_w_o)
+
+ To speed it up, check the common prefix just once, rather than
+ all 8 times.
+ */
+ const HChar* prefix = "MC_(helperc_value_check";
+
+ HChar n, p;
+ while (True) {
+ n = *name;
+ p = *prefix;
+ if (p == 0) break; /* ran off the end of the prefix */
+ /* We still have some prefix to use */
+ if (n == 0) return False; /* have prefix, but name ran out */
+ if (n != p) return False; /* have both pfx and name, but no match */
+ name++;
+ prefix++;
+ }
+
+ /* Check the part after the prefix. */
+ tl_assert(*prefix == 0 && *name != 0);
+ return 0==VG_(strcmp)(name, "8_fail_no_o)")
+ || 0==VG_(strcmp)(name, "4_fail_no_o)")
+ || 0==VG_(strcmp)(name, "0_fail_no_o)")
+ || 0==VG_(strcmp)(name, "1_fail_no_o)")
+ || 0==VG_(strcmp)(name, "8_fail_w_o)")
+ || 0==VG_(strcmp)(name, "4_fail_w_o)")
+ || 0==VG_(strcmp)(name, "0_fail_w_o)")
+ || 0==VG_(strcmp)(name, "1_fail_w_o)");
}
IRSB* MC_(final_tidy) ( IRSB* sb_in )
{
- Int i;
+ Int i;
IRStmt* st;
IRDirty* di;
IRExpr* guard;
IRCallee* cee;
Bool alreadyPresent;
- XArray* pairs = VG_(newXA)( VG_(malloc), "mc.ft.1",
- VG_(free), sizeof(Pair) );
+ Pairs pairs;
+
+ pairs.pairsUsed = 0;
+
+ pairs.pairs[N_TIDYING_PAIRS].entry = (void*)0x123;
+ pairs.pairs[N_TIDYING_PAIRS].guard = (IRExpr*)0x456;
+
/* Scan forwards through the statements. Each time a call to one
of the relevant helpers is seen, check if we have made a
previous call to the same helper using the same guard
@@ -6720,16 +6826,21 @@
guard 'guard'. Check if we have already seen a call to this
function with the same guard. If so, delete it. If not,
add it to the set of calls we do know about. */
- alreadyPresent = check_or_add( pairs, guard, cee->addr );
+ alreadyPresent = check_or_add( &pairs, guard, cee->addr );
if (alreadyPresent) {
sb_in->stmts[i] = IRStmt_NoOp();
if (0) VG_(printf)("XX\n");
}
}
- VG_(deleteXA)( pairs );
+
+ tl_assert(pairs.pairs[N_TIDYING_PAIRS].entry == (void*)0x123);
+ tl_assert(pairs.pairs[N_TIDYING_PAIRS].guard == (IRExpr*)0x456);
+
return sb_in;
}
+#undef N_TIDYING_PAIRS
+
/*------------------------------------------------------------*/
/*--- Origin tracking stuff ---*/
@@ -7485,6 +7596,62 @@
}
+/*------------------------------------------------------------*/
+/*--- Startup assertion checking ---*/
+/*------------------------------------------------------------*/
+
+void MC_(do_instrumentation_startup_checks)( void )
+{
+ /* Make a best-effort check to see that is_helperc_value_checkN_fail
+ is working as we expect. */
+
+# define CHECK(_expected, _string) \
+ tl_assert((_expected) == is_helperc_value_checkN_fail(_string))
+
+ /* It should identify these 8, and no others, as targets. */
+ CHECK(True, "MC_(helperc_value_check8_fail_no_o)");
+ CHECK(True, "MC_(helperc_value_check4_fail_no_o)");
+ CHECK(True, "MC_(helperc_value_check0_fail_no_o)");
+ CHECK(True, "MC_(helperc_value_check1_fail_no_o)");
+ CHECK(True, "MC_(helperc_value_check8_fail_w_o)");
+ CHECK(True, "MC_(helperc_value_check0_fail_w_o)");
+ CHECK(True, "MC_(helperc_value_check1_fail_w_o)");
+ CHECK(True, "MC_(helperc_value_check4_fail_w_o)");
+
+ /* Ad-hoc selection of other strings gathered via a quick test. */
+ CHECK(False, "amd64g_dirtyhelper_CPUID_avx2");
+ CHECK(False, "amd64g_dirtyhelper_RDTSC");
+ CHECK(False, "MC_(helperc_b_load1)");
+ CHECK(False, "MC_(helperc_b_load2)");
+ CHECK(False, "MC_(helperc_b_load4)");
+ CHECK(False, "MC_(helperc_b_load8)");
+ CHECK(False, "MC_(helperc_b_load16)");
+ CHECK(False, "MC_(helperc_b_load32)");
+ CHECK(False, "MC_(helperc_b_store1)");
+ CHECK(False, "MC_(helperc_b_store2)");
+ CHECK(False, "MC_(helperc_b_store4)");
+ CHECK(False, "MC_(helperc_b_store8)");
+ CHECK(False, "MC_(helperc_b_store16)");
+ CHECK(False, "MC_(helperc_b_store32)");
+ CHECK(False, "MC_(helperc_LOADV8)");
+ CHECK(False, "MC_(helperc_LOADV16le)");
+ CHECK(False, "MC_(helperc_LOADV32le)");
+ CHECK(False, "MC_(helperc_LOADV64le)");
+ CHECK(False, "MC_(helperc_LOADV128le)");
+ CHECK(False, "MC_(helperc_LOADV256le)");
+ CHECK(False, "MC_(helperc_STOREV16le)");
+ CHECK(False, "MC_(helperc_STOREV32le)");
+ CHECK(False, "MC_(helperc_STOREV64le)");
+ CHECK(False, "MC_(helperc_STOREV8)");
+ CHECK(False, "track_die_mem_stack_8");
+ CHECK(False, "track_new_mem_stack_8_w_ECU");
+ CHECK(False, "MC_(helperc_MAKE_STACK_UNINIT_w_o)");
+ CHECK(False, "VG_(unknown_SP_update_w_ECU)");
+
+# undef CHECK
+}
+
+
/*--------------------------------------------------------------------*/
/*--- end mc_translate.c ---*/
/*--------------------------------------------------------------------*/
|
|
From: <sv...@va...> - 2016-08-05 14:54:35
|
Author: sewardj
Date: Fri Aug 5 15:54:27 2016
New Revision: 15926
Log:
Reimplement pszB_to_listNo using a binary search rather than a linear search.
Unlikely as it seems, this saves a considerable number of instructions (2% of total)
on very heap-intensive code (perf/heap.c).
Modified:
trunk/coregrind/m_mallocfree.c
Modified: trunk/coregrind/m_mallocfree.c
==============================================================================
--- trunk/coregrind/m_mallocfree.c (original)
+++ trunk/coregrind/m_mallocfree.c Fri Aug 5 15:54:27 2016
@@ -1011,66 +1011,125 @@
// payload size, not block size.
// Convert a payload size in bytes to a freelist number.
-static
+static __attribute__((noinline))
+UInt pszB_to_listNo_SLOW ( SizeT pszB__divided_by__VG_MIN_MALLOC_SZB )
+{
+ SizeT n = pszB__divided_by__VG_MIN_MALLOC_SZB;
+
+ if (n < 299) {
+ if (n < 114) {
+ if (n < 85) {
+ if (n < 74) {
+ /* -- Exponential slope up, factor 1.05 -- */
+ if (n < 67) return 64;
+ if (n < 70) return 65;
+ /* else */ return 66;
+ } else {
+ if (n < 77) return 67;
+ if (n < 81) return 68;
+ /* else */ return 69;
+ }
+ } else {
+ if (n < 99) {
+ if (n < 90) return 70;
+ if (n < 94) return 71;
+ /* else */ return 72;
+ } else {
+ if (n < 104) return 73;
+ if (n < 109) return 74;
+ /* else */ return 75;
+ }
+ }
+ } else {
+ if (n < 169) {
+ if (n < 133) {
+ if (n < 120) return 76;
+ if (n < 126) return 77;
+ /* else */ return 78;
+ } else {
+ if (n < 139) return 79;
+ /* -- Exponential slope up, factor 1.10 -- */
+ if (n < 153) return 80;
+ /* else */ return 81;
+ }
+ } else {
+ if (n < 224) {
+ if (n < 185) return 82;
+ if (n < 204) return 83;
+ /* else */ return 84;
+ } else {
+ if (n < 247) return 85;
+ if (n < 272) return 86;
+ /* else */ return 87;
+ }
+ }
+ }
+ } else {
+ if (n < 1331) {
+ if (n < 530) {
+ if (n < 398) {
+ if (n < 329) return 88;
+ if (n < 362) return 89;
+ /* else */ return 90;
+ } else {
+ if (n < 438) return 91;
+ if (n < 482) return 92;
+ /* else */ return 93;
+ }
+ } else {
+ if (n < 770) {
+ if (n < 583) return 94;
+ if (n < 641) return 95;
+ /* -- Exponential slope up, factor 1.20 -- */
+ /* else */ return 96;
+ } else {
+ if (n < 924) return 97;
+ if (n < 1109) return 98;
+ /* else */ return 99;
+ }
+ }
+ } else {
+ if (n < 3974) {
+ if (n < 2300) {
+ if (n < 1597) return 100;
+ if (n < 1916) return 101;
+ return 102;
+ } else {
+ if (n < 2760) return 103;
+ if (n < 3312) return 104;
+ /* else */ return 105;
+ }
+ } else {
+ if (n < 6868) {
+ if (n < 4769) return 106;
+ if (n < 5723) return 107;
+ /* else */ return 108;
+ } else {
+ if (n < 8241) return 109;
+ if (n < 9890) return 110;
+ /* else */ return 111;
+ }
+ }
+ }
+ }
+ /*NOTREACHED*/
+ vg_assert(0);
+}
+
+static inline
UInt pszB_to_listNo ( SizeT pszB )
{
SizeT n = pszB / VG_MIN_MALLOC_SZB;
- vg_assert(0 == pszB % VG_MIN_MALLOC_SZB);
+ vg_assert(0 == (pszB % VG_MIN_MALLOC_SZB));
// The first 64 lists hold blocks of size VG_MIN_MALLOC_SZB * list_num.
- // The final 48 hold bigger blocks.
- if (n < 64) return (UInt)n;
- /* Exponential slope up, factor 1.05 */
- if (n < 67) return 64;
- if (n < 70) return 65;
- if (n < 74) return 66;
- if (n < 77) return 67;
- if (n < 81) return 68;
- if (n < 85) return 69;
- if (n < 90) return 70;
- if (n < 94) return 71;
- if (n < 99) return 72;
- if (n < 104) return 73;
- if (n < 109) return 74;
- if (n < 114) return 75;
- if (n < 120) return 76;
- if (n < 126) return 77;
- if (n < 133) return 78;
- if (n < 139) return 79;
- /* Exponential slope up, factor 1.10 */
- if (n < 153) return 80;
- if (n < 169) return 81;
- if (n < 185) return 82;
- if (n < 204) return 83;
- if (n < 224) return 84;
- if (n < 247) return 85;
- if (n < 272) return 86;
- if (n < 299) return 87;
- if (n < 329) return 88;
- if (n < 362) return 89;
- if (n < 398) return 90;
- if (n < 438) return 91;
- if (n < 482) return 92;
- if (n < 530) return 93;
- if (n < 583) return 94;
- if (n < 641) return 95;
- /* Exponential slope up, factor 1.20 */
- if (n < 770) return 96;
- if (n < 924) return 97;
- if (n < 1109) return 98;
- if (n < 1331) return 99;
- if (n < 1597) return 100;
- if (n < 1916) return 101;
- if (n < 2300) return 102;
- if (n < 2760) return 103;
- if (n < 3312) return 104;
- if (n < 3974) return 105;
- if (n < 4769) return 106;
- if (n < 5723) return 107;
- if (n < 6868) return 108;
- if (n < 8241) return 109;
- if (n < 9890) return 110;
- return 111;
+ // The final 48 hold bigger blocks and are dealt with by the _SLOW
+ // case.
+ if (LIKELY(n < 64)) {
+ return (UInt)n;
+ } else {
+ return pszB_to_listNo_SLOW(n);
+ }
}
// What is the minimum payload size for a given list?
|
|
From: Ruurd B. <Ruu...@in...> - 2016-08-05 14:39:31
|
Hi,
I am a software developer at Infor, where we maintain a complex application (30+ years old, many millions of lines), most of it written in C/C++.
I have used valgrind with memcheck to find and fix memory related issues and have become a great fan of the product.
However, we use a custom allocator that caused me considerable problems because it has memory pool features not supported by the "loose model" of valgrind.
1. Specifically, it allows me to create a memory pool, allocate many items from that pool and then destroy the pool.
The applications know that all pool items are automatically freed when the pool is destroyed, so it saves time and code by not doing so explicitly.
Valgrind reports all items in such a pool as memory leaks, because that is the model it assumes.
I understand that this is a design choice: Either such application memory pools are considered "auto-freed" or not, and when not, they are considered leaks.
2. Another problem is that our allocator uses itself to allocate large chunks for the memory pools.
Those chunks are used to dole out smaller pieces for the applications.
Valgrind sees that as an error: Overlapping memory blocks because both types of blocks (memory pool and allocations from the pool) are marked as VALGRIND_MALLOCLIKE_BLOCK.
That triggers an error:
Block 0x%lx..0x%lx overlaps with block 0x%lx..0x%lx, this is usually caused by using VALGRIND_MALLOCLIKE_BLOCK in an inappropriate way
plus an assert in memcheck.
3. Our (admittedly ancient) allocator uses sbrk() to get the memory (and not mmap).
Valgrind (on linux) limits this to 8MB. That is not enough for our applications. The 8MB is hardcoded in valgrind.
4. We use Oracle as a database, which executes as setuid-to-oracle on Linux (we have our own database wrapper software layers for Oracle, DB2, MySql, Microsoft SQL and so on).
To be able to valgrind such executables, I've created a setuid-oracle copy of Valgrind.
That works, but the reports valgrind creates are owned by Oracle in such cases and our test framework got "Permission denied" when it wanted to analyze and modify the valgrind reports.
So I have modified valgrind to support our model, address the problems, and tried not to break anything in the process:
1. Added a VALGRIND_CREATE_META_MEMPOOL macro in valgrind.h, modelled after VALGRIND_CREATE_MEMPOOL.
It takes a flag parameter, with 2 (or-able) options: MEMPOOL_AUTO_FREE and MEMPOOL_METABLOCKS.
When AUTO_FREE is set when the pool is created, valgrind will free all allocations in a memory pool block when MEMPOOL_FREE is used on a block.
For a non-auto-free pool, everything is as before. This prevents the false memory-leak reports.
2. When METABLOCKS is used, it will not complain about overlapping blocks as long as the overlap is with a memory-pool chunk from a METABLOCKS pool.
Also, when reporting the location of a problem, the "describe_addr" function favored custom memory pool blocks (our 64 KB chunks for the pool) over all else.
That caused almost all reports to say "Address XXX is many bytes in a block of 64K alloc'd", and the alloc location-stack would be the place where the pool was extended.
Not very useful.
So I've modified the describe_addr function to take the "meta-blocks" into account and report the underlying smaller allocation.
When no such meta-blocks exists, everything is as before.
3. For the sbrk problem, I've added a new command line option, --main-sbrksize, patterned after -main-stacksize. The default is the old (hard-coded) 8MB.
In the initimg modules for Linux and Solaris I have changed the code to use the command line value.
So the behavior is modified only when the new command line option is used. Out test framework passes 1GB and that works well.
That change cause a few regression tests of valgrind to fail that check the "help" output of valgrind, I've fixed those as well.
4. I've added group-write permissions to the default file-creation mask for the valgrind reports.
BTW: Those reports are altered because I could not figure out how to write suppression rules that are based only on the allocation stack of a problem.
For example, we link against OpenSSL crypto libraries which (intentionally) do all sorts of things with uninitialized memory (for randomness).
Valgrind spots that, but I want to suppress those messages.
The numerous different error-locations all have the same allocation spot, but suppressions insist on using the location of the error (use of uninitialized memory).
Those are far too many and often change when a new OpenSSL version is released.
So I've written as Perl post-processor to delete (suppress) the OpenSSL stuff based on arbitrary patterns in a valgrind error message.
The "permission denied" occurred when it wanted to write the edited report back.
Suggestion: It would be nice to be able to write suppression rules for this kind of problem, with regular expressions on the complete valgrind message.
I've created a new version of valgrind for this: 3.11.1.
I've attached a patch file to alter a 3.11.0 tree to a 3.11.1. Apply to a 3.11.0 tree by going to the root of the tree and do "patch -p0 < metamempool.patch".
I've tried to do all the changes in the style of the existing code.
I've run all the regression tests of valgrind and the results of 3.11.0 and 3.11.1 are identical.
The patch also includes altered manual pages, I've been unable to build those on my system, so I hope they're OK.
I've installed this altered valgrind on various development and test systems at Infor and used it for a few months to make sure I have not broken anything.
This version 3.11.1 is used on both normal programs and ones using our custom allocator (or both). Everything works the way it should.
I'd appreciate it if this patch could be applied to the standard distribution so I will not have to maintain a separate version of valgrind/memcheck for Infor.
Comment / feedback appreciated,
Regards,
Ruurd Beerstra
[Infor]<http://www.infor.com/>
Ruurd Beerstra | Software Engineer, Sr.
office: 0342-427289 | mobile: +31 22 42 7478 | Ruu...@in... | http://www.infor.com
|
|
From: <sv...@va...> - 2016-08-05 10:34:23
|
Author: sewardj
Date: Fri Aug 5 11:34:15 2016
New Revision: 3238
Log:
Fix two invalid signed left shifts picked up by ubsan.
Modified:
trunk/priv/guest_arm64_toIR.c
Modified: trunk/priv/guest_arm64_toIR.c
==============================================================================
--- trunk/priv/guest_arm64_toIR.c (original)
+++ trunk/priv/guest_arm64_toIR.c Fri Aug 5 11:34:15 2016
@@ -148,8 +148,9 @@
static ULong sx_to_64 ( ULong x, UInt n )
{
vassert(n > 1 && n < 64);
+ x <<= (64-n);
Long r = (Long)x;
- r = (r << (64-n)) >> (64-n);
+ r >>= (64-n);
return (ULong)r;
}
@@ -2590,7 +2591,7 @@
IRTemp old = newTemp(Ity_I32);
assign(old, getIReg32orZR(dd));
vassert(hw <= 1);
- UInt mask = 0xFFFF << (16 * hw);
+ UInt mask = ((UInt)0xFFFF) << (16 * hw);
IRExpr* res
= binop(Iop_Or32,
binop(Iop_And32, mkexpr(old), mkU32(~mask)),
|