You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(6) |
2
(4) |
3
(4) |
4
(4) |
|
5
(6) |
6
(9) |
7
(4) |
8
(15) |
9
(6) |
10
(6) |
11
(22) |
|
12
(12) |
13
(9) |
14
(4) |
15
(11) |
16
(8) |
17
(4) |
18
(6) |
|
19
(6) |
20
(15) |
21
(9) |
22
(9) |
23
(14) |
24
(7) |
25
(7) |
|
26
(8) |
27
(11) |
28
(4) |
29
(4) |
30
(12) |
31
(7) |
|
|
From: Sérgio D. J. <ser...@li...> - 2008-10-15 18:41:45
|
Hello Bart, On Wed, 2008-10-15 at 19:40 +0200, Bart Van Assche wrote: > On Wed, Oct 15, 2008 at 3:42 PM, Sérgio Durigan Júnior > <ser...@li...> wrote: > > 1) If we could ask GCC guys to generate debugging information telling > > where each inline function is, would that help Valgrind to intercept the > > calls? > > A possible alternative is to make sure that none of the > synchronization primitives in libgomp is declared inline, and to ask > Valgrind users to install the libgomp debuginfo package. IMHO this is not really an alternative. Actually I think that the more multithreading applications evolve, the more we'll see things like inlining primitive functions. Also, I don't think that GCC guys will ever accept a request to take off the inline :-) Regards, -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: Bart V. A. <bar...@gm...> - 2008-10-15 17:58:40
|
On Wed, Oct 15, 2008 at 6:11 PM, Julian Seward <js...@ac...> wrote: >> 2) Julian said that detecting locking primitives using only instructions >> is too complex, maybe impossible. Well, but as far as I understood, you >> are assuming a "general locking primitives detector". What if we limit >> this problem only to the locking primitives present in the libgomp? >> Would that be easier to do? (Of course it has a down side because every >> time the libgomp changed, we would have to change Valgrind too... But I >> think it's a valid question anyway) [ ... ] > From my brief investigation of the libgomp primitives, they are the same or > similar to that which libpthread uses. So a solution to libgomp would also > allow us to see inside libpthread, which would be good. But to be honest, > overall I simply don't understand enough about the problem at this point to > answer this question properly. My opinion is that the extraction of information about locking primitives from binary executables is a really powerful technique and would be a very interesting addition to Valgrind. The big question here is whether this is possible. This is at least a challenging research topic. In order to detect as much programming errors as possible, tools like Helgrind and DRD discern a.o. the following primitives: * atomic modifications of variables. * mutex lock, unlock and trylock operations. * semaphore post, wait and trywait operations. * condition variables. * reader-writer locks. * barriers. One issue that puzzles me is the following: it is possible to implement a semaphore using one mutex and one condition variable, and it is possible to implement a mutex using one semaphore. libpthread implements semaphores, mutexes and condition variables via futexes. So how is it possible by only analyzing futex calls and control flow whether the (library) programmer has implemented a mutex or a semaphore ? Bart. |
|
From: Bart V. A. <bar...@gm...> - 2008-10-15 17:52:50
|
On Wed, Oct 15, 2008 at 3:42 PM, Sérgio Durigan Júnior <ser...@li...> wrote: > 1) If we could ask GCC guys to generate debugging information telling > where each inline function is, would that help Valgrind to intercept the > calls? A possible alternative is to make sure that none of the synchronization primitives in libgomp is declared inline, and to ask Valgrind users to install the libgomp debuginfo package. Bart. |
|
From: <sv...@va...> - 2008-10-15 16:44:06
|
Author: sewardj
Date: 2008-10-15 17:38:03 +0100 (Wed, 15 Oct 2008)
New Revision: 8674
Log:
Speedups to stack unwinding on amd64-linux. Back out the cacheing
mechanism introduced by r8668 and replace it with something more
powerful, that caches IP-to-DiCfSI mappings for the entire system
instead of on a per-DebugInfo basis.
To make this safe, some previously unstated invariants relating to
DiCfSI records have been documented and are now checked after every
debuginfo load/unload. See the comment labelled
Comment_on_IMPORTANT_CFSI_REPRESENTATIONAL_INVARIANTS in
priv_storage.h.
Modified:
branches/YARD/coregrind/m_debuginfo/debuginfo.c
branches/YARD/coregrind/m_debuginfo/misc.c
branches/YARD/coregrind/m_debuginfo/priv_storage.h
branches/YARD/coregrind/m_debuginfo/storage.c
branches/YARD/coregrind/m_main.c
branches/YARD/coregrind/pub_core_debuginfo.h
Modified: branches/YARD/coregrind/m_debuginfo/debuginfo.c
===================================================================
--- branches/YARD/coregrind/m_debuginfo/debuginfo.c 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/m_debuginfo/debuginfo.c 2008-10-15 16:38:03 UTC (rev 8674)
@@ -99,6 +99,13 @@
/*------------------------------------------------------------*/
+/*--- fwdses ---*/
+/*------------------------------------------------------------*/
+
+static void cfsi_cache__invalidate ( void );
+
+
+/*------------------------------------------------------------*/
/*--- Root structure ---*/
/*------------------------------------------------------------*/
@@ -171,8 +178,8 @@
di->memname = memname ? ML_(dinfo_strdup)("di.debuginfo.aDI.3", memname)
: NULL;
- /* Everything else -- pointers, sizes, arrays -- is zeroed by
- ML_(dinfo_zalloc). Now set up the debugging-output flags. */
+ /* Everything else -- pointers, sizes, arrays -- is zeroed by calloc.
+ Now set up the debugging-output flags. */
traceme
= VG_(string_match)( VG_(clo_trace_symtab_patt), filename )
|| (memname && VG_(string_match)( VG_(clo_trace_symtab_patt),
@@ -314,10 +321,11 @@
/* Repeatedly scan debugInfo_list, looking for DebugInfos with text
AVMAs intersecting [start,start+length), and call discard_DebugInfo
to get rid of them. This modifies the list, hence the multiple
- iterations.
+ iterations. Returns True iff any such DebugInfos were found.
*/
-static void discard_syms_in_range ( Addr start, SizeT length )
+static Bool discard_syms_in_range ( Addr start, SizeT length )
{
+ Bool anyFound = False;
Bool found;
DebugInfo* curr;
@@ -341,8 +349,11 @@
}
if (!found) break;
+ anyFound = True;
discard_DebugInfo( curr );
}
+
+ return anyFound;
}
@@ -472,8 +483,86 @@
}
+/* Debuginfo reading for 'di' has just been successfully completed.
+ Check that the invariants stated in
+ "Comment_on_IMPORTANT_CFSI_REPRESENTATIONAL_INVARIANTS" in
+ priv_storage.h are observed. */
+static void check_CFSI_related_invariants ( DebugInfo* di )
+{
+ DebugInfo* di2 = NULL;
+ vg_assert(di);
+ /* This fn isn't called until after debuginfo for this object has
+ been successfully read. And that shouldn't happen until we have
+ both a r-x and rw- mapping for the object. Hence: */
+ vg_assert(di->have_rx_map);
+ vg_assert(di->have_rw_map);
+ /* degenerate case: r-x section is empty */
+ if (di->rx_map_size == 0) {
+ vg_assert(di->cfsi == NULL);
+ return;
+ }
+ /* normal case: r-x section is nonempty */
+ /* invariant (0) */
+ vg_assert(di->rx_map_size > 0);
+ /* invariant (1) */
+ for (di2 = debugInfo_list; di2; di2 = di2->next) {
+ if (di2 == di)
+ continue;
+ if (di2->rx_map_size == 0)
+ continue;
+ vg_assert(di->rx_map_avma + di->rx_map_size <= di2->rx_map_avma
+ || di2->rx_map_avma + di2->rx_map_size <= di->rx_map_avma);
+ }
+ di2 = NULL;
+ /* invariant (2) */
+ if (di->cfsi) {
+ vg_assert(di->cfsi_minavma <= di->cfsi_maxavma); /* duh! */
+ vg_assert(di->cfsi_minavma >= di->rx_map_avma);
+ vg_assert(di->cfsi_maxavma < di->rx_map_avma + di->rx_map_size);
+ }
+ /* invariants (3) and (4) */
+ if (di->cfsi) {
+ Word i;
+ vg_assert(di->cfsi_used > 0);
+ vg_assert(di->cfsi_size > 0);
+ for (i = 0; i < di->cfsi_used; i++) {
+ DiCfSI* cfsi = &di->cfsi[i];
+ vg_assert(cfsi->len > 0);
+ vg_assert(cfsi->base >= di->cfsi_minavma);
+ vg_assert(cfsi->base + cfsi->len - 1 <= di->cfsi_maxavma);
+ if (i > 0) {
+ DiCfSI* cfsip = &di->cfsi[i-1];
+ vg_assert(cfsip->base + cfsip->len <= cfsi->base);
+ }
+ }
+ } else {
+ vg_assert(di->cfsi_used == 0);
+ vg_assert(di->cfsi_size == 0);
+ }
+}
+
+
/*--------------------------------------------------------------*/
/*--- ---*/
+/*--- TOP LEVEL: INITIALISE THE DEBUGINFO SYSTEM ---*/
+/*--- ---*/
+/*--------------------------------------------------------------*/
+
+void VG_(di_initialise) ( void )
+{
+ /* There's actually very little to do here, since everything
+ centers around the DebugInfos in debugInfo_list, they are
+ created and destroyed on demand, and each one is treated more or
+ less independently. */
+ vg_assert(debugInfo_list == NULL);
+
+ /* flush the CFI fast query cache. */
+ cfsi_cache__invalidate();
+}
+
+
+/*--------------------------------------------------------------*/
+/*--- ---*/
/*--- TOP LEVEL: NOTIFICATION (ACQUIRE/DISCARD INFO) (LINUX) ---*/
/*--- ---*/
/*--------------------------------------------------------------*/
@@ -695,6 +784,9 @@
/* .. and acquire new info. */
ok = ML_(read_elf_debug_info)( di );
+ /* .. and invalidate the CFI unwind cache. */
+ cfsi_cache__invalidate();
+
if (ok) {
TRACE_SYMTAB("\n------ Canonicalising the "
"acquired info ------\n");
@@ -705,11 +797,16 @@
VG_(redir_notify_new_DebugInfo)( di );
/* Note that we succeeded */
di->have_dinfo = True;
+ /* Check invariants listed in
+ Comment_on_IMPORTANT_REPRESENTATIONAL_INVARIANTS in
+ priv_storage.h. */
+ check_CFSI_related_invariants(di);
} else {
TRACE_SYMTAB("\n------ ELF reading failed ------\n");
/* Something went wrong (eg. bad ELF file). Should we delete
this DebugInfo? No - it contains info on the rw/rx
mappings, at least. */
+ vg_assert(di->have_dinfo == False);
}
TRACE_SYMTAB("\n");
@@ -726,8 +823,11 @@
[a, a+len). */
void VG_(di_notify_munmap)( Addr a, SizeT len )
{
+ Bool anyFound;
if (0) VG_(printf)("DISCARD %#lx %#lx\n", a, a+len);
- discard_syms_in_range(a, len);
+ anyFound = discard_syms_in_range(a, len);
+ if (anyFound)
+ cfsi_cache__invalidate();
}
@@ -741,8 +841,11 @@
# if defined(VGP_x86_linux)
exe_ok = exe_ok || toBool(prot & VKI_PROT_READ);
# endif
- if (0 && !exe_ok)
- discard_syms_in_range(a, len);
+ if (0 && !exe_ok) {
+ Bool anyFound = discard_syms_in_range(a, len);
+ if (anyFound)
+ cfsi_cache__invalidate();
+ }
}
#endif /* defined(VGO_linux) */
@@ -771,6 +874,10 @@
Bool is_mainexe,
Bool acquire )
{
+ /* play safe; always invalidate the CFI cache. Not
+ that it should be used on AIX, but still .. */
+ cfsi_cache__invalidate();
+
if (acquire) {
Bool ok;
@@ -812,6 +919,10 @@
VG_(redir_notify_new_DebugInfo)( di );
/* Note that we succeeded */
di->have_dinfo = True;
+ /* Check invariants listed in
+ Comment_on_IMPORTANT_REPRESENTATIONAL_INVARIANTS in
+ priv_storage.h. */
+ check_CFSI_related_invariants(di);
} else {
/* Something went wrong (eg. bad XCOFF file). */
discard_DebugInfo( di );
@@ -822,8 +933,11 @@
/* Dump all the debugInfos whose text segments intersect
code_start/code_len. */
+ /* CFI cache is always invalidated at start of this routine.
+ Hence it's safe to ignore the return value of
+ discard_syms_in_range. */
if (code_len > 0)
- discard_syms_in_range( code_start, code_len );
+ (void)discard_syms_in_range( code_start, code_len );
}
}
@@ -1511,6 +1625,122 @@
}
+/* Search all the DebugInfos in the entire system, to find the DiCfSI
+ that pertains to 'ip'.
+
+ If found, set *diP to the DebugInfo in which it resides, and
+ *ixP to the index in that DebugInfo's cfsi array.
+
+ If not found, set *diP to (DebugInfo*)1 and *ixP to zero.
+*/
+__attribute__((noinline))
+static void find_DiCfSI ( /*OUT*/DebugInfo** diP,
+ /*OUT*/Word* ixP,
+ Addr ip )
+{
+ DebugInfo* di;
+ Word i = -1;
+
+ static UWord n_search = 0;
+ static UWord n_steps = 0;
+ n_search++;
+
+ if (0) VG_(printf)("search for %#lx\n", ip);
+
+ for (di = debugInfo_list; di != NULL; di = di->next) {
+ Word j;
+ n_steps++;
+
+ /* Use the per-DebugInfo summary address ranges to skip
+ inapplicable DebugInfos quickly. */
+ if (di->cfsi_used == 0)
+ continue;
+ if (ip < di->cfsi_minavma || ip > di->cfsi_maxavma)
+ continue;
+
+ /* It might be in this DebugInfo. Search it. */
+ j = ML_(search_one_cfitab)( di, ip );
+ vg_assert(j >= -1 && j < di->cfsi_used);
+
+ if (j != -1) {
+ i = j;
+ break; /* found it */
+ }
+ }
+
+ if (i == -1) {
+
+ /* we didn't find it. */
+ *diP = (DebugInfo*)1;
+ *ixP = 0;
+
+ } else {
+
+ /* found it. */
+ /* ensure that di is 4-aligned (at least), so it can't possibly
+ be equal to (DebugInfo*)1. */
+ vg_assert(di && VG_IS_4_ALIGNED(di));
+ vg_assert(i >= 0 && i < di->cfsi_used);
+ *diP = di;
+ *ixP = i;
+
+ /* Start of performance-enhancing hack: once every 64 (chosen
+ hackily after profiling) successful searches, move the found
+ DebugInfo one step closer to the start of the list. This
+ makes future searches cheaper. For starting konqueror on
+ amd64, this in fact reduces the total amount of searching
+ done by the above find-the-right-DebugInfo loop by more than
+ a factor of 20. */
+ if ((n_search & 0xF) == 0) {
+ /* Move di one step closer to the start of the list. */
+ move_DebugInfo_one_step_forward( di );
+ }
+ /* End of performance-enhancing hack. */
+
+ if (0 && ((n_search & 0x7FFFF) == 0))
+ VG_(printf)("find_DiCfSI: %lu searches, "
+ "%lu DebugInfos looked at\n",
+ n_search, n_steps);
+
+ }
+
+}
+
+
+/* Now follows a mechanism for caching queries to find_DiCfSI, since
+ they are extremely frequent on amd64-linux, during stack unwinding.
+
+ Each cache entry binds an ip value to a (di, ix) pair. Possible
+ values:
+
+ di is non-null, ix >= 0 ==> cache slot in use, "di->cfsi[ix]"
+ di is (DebugInfo*)1 ==> cache slot in use, no associated di
+ di is NULL ==> cache slot not in use
+
+ Hence simply zeroing out the entire cache invalidates all
+ entries.
+
+ Why not map ip values directly to DiCfSI*'s? Because this would
+ cause problems if/when the cfsi array is moved due to resizing.
+ Instead we cache .cfsi array index value, which should be invariant
+ across resizing. (That said, I don't think the current
+ implementation will resize whilst during queries, since the DiCfSI
+ records are added all at once, when the debuginfo for an object is
+ read, and is not changed ever thereafter. */
+
+#define N_CFSI_CACHE 511
+
+typedef
+ struct { Addr ip; DebugInfo* di; Word ix; }
+ CFSICacheEnt;
+
+static CFSICacheEnt cfsi_cache[N_CFSI_CACHE];
+
+static void cfsi_cache__invalidate ( void ) {
+ VG_(memset)(&cfsi_cache, 0, sizeof(cfsi_cache));
+}
+
+
/* The main function for DWARF2/3 CFI-based stack unwinding.
Given an IP/SP/FP triple, produce the IP/SP/FP values for the
previous frame, if possible. */
@@ -1523,61 +1753,47 @@
Addr min_accessible,
Addr max_accessible )
{
- Bool ok;
- Int i;
- DebugInfo* si;
- DiCfSI* cfsi = NULL;
- Addr cfa, ipHere, spHere, fpHere, ipPrev, spPrev, fpPrev;
+ Bool ok;
+ DebugInfo* di;
+ DiCfSI* cfsi = NULL;
+ Addr cfa, ipHere, spHere, fpHere, ipPrev, spPrev, fpPrev;
CfiExprEvalContext eec;
- static UInt n_search = 0;
- static UInt n_steps = 0;
- n_search++;
+ static UWord n_q = 0, n_m = 0;
+ n_q++;
+ if (0 && 0 == (n_q & 0x1FFFFF))
+ VG_(printf)("QQQ %lu %lu\n", n_q, n_m);
- if (0) VG_(printf)("search for %#lx\n", *ipP);
+ { UWord hash = (*ipP) % N_CFSI_CACHE;
+ CFSICacheEnt* ce = &cfsi_cache[hash];
- for (si = debugInfo_list; si != NULL; si = si->next) {
- n_steps++;
+ if (LIKELY(ce->ip == *ipP) && LIKELY(ce->di != NULL)) {
+ /* found an entry in the cache .. */
+ } else {
+ /* not found in cache. Search and update. */
+ n_m++;
+ ce->ip = *ipP;
+ find_DiCfSI( &ce->di, &ce->ix, *ipP );
+ }
- /* Use the per-DebugInfo summary address ranges to skip
- inapplicable DebugInfos quickly. */
- if (si->cfsi_used == 0)
- continue;
- if (*ipP < si->cfsi_minavma || *ipP > si->cfsi_maxavma)
- continue;
-
- i = ML_(search_one_cfitab)( si, *ipP );
- if (i != -1) {
- vg_assert(i >= 0 && i < si->cfsi_used);
- cfsi = &si->cfsi[i];
- break;
- }
+ if (UNLIKELY(ce->di == (DebugInfo*)1)) {
+ /* no DiCfSI for this address */
+ cfsi = NULL;
+ di = NULL;
+ } else {
+ /* found a DiCfSI for this address */
+ di = ce->di;
+ cfsi = &di->cfsi[ ce->ix ];
+ }
}
- if (cfsi == NULL)
- return False;
+ if (UNLIKELY(cfsi == NULL))
+ return False; /* no info. Nothing we can do. */
- if (0 && ((n_search & 0x7FFFF) == 0))
- VG_(printf)("VG_(use_CF_info): %u searches, "
- "%u DebugInfos looked at\n",
- n_search, n_steps);
-
- /* Start of performance-enhancing hack: once every 64 (chosen
- hackily after profiling) successful searches, move the found
- DebugInfo one step closer to the start of the list. This makes
- future searches cheaper. For starting konqueror on amd64, this
- in fact reduces the total amount of searching done by the above
- find-the-right-DebugInfo loop by more than a factor of 20. */
- if ((n_search & 0x3F) == 0) {
- /* Move si one step closer to the start of the list. */
- move_DebugInfo_one_step_forward( si );
- }
- /* End of performance-enhancing hack. */
-
if (0) {
VG_(printf)("found cfisi: ");
- ML_(ppDiCfSI)(si->cfsi_exprs, cfsi);
+ ML_(ppDiCfSI)(di->cfsi_exprs, cfsi);
}
ipPrev = spPrev = fpPrev = 0;
@@ -1598,7 +1814,7 @@
case CFIC_EXPR:
if (0) {
VG_(printf)("CFIC_EXPR: ");
- ML_(ppCfiExpr)(si->cfsi_exprs, cfsi->cfa_off);
+ ML_(ppCfiExpr)(di->cfsi_exprs, cfsi->cfa_off);
VG_(printf)("\n");
}
eec.ipHere = ipHere;
@@ -1607,7 +1823,7 @@
eec.min_accessible = min_accessible;
eec.max_accessible = max_accessible;
ok = True;
- cfa = evalCfiExpr(si->cfsi_exprs, cfsi->cfa_off, &eec, &ok );
+ cfa = evalCfiExpr(di->cfsi_exprs, cfsi->cfa_off, &eec, &ok );
if (!ok) return False;
break;
default:
@@ -1637,14 +1853,14 @@
break; \
case CFIR_EXPR: \
if (0) \
- ML_(ppCfiExpr)(si->cfsi_exprs,_off); \
+ ML_(ppCfiExpr)(di->cfsi_exprs,_off); \
eec.ipHere = ipHere; \
eec.spHere = spHere; \
eec.fpHere = fpHere; \
eec.min_accessible = min_accessible; \
eec.max_accessible = max_accessible; \
ok = True; \
- _prev = evalCfiExpr(si->cfsi_exprs, _off, &eec, &ok ); \
+ _prev = evalCfiExpr(di->cfsi_exprs, _off, &eec, &ok ); \
if (!ok) return False; \
break; \
default: \
Modified: branches/YARD/coregrind/m_debuginfo/misc.c
===================================================================
--- branches/YARD/coregrind/m_debuginfo/misc.c 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/m_debuginfo/misc.c 2008-10-15 16:38:03 UTC (rev 8674)
@@ -42,8 +42,6 @@
#include "priv_misc.h" /* self */
-/* Various functions rely on this returning zeroed memory.
- alloc_DebugInfo is one of them. */
void* ML_(dinfo_zalloc) ( HChar* cc, SizeT szB ) {
void* v;
vg_assert(szB > 0);
Modified: branches/YARD/coregrind/m_debuginfo/priv_storage.h
===================================================================
--- branches/YARD/coregrind/m_debuginfo/priv_storage.h 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/m_debuginfo/priv_storage.h 2008-10-15 16:38:03 UTC (rev 8674)
@@ -241,8 +241,6 @@
#define SEGINFO_STRCHUNKSIZE (64*1024)
-#define N_CFSI_SEARCH_CACHE 8
-
struct _DebugInfo {
/* Admin stuff */
@@ -300,7 +298,46 @@
in some obscure circumstances (to do with data/sdata/bss) it is
possible for the mapping to be present but have zero size.
Certainly text_ is mandatory on all platforms; not sure about
- the rest though. */
+ the rest though.
+
+ Comment_on_IMPORTANT_CFSI_REPRESENTATIONAL_INVARIANTS: we require that
+
+ either (rx_map_size == 0 && cfsi == NULL) (the degenerate case)
+
+ or the normal case, which is the AND of the following:
+ (0) rx_map_size > 0
+ (1) no two DebugInfos with rx_map_size > 0
+ have overlapping [rx_map_avma,+rx_map_size)
+ (2) [cfsi_minavma,cfsi_maxavma] does not extend
+ beyond [rx_map_avma,+rx_map_size); that is, the former is a
+ subrange or equal to the latter.
+ (3) all DiCfSI in the cfsi array all have ranges that fall within
+ [rx_map_avma,+rx_map_size).
+ (4) all DiCfSI in the cfsi array are non-overlapping
+
+ The cumulative effect of these restrictions is to ensure that
+ all the DiCfSI records in the entire system are non overlapping.
+ Hence any address falls into either exactly one DiCfSI record,
+ or none. Hence it is safe to cache the results of searches for
+ DiCfSI records. This is the whole point of these restrictions.
+ The caching of DiCfSI searches is done in VG_(use_CF_info). The
+ cache is flushed after any change to debugInfo_list. DiCfSI
+ searches are cached because they are central to stack unwinding
+ on amd64-linux.
+
+ Where are these invariants imposed and checked?
+
+ They are checked after a successful read of debuginfo into
+ a DebugInfo*, in check_CFSI_related_invariants.
+
+ (1) is not really imposed anywhere. We simply assume that the
+ kernel will not map the text segments from two different objects
+ into the same space. Sounds reasonable.
+
+ (2) follows from (4) and (3). It is ensured by canonicaliseCFI.
+ (3) is ensured by ML_(addDiCfSI).
+ (4) is ensured by canonicaliseCFI.
+ */
/* .text */
Bool text_present;
Addr text_avma;
@@ -364,20 +401,11 @@
records require any expression nodes, they are stored in
cfsi_exprs. */
DiCfSI* cfsi;
- UWord cfsi_used;
- UWord cfsi_size;
+ UInt cfsi_used;
+ UInt cfsi_size;
Addr cfsi_minavma;
Addr cfsi_maxavma;
XArray* cfsi_exprs; /* XArray of CfiExpr */
- /* Stack unwinding on amd64 causes a lot of searching in .cfsi to
- find the DiCfSI record that covers a particular address. To
- speed up the searches we add a small (8-entry) cache containing
- cached results from ML_(search_one_cfitab). This speeds up the
- searching by about a factor of 3 and overall increases the stack
- unwind speed by about 50% on amd64-linux on large C++ apps. */
- UWord cfsi_search_cache_used; /* 0 .. N_CFSI_SEARCH_CACHE */
- struct { Addr aMin; Addr aMax; Word ix; }
- cfsi_search_cache[N_CFSI_SEARCH_CACHE];
/* Expandable arrays of characters -- the string table. Pointers
into this are stable (the arrays are not reallocated). */
@@ -475,7 +503,7 @@
/* Find a CFI-table index containing the specified pointer, or -1 if
not found. Binary search. */
-extern Word ML_(search_one_cfitab) ( struct _DebugInfo* di, Addr ptr );
+extern Int ML_(search_one_cfitab) ( struct _DebugInfo* di, Addr ptr );
/* ------ Misc ------ */
Modified: branches/YARD/coregrind/m_debuginfo/storage.c
===================================================================
--- branches/YARD/coregrind/m_debuginfo/storage.c 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/m_debuginfo/storage.c 2008-10-15 16:38:03 UTC (rev 8674)
@@ -365,9 +365,6 @@
ML_(ppDiCfSI)(di->cfsi_exprs, cfsi);
}
- /* invalidate the lookup cache */
- di->cfsi_search_cache_used = 0;
-
/* sanity */
vg_assert(cfsi->len > 0);
/* If this fails, the implication is you have a single procedure
@@ -1288,11 +1285,8 @@
di->cfsi_maxavma = here_max;
}
- /* invalidate the lookup cache */
- di->cfsi_search_cache_used = 0;
-
if (di->trace_cfi)
- VG_(printf)("canonicaliseCfiSI: %ld entries, %#lx .. %#lx\n",
+ VG_(printf)("canonicaliseCfiSI: %d entries, %#lx .. %#lx\n",
di->cfsi_used,
di->cfsi_minavma, di->cfsi_maxavma);
@@ -1426,12 +1420,11 @@
/* Find a CFI-table index containing the specified pointer, or -1
if not found. Binary search. */
-__attribute__((noinline))
-static
-Word search_one_cfitab_WRK ( struct _DebugInfo* di, Addr ptr )
+
+Int ML_(search_one_cfitab) ( struct _DebugInfo* di, Addr ptr )
{
Addr a_mid_lo, a_mid_hi;
- Word mid, size,
+ Int mid, size,
lo = 0,
hi = di->cfsi_used-1;
while (True) {
@@ -1449,57 +1442,7 @@
}
}
-/* Find the CFI-table index containing the specified pointer,
- or -1 if not found. It uses search_one_cfitab_WRK to do the
- real work, but caches the results so as to avoid calling
- search_one_cfitab_WRK nost of the time. */
-Word ML_(search_one_cfitab) ( struct _DebugInfo* di, Addr ptr )
-{
- static UWord nq = 0;
- static UWord nm = 0;
- Word i, w;
-
- if (0 && 0 == (nq & 0x3FFFF))
- VG_(printf)("ZZZZZ %lu qs %lu misses\n", nq,nm);
-
- nq++;
- /* Check the cache first */
- for (i = 0; i < di->cfsi_search_cache_used; i++) {
- if (di->cfsi_search_cache[i].aMin <= ptr
- && ptr <= di->cfsi_search_cache[i].aMax) {
- /* found it. Once in every 16 searches, move the found element
- one step closer to the front. */
- if (i > 0 && 0 == (nq & 0xF)) {
- __typeof__(di->cfsi_search_cache[0]) tmp;
- tmp = di->cfsi_search_cache[i-1];
- di->cfsi_search_cache[i-1] = di->cfsi_search_cache[i];
- di->cfsi_search_cache[i] = tmp;
- i--;
- }
- return di->cfsi_search_cache[i].ix;
- }
- }
- /* not found in the cache; do slow search */
- nm++;
- w = search_one_cfitab_WRK(di, ptr);
- if (w >= 0) {
- /* We got a result, so update the cache. Slide all entries
- along one and insert new one at [0]. */
- for (i = N_CFSI_SEARCH_CACHE-1; i >= 1; i--) {
- di->cfsi_search_cache[i] = di->cfsi_search_cache[i-1];
- }
- if (di->cfsi_search_cache_used < N_CFSI_SEARCH_CACHE)
- di->cfsi_search_cache_used++;
- tl_assert(di->cfsi[w].len > 0);
- di->cfsi_search_cache[0].aMin = di->cfsi[w].base;
- di->cfsi_search_cache[0].aMax = di->cfsi[w].base + di->cfsi[w].len - 1;
- di->cfsi_search_cache[0].ix = w;
- }
- return w;
-}
-
-
/*--------------------------------------------------------------------*/
/*--- end ---*/
/*--------------------------------------------------------------------*/
Modified: branches/YARD/coregrind/m_main.c
===================================================================
--- branches/YARD/coregrind/m_main.c 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/m_main.c 2008-10-15 16:38:03 UTC (rev 8674)
@@ -1345,6 +1345,12 @@
//============================================================
//--------------------------------------------------------------
+ // Initialise m_debuginfo
+ // p: dynamic memory allocation
+ VG_(debugLog)(1, "main", "Initialise m_debuginfo\n");
+ VG_(di_initialise)();
+
+ //--------------------------------------------------------------
// Look for alternative libdir
{ HChar *cp = VG_(getenv)(VALGRIND_LIB);
if (cp != NULL)
@@ -1714,6 +1720,7 @@
// p: setup_code_redirect_table [so that redirs can be recorded]
// p: mallocfree
// p: probably: setup fds and process CLOs, so that logging works
+ // p: initialise m_debuginfo
//--------------------------------------------------------------
VG_(debugLog)(1, "main", "Load initial debug info\n");
# if defined(VGO_linux)
Modified: branches/YARD/coregrind/pub_core_debuginfo.h
===================================================================
--- branches/YARD/coregrind/pub_core_debuginfo.h 2008-10-15 16:30:55 UTC (rev 8673)
+++ branches/YARD/coregrind/pub_core_debuginfo.h 2008-10-15 16:38:03 UTC (rev 8674)
@@ -39,6 +39,9 @@
#include "pub_tool_debuginfo.h"
+/* Initialise the entire module. Must be called first of all. */
+extern void VG_(di_initialise) ( void );
+
/* LINUX: Notify the debuginfo system about a new mapping, or the
disappearance of such, or a permissions change on an existing
mapping. This is the way new debug information gets loaded. If
|
|
From: <sv...@va...> - 2008-10-15 16:35:18
|
Author: sewardj Date: 2008-10-15 17:30:55 +0100 (Wed, 15 Oct 2008) New Revision: 8673 Log: OProfile is a really useful tool for profiling Valgrind, but it is a fragile PITA and often doesn't work for completely non-obvious reasons. This file contains some hints on how to get a working OProfile setup. Added: branches/YARD/docs/internals/howto_oprofile.txt Modified: branches/YARD/docs/internals/Makefile.am Modified: branches/YARD/docs/internals/Makefile.am =================================================================== --- branches/YARD/docs/internals/Makefile.am 2008-10-15 12:19:07 UTC (rev 8672) +++ branches/YARD/docs/internals/Makefile.am 2008-10-15 16:30:55 UTC (rev 8673) @@ -3,6 +3,7 @@ 3_2_BUGSTATUS.txt 3_3_BUGSTATUS.txt \ darwin-notes.txt darwin-syscalls.txt \ directory-structure.txt \ + howto_oprofile.txt \ m_replacemalloc.txt \ m_syswrap.txt module-structure.txt notes.txt porting-HOWTO.txt \ mpi2entries.txt \ Added: branches/YARD/docs/internals/howto_oprofile.txt =================================================================== --- branches/YARD/docs/internals/howto_oprofile.txt (rev 0) +++ branches/YARD/docs/internals/howto_oprofile.txt 2008-10-15 16:30:55 UTC (rev 8673) @@ -0,0 +1,41 @@ + +# Note that you must do all the following as root (I believe). +# Although the program to be profiled can be run by anybody. + +# start the profiler +opcontrol --stop ; opcontrol --reset ; opcontrol --callgraph=5 --start + +# now run the program(s) to be profiled + +# stop the profiler and dump results to .. um .. some file somewhere +opcontrol --stop ; opcontrol --dump + +# produce a flat profile +opreport --merge=tgid --symbols -x \ + /home/sewardj/VgTRUNK/hgdev/Inst/lib/valgrind/x86-linux/helgrind \ + | less + +# produce a profile w/ callgraph +opreport --merge=tgid --callgraph \ + /home/sewardj/VgTRUNK/hgdev/Inst/lib/valgrind/x86-linux/helgrind \ + | less + +#### notes. + +1. on amd64, need to build V with -fno-omit-frame-pointer, else the + w/ callgraph profiles are useless. (oprofile doesn't do CFI based + stack unwinding, I guess). Add -fno-omit-frame-pointer to + AM_CFLAGS_BASE in Makefile.flags.am, and rebuild from clean. + +2. even at the best of times the callgraph profiles seem pretty + flaky to me. + +3. Even oprofile 0.9.4 (the latest) on amd64-linux doesn't work + for callgraph profiling. There is however a patch that + makes it work. See + +http://sourceforge.net/tracker/index.php?func=detail&aid=1685267&group_id=16191&atid=116191 + + for details. Even then it sometimes fails at the "opcontrol + --dump" phase, complaining that the daemon died (or something like + that). But apart from that, it seems usable. |
|
From: Julian S. <js...@ac...> - 2008-10-15 16:31:35
|
The server is running again. It appears there was a power cut to the machine about 4am Tuesday, resulting in corruption of the repo databases. A bit of "svnadmin recover"-ing fixed it. J On Wednesday 15 October 2008, Julian Seward wrote: > The svn server is currently not running, having failed with some > strange messages from the svn server process presumably at some point > in the past 24 hours or so. I'm not sure what the problem is, but > am trying to fix it. > > J > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge Build the coolest Linux based applications with Moblin SDK & win > great prizes Grand prize is a trip for two to an Open Source event anywhere > in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Julian S. <js...@ac...> - 2008-10-15 16:27:11
|
> 1) If we could ask GCC guys to generate debugging information telling
> where each inline function is, would that help Valgrind to intercept the
> calls?
We need to see the entry and exit of these functions, along with the
arguments and return value. This is all very difficult because the
point of inlining is only partially to avoid a function call. More
important is that the compiler can then transform the merged caller-callee
pair arbitrarily, as it wants. This means there may not really be any
well defined boundary between the caller and callee when it's done.
That's all a bit abstract. Example. Before:
void foo ( int x ) {
if (x) { A }; else { B };
}
void bar ( int x ) {
foo(x);
C;
}
The "obvious" result of inlining is
void bar ( int x ) {
if (x) { A }; else { B; }
C;
}
But suppose C is small enough to duplicate; or for whatever reason, placing
it directly after A and B is beneficial. Then this might be the result:
void bar ( int x ) {
if (x) { A; C; } else { B; C };
}
Now you really have to mark _two_ different exit points from the inlined
"foo". etc, etc.
In the general case I don't think this idea is likely to work, unfortunately.
A simpler solution would be simply not to inline this stuff.
Maybe the gcc people can make a better suggestion.
> 2) Julian said that detecting locking primitives using only instructions
> is too complex, maybe impossible. Well, but as far as I understood, you
> are assuming a "general locking primitives detector". What if we limit
> this problem only to the locking primitives present in the libgomp?
> Would that be easier to do? (Of course it has a down side because every
> time the libgomp changed, we would have to change Valgrind too... But I
> think it's a valid question anyway)
A general locking primitives detector would be really useful, although
(as Bart suggests) maybe impossible. Maybe it is equivalent to solving the
halting problem. I don't know.
From my brief investigation of the libgomp primitives, they are the same or
similar to that which libpthread uses. So a solution to libgomp would also
allow us to see inside libpthread, which would be good. But to be honest,
overall I simply don't understand enough about the problem at this point to
answer this question properly.
J
|
|
From: Sérgio D. J. <ser...@li...> - 2008-10-15 14:47:54
|
Hello Julian and Bart, On Sun, 2008-10-12 at 10:48 +0200, Julian Seward wrote: > If you can think of a solution to this ... So, based on what you two said to me (by the way, thanks for all the explanation), I have two questions. 1) If we could ask GCC guys to generate debugging information telling where each inline function is, would that help Valgrind to intercept the calls? 2) Julian said that detecting locking primitives using only instructions is too complex, maybe impossible. Well, but as far as I understood, you are assuming a "general locking primitives detector". What if we limit this problem only to the locking primitives present in the libgomp? Would that be easier to do? (Of course it has a down side because every time the libgomp changed, we would have to change Valgrind too... But I think it's a valid question anyway) So, basically that's it. Regarding to the first question, it'd be good to know how much of the debugging information available in a binary Valgrind can use. Thanks :-) -- Sérgio Durigan Júnior Linux on Power Toolchain - Software Engineer Linux Technology Center - LTC IBM Brazil |
|
From: <sv...@va...> - 2008-10-15 14:08:04
|
Author: sewardj Date: 2008-10-15 13:19:07 +0100 (Wed, 15 Oct 2008) New Revision: 8672 Log: Test commit, to see if post-commit email works again following recent svn server problems. Modified: branches/YARD/NEWS Modified: branches/YARD/NEWS =================================================================== --- branches/YARD/NEWS 2008-10-15 12:10:11 UTC (rev 8671) +++ branches/YARD/NEWS 2008-10-15 12:19:07 UTC (rev 8672) @@ -1,6 +1,6 @@ Release 3.4.0 (???) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Changes include: - SSE3 insns now supported. Changed CPUID output to be Core-2, so now it claims to be a Core 2 E6600. |
|
From: <sv...@va...> - 2008-10-15 14:04:42
|
Author: sewardj
Date: 2008-10-15 13:10:11 +0100 (Wed, 15 Oct 2008)
New Revision: 8671
Log:
Print a reasonable error message (like Memcheck) when running out of
memory, instead of asserting.
Modified:
branches/YARD/helgrind/libhb_core.c
Modified: branches/YARD/helgrind/libhb_core.c
===================================================================
--- branches/YARD/helgrind/libhb_core.c 2008-10-13 19:22:35 UTC (rev 8670)
+++ branches/YARD/helgrind/libhb_core.c 2008-10-15 12:10:11 UTC (rev 8671)
@@ -358,6 +358,9 @@
VG_(printf)("XXXXX bigchunk: abandoning %d bytes\n",
(Int)(shmem__bigchunk_end1 - shmem__bigchunk_next));
shmem__bigchunk_next = VG_(am_shadow_alloc)( sHMEM__BIGCHUNK_SIZE );
+ if (shmem__bigchunk_next == NULL)
+ VG_(out_of_memory_NORETURN)(
+ "helgrind:shmem__bigchunk_alloc", sHMEM__BIGCHUNK_SIZE );
shmem__bigchunk_end1 = shmem__bigchunk_next + sHMEM__BIGCHUNK_SIZE;
}
tl_assert(shmem__bigchunk_next);
|
|
From: Julian S. <js...@ac...> - 2008-10-15 03:14:32
|
The svn server is currently not running, having failed with some strange messages from the svn server process presumably at some point in the past 24 hours or so. I'm not sure what the problem is, but am trying to fix it. J |