You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
|
2
|
3
|
4
|
|
5
(3) |
6
(7) |
7
(6) |
8
(10) |
9
(6) |
10
(3) |
11
(1) |
|
12
|
13
|
14
(2) |
15
(6) |
16
(16) |
17
(8) |
18
(1) |
|
19
|
20
|
21
(1) |
22
|
23
|
24
|
25
|
|
26
|
27
(1) |
28
(2) |
29
|
30
(1) |
31
(1) |
|
|
From: Nicholas N. <n.n...@gm...> - 2010-12-06 22:32:54
|
On Tue, Dec 7, 2010 at 7:59 AM, Bruce Merry <bm...@gm...> wrote: > > The intended use is to be able to visually identify poor memory access > patterns. Cachegrind is all well and good for telling you that one > small area of code is hammering the cache, but there are a lot of > things it doesn't show so clearly. For example, code that simply has > to iterate across megabytes of data but has a nice linear access > pattern will show up much hotter in cachegrind than code that touches > one byte in each of a few hundred cache lines, even though the latter > is the only one you can do much about. I've already used it to spot > several places where data structure layout can be improved in a > performance-critical application. Fair enough, I can believe that would be useful. > Perhaps at this stage a more appropriate way to advertise this to > interested parties would be with a link on > http://valgrind.org/downloads/variants.html? Obviously I'd need to > actually write a web page for it to link to first. Yes, I'd be happy to put a link there once you have something to link to. Nick |
|
From: Bruce M. <bm...@gm...> - 2010-12-06 20:59:11
|
On 6 December 2010 05:16, Nicholas Nethercote <n.n...@gm...> wrote: > On Mon, Dec 6, 2010 at 2:02 AM, Bruce Merry <bm...@gm...> wrote: >> >> I've been developing a Valgrind tool for logging addresses of data >> accesses (with a more compact binary format than the lackey output) >> and a corresponding graphical viewer. I'd like to find out whether >> there is interest from the Valgrind developer community for this to be >> incorporated into Valgrind at some point. > > First question: what's it useful for? I see some pictures of sorting > algorithms which are pretty but don't seem terribly useful... Hi To some extent, the sorting is more pretty than useful, although there may be some applications in reverse engineering (I only thought of running it on sorting algorithms because one of my first tests of the viewer was running it on "ls" and when I looked at the output I could clearly identify a merge sort). The intended use is to be able to visually identify poor memory access patterns. Cachegrind is all well and good for telling you that one small area of code is hammering the cache, but there are a lot of things it doesn't show so clearly. For example, code that simply has to iterate across megabytes of data but has a nice linear access pattern will show up much hotter in cachegrind than code that touches one byte in each of a few hundred cache lines, even though the latter is the only one you can do much about. I've already used it to spot several places where data structure layout can be improved in a performance-critical application. > After that, it would need good docs, preferably some reasonable number > of tests, and a commitment from you to maintain it. Useful info, thanks. I've got some docs written - they're probably not fully up to date at this point, but they're reasonably detailed and written in the DocBook framework that the rest of the Valgrind docs use. I haven't gotten around to writing tests yet. In the short term I wouldn't be able to make any commitments about maintaining it since I have my fingers in too many pies at the moment, but that may change by the time I get around to writing tests and fixing up the known bugs. Perhaps at this stage a more appropriate way to advertise this to interested parties would be with a link on http://valgrind.org/downloads/variants.html? Obviously I'd need to actually write a web page for it to link to first. Thanks Bruce -- Dr Bruce Merry bmerry <@> gmail <.> com http://www.brucemerry.org.za/ http://blog.brucemerry.org.za/ |
|
From: <sv...@va...> - 2010-12-06 11:40:14
|
Author: sewardj
Date: 2010-12-06 11:40:04 +0000 (Mon, 06 Dec 2010)
New Revision: 11483
Log:
New command line option: --trace-children-skip-by-arg, which allows
chase/nochase decisions for child processes to be made on the basis
of their argv[] entries rather than on the name of their executables.
Modified:
trunk/coregrind/m_main.c
trunk/coregrind/m_options.c
trunk/coregrind/m_syswrap/syswrap-generic.c
trunk/coregrind/pub_core_options.h
trunk/docs/xml/manual-core.xml
trunk/none/tests/cmdline1.stdout.exp
trunk/none/tests/cmdline2.stdout.exp
Modified: trunk/coregrind/m_main.c
===================================================================
--- trunk/coregrind/m_main.c 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/coregrind/m_main.c 2010-12-06 11:40:04 UTC (rev 11483)
@@ -124,6 +124,9 @@
" --trace-children=no|yes Valgrind-ise child processes (follow execve)? [no]\n"
" --trace-children-skip=patt1,patt2,... specifies a list of executables\n"
" that --trace-children=yes should not trace into\n"
+" --trace-children-skip-by-arg=patt1,patt2,... same as --trace-children-skip=\n"
+" but check the argv[] entries for children, rather\n"
+" than the exe name, to make a follow/no-follow decision\n"
" --child-silent-after-fork=no|yes omit child output between fork & exec? [no]\n"
" --track-fds=no|yes track open file descriptors? [no]\n"
" --time-stamp=no|yes add timestamps to log messages? [no]\n"
@@ -503,7 +506,10 @@
else if VG_BOOL_CLO(arg, "--dsymutil", VG_(clo_dsymutil)) {}
- else if VG_STR_CLO (arg, "--trace-children-skip", VG_(clo_trace_children_skip)) {}
+ else if VG_STR_CLO (arg, "--trace-children-skip",
+ VG_(clo_trace_children_skip)) {}
+ else if VG_STR_CLO (arg, "--trace-children-skip-by-arg",
+ VG_(clo_trace_children_skip_by_arg)) {}
else if VG_BINT_CLO(arg, "--vex-iropt-verbosity",
VG_(clo_vex_control).iropt_verbosity, 0, 10) {}
Modified: trunk/coregrind/m_options.c
===================================================================
--- trunk/coregrind/m_options.c 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/coregrind/m_options.c 2010-12-06 11:40:04 UTC (rev 11483)
@@ -57,6 +57,7 @@
Bool VG_(clo_demangle) = True;
Bool VG_(clo_trace_children) = False;
HChar* VG_(clo_trace_children_skip) = NULL;
+HChar* VG_(clo_trace_children_skip_by_arg) = NULL;
Bool VG_(clo_child_silent_after_fork) = False;
Char* VG_(clo_log_fname_expanded) = NULL;
Char* VG_(clo_xml_fname_expanded) = NULL;
@@ -255,9 +256,13 @@
}
/* Should we trace into this child executable (across execve etc) ?
- This involves considering --trace-children=, --trace-children-skip=
- and the name of the executable. */
-Bool VG_(should_we_trace_this_child) ( HChar* child_exe_name )
+ This involves considering --trace-children=,
+ --trace-children-skip=, --trace-children-skip-by-arg=, and the name
+ of the executable. 'child_argv' must not include the name of the
+ executable itself; iow child_argv[0] must be the first arg, if any,
+ for the child. */
+Bool VG_(should_we_trace_this_child) ( HChar* child_exe_name,
+ HChar** child_argv )
{
// child_exe_name is pulled out of the guest's space. We
// should be at least marginally cautious with it, lest it
@@ -265,13 +270,13 @@
if (child_exe_name == NULL || VG_(strlen)(child_exe_name) == 0)
return VG_(clo_trace_children); // we know narfink
- // the main logic
// If --trace-children=no, the answer is simply NO.
if (! VG_(clo_trace_children))
return False;
- // otherwise, return True, unless the exe name matches any of the
- // patterns specified by --trace-children-skip=.
+ // Otherwise, look for other reasons to say NO. First,
+ // see if the exe name matches any of the patterns specified
+ // by --trace-children-skip=.
if (VG_(clo_trace_children_skip)) {
HChar const* last = VG_(clo_trace_children_skip);
HChar const* name = (HChar const*)child_exe_name;
@@ -294,7 +299,36 @@
return False;
}
}
-
+
+ // Check if any of the args match any of the patterns specified
+ // by --trace-children-skip-by-arg=.
+ if (VG_(clo_trace_children_skip_by_arg) && child_argv != NULL) {
+ HChar const* last = VG_(clo_trace_children_skip_by_arg);
+ while (*last) {
+ Int i;
+ Bool matches;
+ HChar* patt;
+ HChar const* first = consume_commas(last);
+ last = consume_field(first);
+ if (first == last)
+ break;
+ vg_assert(last > first);
+ /* copy the candidate string into a temporary malloc'd block
+ so we can use VG_(string_match) on it. */
+ patt = VG_(calloc)("m_options.swttc.1", last - first + 1, 1);
+ VG_(memcpy)(patt, first, last - first);
+ vg_assert(patt[last-first] == 0);
+ for (i = 0; child_argv[i]; i++) {
+ matches = VG_(string_match)(patt, child_argv[i]);
+ if (matches) {
+ VG_(free)(patt);
+ return False;
+ }
+ }
+ VG_(free)(patt);
+ }
+ }
+
// --trace-children=yes, and this particular executable isn't
// excluded
return True;
Modified: trunk/coregrind/m_syswrap/syswrap-generic.c
===================================================================
--- trunk/coregrind/m_syswrap/syswrap-generic.c 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/coregrind/m_syswrap/syswrap-generic.c 2010-12-06 11:40:04 UTC (rev 11483)
@@ -2550,8 +2550,29 @@
return;
}
+ // debug-only printing
+ if (0) {
+ VG_(printf)("ARG1 = %p(%s)\n", (void*)ARG1, (HChar*)ARG1);
+ if (ARG2) {
+ VG_(printf)("ARG2 = ");
+ Int q;
+ HChar** vec = (HChar**)ARG2;
+ for (q = 0; vec[q]; q++)
+ VG_(printf)("%p(%s) ", vec[q], vec[q]);
+ VG_(printf)("\n");
+ } else {
+ VG_(printf)("ARG2 = null\n");
+ }
+ }
+
// Decide whether or not we want to follow along
- trace_this_child = VG_(should_we_trace_this_child)( (HChar*)ARG1 );
+ { // Make 'child_argv' be a pointer to the child's arg vector
+ // (skipping the exe name)
+ HChar** child_argv = (HChar**)ARG2;
+ if (child_argv && child_argv[0] == NULL)
+ child_argv = NULL;
+ trace_this_child = VG_(should_we_trace_this_child)( (HChar*)ARG1, child_argv );
+ }
// Do the important checks: it is a file, is executable, permissions are
// ok, etc. We allow setuid executables to run only in the case when
Modified: trunk/coregrind/pub_core_options.h
===================================================================
--- trunk/coregrind/pub_core_options.h 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/coregrind/pub_core_options.h 2010-12-06 11:40:04 UTC (rev 11483)
@@ -70,6 +70,10 @@
/* String containing comma-separated patterns for executable names
that should not be traced into even when --trace-children=yes */
extern HChar* VG_(clo_trace_children_skip);
+/* The same as VG_(clo_trace_children), except that these patterns are
+ tested against the arguments for child processes, rather than the
+ executable name. */
+extern HChar* VG_(clo_trace_children_skip_by_arg);
/* After a fork, the child's output can become confusingly
intermingled with the parent's output. This is especially
problematic when VG_(clo_xml) is True. Setting
@@ -220,9 +224,13 @@
extern Bool VG_(clo_dsymutil);
/* Should we trace into this child executable (across execve etc) ?
- This involves considering --trace-children=, --trace-children-skip=
- and the name of the executable. */
-extern Bool VG_(should_we_trace_this_child) ( HChar* child_exe_name );
+ This involves considering --trace-children=,
+ --trace-children-skip=, --trace-children-skip-by-arg=, and the name
+ of the executable. 'child_argv' must not include the name of the
+ executable itself; iow child_argv[0] must be the first arg, if any,
+ for the child. */
+extern Bool VG_(should_we_trace_this_child) ( HChar* child_exe_name,
+ HChar** child_argv );
#endif // __PUB_CORE_OPTIONS_H
Modified: trunk/docs/xml/manual-core.xml
===================================================================
--- trunk/docs/xml/manual-core.xml 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/docs/xml/manual-core.xml 2010-12-06 11:40:04 UTC (rev 11483)
@@ -663,7 +663,7 @@
<varlistentry id="opt.trace-children-skip" xreflabel="--trace-children-skip">
<term>
- <option><![CDATA[--trace-children-skip=patt1,patt2 ]]></option>
+ <option><![CDATA[--trace-children-skip=patt1,patt2,... ]]></option>
</term>
<listitem>
<para>This option only has an effect when
@@ -687,6 +687,20 @@
</listitem>
</varlistentry>
+ <varlistentry id="opt.trace-children-skip-by-arg"
+ xreflabel="--trace-children-skip-by-arg">
+ <term>
+ <option><![CDATA[--trace-children-skip-by-arg=patt1,patt2,... ]]></option>
+ </term>
+ <listitem>
+ <para>This is the same as
+ <option>--trace-children-skip</option>, with one difference:
+ the decision as to whether to trace into a child process is
+ made by examining the arguments to the child process, rather
+ than the name of its executable.</para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="opt.child-silent-after-fork"
xreflabel="--child-silent-after-fork">
<term>
Modified: trunk/none/tests/cmdline1.stdout.exp
===================================================================
--- trunk/none/tests/cmdline1.stdout.exp 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/none/tests/cmdline1.stdout.exp 2010-12-06 11:40:04 UTC (rev 11483)
@@ -12,6 +12,9 @@
--trace-children=no|yes Valgrind-ise child processes (follow execve)? [no]
--trace-children-skip=patt1,patt2,... specifies a list of executables
that --trace-children=yes should not trace into
+ --trace-children-skip-by-arg=patt1,patt2,... same as --trace-children-skip=
+ but check the argv[] entries for children, rather
+ than the exe name, to make a follow/no-follow decision
--child-silent-after-fork=no|yes omit child output between fork & exec? [no]
--track-fds=no|yes track open file descriptors? [no]
--time-stamp=no|yes add timestamps to log messages? [no]
Modified: trunk/none/tests/cmdline2.stdout.exp
===================================================================
--- trunk/none/tests/cmdline2.stdout.exp 2010-12-06 11:11:29 UTC (rev 11482)
+++ trunk/none/tests/cmdline2.stdout.exp 2010-12-06 11:40:04 UTC (rev 11483)
@@ -12,6 +12,9 @@
--trace-children=no|yes Valgrind-ise child processes (follow execve)? [no]
--trace-children-skip=patt1,patt2,... specifies a list of executables
that --trace-children=yes should not trace into
+ --trace-children-skip-by-arg=patt1,patt2,... same as --trace-children-skip=
+ but check the argv[] entries for children, rather
+ than the exe name, to make a follow/no-follow decision
--child-silent-after-fork=no|yes omit child output between fork & exec? [no]
--track-fds=no|yes track open file descriptors? [no]
--time-stamp=no|yes add timestamps to log messages? [no]
|
|
From: <sv...@va...> - 2010-12-06 11:11:39
|
Author: sewardj
Date: 2010-12-06 11:11:29 +0000 (Mon, 06 Dec 2010)
New Revision: 11482
Log:
Minor improvements to PDB reading:
* better progress messages, to make it clear that reading of a
PDB is finished, and how much stuff was read from it
* don't mmap PDB files to read them -- instead use VG_(read).
This is because CIFS filesystem mounting only works reliably on
Linux when mounted with option '-o directio', and that
disallows mmap-ing files.
Modified:
trunk/coregrind/m_debuginfo/debuginfo.c
Modified: trunk/coregrind/m_debuginfo/debuginfo.c
===================================================================
--- trunk/coregrind/m_debuginfo/debuginfo.c 2010-12-06 11:05:29 UTC (rev 11481)
+++ trunk/coregrind/m_debuginfo/debuginfo.c 2010-12-06 11:11:29 UTC (rev 11482)
@@ -914,8 +914,8 @@
if (VG_(clo_verbosity) > 0) {
VG_(message)(Vg_UserMsg, "\n");
VG_(message)(Vg_UserMsg,
- "LOAD_PDB_DEBUGINFO(fd=%d, avma=%#lx, total_size=%lu, "
- "uu_reloc=%#lx)\n",
+ "LOAD_PDB_DEBUGINFO: clreq: fd=%d, avma=%#lx, total_size=%lu, "
+ "uu_reloc=%#lx\n",
fd_obj, avma_obj, total_size, unknown_purpose__reloc
);
}
@@ -1056,16 +1056,39 @@
goto out;
}
- /* Looks promising; go on to try and read stuff from it. */
+ /* Looks promising; go on to try and read stuff from it. But don't
+ mmap the file. Instead mmap free space and read the file into
+ it. This is because files on CIFS filesystems that are mounted
+ '-o directio' can't be mmap'd, and that mount option is needed
+ to make CIFS work reliably. (See
+ http://www.nabble.com/Corrupted-data-on-write-to-
+ Windows-2003-Server-t2782623.html)
+ This is slower, but at least it works reliably. */
fd_pdbimage = sr_Res(sres);
n_pdbimage = stat_buf.size;
- sres = VG_(am_mmap_file_float_valgrind)( n_pdbimage, VKI_PROT_READ,
- fd_pdbimage, 0 );
+ if (n_pdbimage == 0 || n_pdbimage > 0x7FFFFFFF) {
+ // 0x7FFFFFFF: why? Because the VG_(read) just below only
+ // can deal with a signed int as the size of data to read,
+ // so we can't reliably check for read failure for files
+ // greater than that size. Hence just skip them; we're
+ // unlikely to encounter a PDB that large anyway.
+ VG_(close)(fd_pdbimage);
+ goto out;
+ }
+ sres = VG_(am_mmap_anon_float_valgrind)( n_pdbimage );
if (sr_isError(sres)) {
VG_(close)(fd_pdbimage);
goto out;
}
+ void* pdbimage = (void*)sr_Res(sres);
+ r = VG_(read)( fd_pdbimage, pdbimage, (Int)n_pdbimage );
+ if (r < 0 || r != (Int)n_pdbimage) {
+ VG_(am_munmap_valgrind)( (Addr)pdbimage, n_pdbimage );
+ VG_(close)(fd_pdbimage);
+ goto out;
+ }
+
if (VG_(clo_verbosity) > 0)
VG_(message)(Vg_UserMsg, "LOAD_PDB_DEBUGINFO: pdbname: %s\n", pdbname);
@@ -1075,8 +1098,7 @@
/* dump old info for this range, if any */
discard_syms_in_range( avma_obj, total_size );
- { void* pdbimage = (void*)sr_Res(sres);
- DebugInfo* di = find_or_create_DebugInfo_for(exename, NULL/*membername*/ );
+ { DebugInfo* di = find_or_create_DebugInfo_for(exename, NULL/*membername*/ );
/* this di must be new, since we just nuked any old stuff in the range */
vg_assert(di && !di->have_rx_map && !di->have_rw_map);
@@ -1091,6 +1113,12 @@
vg_assert(di->have_dinfo); // fails if PDB read failed
VG_(am_munmap_valgrind)( (Addr)pdbimage, n_pdbimage );
VG_(close)(fd_pdbimage);
+
+ if (VG_(clo_verbosity) > 0) {
+ VG_(message)(Vg_UserMsg, "LOAD_PDB_DEBUGINFO: done: "
+ "%lu syms, %lu src locs, %lu fpo recs\n",
+ di->symtab_used, di->loctab_used, di->fpo_size);
+ }
}
out:
|
|
From: <sv...@va...> - 2010-12-06 11:05:38
|
Author: sewardj
Date: 2010-12-06 11:05:29 +0000 (Mon, 06 Dec 2010)
New Revision: 11481
Log:
Add tests for ROUNDPD and ROUNDPS.
Modified:
trunk/none/tests/amd64/sse4-64.c
Modified: trunk/none/tests/amd64/sse4-64.c
===================================================================
--- trunk/none/tests/amd64/sse4-64.c 2010-12-06 10:56:09 UTC (rev 11480)
+++ trunk/none/tests/amd64/sse4-64.c 2010-12-06 11:05:29 UTC (rev 11481)
@@ -2551,6 +2551,518 @@
}
}
+/* ------------ ROUNDPD ------------ */
+
+void do_ROUNDPD_000 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundpd $0, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundpd $0, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPD_001 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundpd $1, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundpd $1, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPD_010 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundpd $2, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundpd $2, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPD_011 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundpd $3, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundpd $3, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+
+void test_ROUNDPD_w_immediate_rounding ( void )
+{
+ double vals[22];
+ Int i = 0;
+ vals[i++] = 0.0;
+ vals[i++] = -0.0;
+ vals[i++] = mkPosInf();
+ vals[i++] = mkNegInf();
+ vals[i++] = mkPosNan();
+ vals[i++] = mkNegNan();
+ vals[i++] = -1.3;
+ vals[i++] = -1.1;
+ vals[i++] = -0.9;
+ vals[i++] = -0.7;
+ vals[i++] = -0.50001;
+ vals[i++] = -0.49999;
+ vals[i++] = -0.3;
+ vals[i++] = -0.1;
+ vals[i++] = 0.1;
+ vals[i++] = 0.3;
+ vals[i++] = 0.49999;
+ vals[i++] = 0.50001;
+ vals[i++] = 0.7;
+ vals[i++] = 0.9;
+ vals[i++] = 1.1;
+ vals[i++] = 1.3;
+ assert(i == 22);
+
+ for (i = 0; i < sizeof(vals)/sizeof(vals[0]); i++) {
+ V128 src, dst;
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_000(False/*reg*/, &src, &dst);
+ printf("r roundpd_000 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_000(True/*mem*/, &src, &dst);
+ printf("m roundpd_000 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_001(False/*reg*/, &src, &dst);
+ printf("r roundpd_001 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_001(True/*mem*/, &src, &dst);
+ printf("m roundpd_001 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_010(False/*reg*/, &src, &dst);
+ printf("r roundpd_010 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_010(True/*mem*/, &src, &dst);
+ printf("m roundpd_010 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_011(False/*reg*/, &src, &dst);
+ printf("r roundpd_011 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 8);
+ memcpy(&src[8], &vals[(i+11)%22], 8);
+ do_ROUNDPD_011(True/*mem*/, &src, &dst);
+ printf("m roundpd_011 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %10f -> %10f", vals[i], *(double*)(&dst[0]));
+ printf(" %10f -> %10f", vals[(i+11)%22], *(double*)(&dst[8]));
+ printf("\n");
+ }
+}
+
+/* ------------ ROUNDPS ------------ */
+
+void do_ROUNDPS_000 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundps $0, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundps $0, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPS_001 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundps $1, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundps $1, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPS_010 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundps $2, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundps $2, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+void do_ROUNDPS_011 ( Bool mem, V128* src, /*OUT*/V128* dst )
+{
+ if (mem) {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "roundps $3, (%0), %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11"
+ );
+ } else {
+ __asm__ __volatile__(
+ "movupd (%1), %%xmm11" "\n\t"
+ "movupd (%0), %%xmm2" "\n\t"
+ "roundps $3, %%xmm2, %%xmm11" "\n\t"
+ "movupd %%xmm11, (%1)" "\n"
+ : /*OUT*/
+ : /*IN*/ "r"(src), "r"(dst)
+ : /*TRASH*/ "xmm11","xmm2"
+ );
+ }
+}
+
+
+void test_ROUNDPS_w_immediate_rounding ( void )
+{
+ float vals[22];
+ Int i = 0;
+ vals[i++] = 0.0;
+ vals[i++] = -0.0;
+ vals[i++] = mkPosInf();
+ vals[i++] = mkNegInf();
+ vals[i++] = mkPosNan();
+ vals[i++] = mkNegNan();
+ vals[i++] = -1.3;
+ vals[i++] = -1.1;
+ vals[i++] = -0.9;
+ vals[i++] = -0.7;
+ vals[i++] = -0.50001;
+ vals[i++] = -0.49999;
+ vals[i++] = -0.3;
+ vals[i++] = -0.1;
+ vals[i++] = 0.1;
+ vals[i++] = 0.3;
+ vals[i++] = 0.49999;
+ vals[i++] = 0.50001;
+ vals[i++] = 0.7;
+ vals[i++] = 0.9;
+ vals[i++] = 1.1;
+ vals[i++] = 1.3;
+ assert(i == 22);
+
+ for (i = 0; i < sizeof(vals)/sizeof(vals[0]); i++) {
+ V128 src, dst;
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_000(False/*reg*/, &src, &dst);
+ printf("r roundps_000 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_000(True/*mem*/, &src, &dst);
+ printf("m roundps_000 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_001(False/*reg*/, &src, &dst);
+ printf("r roundps_001 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_001(True/*mem*/, &src, &dst);
+ printf("m roundps_001 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_010(False/*reg*/, &src, &dst);
+ printf("r roundps_010 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_010(True/*mem*/, &src, &dst);
+ printf("m roundps_010 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_011(False/*reg*/, &src, &dst);
+ printf("r roundps_011 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+
+ randV128(&src);
+ randV128(&dst);
+ memcpy(&src[0], &vals[i], 4);
+ memcpy(&src[4], &vals[(i+5)%22], 4);
+ memcpy(&src[8], &vals[(i+11)%22], 4);
+ memcpy(&src[12], &vals[(i+17)%22], 4);
+ do_ROUNDPS_011(True/*mem*/, &src, &dst);
+ printf("m roundps_011 ");
+ showV128(&src);
+ printf(" ");
+ showV128(&dst);
+ printf(" %9f:%9f", vals[i], (double)*(float*)(&dst[0]));
+ printf(" %9f:%9f", vals[(i+5)%22], (double)*(float*)(&dst[4]));
+ printf(" %9f:%9f", vals[(i+11)%22], (double)*(float*)(&dst[8]));
+ printf(" %9f:%9f", vals[(i+17)%22], (double)*(float*)(&dst[12]));
+ printf("\n");
+ }
+}
+
+/* ------------ PTEST ------------ */
+
void test_PTEST ( void )
{
const Int ntests = 8;
@@ -2642,12 +3154,10 @@
//test_PMULDQ();
test_PMULLD();
test_PTEST();
- // ROUNDPD
- // ROUNDPS
- // ROUNDSD
- // ROUNDSS
test_ROUNDSD_w_immediate_rounding();
test_ROUNDSS_w_immediate_rounding();
+ test_ROUNDPD_w_immediate_rounding();
+ test_ROUNDPS_w_immediate_rounding();
// ------ SSE 4.2 ------
test_PCMPGTQ();
#else
|
|
From: <sv...@va...> - 2010-12-06 10:56:18
|
Author: sewardj
Date: 2010-12-06 10:56:09 +0000 (Mon, 06 Dec 2010)
New Revision: 11480
Log:
Speedups and fixes:
* (speedup) addMemEvent: generate inline code to check whether a
memory access is within 16k of the stack pointer, and if so
don't bother to call the helper
* (speedup) find_Block_containing: cache the most recently seen 2
blocks, and check new references in them first. This gives a
worthwhile speedup.
* (fix) at the end of the run, merge stats from un-freed blocks
back into APs. This fixes misleading stats that cause un-freed
blocks to appear to not have been accessed at all.
Modified:
trunk/exp-dhat/dh_main.c
Modified: trunk/exp-dhat/dh_main.c
===================================================================
--- trunk/exp-dhat/dh_main.c 2010-11-12 10:40:20 UTC (rev 11479)
+++ trunk/exp-dhat/dh_main.c 2010-12-06 10:56:09 UTC (rev 11480)
@@ -101,8 +101,33 @@
return 0;
}
+// 2-entry cache for find_Block_containing
+static Block* fbc_cache0 = NULL;
+static Block* fbc_cache1 = NULL;
+
+static UWord stats__n_fBc_cached = 0;
+static UWord stats__n_fBc_uncached = 0;
+static UWord stats__n_fBc_notfound = 0;
+
static Block* find_Block_containing ( Addr a )
{
+ if (LIKELY(fbc_cache0
+ && fbc_cache0->payload <= a
+ && a < fbc_cache0->payload + fbc_cache0->req_szB)) {
+ // found at 0
+ stats__n_fBc_cached++;
+ return fbc_cache0;
+ }
+ if (LIKELY(fbc_cache1
+ && fbc_cache1->payload <= a
+ && a < fbc_cache1->payload + fbc_cache1->req_szB)) {
+ // found at 1; swap 0 and 1
+ Block* tmp = fbc_cache0;
+ fbc_cache0 = fbc_cache1;
+ fbc_cache1 = tmp;
+ stats__n_fBc_cached++;
+ return fbc_cache0;
+ }
Block fake;
fake.payload = a;
fake.req_szB = 1;
@@ -110,12 +135,18 @@
UWord foundval = 1;
Bool found = VG_(lookupFM)( interval_tree,
&foundkey, &foundval, (UWord)&fake );
- if (!found)
+ if (!found) {
+ stats__n_fBc_notfound++;
return NULL;
+ }
tl_assert(foundval == 0); // we don't store vals in the interval tree
tl_assert(foundkey != 1);
Block* res = (Block*)foundkey;
tl_assert(res != &fake);
+ // put at the top position
+ fbc_cache1 = fbc_cache0;
+ fbc_cache0 = res;
+ stats__n_fBc_uncached++;
return res;
}
@@ -129,6 +160,7 @@
Bool found = VG_(delFromFM)( interval_tree,
NULL, NULL, (Addr)&fake );
tl_assert(found);
+ fbc_cache0 = fbc_cache1 = NULL;
}
@@ -250,8 +282,13 @@
/* 'bk' is retiring (being freed). Find the relevant APInfo entry for
it, which must already exist. Then, fold info from 'bk' into that
- entry. */
-static void retire_Block ( Block* bk )
+ entry. 'because_freed' is True if the block is retiring because
+ the client has freed it. If it is False then the block is retiring
+ because the program has finished, in which case we want to skip the
+ updates of the total blocks live etc for this AP, but still fold in
+ the access counts and histo data that have so far accumulated for
+ the block. */
+static void retire_Block ( Block* bk, Bool because_freed )
{
tl_assert(bk);
tl_assert(bk->ap);
@@ -271,44 +308,49 @@
VG_(printf)("ec %p api->c_by_l %llu bk->rszB %llu\n",
bk->ap, api->cur_bytes_live, (ULong)bk->req_szB);
- tl_assert(api->cur_blocks_live >= 1);
- tl_assert(api->cur_bytes_live >= bk->req_szB);
- api->cur_blocks_live--;
- api->cur_bytes_live -= bk->req_szB;
+ // update total blocks live etc for this AP
+ if (because_freed) {
+ tl_assert(api->cur_blocks_live >= 1);
+ tl_assert(api->cur_bytes_live >= bk->req_szB);
+ api->cur_blocks_live--;
+ api->cur_bytes_live -= bk->req_szB;
- api->deaths++;
+ api->deaths++;
- tl_assert(bk->allocd_at <= g_guest_instrs_executed);
- api->death_ages_sum += (g_guest_instrs_executed - bk->allocd_at);
+ tl_assert(bk->allocd_at <= g_guest_instrs_executed);
+ api->death_ages_sum += (g_guest_instrs_executed - bk->allocd_at);
+ // update global summary stats
+ tl_assert(g_cur_blocks_live > 0);
+ g_cur_blocks_live--;
+ tl_assert(g_cur_bytes_live >= bk->req_szB);
+ g_cur_bytes_live -= bk->req_szB;
+ }
+
+ // access counts
api->n_reads += bk->n_reads;
api->n_writes += bk->n_writes;
- // update global summary stats
- tl_assert(g_cur_blocks_live > 0);
- g_cur_blocks_live--;
- tl_assert(g_cur_bytes_live >= bk->req_szB);
- g_cur_bytes_live -= bk->req_szB;
-
// histo stuff. First, do state transitions for xsize/xsize_tag.
switch (api->xsize_tag) {
case Unknown:
tl_assert(api->xsize == 0);
- tl_assert(api->deaths == 1);
+ tl_assert(api->deaths == 1 || api->deaths == 0);
tl_assert(!api->histo);
api->xsize_tag = Exactly;
api->xsize = bk->req_szB;
if (0) VG_(printf)("api %p --> Exactly(%lu)\n", api, api->xsize);
// and allocate the histo
if (bk->histoW) {
- api->histo = VG_(malloc)("dh.main.retire_Block.1", api->xsize * sizeof(UInt));
+ api->histo = VG_(malloc)("dh.main.retire_Block.1",
+ api->xsize * sizeof(UInt));
VG_(memset)(api->histo, 0, api->xsize * sizeof(UInt));
}
break;
case Exactly:
- tl_assert(api->deaths > 1);
+ //tl_assert(api->deaths > 1);
if (bk->req_szB != api->xsize) {
if (0) VG_(printf)("api %p --> Mixed(%lu -> %lu)\n",
api, api->xsize, bk->req_szB);
@@ -323,7 +365,7 @@
break;
case Mixed:
- tl_assert(api->deaths > 1);
+ //tl_assert(api->deaths > 1);
break;
default:
@@ -392,6 +434,8 @@
g_max_bytes_live = g_cur_bytes_live;
g_max_blocks_live = g_cur_blocks_live;
}
+ if (delta > 0)
+ g_tot_bytes += delta;
// adjust total allocation size
if (delta > 0)
@@ -446,6 +490,7 @@
Bool present = VG_(addToFM)( interval_tree, (UWord)bk, (UWord)0/*no val*/);
tl_assert(!present);
+ fbc_cache0 = fbc_cache1 = NULL;
intro_Block(bk);
@@ -477,7 +522,7 @@
if (0) VG_(printf)(" FREE %p %llu\n",
p, g_guest_instrs_executed - bk->allocd_at);
- retire_Block(bk);
+ retire_Block(bk, True/*because_freed*/);
VG_(cli_free)( (void*)bk->payload );
delete_Block_starting_at( bk->payload );
@@ -559,6 +604,7 @@
Bool present
= VG_(addToFM)( interval_tree, (UWord)bk, (UWord)0/*no val*/);
tl_assert(!present);
+ fbc_cache0 = fbc_cache1 = NULL;
return p_new;
}
@@ -717,6 +763,12 @@
//--- Instrumentation ---//
//------------------------------------------------------------//
+#define binop(_op, _arg1, _arg2) IRExpr_Binop((_op),(_arg1),(_arg2))
+#define mkexpr(_tmp) IRExpr_RdTmp((_tmp))
+#define mkU32(_n) IRExpr_Const(IRConst_U32(_n))
+#define mkU64(_n) IRExpr_Const(IRConst_U64(_n))
+#define assign(_t, _e) IRStmt_WrTmp((_t), (_e))
+
static
void add_counter_update(IRSB* sbOut, Int n)
{
@@ -735,12 +787,9 @@
IRTemp t2 = newIRTemp(sbOut->tyenv, Ity_I64);
IRExpr* counter_addr = mkIRExpr_HWord( (HWord)&g_guest_instrs_executed );
- IRStmt* st1 = IRStmt_WrTmp(t1, IRExpr_Load(END, Ity_I64, counter_addr));
- IRStmt* st2 =
- IRStmt_WrTmp(t2,
- IRExpr_Binop(Iop_Add64, IRExpr_RdTmp(t1),
- IRExpr_Const(IRConst_U64(n))));
- IRStmt* st3 = IRStmt_Store(END, counter_addr, IRExpr_RdTmp(t2));
+ IRStmt* st1 = assign(t1, IRExpr_Load(END, Ity_I64, counter_addr));
+ IRStmt* st2 = assign(t2, binop(Iop_Add64, mkexpr(t1), mkU64(n)));
+ IRStmt* st3 = IRStmt_Store(END, counter_addr, mkexpr(t2));
addStmtToIRSB( sbOut, st1 );
addStmtToIRSB( sbOut, st2 );
@@ -748,7 +797,8 @@
}
static
-void addMemEvent(IRSB* sbOut, Bool isWrite, Int szB, IRExpr* addr )
+void addMemEvent(IRSB* sbOut, Bool isWrite, Int szB, IRExpr* addr,
+ Int goff_sp)
{
IRType tyAddr = Ity_INVALID;
HChar* hName = NULL;
@@ -756,6 +806,9 @@
IRExpr** argv = NULL;
IRDirty* di = NULL;
+ const Int THRESH = 4096 * 4; // somewhat arbitrary
+ const Int rz_szB = VG_STACK_REDZONE_SZB;
+
tyAddr = typeOfIRExpr( sbOut->tyenv, addr );
tl_assert(tyAddr == Ity_I32 || tyAddr == Ity_I64);
@@ -777,6 +830,42 @@
hName, VG_(fnptr_to_fnentry)( hAddr ),
argv );
+ /* Generate the guard condition: "(addr - (SP - RZ)) >u N", for
+ some arbitrary N. If that fails then addr is in the range (SP -
+ RZ .. SP + N - RZ). If N is smallish (a page?) then we can say
+ addr is within a page of SP and so can't possibly be a heap
+ access, and so can be skipped. */
+ IRTemp sp = newIRTemp(sbOut->tyenv, tyAddr);
+ addStmtToIRSB( sbOut, assign(sp, IRExpr_Get(goff_sp, tyAddr)));
+
+ IRTemp sp_minus_rz = newIRTemp(sbOut->tyenv, tyAddr);
+ addStmtToIRSB(
+ sbOut,
+ assign(sp_minus_rz,
+ tyAddr == Ity_I32
+ ? binop(Iop_Sub32, mkexpr(sp), mkU32(rz_szB))
+ : binop(Iop_Sub64, mkexpr(sp), mkU64(rz_szB)))
+ );
+
+ IRTemp diff = newIRTemp(sbOut->tyenv, tyAddr);
+ addStmtToIRSB(
+ sbOut,
+ assign(diff,
+ tyAddr == Ity_I32
+ ? binop(Iop_Sub32, addr, mkexpr(sp_minus_rz))
+ : binop(Iop_Sub64, addr, mkexpr(sp_minus_rz)))
+ );
+
+ IRTemp guard = newIRTemp(sbOut->tyenv, Ity_I1);
+ addStmtToIRSB(
+ sbOut,
+ assign(guard,
+ tyAddr == Ity_I32
+ ? binop(Iop_CmpLT32U, mkU32(THRESH), mkexpr(diff))
+ : binop(Iop_CmpLT64U, mkU64(THRESH), mkexpr(diff)))
+ );
+ di->guard = mkexpr(guard);
+
addStmtToIRSB( sbOut, IRStmt_Dirty(di) );
}
@@ -791,6 +880,8 @@
IRSB* sbOut;
IRTypeEnv* tyenv = sbIn->tyenv;
+ const Int goff_sp = layout->offset_SP;
+
// We increment the instruction count in two places:
// - just before any Ist_Exit statements;
// - just before the IRSB's end.
@@ -833,7 +924,8 @@
// Note also, endianness info is ignored. I guess
// that's not interesting.
addMemEvent( sbOut, False/*!isWrite*/,
- sizeofIRType(data->Iex.Load.ty), aexpr );
+ sizeofIRType(data->Iex.Load.ty),
+ aexpr, goff_sp );
}
break;
}
@@ -842,7 +934,8 @@
IRExpr* data = st->Ist.Store.data;
IRExpr* aexpr = st->Ist.Store.addr;
addMemEvent( sbOut, True/*isWrite*/,
- sizeofIRType(typeOfIRExpr(tyenv, data)), aexpr );
+ sizeofIRType(typeOfIRExpr(tyenv, data)),
+ aexpr, goff_sp );
break;
}
@@ -860,10 +953,10 @@
// than two cache lines in the simulation.
if (d->mFx == Ifx_Read || d->mFx == Ifx_Modify)
addMemEvent( sbOut, False/*!isWrite*/,
- dataSize, d->mAddr );
+ dataSize, d->mAddr, goff_sp );
if (d->mFx == Ifx_Write || d->mFx == Ifx_Modify)
addMemEvent( sbOut, True/*isWrite*/,
- dataSize, d->mAddr );
+ dataSize, d->mAddr, goff_sp );
} else {
tl_assert(d->mAddr == NULL);
tl_assert(d->mSize == 0);
@@ -884,8 +977,10 @@
dataSize = sizeofIRType(typeOfIRExpr(tyenv, cas->dataLo));
if (cas->dataHi != NULL)
dataSize *= 2; /* since it's a doubleword-CAS */
- addMemEvent( sbOut, False/*!isWrite*/, dataSize, cas->addr );
- addMemEvent( sbOut, True/*isWrite*/, dataSize, cas->addr );
+ addMemEvent( sbOut, False/*!isWrite*/,
+ dataSize, cas->addr, goff_sp );
+ addMemEvent( sbOut, True/*isWrite*/,
+ dataSize, cas->addr, goff_sp );
break;
}
@@ -895,12 +990,14 @@
/* LL */
dataTy = typeOfIRTemp(tyenv, st->Ist.LLSC.result);
addMemEvent( sbOut, False/*!isWrite*/,
- sizeofIRType(dataTy), st->Ist.LLSC.addr );
+ sizeofIRType(dataTy),
+ st->Ist.LLSC.addr, goff_sp );
} else {
/* SC */
dataTy = typeOfIRExpr(tyenv, st->Ist.LLSC.storedata);
addMemEvent( sbOut, True/*isWrite*/,
- sizeofIRType(dataTy), st->Ist.LLSC.addr );
+ sizeofIRType(dataTy),
+ st->Ist.LLSC.addr, goff_sp );
}
break;
}
@@ -919,7 +1016,13 @@
return sbOut;
}
+#undef binop
+#undef mkexpr
+#undef mkU32
+#undef mkU64
+#undef assign
+
//------------------------------------------------------------//
//--- Command line args ---//
//------------------------------------------------------------//
@@ -1008,10 +1111,20 @@
tl_assert(api->tot_bytes >= api->max_bytes_live);
if (api->deaths > 0) {
- VG_(umsg)("deaths: %'llu, at avg age %'llu\n",
- api->deaths,
- api->deaths == 0
- ? 0 : (api->death_ages_sum / api->deaths));
+ // Average Age at Death
+ ULong aad = api->deaths == 0
+ ? 0 : (api->death_ages_sum / api->deaths);
+ // AAD as a fraction of the total program lifetime (so far)
+ // measured in ten-thousand-ths (aad_frac_10k == 10000 means the
+ // complete lifetime of the program.
+ ULong aad_frac_10k
+ = g_guest_instrs_executed == 0
+ ? 0 : (10000ULL * aad) / g_guest_instrs_executed;
+ HChar buf[16];
+ show_N_div_100(buf, aad_frac_10k);
+ VG_(umsg)("deaths: %'llu, at avg age %'llu "
+ "(%s%% of prog lifetime)\n",
+ api->deaths, aad, buf );
} else {
VG_(umsg)("deaths: none (none of these blocks were freed)\n");
}
@@ -1154,6 +1267,22 @@
static void dh_fini(Int exit_status)
{
+ // Before printing statistics, we must harvest access counts for
+ // all the blocks that are still alive. Not doing so gives
+ // access ratios which are too low (zero, in the worst case)
+ // for such blocks, since the accesses that do get made will
+ // (if we skip this step) not get folded into the AP summaries.
+ UWord keyW, valW;
+ VG_(initIterFM)( interval_tree );
+ while (VG_(nextIterFM)( interval_tree, &keyW, &valW )) {
+ Block* bk = (Block*)keyW;
+ tl_assert(valW == 0);
+ tl_assert(bk);
+ retire_Block(bk, False/*!because_freed*/);
+ }
+ VG_(doneIterFM)( interval_tree );
+
+ // show results
VG_(umsg)("======== SUMMARY STATISTICS ========\n");
VG_(umsg)("\n");
VG_(umsg)("guest_insns: %'llu\n", g_guest_instrs_executed);
@@ -1192,6 +1321,16 @@
VG_(umsg)(" over far too many alloc points. I strongly suggest using\n");
VG_(umsg)(" --num-callers=4 or some such, to reduce the spreading.\n");
VG_(umsg)("\n");
+
+ if (VG_(clo_stats)) {
+ VG_(dmsg)(" dhat: find_Block_containing:\n");
+ VG_(dmsg)(" found: %'lu (%'lu cached + %'lu uncached)\n",
+ stats__n_fBc_cached + stats__n_fBc_uncached,
+ stats__n_fBc_cached,
+ stats__n_fBc_uncached);
+ VG_(dmsg)(" notfound: %'lu\n", stats__n_fBc_notfound);
+ VG_(dmsg)("\n");
+ }
}
@@ -1242,6 +1381,8 @@
VG_(track_post_mem_write) ( dh_handle_noninsn_write );
tl_assert(!interval_tree);
+ tl_assert(!fbc_cache0);
+ tl_assert(!fbc_cache1);
interval_tree = VG_(newFM)( VG_(malloc),
"dh.main.interval_tree.1",
|
|
From: Nicholas N. <n.n...@gm...> - 2010-12-06 05:17:21
|
On Mon, Dec 6, 2010 at 2:02 AM, Bruce Merry <bm...@gm...> wrote: > > I've been developing a Valgrind tool for logging addresses of data > accesses (with a more compact binary format than the lackey output) > and a corresponding graphical viewer. I'd like to find out whether > there is interest from the Valgrind developer community for this to be > incorporated into Valgrind at some point. First question: what's it useful for? I see some pictures of sorting algorithms which are pretty but don't seem terribly useful... After that, it would need good docs, preferably some reasonable number of tests, and a commitment from you to maintain it. Nick |