You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(2) |
2
(7) |
3
(1) |
4
(9) |
5
|
|
6
(7) |
7
(10) |
8
(23) |
9
(19) |
10
(21) |
11
(14) |
12
(15) |
|
13
(11) |
14
(7) |
15
(20) |
16
(21) |
17
(20) |
18
(20) |
19
(19) |
|
20
(24) |
21
(22) |
22
(19) |
23
(17) |
24
(26) |
25
(15) |
26
(16) |
|
27
(8) |
28
(10) |
29
(24) |
30
(21) |
31
(19) |
|
|
|
From: Christian B. <bor...@de...> - 2013-01-08 03:20:03
|
valgrind revision: 13215 VEX revision: 2628 C compiler: gcc (GCC) 4.6.1 20110908 (Red Hat 4.6.1-9bb4) Assembler: GNU assembler version 2.21.51.0.6-6bb6.fc15 20110118 C library: GNU C Library stable release version 2.14.1 uname -mrs: Linux 3.6.8-57.x.20121204-s390xperformance s390x Vendor version: unknown Nightly build on fedora390 ( Fedora 15 with devel libc/toolchain on z196 (s390x) ) Started at 2013-01-08 03:45:01 CET Ended at 2013-01-08 04:20:08 CET Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 621 tests, 3 stderr failures, 1 stdout failure, 6 stderrB failures, 0 stdoutB failures, 0 post failures == gdbserver_tests/mcbreak (stderrB) gdbserver_tests/mcclean_after_fork (stderrB) gdbserver_tests/mcleak (stderrB) gdbserver_tests/mcmain_pic (stderrB) gdbserver_tests/mcvabits (stderrB) gdbserver_tests/mssnapshot (stderrB) memcheck/tests/linux/timerfd-syscall (stderr) none/tests/s390x/test_clone (stdout) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc20_verifywrap (stderr) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 621 tests, 3 stderr failures, 0 stdout failures, 6 stderrB failures, 0 stdoutB failures, 0 post failures == gdbserver_tests/mcbreak (stderrB) gdbserver_tests/mcclean_after_fork (stderrB) gdbserver_tests/mcleak (stderrB) gdbserver_tests/mcmain_pic (stderrB) gdbserver_tests/mcvabits (stderrB) gdbserver_tests/mssnapshot (stderrB) memcheck/tests/linux/timerfd-syscall (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc20_verifywrap (stderr) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Tue Jan 8 04:02:29 2013 --- new.short Tue Jan 8 04:20:08 2013 *************** *** 8,10 **** ! == 621 tests, 3 stderr failures, 0 stdout failures, 6 stderrB failures, 0 stdoutB failures, 0 post failures == gdbserver_tests/mcbreak (stderrB) --- 8,10 ---- ! == 621 tests, 3 stderr failures, 1 stdout failure, 6 stderrB failures, 0 stdoutB failures, 0 post failures == gdbserver_tests/mcbreak (stderrB) *************** *** 16,17 **** --- 16,18 ---- memcheck/tests/linux/timerfd-syscall (stderr) + none/tests/s390x/test_clone (stdout) helgrind/tests/tc18_semabuse (stderr) |
|
From: Christian B. <bor...@de...> - 2013-01-08 03:13:39
|
valgrind revision: 13215 VEX revision: 2628 C compiler: gcc (SUSE Linux) 4.3.4 [gcc-4_3-branch revision 152973] Assembler: GNU assembler (GNU Binutils; SUSE Linux Enterprise 11) 2.21.1 C library: GNU C Library stable release version 2.11.3 (20110527) uname -mrs: Linux 3.0.42-0.7-default s390x Vendor version: Welcome to SUSE Linux Enterprise Server 11 SP2 (s390x) - Kernel %r (%t). Nightly build on sless390 ( SUSE Linux Enterprise Server 11 SP1 gcc 4.3.4 on z196 (s390x) ) Started at 2013-01-08 03:45:01 CET Ended at 2013-01-08 04:13:28 CET Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... done Regression test results follow == 620 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == |
|
From: Tom H. <to...@co...> - 2013-01-08 03:13:15
|
valgrind revision: 13215 VEX revision: 2628 C compiler: gcc (GCC) 4.7.2 20120921 (Red Hat 4.7.2-2) Assembler: GNU assembler version 2.22.52.0.1-10.fc17 20120131 C library: GNU C Library stable release version 2.15 uname -mrs: Linux 3.5.3-1.fc17.x86_64 x86_64 Vendor version: Fedora release 17 (Beefy Miracle) Nightly build on bristol ( x86_64, Fedora 17 (Beefy Miracle) ) Started at 2013-01-08 02:41:05 GMT Ended at 2013-01-08 03:13:00 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 642 tests, 5 stderr failures, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures == gdbserver_tests/mcinfcallRU (stderr) gdbserver_tests/mcinfcallWSRU (stderr) gdbserver_tests/mcmain_pic (stderr) memcheck/tests/origin5-bz2 (stderr) exp-sgcheck/tests/preen_invars (stdout) exp-sgcheck/tests/preen_invars (stderr) |
|
From: Tom H. <to...@co...> - 2013-01-08 03:04:05
|
valgrind revision: 13215 VEX revision: 2628 C compiler: gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-8) Assembler: GNU assembler version 2.23.51.0.1-3.fc18 20120806 C library: GNU C Library stable release version 2.16 uname -mrs: Linux 3.5.3-1.fc17.x86_64 x86_64 Vendor version: Fedora release 18 (Spherical Cow) Nightly build on bristol ( x86_64, Fedora 18 (Spherical Cow) ) Started at 2013-01-08 02:31:08 GMT Ended at 2013-01-08 03:03:51 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 642 tests, 2 stderr failures, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/origin5-bz2 (stderr) exp-sgcheck/tests/preen_invars (stdout) exp-sgcheck/tests/preen_invars (stderr) |
|
From: Tom H. <to...@co...> - 2013-01-08 02:23:58
|
valgrind revision: 13215 VEX revision: 2628 C compiler: gcc (GCC) 4.7.2 20121109 (Red Hat 4.7.2-9) Assembler: GNU assembler version 2.23.51.0.8-2.fc19 20121218 C library: GNU C Library (GNU libc) stable release version 2.17 uname -mrs: Linux 3.5.3-1.fc17.x86_64 x86_64 Vendor version: Fedora release 19 (Rawhide) Nightly build on bristol ( x86_64, Fedora 19 ) Started at 2013-01-08 02:23:13 GMT Ended at 2013-01-08 02:23:48 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... failed Last 20 lines of verbose log follow echo checking for use as an inner Valgrind... no checking for Pagesize... 4k checking for shared memory alignment... 2*PAGE_SIZE checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking features.h usability... yes checking features.h presence... yes checking for features.h... yes checking the GLIBC_VERSION version... unsupported version 2.17 configure: error: Valgrind requires glibc version 2.2 - 2.16 |
|
From: Florian K. <br...@ac...> - 2013-01-07 20:18:58
|
On 01/07/2013 03:03 AM, Julian Seward wrote: > On Sunday, January 06, 2013, Florian Krohm wrote: > >> OK, let's do it them. Do you want to merge COMEM first ? > > Yes. (Unfortunately) I think it would be wise to merge COMEM > first, and also to finish off and commit the guard-type change > too. Without those in place, there is going to be a lot of > conflicts with a name-and-arg-order change. I hope to get > them done this week. No rush here. I have a largish DFP patch waiting for me in BZ ... > ITE as a name seems fine. w.r.t. the comment .. > >> + /* A ternary if-then-else operator. It returns iftrue if cond is >> + zero, iffalse otherwise. > > s/zero/nonzero, I think. Righto. Florian |
|
From: Timur I. <tim...@go...> - 2013-01-07 15:00:34
|
2013/1/7 Julian Seward <js...@ac...>: > >>> (== some measure of the cost of running the code). What numbers to you >>> see? >> >> transtab: new 173,061 (4,555,266 -> 33,374,561; ratio 73:10) [0 >> scs] >> scheduler: 79,073,144 event checks. > > > That is quite a lot of translation work for a relatively small amount of > execution. > > What optimisation level is your test code built at? -O0 is probably bad; > -O1 works pretty well with Memcheck. The main binary is -O1, Release build (i.e. fewer asserts) Please note that a substantial part of the code executed is likely in the system libraries (e.g. fontconfig) > Also it might be useful to check if the JIT has any bad optimisation > cases in your code. For the above test, rerun with > --tool=none --profile-flags=10001000 and send the results (will be > very large). Done [sent off the list as it's 1MB which is larger than the 100KB limit]. Not sure how to read this profile though - can you please check? |
|
From: Julian S. <js...@ac...> - 2013-01-07 13:11:10
|
>> (== some measure of the cost of running the code). What numbers to you see? > transtab: new 173,061 (4,555,266 -> 33,374,561; ratio 73:10) [0 scs] > scheduler: 79,073,144 event checks. That is quite a lot of translation work for a relatively small amount of execution. What optimisation level is your test code built at? -O0 is probably bad; -O1 works pretty well with Memcheck. Also it might be useful to check if the JIT has any bad optimisation cases in your code. For the above test, rerun with --tool=none --profile-flags=10001000 and send the results (will be very large). J |
|
From: Timur I. <tim...@go...> - 2013-01-07 12:51:16
|
Hi Julian, Thanks for your detailed reply! 2013/1/7 Julian Seward <js...@ac...>: > >> This is roughly a 50x slowdown, I'd expect it to be much smaller when >> doing little to no additional instrumentation. > > For no-instrumentation (--tool=none) I'd expect a slowdown in the range > 3-4. But that's only in the steady state, when the cost of doing the > JITting is small compared to the cost of running the generated code. > > It maybe that the case you mentioned is the worst case -- JITting a huge > amount of code (half of Chromium) and then do very little work before > exiting. Yes, this was my reasoning too. I wonder if it's possible to optimize this worst case. We observe very slow startup of Chromium tests under Memcheck and startup takes ~50% of the run time on the build bots. This particular DumpRenderTree test I've mentioned before actually uses little of Chromium code (this is mostly a WebKit target) so it's not half of Chromium but still it loads very slow. > You can assess that by looking at these lines in the --stats=yes output: > > transtab: new 3,171 (69,078 -> 1,143,658; ratio 165:10) [0 scs] > > which shows how much code is translated (== some measure of the JIT cost), > and > > scheduler: 82,744 event checks. > > which shows how many backwards branches are counted by the simulator > (== some measure of the cost of running the code). What numbers to you see? transtab: new 173,061 (4,555,266 -> 33,374,561; ratio 73:10) [0 scs] scheduler: 79,073,144 event checks. See the full stats attached. >> Is there any way to optimize Valgrind for such usage? > Do more useful work per program-start. Unfortunately, sometimes this is not an option. e.g. in browser security testing (generate random HTMLs until the browser crashes, then minimize) we have to start a full browser, give it a small HTML and restart if the browser crashes. Even if the browser survives, it may be in a broken state, so we restart it anyways. You might imagine 90% of our testing is wasted on startup when using Valgrind :( And Chromium multi-process architecture makes it hard to use fork() tricks to save the startup time... >> What I want is basically to instrument only a couple of .so modules, >> leaving anything else unchanged. > > Not doable for Memcheck/Helgrind -- they need to track the complete > memory state from startup. I totally understand this Memcheck restriction. Interestingly, the memcheck startup+execution time is just 2x-3x more than startup+execution time of nonetool. I wonder if the nonetool worst case optimization could improve the memcheck startup time by 1.5-2x? Let me disagree with the Helgrind consideration, as data race detection works totally fine when memory accesses are sampled, e.g. see [1] and ThreadSanitizer. Of course, one still needs to see all the synchronization in the app, but this is doable using more lightweight methods (e.g. at compile time or at dynamic linking time). Other tools may benefit from partial instrumentation too. E.g. AddressSanitizer can be sped up by only instrumenting "interesting" modules; if you had unaddr-only or leak-only Memcheck versions that'd probably benefit from sampling too. [1] Daniel Marino, Madanlal Musuvathi, Satish Narayanasamy, LiteRace: effective sampling for lightweight data-race detection, Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation, June 15-21, 2009, Dublin, Ireland |
|
From: <sv...@va...> - 2013-01-07 11:17:54
|
sewardj 2013-01-07 11:17:43 +0000 (Mon, 07 Jan 2013)
New Revision: 2628
Log:
Remove unused function.
Modified files:
trunk/priv/ir_opt.c
Modified: trunk/priv/ir_opt.c (+0 -8)
===================================================================
--- trunk/priv/ir_opt.c 2013-01-03 23:34:18 +00:00 (rev 2627)
+++ trunk/priv/ir_opt.c 2013-01-07 11:17:43 +00:00 (rev 2628)
@@ -1151,14 +1151,6 @@
&& e->Iex.Const.con->Ico.U32 == 0xFFFFFFFF );
}
-/* Is this literally IRExpr_Const(IRConst_U64(0)) ? */
-static Bool isZeroU64 ( IRExpr* e )
-{
- return toBool( e->tag == Iex_Const
- && e->Iex.Const.con->tag == Ico_U64
- && e->Iex.Const.con->Ico.U64 == 0);
-}
-
/* Is this an integer constant with value 0 ? */
static Bool isZeroU ( IRExpr* e )
{
|
|
From: David B. <dav...@gm...> - 2013-01-07 11:11:12
|
On Mon, Jan 7, 2013 at 2:01 AM, John Reiser <jr...@bi...> wrote:
> On 01/06/2013 01:24 PM, David Bar wrote:
> > That's true for memcheck if interested in memory corruption issues.
>
> >
>
> > What about massif? It seems that if we only want memory measurements,
> wouldn't it be enough to just handle malloc/free/etc.?
>
> If memory leaks matter then it is not obvious that just handling
> "malloc/free/etc" suffices. If all you want is
> {max,min,integral}(malloc - free)
> and massif doesn't perform as you like, then talk directly to massif.
>
I'm not using Massif as a memory leak detection tool (well, sometimes, but
usually not).
I want all the functionality Massif gives, to get detailed memory analysis,
not just the overall consumption.
However, I fail to see why to do what Massif does, translation and JITing
is required.
I work in a large enterprise, with huge programs with hundreds of DLLs
being loaded. I can tell you that there are currently two projects going on
in my enterprise which mimic Massif functionality using only malloc/free
hooks. Yes, it's idiotic, especially with two such projects - I'm not
calling the shots here...
Valgrind is just deemed too slow and annoying to run, as we have to wait
for several minutes until the program starts to run, and then it
works sluggishly, which is also problematic for daemons which are expected
to respond to requests under a reasonable time limit.
I see the development of such other tools as a waste of time. Massif
already does a great job, and already has great infrastructure in place for
getting called on hooks, getting the backtrace along with debug symbols,
etc.
>
> > What if I want to run memcheck just to find memory leaks? Doesn't seem
> to me that I need actual instrumentation here of all code, no?
>
> Show me the code, or at least explain in detail why this should be true.
> I don't believe the claim. In particular, I have programs which
> use stem+leaf storage for collections of pointers, and these programs
> require _more_ than 100% "observation" just to find memory leaks.
>
> > Even for the complete memcheck functionality, some use cases may live
> well enough with just instrumenting a part of the program.
>
> Extremely unlikely. Sabotaging only log(n) of the memory references can
> totally invalidate nearly every program ever written.
>
> >
> > Say I have a huge program, which contain tons of DLLs, most of them not
> under my responsibility and/or I can't fix the bugs in them. If I just want
> to memcheck my DLL, and am willing to live with the assumption that memory
> my DLL allocates is only read/written by my DLL, it seems reasonable to only
> > translate and handle my DLL, no?
>
> If you are willing to assume that some DLL is an island unto itself,
> then why not test it that way? Also, unless the DLL has been through
> an actual theorem prover or an extensive battery of tests then I
> don't believe the claim "my DLL is an island."
>
>
I'm not saying the theoretical DLL in question is "an island". It is being
used by the application, and it uses various I/Ss from other DLLs.
I sometimes write new code in 1 or 2 DLLs that run within huge programs,
along with hundreds of other DLLs, which were already tested in the past.
It's not trivial to write a test program which uses my new code in the
specific DLL, and properly reinitialize all the other DLLs so that my DLL
would be able to use them.
If all I want to check is that my code behaved well on memory it allocated
and free'd, it seems reasonable to me to tell Valgrind to avoid all the
hard work on the other parts of the program. Yes, I understand, this will
not be accurate, but it may be good enough. For a large application it may
take minutes for the initialization to finish under Valgrind, which is
frustrating.
Perhaps memcheck isn't a good example. But I believe that Massif is, and
perhaps also Cachegrind. Not familiar enough with other Valgrind tools to
say if it may apply to them also.
Again, I know that cachegrind would give inaccurate results if it only sees
some of the instructions executed, but again, if I just want to check a new
fancy algorithm in my specific DLL, it is good enough.
You wanted a specific example - here's one - say I have implemented a table
which is supposed to be cache-efficient. All memory sitting on the table is
allocated within the examined DLL, and when my API is called I don't give
back pointers to the actual data, but only give out copies. And the same
for stores in the table - I copy the data to memory my DLL allocates, and
put pointers in the hash table to the copied data. Assume I provide
other functionality, such as iteration on the table, automatic expiration
of entries, etc.
I may not know the exact use pattern of my hash table by the application. I
could go and gather stats, and write a simulator, but why bother?
If I could just fire-up cachegrind on the application, and it would work
fast, the I could quickly compare the cache behavior of my new
implementation, compared with the old implementation of the same
functionality.
Perhaps it would be good enough if there was a cache that Valgrind would
save code after translation/JIT, and would use it if the checksum/timestamp
of the DLL hasn't changed. This would mean that only the first run would be
slow, and other runs would at least have fast initialization time.
Was caching ever considered?
|
|
From: Julian S. <js...@ac...> - 2013-01-07 09:07:53
|
> Say I have a huge program, which contain tons of DLLs, most of them not > under my responsibility and/or I can't fix the bugs in them. If I just want > to memcheck my DLL, and am willing to live with the assumption that memory > my DLL allocates is only read/written by my DLL, it seems reasonable to > only translate and handle my DLL, no? There have been proposals along these lines in the past, but it seems to be difficult to implement, and generally the perceived (at least by me) benefit vs added-extra-complexity ratio hasn't made it something worth chasing, compared to the other problems the system has. IIRC Dragos Tatulea has some patches that can be used to tell Memcheck "we're in trusted code now, don't check so much". See https://bugs.kde.org/show_bug.cgi?id=301269 J |
|
From: Julian S. <js...@ac...> - 2013-01-07 09:00:45
|
> This is roughly a 50x slowdown, I'd expect it to be much smaller when > doing little to no additional instrumentation. For no-instrumentation (--tool=none) I'd expect a slowdown in the range 3-4. But that's only in the steady state, when the cost of doing the JITting is small compared to the cost of running the generated code. It maybe that the case you mentioned is the worst case -- JITting a huge amount of code (half of Chromium) and then do very little work before exiting. You can assess that by looking at these lines in the --stats=yes output: transtab: new 3,171 (69,078 -> 1,143,658; ratio 165:10) [0 scs] which shows how much code is translated (== some measure of the JIT cost), and scheduler: 82,744 event checks. which shows how many backwards branches are counted by the simulator (== some measure of the cost of running the code). What numbers to you see? > Is there any way to optimize Valgrind for such usage? Do more useful work per program-start. > What I want is basically to instrument only a couple of .so modules, > leaving anything else unchanged. Not doable for Memcheck/Helgrind -- they need to track the complete memory state from startup. J |
|
From: Julian S. <js...@ac...> - 2013-01-07 08:04:10
|
On Sunday, January 06, 2013, Florian Krohm wrote: > >> I'm willing to make the change if you and/or other port maintainers are > >> willing to double-check. > > > > Yes, I'm willing to double-check/review. > > OK, let's do it them. Do you want to merge COMEM first ? Yes. (Unfortunately) I think it would be wise to merge COMEM first, and also to finish off and commit the guard-type change too. Without those in place, there is going to be a lot of conflicts with a name-and-arg-order change. I hope to get them done this week. > How about this (with ITE meaning if-then-else)? ITE as a name seems fine. w.r.t. the comment .. > + /* A ternary if-then-else operator. It returns iftrue if cond is > + zero, iffalse otherwise. s/zero/nonzero, I think. J |
|
From: John R. <jr...@bi...> - 2013-01-07 00:00:14
|
On 01/06/2013 01:24 PM, David Bar wrote:
> That's true for memcheck if interested in memory corruption issues.
DO NOT "TOP POST" IN A TECHNICAL DISCUSSION.
In a technical discussion the responses must be in chronological order
from top to bottom so that the flow of ideas is clear without requiring
back-and-forth scanning. "Top posting" (putting the reply above antecedent)
may be OK for some management discussions within a small group who know
each other personally and are having a quick conversation. In a technical
discussion among possibly many strangers whose language fluency may vary
a lot, and which may span a long time, including searches weeks and months
later, then top posting is a severe impediment.
>
> What about massif? It seems that if we only want memory measurements, wouldn't it be enough to just handle malloc/free/etc.?
If memory leaks matter then it is not obvious that just handling
"malloc/free/etc" suffices. If all you want is
{max,min,integral}(malloc - free)
and massif doesn't perform as you like, then talk directly to massif.
> What if I want to run memcheck just to find memory leaks? Doesn't seem to me that I need actual instrumentation here of all code, no?
Show me the code, or at least explain in detail why this should be true.
I don't believe the claim. In particular, I have programs which
use stem+leaf storage for collections of pointers, and these programs
require _more_ than 100% "observation" just to find memory leaks.
> Even for the complete memcheck functionality, some use cases may live well enough with just instrumenting a part of the program.
Extremely unlikely. Sabotaging only log(n) of the memory references can
totally invalidate nearly every program ever written.
>
> Say I have a huge program, which contain tons of DLLs, most of them not under my responsibility and/or I can't fix the bugs in them. If I just want to memcheck my DLL, and am willing to live with the assumption that memory my DLL allocates is only read/written by my DLL, it seems reasonable to only
> translate and handle my DLL, no?
If you are willing to assume that some DLL is an island unto itself,
then why not test it that way? Also, unless the DLL has been through
an actual theorem prover or an extensive battery of tests then I
don't believe the claim "my DLL is an island."
--
|
|
From: David B. <dav...@gm...> - 2013-01-06 21:24:10
|
John, That's true for memcheck if interested in memory corruption issues. What about massif? It seems that if we only want memory measurements, wouldn't it be enough to just handle malloc/free/etc.? What if I want to run memcheck just to find memory leaks? Doesn't seem to me that I need actual instrumentation here of all code, no? Even for the complete memcheck functionality, some use cases may live well enough with just instrumenting a part of the program. Say I have a huge program, which contain tons of DLLs, most of them not under my responsibility and/or I can't fix the bugs in them. If I just want to memcheck my DLL, and am willing to live with the assumption that memory my DLL allocates is only read/written by my DLL, it seems reasonable to only translate and handle my DLL, no? On Sun, Jan 6, 2013 at 11:06 PM, John Reiser <jr...@bi...> wrote: > > What I want is basically to instrument only a couple of .so modules, > > leaving anything else unchanged. > > In order to compute correct answers, memcheck must observe > every instruction whose input(s) derive from memory, > or whose output(s) eventually go to memory. > "Partial instrumentation" would be possible only for > a program which directly spews constants. > > -- > > |
|
From: John R. <jr...@bi...> - 2013-01-06 21:05:40
|
> What I want is basically to instrument only a couple of .so modules, > leaving anything else unchanged. In order to compute correct answers, memcheck must observe every instruction whose input(s) derive from memory, or whose output(s) eventually go to memory. "Partial instrumentation" would be possible only for a program which directly spews constants. -- |
|
From: Timur I. <tim...@go...> - 2013-01-06 20:17:26
|
Hi, I'm benchmarking Valgrind (as an instrumentation framework) on large apps and noticed it's sometimes too slow. For example, running DumpRenderTree* takes ~16.5 seconds under an "empty" tool (similar to "none") on my machine. The same binary runs for just 0.3 seconds without Valgrind. This is roughly a 50x slowdown, I'd expect it to be much smaller when doing little to no additional instrumentation. See a tool I've created for my benchmarking attached, it runs the same test in 17 seconds. Flipping "#if 0" -> "#if 1" gives the 16.5 seconds mentioned above. That means, almost all the overhead is inside Valgrind. Is there any way to optimize Valgrind for such usage? What I want is basically to instrument only a couple of .so modules, leaving anything else unchanged. Thanks! * - DumpRenderTree is one of the targets in a Chromium checkout. I've built a Release build of DRT and ran it like this: xvfb-run time ./valgrind-3.8.1/BUILD/bin/valgrind --tool=vgtool ../DumpRenderTree ../hello.html You can find the build instructions here: http://code.google.com/p/chromium/wiki/LinuxBuildInstructions and hello.html is just "<html><body>Hello world!</body></html>". I ran the tests on Intel Xeon E5620, Ubuntu 12.04, gcc 4.6.3. -- Timur Iskhodzhanov, Google Russia |
|
From: Florian K. <br...@ac...> - 2013-01-06 16:50:20
|
Allow the tree builder to move load expressions past statements that modify the guest state, unless the guest state modification requires precise exceptions. This improvement was already suggested in a comment in the code and this patch implements that suggestion. s390 will get more opportunities to use memory-to-memory move insns. regtested on x86-64 and s390 with no new regressions. I'll wait for approval before checking this in. Florian |
|
From: Florian K. <br...@ac...> - 2013-01-06 16:04:09
|
On 01/06/2013 10:48 AM, Julian Seward wrote:
>
>>> One change I am looking at is changing the guard type of Mux0X
>>> from I8 to I1. This makes it consistent with I1 guard types for
>>> Dirty helpers and for Exits.
>>
>> Seconded.
>
> I have this working now for ARM, and, obviously, the generic IR
> stuff has been fixed up too. I can do the front/back end fixes
> for x86/amd64/ppc32/ppc64, and possibly MIPS, but I can't do the
> s390 bits. If I make available a patch containing the above stuff,
> can you do the s390 bits?
Sure, no problem.
>> I'm willing to make the change if you and/or other port maintainers are
>> willing to double-check.
>
> Yes, I'm willing to double-check/review.
OK, let's do it them. Do you want to merge COMEM first ?
How about this (with ITE meaning if-then-else)?
Index: VEX/pub/libvex_ir.h
===================================================================
--- VEX/pub/libvex_ir.h (revision 2627)
+++ VEX/pub/libvex_ir.h (working copy)
@@ -1578,7 +1578,7 @@
Iex_Unop,
Iex_Load,
Iex_Const,
- Iex_Mux0X,
+ Iex_ITE,
Iex_CCall
}
IRExprTag;
@@ -1762,18 +1762,18 @@
IRExpr** args; /* Vector of argument expressions. */
} CCall;
- /* A ternary if-then-else operator. It returns expr0 if cond is
- zero, exprX otherwise. Note that it is STRICT, ie. both
- expr0 and exprX are evaluated in all cases.
+ /* A ternary if-then-else operator. It returns iftrue if cond is
+ zero, iffalse otherwise. Note that it is STRICT, ie. both
+ iftrue and iffalse are evaluated in all cases.
- ppIRExpr output: Mux0X(<cond>,<expr0>,<exprX>),
- eg. Mux0X(t6,t7,t8)
+ ppIRExpr output: ITE(<cond>,<iftrue>,<iffalse>),
+ eg. ITE(t6,t7,t8)
*/
struct {
IRExpr* cond; /* Condition */
- IRExpr* expr0; /* True expression */
- IRExpr* exprX; /* False expression */
- } Mux0X;
+ IRExpr* iftrue; /* True expression */
+ IRExpr* iffalse; /* False expression */
+ } ITE;
} Iex;
};
@@ -1808,7 +1808,7 @@
extern IRExpr* IRExpr_Load ( IREndness end, IRType ty, IRExpr* addr );
extern IRExpr* IRExpr_Const ( IRConst* con );
extern IRExpr* IRExpr_CCall ( IRCallee* cee, IRType retty, IRExpr**
args );
-extern IRExpr* IRExpr_Mux0X ( IRExpr* cond, IRExpr* expr0, IRExpr*
exprX );
+extern IRExpr* IRExpr_ITE ( IRExpr* cond, IRExpr* expr0, IRExpr*
exprX );
/* Deep-copy an IRExpr. */
extern IRExpr* deepCopyIRExpr ( IRExpr* );
|
|
From: Julian S. <js...@ac...> - 2013-01-06 15:48:19
|
> > One change I am looking at is changing the guard type of Mux0X > > from I8 to I1. This makes it consistent with I1 guard types for > > Dirty helpers and for Exits. > > Seconded. I have this working now for ARM, and, obviously, the generic IR stuff has been fixed up too. I can do the front/back end fixes for x86/amd64/ppc32/ppc64, and possibly MIPS, but I can't do the s390 bits. If I make available a patch containing the above stuff, can you do the s390 bits? Unfortunately this is not something that can be committed incrementally without breaking various front/back ends. > I'm willing to make the change if you and/or other port maintainers are > willing to double-check. Yes, I'm willing to double-check/review. J |
|
From: Florian K. <br...@ac...> - 2013-01-06 14:48:22
|
On 01/04/2013 06:21 AM, Julian Seward wrote: > > One change I am looking at is changing the guard type of Mux0X > from I8 to I1. This makes it consistent with I1 guard types for > Dirty helpers and for Exits. Seconded. > Changing the order of the second and third args in Mux0X, so it > matches the familiar C ?-: syntax, and possibly renaming it, > would be a nice thing, but does not change the generated code. > If anyone wants to volunteer to do that (and verify the change is > correct :) please speak up. Yes this would be good to do. I never understood why this IROp was defined this way and I made a few mistakes using it at the time. The change is obviously mechanical. But I would not know how to verify it other than having somebody else proof-read the patch. Passing regtest is too weak a criterion... I'm willing to make the change if you and/or other port maintainers are willing to double-check. Florian |
|
From: John R. <jr...@bi...> - 2013-01-04 23:28:03
|
> The patch simply uses the modulo operation for mapping addresses to > cache sets for the LL. > I was undecided on what to do. Shift right by log2(line_size), then do a table lookup using the bottom K bits as index. K and the table are computed during initialization, and never modified after that. Typical line size is 32, 64, or 128 bytes; thus shift is 5, 6, or 7. Typical K is 6, 7, or 8. Statically allocate unsigned char table[512], and just ignore what is not used. Using compile-time constants 64==line_size and 6==K (computing only the table at runtime) does OK for many of today's caches. -- |
|
From: Josef W. <Jos...@gm...> - 2013-01-04 22:33:38
|
Am 27.12.2012 13:12, schrieb Julian Seward: > > On Friday, December 21, 2012, Josef Weidendorfer wrote: >> On x86, for sure they are rare. > > On x86 they do not exist at all for any instruction set < AVX, IIUC, > since AVX has conditional scatter/gather loads. Did you mean something > else? I thought you may use this also on x86/amd64 for cmov... >> I am not so sure for ARM. > >>From my experimentation so far, they do exist on ARM (obviously) but are > still quite rare. Ok. > I think so, but it would be good if you could look at what I just > committed for Callgrind (r13206). Looks fine aside from one issue: The handlers CLG_(cachesim).log_0I1Dw/Dr may be 0 if cache simulation is off. Thus, in addEvent_D_guarded it should not add the helper call if helperAddr is 0. It does not produce the same > numbers for insn reads or data accesses for "perf/bz2 x" compiled > -g -O, which worries me. But IIRC you fixed some bug on the trunk > relating to cachegrind/callgrind inconsistencies, and maybe you > fixed it after I made the COMEM branch from the trunk. So maybe > that is the reason for the differences. May be. That fix was for the case when cache simulation is off in callgrind. If you start callgrind with --cache-sim=yes, it should show the same numbers. Josef |
|
From: Siddharth N. <si...@gm...> - 2013-01-04 21:40:18
|
Hi All, I am trying to instrument syscalls that read from and modify memory. I want to use the pre_mem_read and post_mem_write functions, but I also need the correct function execution context when the read and write occurs (user program function where the syscall occurs). So my idea was to use Callgrind, record all pre_mem_reads and query the context during the next syscall and similarly during post_mem_writes query the context of the previous syscall that has occurred. I couldn't find any documentation on how pre_mem_read functions work in conjunction with syscalls. Am I going about this right? Thanks, Siddharth |