You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
1
(17) |
2
(21) |
3
(17) |
4
(28) |
5
(21) |
6
(11) |
|
7
(13) |
8
(21) |
9
(21) |
10
(9) |
11
(11) |
12
(15) |
13
(23) |
|
14
(15) |
15
(22) |
16
(28) |
17
(12) |
18
(15) |
19
(8) |
20
(7) |
|
21
(8) |
22
(12) |
23
(13) |
24
(7) |
25
(7) |
26
(3) |
27
(9) |
|
28
(13) |
29
(7) |
30
(7) |
31
(9) |
|
|
|
|
From: Tom H. <th...@cy...> - 2004-03-08 16:05:58
|
CVS commit by thughes:
Remove function which is no longer used following the changes to
the handling of libc thread specific data.
M +0 -16 vg_libpthread.c 1.147
--- valgrind/coregrind/vg_libpthread.c #1.146:1.147
@@ -1675,20 +1675,4 @@ void * __pthread_getspecific(pthread_key
-#ifdef GLIBC_2_3
-static
-void ** __pthread_getspecific_addr(pthread_key_t key)
-{
- void** specifics_ptr;
- ensure_valgrind("pthread_getspecific_addr");
-
- if (!key_is_valid(key))
- return NULL;
-
- specifics_ptr = get_or_allocate_specifics_ptr(pthread_self());
- return &(specifics_ptr[key]);
-}
-#endif
-
-
/* ---------------------------------------------------
ONCEry
|
|
From: Nicholas N. <nj...@ca...> - 2004-03-08 15:47:53
|
CVS commit by nethercote:
Remove warning with -q.
[from HEAD]
M +7 -5 mc_main.c 1.38.2.2
--- valgrind/memcheck/mc_main.c #1.38.2.1:1.38.2.2
@@ -288,4 +288,5 @@ static void set_address_range_perms ( Ad
return;
+ if (VG_(clo_verbosity) > 0) {
if (len > 100 * 1000 * 1000) {
VG_(message)(Vg_UserMsg,
@@ -294,4 +295,5 @@ static void set_address_range_perms ( Ad
len, example_a_bit, example_v_bit );
}
+ }
VGP_PUSHCC(VgpSetMem);
|
|
From: Nicholas N. <nj...@ca...> - 2004-03-08 15:46:38
|
CVS commit by nethercote:
Turn off warning with -q.
M +7 -5 mc_main.c 1.47
--- valgrind/memcheck/mc_main.c #1.46:1.47
@@ -285,4 +285,5 @@ static void set_address_range_perms ( Ad
return;
+ if (VG_(clo_verbosity) > 0) {
if (len > 100 * 1000 * 1000) {
VG_(message)(Vg_UserMsg,
@@ -291,4 +292,5 @@ static void set_address_range_perms ( Ad
len, example_a_bit, example_v_bit );
}
+ }
VGP_PUSHCC(VgpSetMem);
|
|
From: Jeremy F. <je...@go...> - 2004-03-08 10:30:09
|
On Mon, 2004-03-08 at 01:26, Doug Rabson wrote: > I recently had problems with valgrind running out of memory loading the > debug information for a program which loaded a large number of shared > libraries, all of which had lots of C++ debugging information. We ended > up re-arranging things to give valgrind more memory for malloc. > > From a quick reading of the code, we seem to only decode type > information for stabs. Does anything actually use the type information? > It certainly seems to take up a lot of space. The VG_(describe_addr)() function uses it to generate a symbolic description for a particular address and execution context, using whatever variables happen to be in scope at the time. The intent is to present a much more useful description of a failing address than just the numeric value, or the offset into the malloc block. Helgrind is the only tool to use it really consistently; we should make the rest use it too, where possible (memcheck/addrcheck use it a bit, but not very well). It is pretty large though. I guess the options are to work out how to represent it more compactly, or have an option to not load it. For DWARF2, it should be possible to incrementally load it as necessary, rather than in bulk like stabs forces us to do. J |
|
From: Jeremy F. <je...@go...> - 2004-03-08 10:25:41
|
On Sun, 2004-03-07 at 23:19, Tom Hughes wrote: > I don't think that's true at all - our x86-64 box seems to map code > outside the 4G range. Look at libc in this map: I think the restriction is on executables; they can't be outside the 2G limit. Shared objects use EIP-relative addressing, so they don't really care where they're placed. The x86-64 ABI talks about small, kernel, medium and large models; small is where text and data are below 2G; kernel is mapped into the negative 2G part of the address space; medium forces text to be under 2G, but data can be higher; large has no restrictions. gcc/binutils doesn't implement large. The upshot is that I think we can fit all of Valgrind's static text below the start of the client executable (4 MBytes should be enough space), and put the .so's and data way above the client address space. J |
|
From: Doug R. <df...@nl...> - 2004-03-08 09:35:24
|
On Monday 08 March 2004 01:00, Jeremy Fitzhardinge wrote: > On Fri, 2004-03-05 at 18:22, Josef Weidendorfer wrote: > > Hi, > > > > I just tried Calltree to get working with Valgrind CVS. > > The version can be found at > > http://kcachegrind.sourceforge.net/calltree-0.9.7.tar.gz > > > > It is almost working fine with using "--pointercheck=3Dno", but for > > large programs I get the follow error when the valgrind process > > grows over 160M in size: > > Yes, the heap size for Valgrind is set to a relatively small value.=20 > The idea is that if a tool needs to allocate large chunks of memory, > it could use the shadow memory pool. For most tools, this is sized > by a well-understood ratio to the client memory size - but for > calltree this doesn't necessarily make a lot of sense. > > We could increase the Valgrind heap size pretty easily, but it eats > into the available client address space. > > How does Calltree use memory? What does it allocate? I recently had problems with valgrind running out of memory loading the=20 debug information for a program which loaded a large number of shared=20 libraries, all of which had lots of C++ debugging information. We ended=20 up re-arranging things to give valgrind more memory for malloc. =46rom a quick reading of the code, we seem to only decode type=20 information for stabs. Does anything actually use the type information?=20 It certainly seems to take up a lot of space. |
|
From: Nicholas N. <nj...@ca...> - 2004-03-08 09:25:05
|
On Mon, 8 Mar 2004, Paul Mackerras wrote: > However, it is painfully slow and it detects a lot of errors. > Mozilla, for instance, runs about 500 times slower under valgrind than > it does natively on my G5. It's not executing 500x as many > instructions, so there must be something about the kinds of > instruction sequences I am generating that cause the CPU to run a lot > more slowly than it does on "normal" code. Maybe I am getting a lot > of cache misses. How does Memcheck compare with Nulgrind (--skin=none) and Addrcheck? A comparison there could be instructive. N |
|
From: Tom H. <th...@cy...> - 2004-03-08 07:28:14
|
In message <1078708420.28976.15.camel@localhost.localdomain>
Jeremy Fitzhardinge <je...@go...> wrote:
> Tell me if I can help in any way. I'm interested to know if I've made
> any assumptions which won't work for ppc (32 or 64). I already know
> that we're going to have to do things slightly differently for x86-64,
> because we can't put the Valgrind code at a very high address (the
> toolchain doesn't support code outside of 4G, even if pointers are
> large), so we're going to have to do something like move Valgrind very
> low, reserving all the low addresses for its code (which would also work
> for most x86-32 programs... hmm).
I don't think that's true at all - our x86-64 box seems to map code
outside the 4G range. Look at libc in this map:
gill [~] % uname -a
Linux gill.uk.cyberscience.com 2.4.22-1.2166.nptl #1 Fri Jan 30 13:44:52 EST 2004 x86_64 x86_64 x86_64 GNU/Linux
gill [~] % cat /proc/self/maps
0000000000400000-0000000000404000 r-xp 0000000000000000 03:41 868386 /bin/cat
0000000000504000-0000000000505000 rw-p 0000000000004000 03:41 868386 /bin/cat
0000000000505000-0000000000526000 rwxp 0000000000000000 00:00 0
0000002a95556000-0000002a9556b000 r-xp 0000000000000000 03:41 786436 /lib64/ld-2.3.2.so
0000002a9556b000-0000002a9556c000 rw-p 0000000000000000 00:00 0
0000002a9557d000-0000002a9557e000 rw-p 0000000000000000 00:00 0
0000002a9566a000-0000002a9566b000 rw-p 0000000000014000 03:41 786436 /lib64/ld-2.3.2.so
0000002a9566b000-0000002a957a6000 r-xp 0000000000000000 03:41 6307844 /lib64/tls/libc-2.3.2.so
0000002a957a6000-0000002a9586b000 ---p 000000000013b000 03:41 6307844 /lib64/tls/libc-2.3.2.so
0000002a9586b000-0000002a958ab000 rw-p 0000000000100000 03:41 6307844 /lib64/tls/libc-2.3.2.so
0000002a958ab000-0000002a958af000 rw-p 0000000000000000 00:00 0
0000007fbfffd000-0000007fc0000000 rwxp ffffffffffffe000 00:00 0
Tom
--
Tom Hughes (th...@cy...)
Software Engineer, Cyberscience Corporation
http://www.cyberscience.com/
|
|
From: <js...@ac...> - 2004-03-08 04:15:40
|
Nightly build on phoenix ( SuSE 8.2 ) started at 2004-03-08 04:00:00 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow resolv: valgrind ./resolv seg_override: valgrind ./seg_override sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 127 tests, 5 stderr failures, 0 stdout failures ================= corecheck/tests/as_mmap (stderr) corecheck/tests/fdleak_fcntl (stderr) helgrind/tests/inherit (stderr) memcheck/tests/writev (stderr) memcheck/tests/zeropage (stderr) make: *** [regtest] Error 1 |
|
From: <js...@ac...> - 2004-03-08 03:57:19
|
Nightly build on nemesis ( SuSE 9.0 ) started at 2004-03-08 03:50:00 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 127 tests, 13 stderr failures, 0 stdout failures ================= corecheck/tests/as_mmap (stderr) corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_creat (stderr) corecheck/tests/fdleak_dup (stderr) corecheck/tests/fdleak_dup2 (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_open (stderr) corecheck/tests/fdleak_pipe (stderr) corecheck/tests/fdleak_socketpair (stderr) helgrind/tests/inherit (stderr) memcheck/tests/writev (stderr) memcheck/tests/zeropage (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <to...@co...> - 2004-03-08 03:30:50
|
Nightly build on dunsmere ( Fedora Core 1 ) started at 2004-03-08 03:20:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow == 128 tests, 15 stderr failures, 1 stdout failure ================= corecheck/tests/fdleak_cmsg (stderr) corecheck/tests/fdleak_creat (stderr) corecheck/tests/fdleak_dup (stderr) corecheck/tests/fdleak_dup2 (stderr) corecheck/tests/fdleak_fcntl (stderr) corecheck/tests/fdleak_ipv4 (stderr) corecheck/tests/fdleak_open (stderr) corecheck/tests/fdleak_pipe (stderr) corecheck/tests/fdleak_socketpair (stderr) helgrind/tests/inherit (stderr) memcheck/tests/buflen_check (stderr) memcheck/tests/execve (stderr) memcheck/tests/fwrite (stderr) memcheck/tests/weirdioctl (stderr) memcheck/tests/writev (stderr) none/tests/exec-sigmask (stdout) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-03-08 03:26:08
|
Nightly build on audi ( Red Hat 9 ) started at 2004-03-08 03:15:01 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow pushpopseg: valgrind ./pushpopseg rcl_assert: valgrind ./rcl_assert rcrl: valgrind ./rcrl readline1: valgrind ./readline1 resolv: valgrind ./resolv seg_override: valgrind ./seg_override sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 128 tests, 1 stderr failure, 0 stdout failures ================= helgrind/tests/inherit (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-03-08 03:21:01
|
Nightly build on ginetta ( Red Hat 8.0 ) started at 2004-03-08 03:10:01 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow seg_override: valgrind ./seg_override sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 128 tests, 6 stderr failures, 0 stdout failures ================= helgrind/tests/deadlock (stderr) helgrind/tests/inherit (stderr) helgrind/tests/race (stderr) helgrind/tests/race2 (stderr) memcheck/tests/nanoleak (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-03-08 03:18:32
|
Nightly build on standard ( Red Hat 7.2 ) started at 2004-03-08 03:00:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow resolv: valgrind ./resolv seg_override: valgrind ./seg_override sha1_test: valgrind ./sha1_test shortpush: valgrind ./shortpush shorts: valgrind ./shorts smc1: valgrind ./smc1 syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 128 tests, 5 stderr failures, 0 stdout failures ================= corecheck/tests/fdleak_fcntl (stderr) helgrind/tests/inherit (stderr) memcheck/tests/badfree-2trace (stderr) memcheck/tests/execve (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Tom H. <th...@cy...> - 2004-03-08 03:15:58
|
Nightly build on alvis ( Red Hat 7.3 ) started at 2004-03-08 03:05:02 GMT Checking out source tree ... done Configuring ... done Building ... done Running regression tests ... done Last 20 lines of log.verbose follow smc1: valgrind ./smc1 syscall-restart1: valgrind ./syscall-restart1 syscall-restart2: valgrind ./syscall-restart2 system: valgrind ./system yield: valgrind ./yield -- Finished tests in none/tests ---------------------------------------- == 128 tests, 9 stderr failures, 1 stdout failure ================= helgrind/tests/deadlock (stderr) helgrind/tests/inherit (stderr) helgrind/tests/race (stderr) helgrind/tests/race2 (stderr) memcheck/tests/badfree-2trace (stderr) memcheck/tests/badjump (stderr) memcheck/tests/brk (stderr) memcheck/tests/error_counts (stdout) memcheck/tests/new_nothrow (stderr) memcheck/tests/writev (stderr) make: *** [regtest] Error 1 |
|
From: Jeremy F. <je...@go...> - 2004-03-08 01:44:46
|
On Fri, 2004-03-05 at 09:10, KJK::Hyperion wrote: > At 01.14 02/03/2004, Jeremy Fitzhardinge wrote: > >Well, the thing which Valgrind ideally wants is two complete address > >spaces: one for the client, and one for Valgrind. > > are you 100% sure of what you're saying? if I understand correctly, at > least the JIT compiler needs to run in the same process as the client. > Otherwise we'd have an insane amount of inter-process memory copying, which > doesn't exactly come for free Well, there is the question of which address space the generated code should go into. You're right, it needs quick access to the client memory for client execution, and quick access to the shadow memory for the instrumentation. I guess in principle you could play segment games for this or something... I haven't thought about this too deeply. We definitely need the notion of multiple address spaces, but Valgrind's precise requirements somewhat more demanding than the normal uses of address spaces. It wants a distinct address space to keep the client in, and one for the core Valgrind code, but since the client code doesn't execute directly, we need some union space in which generated code can get to both the client and Valgrind address spaces equally easily. This is doable but awkward in a single linear address space: you just partition the address space at some point, and say everything below is client, and everything above is Valgrind - this is what we do now, and so long as client-visible objects don't have high fixed addresses, it all works. The main problem is that 4G is a bit of a squeeze. This model should work a lot better for 64-bit address spaces, since there's a lot more room to play with. (x86-64 is a bit odd, since the toolchain doesn't support code above 4G, and is "only" a 53/48? bit address space anyway.) Two linear address spaces (ie, two separate unix processes) would allow the client to have full run of one whole address space, but there's no clear way in which generated code could have efficient access to both address spaces. Some kind of segmentation scheme, in which we map two segments to two distinct linear address spaces would work, since client code could use some kind of segment prefix on memory operations to distinguish which address space it wants to access. Unfortunately, I don't think this works on x86 (since segmentation is layered on top of paging, so all segments ultimately map onto the same underlying 4G paged address space). PPC's notion of segments is somewhat different, but I don't think it does the right thing either. In other words, if someone wanted to make a CPU and OS optimised for running Valgrind, it would look somewhat different from the Unix model or the Windows model, I think. > this sounds like good news, finally. Is there an updated technical > overview? last time I looked, it said Valgrind didn't do multithreading I think that overview was obsolete then, then. Multithreading has been in there a long time. But it needs a lot more updating, since things changed quite a bit in 2.1.0, and 2.1.1 (when it appears). 2.1.2 will rearrange the source tree and probably include at least the FreeBSD port, but I don't think there'll be much deep architectural change. > >Does this mean that if we translate this code into the instrumented code > >cache, then things will care because the EIP isn't within the > >kernel/user/gdi.dll? > > no, this shouldn't be a problem, it only hurt full virtualization. The > kernel-mode windowing and graphics code does call back to user-mode, but it > only does so through a well-defined entry point - one of the entry points > Valgrind will have to catch anyway not to lose control I don't really understand what the problem with FV is. If kernel32.dll, user32.dll and gdi32.dll need to be loaded once at a fixed address, there's still no reason why the client and Valgrind couldn't share the same copy. All the client uses of code in those .dlls will be virtualized, of course. Hm, I guess if those libraries are holding state, we need to make sure the client version and Valgrind versions are distinct. If this is a case, it doesn't seem like FV itself is the problem - its the more general problem of how to multiplex one "process" state/context between two somewhat independent separate programs. > Apropos, some details for the other Windows guys: > - the entry points (exported by ntdll.dll) are: > - KiRaiseUserExceptionDispatcher > - KiUserApcDispatcher > - KiUserCallbackDispatcher > - KiUserExceptionDispatcher > KiUserApcDispatcher will always be hit at least once per thread, as an > APC is queued to all threads to run the initialization code. We won't need > to do special handling of any of them, though. We'll just switch to the JIT > upon entering one of them. The first thread entering the JIT will > initialize it, the others will spin until initialization is done. We could > detect new threads by checking the flat address of their TEB against an > internal table, or by allocating a TLS slot and checking for its value > (NULL -> new thread) What's an APC? Async procedure call? What does that mean? > - the entry points above aren't enough. Some control transfers happen > outside of them - luckily there aren't many. I've counted three: > ZwContinue, ZwSetContextThread and ZwVdmControl. The first two are easy, > the third is a mistery. I know it causes control transfers to and from V86 > virtual machines, but how does it do that is not known - luckily, only > NTVDM uses it I guess this would be an elaboration of the games we play currently with signals? That's the only place in Unix where the kernel asynchronously changes the process context. > - catching system calls is a mess. Hooking system calls directly in > kernel mode, as dangerous as it is, is the best way for several reasons. I > don't like how that strace for Windows does it, though. To signal a call > I'd raise an exception: it's semantically correct, so it will work well > with existing (and future) code Is there some distinct instruction or class of instructions used for calling into the kernel? int? lcall through some special call gate? Any of those we can identify at translation time and do the right thing. J |
|
From: Jeremy F. <je...@go...> - 2004-03-08 01:21:33
|
On Sun, 2004-03-07 at 14:26, Paul Mackerras wrote: > I can now successfully valgrind Mozilla and OpenOffice on PPC. Good news. > However, it is painfully slow and it detects a lot of errors. What kinds of errors? The error paths, even for suppressed or suplicate errors, are pretty slow compared to the non-error paths; I could imagine a pretty significant performance hit from just those. > Mozilla, for instance, runs about 500 times slower under valgrind than > it does natively on my G5. It's not executing 500x as many > instructions, so there must be something about the kinds of > instruction sequences I am generating that cause the CPU to run a lot > more slowly than it does on "normal" code. Maybe I am getting a lot > of cache misses. 500x is a bit of a surprise - it could just be a result of "lots of errors". I'd look to see if there's issues with sharing code and data on the same page. The current translation cache puts a structure immediately before each BB. I don't think it's modified much, but there could be issues. I think we should probably consider separating the data and code pieces of the TC anyway. Also, I presume you flush the icache when generating new blocks of code; is that a global flush, or just parts of the icache? Does linux-ppc support oprofile? I have a little hack which allocates the TC with mmap to a file rather than in anonymous memory, which allows oprofile to give good overall results to see whether time is being spent in generated code or in Valgrind core code (though mapping the generated code addresses to something meaningful is trickier). > I have started merging my changes into the current CVS version. It's > going to take me a little while, though, to understand the new startup > sequence. Tell me if I can help in any way. I'm interested to know if I've made any assumptions which won't work for ppc (32 or 64). I already know that we're going to have to do things slightly differently for x86-64, because we can't put the Valgrind code at a very high address (the toolchain doesn't support code outside of 4G, even if pointers are large), so we're going to have to do something like move Valgrind very low, reserving all the low addresses for its code (which would also work for most x86-32 programs... hmm). J |
|
From: Jeremy F. <je...@go...> - 2004-03-08 01:08:46
|
On Fri, 2004-03-05 at 18:22, Josef Weidendorfer wrote: > Hi, > > I just tried Calltree to get working with Valgrind CVS. > The version can be found at > http://kcachegrind.sourceforge.net/calltree-0.9.7.tar.gz > > It is almost working fine with using "--pointercheck=no", but for large > programs I get the follow error when the valgrind process grows over 160M in > size: Yes, the heap size for Valgrind is set to a relatively small value. The idea is that if a tool needs to allocate large chunks of memory, it could use the shadow memory pool. For most tools, this is sized by a well-understood ratio to the client memory size - but for calltree this doesn't necessarily make a lot of sense. We could increase the Valgrind heap size pretty easily, but it eats into the available client address space. How does Calltree use memory? What does it allocate? J |
|
From: Julian S. <js...@ac...> - 2004-03-08 01:05:27
|
On Monday 08 March 2004 00:04, Paul Mackerras wrote: > Johan Rydberg writes: > > What about chaining between blocks? That normally increases performance > > by a magnitude. No, not for us. > Yes, I do block chaining. Usually I see about 80% of jumps being > chained (i.e. 20% unchained). But the chaining only seems to increase > performance by about 5%. I'm sure I must be doing something wrong > somewhere but I just can't put my finger on it. 5% gain with block chaining sounds roughly on a par with what we got in x86 land, so looks like you're OK there at least. Have you tried with and without your new ultra-accurate add-tracking sequence -- the one with the min and max? That looks a bit expensive to me. J |
|
From: Julian S. <js...@ac...> - 2004-03-08 01:03:37
|
On Monday 08 March 2004 00:04, Paul Mackerras wrote:
> Johan Rydberg writes:
> > What about chaining between blocks? That normally increases performance
> > by a magnitude.
>
> Yes, I do block chaining. Usually I see about 80% of jumps being
> chained (i.e. 20% unchained). But the chaining only seems to increase
> performance by about 5%. I'm sure I must be doing something wrong
> somewhere but I just can't put my finger on it.
On x86 we unexpectedly got hammered (lost a huge number of cycles) due to
instructions to save and restore the cpu's flags register in memory.
Surprisingly this apparently-trivial action seems to cause PIII,
P4 and Athlon to stop until all pipelines are empty, losing 20-40
cycles for what is basically a simple load or store.
I wonder if something bad like that is happening to you. I tracked
this down by running an ultra-trivial loop on V; something like
for (i = 0; i < 100000000; i++)
;
and so you get a single translated bb jumping back to itself.
That means the code in it is simple enough to inspect and perhaps
that might lead you to something.
Other things I can think of are some kind of Icache coherency
problem due to dynamic code generation? Does writing at some
address invalidate all Icache entries in the vicinity?
J
|
|
From: Paul M. <pa...@sa...> - 2004-03-08 00:19:31
|
Johan Rydberg writes: > What about chaining between blocks? That normally increases performance > by a magnitude. Yes, I do block chaining. Usually I see about 80% of jumps being chained (i.e. 20% unchained). But the chaining only seems to increase performance by about 5%. I'm sure I must be doing something wrong somewhere but I just can't put my finger on it. Paul. |