You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: Mark W. <ma...@kl...> - 2023-04-22 01:47:41
|
An RC2 tarball for 3.21.0 is now available at https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC2.tar.bz2 (md5sum = f33407fdffbfa78f5014781cc92297cf) (sha1sum = c520ee0c28d9e20d28aa25d05ce2525c39a69135) https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC2.tar.bz2.asc Please give it a try in configurations that are important for you and report any problems you have, either on this mailing list, or (preferably) via our bug tracker at https://bugs.kde.org/enter_bug.cgi?product=valgrind Please check the NEWS entry below for new features that could use some extra testing. Note that there has also been a dhat extension which hasn't yet been added to NEWS. There is now a a client request for DHAT to mark memory to be histogrammed: https://bugs.kde.org/464103 https://snapshots.sourceware.org/valgrind/trunk/latest/html/dh-manual.html#dh-access-counts If nothing critical emerges, a final release will happen on Friday 28 April. * ==================== CORE CHANGES =================== * When GDB is used to debug a program running under valgrind using the valgrind gdbserver, GDB will automatically load some python code provided in valgrind defining GDB front end commands corresponding to the valgrind monitor commands. These GDB front end commands accept the same format as the monitor commands directly sent to the Valgrind gdbserver. These GDB front end commands provide a better integration in the GDB command line interface, so as to use for example GDB auto-completion, command specific help, searching for a command or command help matching a regexp, ... For relevant monitor commands, GDB will evaluate arguments to make the use of monitor commands easier. For example, instead of having to print the address of a variable to pass it to a subsequent monitor command, the GDB front end command will evaluate the address argument. It is for example possible to do: (gdb) memcheck who_point_at &some_struct sizeof(some_struct) instead of: (gdb) p &some_struct $2 = (some_struct_type *) 0x1130a0 <some_struct> (gdb) p sizeof(some_struct) $3 = 40 (gdb) monitor who_point_at 0x1130a0 40 * The vgdb utility now supports extended-remote protocol when invoked with --multi. In this mode the GDB run command is supported. Which means you don't need to run gdb and valgrind from different terminals. So for example to start you program in gdb and run it under valgrind you can do: $ gdb prog (gdb) set remote exec-file prog (gdb) set sysroot / (gdb) target extended-remote | vgdb --multi (gdb) start * The behaviour of realloc with a size of zero can now be changed for tools that intercept malloc. Those tools are memcheck, helgrind, drd, massif and dhat. Realloc implementations generally do one of two things - free the memory like free() and return NULL (GNU libc and ptmalloc). - either free the memory and then allocate a minumum siized block or just return the original pointer. Return NULL if the allocation of the minimum sized block fails (jemalloc, musl, snmalloc, Solaris, macOS). When Valgrind is configured and built it will try to match the OS and libc behaviour. However if you are using a non-default library to replace malloc and family (e.g., musl on a glibc Linux or tcmalloc on FreeBSD) then you can use a command line option to change the behaviour of Valgrind: --realloc-zero-bytes-frees=yes|no [yes on Linux glibc, no otherwise] * ================== PLATFORM CHANGES ================= * Make the address space limit on FreeBSD amd64 128Gbytes (the same as Linux and Solaris, it was 32Gbytes) * ==================== TOOL CHANGES =================== * Memcheck: - When doing a delta leak_search, it is now possible to only output the new loss records compared to the previous leak search. This is available in the memcheck monitor command 'leak_search' by specifying the "new" keyword or in your program by using the client request VALGRIND_DO_NEW_LEAK_CHECK. Whenever a "delta" leak search is done (i.e. when specifying "new" or "increased" or "changed" in the monitor command), the new loss records have a "new" marker. - Valgrind now contains python code that defines GDB memcheck front end monitor commands. See CORE CHANGES. - Performs checks for the use of realloc with a size of zero. This is non-portable and a source of errors. If memcheck detects such a usage it will generate an error realloc() with size 0 followed by the usual callstacks. A switch has been added to allow this to be turned off: --show-realloc-size-zero=yes|no [yes] * Helgrind: - The option ---history-backtrace-size=<number> allows to configure the number of entries to record in the stack traces of "old" accesses. Previously, this number was hardcoded to 8. - Valgrind now contains python code that defines GDB helgrind front end monitor commands. See CORE CHANGES. * Cachegrind: - `--cache-sim=no` is now the default. The cache simulation is old and unlikely to match any real modern machine. This means only the `Ir` event are gathered by default, but that is by far the most useful event. - `cg_annotate`, `cg_diff`, and `cg_merge` have been rewritten in Python. As a result, they all have more flexible command line argument handling, e.g. supporting `--show-percs` and `--no-show-percs` forms as well as the existing `--show-percs=yes` and `--show-percs=no`. - `cg_annotate` has some functional changes. - It's much faster, e.g. 3-4x on common cases. - It now supports diffing (with `--diff`, `--mod-filename`, and `--mod-funcname`) and merging (by passing multiple data files). - It now provides more information at the file and function level. There are now "File:function" and "Function:file" sections. These are very useful for programs that use inlining a lot. - Support for user-annotated files and the `-I`/`--include` option has been removed, because it was of little use and blocked other improvements. - The `--auto` option is renamed `--annotate`, though the old `--auto=yes`/`--auto=no` forms are still supported. - `cg_diff` and `cg_merge` are now deprecated, because `cg_annotate` now does a better job of diffing and merging. - The Cachegrind output file format has changed very slightly, but in ways nobody is likely to notice. * Callgrind: - Valgrind now contains python code that defines GDB callgrind front end monitor commands. See CORE CHANGES. * Massif: - Valgrind now contains python code that defines GDB massif front end monitor commands. See CORE CHANGES. |
|
From: Mark W. <ma...@kl...> - 2023-04-20 20:16:42
|
On Wed, Apr 19, 2023 at 11:46:34AM +0200, folkert wrote: > > > The 2 calls it does are: > > > > > > print_char: > > > movb (%esi), %al > > > movb %al, buffer > > > movl $4, %eax > > > movl $1, %ebx > > > movl $buffer, %ecx > > > movl $1, %edx > > > int $0x80 > > > ret > > > > > > exit: > > > movl $1, %eax > > > movl $0, %ebx > > > int $0x80 > > > > Valgrind can't run just any executable binary. It has quite a lot of hard > > coded limitations that correspont (mostly) to what compilers and link > > editors will produce. So if you use assembler and use opcodes not normally > > generated by compilers then it won't work. > ... > > So int 0x80 results in a decode error. > > > > Can you use syscall? > > That solves the problem. Glad that resolved it.I also didn't know int 0x80 worked on amd64 as syscal (but not under valgrind). Note that this also https://bugs.kde.org/show_bug.cgi?id=342988 Cheers, Mark |
|
From: folkert <fo...@va...> - 2023-04-19 09:46:48
|
> > The 2 calls it does are: > > > > print_char: > > movb (%esi), %al > > movb %al, buffer > > movl $4, %eax > > movl $1, %ebx > > movl $buffer, %ecx > > movl $1, %edx > > int $0x80 > > ret > > > > exit: > > movl $1, %eax > > movl $0, %ebx > > int $0x80 > > Valgrind can't run just any executable binary. It has quite a lot of hard > coded limitations that correspont (mostly) to what compilers and link > editors will produce. So if you use assembler and use opcodes not normally > generated by compilers then it won't work. ... > So int 0x80 results in a decode error. > > Can you use syscall? That solves the problem. Thanks! |
|
From: Floyd, P. <pj...@wa...> - 2023-04-18 16:34:19
|
On 18/04/2023 17:46, folkert wrote:
> The 2 calls it does are:
>
> print_char:
> movb (%esi), %al
> movb %al, buffer
> movl $4, %eax
> movl $1, %ebx
> movl $buffer, %ecx
> movl $1, %edx
> int $0x80
> ret
>
> exit:
> movl $1, %eax
> movl $0, %ebx
> int $0x80
Valgrind can't run just any executable binary. It has quite a lot of
hard coded limitations that correspont (mostly) to what compilers and
link editors will produce. So if you use assembler and use opcodes not
normally generated by compilers then it won't work.
The code that handles this is
case 0xCD: /* INT imm8 */
d64 = getUChar(delta); delta++;
/* Handle int $0xD2 (Solaris fasttrap syscalls). */
if (d64 == 0xD2) {
jmp_lit(dres, Ijk_Sys_int210, guest_RIP_bbstart + delta);
vassert(dres->whatNext == Dis_StopHere);
DIP("int $0xD2\n");
return delta;
}
goto decode_failure;
So int 0x80 results in a decode error.
Can you use syscall?
A+
Paul
|
|
From: Eliot M. <mo...@cs...> - 2023-04-18 15:50:31
|
On 4/18/2023 10:51 AM, folkert wrote: > Hi, > > I wrote a compiler for brainfuck to x86. > The result is quite fast but I was curious if I could tune it even more. > So I ran it in callgrind but this resulted in: > > folkert@snsv ~/Projects/bf-compiler (master)$ valgrind --tool=callgrind ./test > ==77043== Callgrind, a call-graph generating cache profiler > ==77043== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al. > ==77043== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info > ==77043== Command: ./test > ==77043== > ==77043== For interactive control, run 'callgrind_control -h'. > vex amd64->IR: unhandled instruction bytes: 0xCD 0x80 0xC3 0x67 0x80 0x3E 0x0 0x74 0x5 0x83 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==77043== valgrind: Unrecognised instruction at address 0x40274e. > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== Your program just tried to execute an instruction that Valgrind > ==77043== did not recognise. There are two possible reasons for this. > ==77043== 1. Your program has a bug and erroneously jumped to a non-code > ==77043== location. If you are running Memcheck and you just saw a > ==77043== warning about a bad jump, it's probably your program's fault. > ==77043== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==77043== i.e. it's Valgrind's fault. If you think this is the case or > ==77043== you are not sure, please let us know and we'll try to fix it. > ==77043== Either way, Valgrind will now raise a SIGILL signal which will > ==77043== probably kill your program. > ==77043== > ==77043== Process terminating with default action of signal 4 (SIGILL) > ==77043== Illegal opcode at address 0x40274E > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== > ==77043== Events : Ir > ==77043== Collected : 28836 > ==77043== > ==77043== I refs: 28,836 > Illegal instruction (core dumped) > > If you're curious what is going wrong here, the source assembly and the > x86 binary can be retrieved from > https://vanheusden.com/permshare/callgrind-error.tar.xz > > Oh and if you would like to assemble the assembly yourself: > > as -g mandelbrot.s > ld -g a.out -o test > > ./test then results in the mandelbrot-fractal. Using an online disassembler, I found that the initial bytes decode to int 0x80, which (under Linux) is a system call. Maybe you're making a system call that valgrind does not recognize? One would need to know register contents to go further with that. Btw, naming a program "test" is not necessarily a wonderful idea if the current directory happens to be on your path, since "test" is a program often used by scripts. Cheers - Eliot Moss |
|
From: folkert <fo...@va...> - 2023-04-18 15:46:41
|
> > I wrote a compiler for brainfuck to x86. > > The result is quite fast but I was curious if I could tune it even more. > > So I ran it in callgrind but this resulted in: ... > > ==77043== Process terminating with default action of signal 4 (SIGILL) > > ==77043== Illegal opcode at address 0x40274E > > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ... > > If you're curious what is going wrong here, the source assembly and the > > x86 binary can be retrieved from > > https://vanheusden.com/permshare/callgrind-error.tar.xz ... > Using an online disassembler, I found that the initial bytes decode to > int 0x80, which (under Linux) is a system call. Maybe you're making a > system call that valgrind does not recognize? One would need to know > register contents to go further with that. The 2 calls it does are: print_char: movb (%esi), %al movb %al, buffer movl $4, %eax movl $1, %ebx movl $buffer, %ecx movl $1, %edx int $0x80 ret exit: movl $1, %eax movl $0, %ebx int $0x80 When the program is ran directly from the command, it runs fine. So that's not the problem. |
|
From: folkert <fo...@va...> - 2023-04-18 15:12:47
|
Hi, I wrote a compiler for brainfuck to x86. The result is quite fast but I was curious if I could tune it even more. So I ran it in callgrind but this resulted in: folkert@snsv ~/Projects/bf-compiler (master)$ valgrind --tool=callgrind ./test ==77043== Callgrind, a call-graph generating cache profiler ==77043== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al. ==77043== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==77043== Command: ./test ==77043== ==77043== For interactive control, run 'callgrind_control -h'. vex amd64->IR: unhandled instruction bytes: 0xCD 0x80 0xC3 0x67 0x80 0x3E 0x0 0x74 0x5 0x83 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==77043== valgrind: Unrecognised instruction at address 0x40274e. ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== Your program just tried to execute an instruction that Valgrind ==77043== did not recognise. There are two possible reasons for this. ==77043== 1. Your program has a bug and erroneously jumped to a non-code ==77043== location. If you are running Memcheck and you just saw a ==77043== warning about a bad jump, it's probably your program's fault. ==77043== 2. The instruction is legitimate but Valgrind doesn't handle it, ==77043== i.e. it's Valgrind's fault. If you think this is the case or ==77043== you are not sure, please let us know and we'll try to fix it. ==77043== Either way, Valgrind will now raise a SIGILL signal which will ==77043== probably kill your program. ==77043== ==77043== Process terminating with default action of signal 4 (SIGILL) ==77043== Illegal opcode at address 0x40274E ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== ==77043== Events : Ir ==77043== Collected : 28836 ==77043== ==77043== I refs: 28,836 Illegal instruction (core dumped) If you're curious what is going wrong here, the source assembly and the x86 binary can be retrieved from https://vanheusden.com/permshare/callgrind-error.tar.xz Oh and if you would like to assemble the assembly yourself: as -g mandelbrot.s ld -g a.out -o test ./test then results in the mandelbrot-fractal. Regards, Folkert van Heusden |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-17 07:20:13
|
I am planning to also remove the `-I`/`--include` option from cg_annotate,
for much the same reasons that I removed user annotated files: it's an
option that made sense in the very early days of cg_annotate, but is of
little or no use today, and it's getting in the way of some other changes I
want to make.
Nick
On Tue, 4 Apr 2023 at 15:52, Nicholas Nethercote <n.n...@gm...>
wrote:
> There were no objections, and I have now removed user annotations from
> `cg_annotate`.
>
> Nick
>
> On Wed, 29 Mar 2023 at 09:03, Nicholas Nethercote <n.n...@gm...>
> wrote:
>
>> Hi,
>>
>> I recently rewrote `cg_annotate`, `cg_diff`, and `cg_merge` in Python.
>> The old versions were written in Perl, Perl, and C, respectively. The new
>> versions are much nicer and easier to modify, and I have various ideas for
>> improving `cg_annotate`. This email is about one of those ideas.
>>
>> A typical way to invoke `cg_annotate` is like this:
>>
>> > cg_annotate cachegrind.out.12345
>>
>> This implies `--auto=yes`, which requests line-by-line "auto-annotation"
>> of source files. I.e. `cg_annotate` will automatically annotate all files
>> in the profile that meet the significance threshold.
>>
>> It's also possible to do something like this:
>>
>> > cg_annotate --auto=no cachegrind.out.12345 a.c b.c
>>
>> Which instead requests "user annotation" of the files `a.c` and `b.c`.
>>
>> My thesis is that auto-annotation suffices in practice for all reasonable
>> use cases, and that user annotation is unnecessary and can be removed.
>>
>> When I first wrote `cg_annotate` in 2002, only user annotation was
>> implemented. Shortly after, I added the `--auto={yes,no}` option. Since
>> then I've never used user annotation, and I suspect nobody else has either.
>> User annotation is ok when dealing with tiny programs, but as soon as you
>> are profiling a program with more than a handful of source files it becomes
>> impractical.
>>
>> The only possible use cases I can think of for user annotation are as
>> follows.
>>
>> - If you want to see a particular file(s) annotated but you don't
>> want to see any others, then you can use user annotation in combination
>> with `--auto=no`. But it's trivial to search through the output for the
>> particular file, so this doesn't seem important.
>> - If the path to a file is somehow really messed up in the debug
>> info, it might be possible that auto-annotation would fail to find it, but
>> user annotation could find it, possibly in combination with `-I`. But this
>> seems unlikely. Some basic testing shows that gcc, clang and rustc all
>> default to using full paths in debug info. gcc supports
>> `-fdebug-prefix-map` but that seems to mostly be used for changing full
>> paths to relative paths, which will still work fine.
>>
>> Removing user annotation would (a) simplify the code and docs, and (b)
>> enable the possibility of moving the merge functionality from `cg_merge`
>> into `cg_annotate`, by allowing the user to specify multiple cachegrind.out
>> files as input.
>>
>> So: is anybody using user annotation? Does anybody see any problems with
>> this proposal?
>>
>> Thanks.
>>
>> Nick
>>
>
|
|
From: Nicholas N. <n.n...@gm...> - 2023-04-16 21:15:32
|
Hi, My plans for the release: - I have one more significant improvement to `cg_annotate` to come, which will add merge and diff capability to it, in a way that is better than the merge/diff capability provided by `cg_merge` and `cg_diff`. - I need to update the Cachegrind docs and the NEWS file for all the changes I've made. I know these will be happening late in the release cycle, but because it's all Python code it should require less testing. The likelihood of platform-specific differences in behaviour is much lower than in most other code within Valgrind. Nick On Sat, 15 Apr 2023 at 12:07, Mark Wielaard <ma...@kl...> wrote: > An RC1 tarball for 3.21.0 is now available at > https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC1.tar.bz2 > (md5sum = a3c7eeff47262cecdf5f1d68b38710b7) > (sha1sum = 46fc5898415001e045abc1b4e2909a41144ed9c4) > https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC1.tar.bz2.asc > > Please give it a try in configurations that are important for you and > report any problems you have, either on this mailing list, or > (preferably) via our bug tracker at > https://bugs.kde.org/enter_bug.cgi?product=valgrind > > There are still some patches being reviewed and a RC2 will appear end > of next week. If nothing critical emerges after that, a final release > will happen on Friday 28 April. > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > |
|
From: Mark W. <ma...@kl...> - 2023-04-15 02:07:05
|
An RC1 tarball for 3.21.0 is now available at https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC1.tar.bz2 (md5sum = a3c7eeff47262cecdf5f1d68b38710b7) (sha1sum = 46fc5898415001e045abc1b4e2909a41144ed9c4) https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC1.tar.bz2.asc Please give it a try in configurations that are important for you and report any problems you have, either on this mailing list, or (preferably) via our bug tracker at https://bugs.kde.org/enter_bug.cgi?product=valgrind There are still some patches being reviewed and a RC2 will appear end of next week. If nothing critical emerges after that, a final release will happen on Friday 28 April. |
|
From: Mark W. <ma...@kl...> - 2023-04-11 07:58:06
|
Hi, Working towards a new release (3.21.0 currently planned for April 28) there is a bit more automation to show pre-releases and documentation: https://snapshots.sourceware.org/valgrind/trunk/ Every 15 minutes a buildbot will check for new commits and create a dist, html manual pages and documentation downloads from latest git trunk (currently the git master branch, after the release we'll switch to using the main branch). Be careful, these aren't official releases, it is as if getting a random git checkout, but hopefully it is useful to see what is coming and have the latest (draft) documentation for new features for the next release. If you try these out please explicitly mention which snapshot you used in any bug reports. Cheers, Mark |
|
From: David F. <fa...@kd...> - 2023-04-04 10:49:32
|
On mardi 4 avril 2023 12:11:18 CEST Nicholas Nethercote wrote: > On Tue, 4 Apr 2023 at 19:24, David Faure <fa...@kd...> wrote: > > But then, with no cache simulation and no call stacks, what's left in > > `cachegrind --cache-sim=no`? > > From the email that started this thread: > > If you run with `--cache-sim=no` then the cache simulation is disabled and > > you just get one event: Ir. (This is "instruction cache reads", which is > > equivalent to "instructions executed".) Ah, right, sorry. So to summarize the big picture: cachegrind -> instructions count, without call stacks, useful for overall numbers or with cg_annotate callgrind -> instructions count, with call stacks, best viewed in kcachegrind I wish those two could do cycles and not just instructions, but I guess this requires a good cache simulator again, back to square one ;) (perf does cycles, but doesn't give exact number of method calls, that's one benefit of cachegrind/callgrind) -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE Frameworks 5 |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-04 10:11:36
|
On Tue, 4 Apr 2023 at 19:24, David Faure <fa...@kd...> wrote: > > But then, with no cache simulation and no call stacks, what's left in > `cachegrind --cache-sim=no`? > >From the email that started this thread: If you run with `--cache-sim=no` then the cache simulation is disabled and > you just get one event: Ir. (This is "instruction cache reads", which is > equivalent to "instructions executed".) > |
|
From: David F. <fa...@kd...> - 2023-04-04 09:25:06
|
On lundi 3 avril 2023 23:46:46 CEST Nicholas Nethercote wrote: > On Mon, 3 Apr 2023 at 21:36, David Faure <fa...@kd...> wrote: > > But then, what's the difference between `cachegrind --cache-sim=no` > > and `callgrind`? > > > > https://accu.org/journals/overload/20/111/floyd_1886/ says > > "The main differences are that Callgrind has more information about the > > callstack whilst cachegrind gives more information about cache hit rates." > > > > Wouldn't one want callstacks? (if this means stack traces). > > I know I must be missing something, thanks for enlightening me. > > Callgrind is a forked and extended version of Cachegrind. It also simulates > a cache, with a slightly different simulation to Cachegrind's. The fact > that both tools exist is due to historical reasons; if starting from > scratch today you wouldn't deliberately split them. Thanks for the information. This is indeed confusing - like anything that is "due to historical reasons" ;-) > Call stacks are often useful (I regularly use Callgrind as well as > Cachegrind) but they aren't always necessary. Without them, Cachegrind runs > faster than Callgrind and produces smaller data files. Cachegrind also > supports diffing and merging different files, while Callgrind does not. OK. I thought call stacks were mandatory for any tool to be useful (they certainly are for KCachegrind (*)), but I now found the documentation on cg_annotate. But then, with no cache simulation and no call stacks, what's left in `cachegrind --cache-sim=no`? (*) This naming adds to the confusion: kcachegrind requires callgrind, it can't work with cachegrind... I know, historical reasons :-) -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE Frameworks 5 |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-04 05:52:49
|
There were no objections, and I have now removed user annotations from
`cg_annotate`.
Nick
On Wed, 29 Mar 2023 at 09:03, Nicholas Nethercote <n.n...@gm...>
wrote:
> Hi,
>
> I recently rewrote `cg_annotate`, `cg_diff`, and `cg_merge` in Python. The
> old versions were written in Perl, Perl, and C, respectively. The new
> versions are much nicer and easier to modify, and I have various ideas for
> improving `cg_annotate`. This email is about one of those ideas.
>
> A typical way to invoke `cg_annotate` is like this:
>
> > cg_annotate cachegrind.out.12345
>
> This implies `--auto=yes`, which requests line-by-line "auto-annotation"
> of source files. I.e. `cg_annotate` will automatically annotate all files
> in the profile that meet the significance threshold.
>
> It's also possible to do something like this:
>
> > cg_annotate --auto=no cachegrind.out.12345 a.c b.c
>
> Which instead requests "user annotation" of the files `a.c` and `b.c`.
>
> My thesis is that auto-annotation suffices in practice for all reasonable
> use cases, and that user annotation is unnecessary and can be removed.
>
> When I first wrote `cg_annotate` in 2002, only user annotation was
> implemented. Shortly after, I added the `--auto={yes,no}` option. Since
> then I've never used user annotation, and I suspect nobody else has either.
> User annotation is ok when dealing with tiny programs, but as soon as you
> are profiling a program with more than a handful of source files it becomes
> impractical.
>
> The only possible use cases I can think of for user annotation are as
> follows.
>
> - If you want to see a particular file(s) annotated but you don't want
> to see any others, then you can use user annotation in combination with
> `--auto=no`. But it's trivial to search through the output for the
> particular file, so this doesn't seem important.
> - If the path to a file is somehow really messed up in the debug info,
> it might be possible that auto-annotation would fail to find it, but user
> annotation could find it, possibly in combination with `-I`. But this seems
> unlikely. Some basic testing shows that gcc, clang and rustc all default to
> using full paths in debug info. gcc supports `-fdebug-prefix-map` but that
> seems to mostly be used for changing full paths to relative paths, which
> will still work fine.
>
> Removing user annotation would (a) simplify the code and docs, and (b)
> enable the possibility of moving the merge functionality from `cg_merge`
> into `cg_annotate`, by allowing the user to specify multiple cachegrind.out
> files as input.
>
> So: is anybody using user annotation? Does anybody see any problems with
> this proposal?
>
> Thanks.
>
> Nick
>
|
|
From: Nicholas N. <n.n...@gm...> - 2023-04-03 21:47:06
|
On Mon, 3 Apr 2023 at 21:36, David Faure <fa...@kd...> wrote: > > But then, what's the difference between `cachegrind --cache-sim=no` > and `callgrind`? > > https://accu.org/journals/overload/20/111/floyd_1886/ says > "The main differences are that Callgrind has more information about the > callstack whilst cachegrind gives more information about cache hit rates." > > Wouldn't one want callstacks? (if this means stack traces). > I know I must be missing something, thanks for enlightening me. > Callgrind is a forked and extended version of Cachegrind. It also simulates a cache, with a slightly different simulation to Cachegrind's. The fact that both tools exist is due to historical reasons; if starting from scratch today you wouldn't deliberately split them. Call stacks are often useful (I regularly use Callgrind as well as Cachegrind) but they aren't always necessary. Without them, Cachegrind runs faster than Callgrind and produces smaller data files. Cachegrind also supports diffing and merging different files, while Callgrind does not. Nick |
|
From: David F. <fa...@kd...> - 2023-04-03 11:56:17
|
[removing valgrind-developers, since I guess I can't post there] On lundi 3 avril 2023 11:29:25 CEST Nicholas Nethercote wrote: > I have been using `--cache-sim=no` almost exclusively for a long time. The > cache simulation done by Valgrind is an approximation of the memory > hierarchy of a 2002 AMD Athlon processor. Its accuracy for a modern memory > hierarchy with three levels of cache, prefetching, non-LRU replacement, and > who-knows-what-else is likely to be low. If you want to accurately know > about cache behaviour you'd be much better off using hardware counters via > `perf` or some other profiler. > > But `--cache-sim=no` is still very useful because instruction execution > counts are still very useful. > > Therefore, I propose changing the default to `--cache-sim=no`. Does anyone > have any objections to this? I agree that simulating a cache from 2002 isn't very useful. But then, what's the difference between `cachegrind --cache-sim=no` and `callgrind`? https://accu.org/journals/overload/20/111/floyd_1886/ says "The main differences are that Callgrind has more information about the callstack whilst cachegrind gives more information about cache hit rates." Wouldn't one want callstacks? (if this means stack traces). I know I must be missing something, thanks for enlightening me. -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE Frameworks 5 |
|
From: Nicholas N. <n.n...@gm...> - 2023-04-03 09:29:43
|
Hi, Cachegrind has an option `--cache-sim`. If you run with `--cache-sim=yes` (the default) it tells it Cachegrind to do a full cache simulation with lots of events: Ir, I1mr, ILmr, Dr, D1mr, DLmr, Dw, D1mw, DLmw. If you run with `--cache-sim=no` then the cache simulation is disabled and you just get one event: Ir. (This is "instruction cache reads", which is equivalent to "instructions executed".) I have been using `--cache-sim=no` almost exclusively for a long time. The cache simulation done by Valgrind is an approximation of the memory hierarchy of a 2002 AMD Athlon processor. Its accuracy for a modern memory hierarchy with three levels of cache, prefetching, non-LRU replacement, and who-knows-what-else is likely to be low. If you want to accurately know about cache behaviour you'd be much better off using hardware counters via `perf` or some other profiler. But `--cache-sim=no` is still very useful because instruction execution counts are still very useful. Therefore, I propose changing the default to `--cache-sim=no`. Does anyone have any objections to this? Thanks. Nick |
|
From: Paul F. <pj...@wa...> - 2023-03-30 07:15:41
|
On 29-03-23 04:41, John Reiser wrote: >>> Could it be possible to add an option like --heap-up-fill >>> --heap-down-fill (like for stack with malloc), that fills heap memory >>> with a specified values (when entering a function and leave a function)? > >> tl;dr 2 >> >> See >> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2723r0.html > > The value zero is the worst possible value to use for such initialization, > from the viewpoint of quickly producing better software by discovering > and identifying bugs sooner. Using the value zero tends to hide many bugs. > > A much better value is 0x8181...81. This value is non-zero, odd, negative > as a signed integer, a very unlikely floating-point value, very often > not a valid pointer value, and instantly recognizable in any dump of > memory. > It was used to great success as the "core constant" (the value of > uninitialized > RAM) by the Michigan Terminal System for IBM 360/67 and successors, > from the early 1970s (50 years ago!) until the demise of MTS around 2000. Hi John The value 0 isn't all bad. I quite often write code that uses enums that start with KIND_INVALID so then I can write asserts like assert(kind != KIND_INVALID); 0 is also the NULL pointer so if your code defends against NULL it will at least not crash. I agree with you about early detection of errors - in a previous job we had loads of "pass the parcel" code that just ignored errors and returned from functions without reporting an error. It was a nightmare to debug, The value used with 'pattern' is 0xAA which isn't too bad either - fairly well known as being a test pattern. A+ Paul |
|
From: John R. <jr...@bi...> - 2023-03-29 02:41:49
|
>> Could it be possible to add an option like --heap-up-fill --heap-down-fill (like for stack with malloc), that fills heap memory with a specified values (when entering a function and leave a function)? > tl;dr 2 > > See > https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2723r0.html The value zero is the worst possible value to use for such initialization, from the viewpoint of quickly producing better software by discovering and identifying bugs sooner. Using the value zero tends to hide many bugs. A much better value is 0x8181...81. This value is non-zero, odd, negative as a signed integer, a very unlikely floating-point value, very often not a valid pointer value, and instantly recognizable in any dump of memory. It was used to great success as the "core constant" (the value of uninitialized RAM) by the Michigan Terminal System for IBM 360/67 and successors, from the early 1970s (50 years ago!) until the demise of MTS around 2000. |
|
From: Nicholas N. <n.n...@gm...> - 2023-03-28 22:04:03
|
Hi,
I recently rewrote `cg_annotate`, `cg_diff`, and `cg_merge` in Python. The
old versions were written in Perl, Perl, and C, respectively. The new
versions are much nicer and easier to modify, and I have various ideas for
improving `cg_annotate`. This email is about one of those ideas.
A typical way to invoke `cg_annotate` is like this:
> cg_annotate cachegrind.out.12345
This implies `--auto=yes`, which requests line-by-line "auto-annotation" of
source files. I.e. `cg_annotate` will automatically annotate all files in
the profile that meet the significance threshold.
It's also possible to do something like this:
> cg_annotate --auto=no cachegrind.out.12345 a.c b.c
Which instead requests "user annotation" of the files `a.c` and `b.c`.
My thesis is that auto-annotation suffices in practice for all reasonable
use cases, and that user annotation is unnecessary and can be removed.
When I first wrote `cg_annotate` in 2002, only user annotation was
implemented. Shortly after, I added the `--auto={yes,no}` option. Since
then I've never used user annotation, and I suspect nobody else has either.
User annotation is ok when dealing with tiny programs, but as soon as you
are profiling a program with more than a handful of source files it becomes
impractical.
The only possible use cases I can think of for user annotation are as
follows.
- If you want to see a particular file(s) annotated but you don't want
to see any others, then you can use user annotation in combination with
`--auto=no`. But it's trivial to search through the output for the
particular file, so this doesn't seem important.
- If the path to a file is somehow really messed up in the debug info,
it might be possible that auto-annotation would fail to find it, but user
annotation could find it, possibly in combination with `-I`. But this seems
unlikely. Some basic testing shows that gcc, clang and rustc all default to
using full paths in debug info. gcc supports `-fdebug-prefix-map` but that
seems to mostly be used for changing full paths to relative paths, which
will still work fine.
Removing user annotation would (a) simplify the code and docs, and (b)
enable the possibility of moving the merge functionality from `cg_merge`
into `cg_annotate`, by allowing the user to specify multiple cachegrind.out
files as input.
So: is anybody using user annotation? Does anybody see any problems with
this proposal?
Thanks.
Nick
|
|
From: Paul F. <pj...@wa...> - 2023-03-28 12:58:24
|
On 28-03-23 11:40, Julien Allali wrote: > Hi, > > Sometimes, valgrind detects error like "Conditional jump or move > depends" or "Use of uninitialized value" related to a variable in heap. > When using with gdb (--vgdb-error=1), a newbie (i.e. my students) can > have difficulties to understand as the value stored is 0 (because there > was zeros in heap, not because we set 0 to the variable). > > Could it be possible to add an option like --heap-up-fill > --heap-down-fill (like for stack with malloc), that fills heap memory > with a specified values (when entering a function and leave a function)? > > Would it be complicated to implement? Hi tl;dr I recommend using the vgdb monitor commands, they show what memcheck considers initialized or not. You just have to type something like "memcheck monitor xb &var sizeof(var)" tl;dr 2 See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2723r0.html Having a "stack fill" in Valgrind would be very difficult. The problem is identifying exactly what is a stack allocation. Identifying a call to malloc is easy - we have the address of the function. Stack allocation is, roughly, just modifying the stack pointer. The simplest case is where the callee does the stack allocation, on amd64 something like 201910: 55 push %rbp 201911: 48 89 e5 mov %rsp,%rbp 201914: 48 83 ec 10 sub $0x10,%rsp where the constant in the last line depends on the amount being allocated. The initial push and move won't be there for optimized builds or if -fomit-frame-pointer is specified. If the compiler is doing RVO then the stack memory will be allocated by the caller. See here https://godbolt.org/z/qasjWGhsK Line 25 in the asm output is where main allocates the space on the stack. Then there is tail-call optimization where there is no call, just a jump to the callee. And then all of the other things that manipulate the stack pointer (alloca, C++ exceptions, signals, setjmp/longjmp). There are a few compiler options that affect stack use. On top of all that each compiler tends to do things differently. I don't know what would be possible with DWARF, but that would only work with debug builds. Going back to my second tl;dr If I compile the godbolt example with clang++-devel -ftrivial-auto-var-init=pattern jfb.cpp -o jfb (that's LLVM 16) then I get 2019a2: be aa 00 00 00 mov $0xaa,%esi 2019a7: ba 00 10 00 00 mov $0x1000,%edx 2019ac: e8 bf 00 00 00 call 201a70 <memset@plt> 2019b1: 48 8d bd 00 f0 ff ff lea -0x1000(%rbp),%rdi 2019b8: e8 73 ff ff ff call 201930 <_Z1fv> which is the caller filling the 4k with 0xaa. A+ Paul |
|
From: Julien A. <jul...@en...> - 2023-03-28 10:00:04
|
Hi, Sometimes, valgrind detects error like "Conditional jump or move depends" or "Use of uninitialized value" related to a variable in heap. When using with gdb (--vgdb-error=1), a newbie (i.e. my students) can have difficulties to understand as the value stored is 0 (because there was zeros in heap, not because we set 0 to the variable). Could it be possible to add an option like --heap-up-fill --heap-down-fill (like for stack with malloc), that fills heap memory with a specified values (when entering a function and leave a function)? Would it be complicated to implement? thanks :) Julien. |
|
From: Floyd, P. <pj...@wa...> - 2023-03-16 13:57:00
|
On 15/03/2023 02:22, 骤变成玄武 via Valgrind-users wrote: > 2 Problem description: > -- source code: mmap a mdev device > What i usually do is run pmap -x on the guest exe running standalong (adding a "sleep" if ncessary) and then running valgrind -d. That should give you memory maps that you can compare. A+ Paul |
|
From: <101...@qq...> - 2023-03-15 01:22:53
|
1 version:
valgrind-3.20.0
4.19.25-200.el7.bclinux.x86_64
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36)
2 Problem description:
-- source code: mmap a mdev device
static int test_mmap_bar(void)
{
const char *path="/sys/class/mdev_bus/0000:06:00.0/resource4";
...
flags = O_RDWR;
prot = PROT_READ | PROT_WRITE;
unsigned long len = 4096;
unsigned long offset = 0x320000;
fd = open(path, flags | O_SYNC);
if (fd < 0) {
printf("[ERROR] open device err\n");
return -1;
}
addr = mmap(0, len, prot, MAP_SHARED, fd, offset);
if (addr == P_FAILED) {
printf("[ERROR] mmap faild\n");
}
...
}
--run with valgrind :
valgrind --leak-check=full --show-leak-kinds=all -v --log-file=valgrind.log ./build/valgrind-test --base-virtaddr=0x4000000000
-- The valgrind log reported the following error with mmap:
valgrind: m_debuginfo/image.c:587 (set_CEnt): Assertion '! sr_isError(sr)' failed.
--strace valgrind:
open("/sys/class/mdev_bus/0000:06:00.0/resource4", O_RDWR|O_SYNC) = 94 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], NULL, 8) = 0
gettid() = 13758
read(1028, "Y", 1) = 1
mmap(0x4033000, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 94, 0x320000) = 0x4033000
syscall_332(0x5e, 0x58246543, 0x1000, 0xfff, 0x1003633aa0, 0) = 0
readlink("/proc/self/fd/94", "/sys/devices/pci0000:00/0000:00:"..., 4096) = 59
syscall_332(0xffffffffffffff9c, 0x597789da, 0, 0xfff, 0x10036346d0, 0) = 0
syscall_332(0x5e, 0x58246543, 0x1000, 0xfff, 0x1003634560, 0) = 0
pread64(94, 0x100299acd0, 8192, 0) = -1 EIO (Input/output error)
valgrind will do pread64 after mmap ??
4 run directly:
-- There is no problem with mmap
-- use pread64(..) will cause io error
5 Quesions:
How can I use valgrind in this case |