|
From: Jeremy F. <je...@go...> - 2005-03-04 22:38:20
|
CVS commit by fitzhardinge: Documentation update. This should bring the core of the documentation up to date with reality. Please give this a proofread. I ran out of steam at memcheck/docs/mc_techdocs.html, which is even more hopelessly out of date. I will note that cacheprof.org is some kind of dental insurance company now... M +152 -96 coregrind/docs/coregrind_core.html 1.40 M +12 -10 coregrind/docs/coregrind_intro.html 1.9 M +5 -4 docs/manual.html 1.51 M +3 -3 memcheck/docs/mc_main.html 1.16 M +8 -9 memcheck/docs/mc_techdocs.html 1.12 --- valgrind/docs/manual.html #1.50:1.51 @@ -26,12 +26,13 @@ <a name="title"> </a> -<h1 align=center>Valgrind, version 2.2.0</h1> -<center>This manual was last updated on 31 August 2004</center> +<h1 align=center>Valgrind, version 2.4.0</h1> +<center>This manual was last updated on 4 March 2005</center> <p> <center> <a href="mailto:js...@ac...">js...@ac...</a>, - <a href="mailto:nj...@ca...">nj...@ca...</a><br> -Copyright © 2000-2004 Julian Seward, Nick Nethercote + <a href="mailto:nj...@ca...">nj...@ca...</a>, +<a href="mailto:je...@go...">je...@go...</a><br> +Copyright © 2000-2005 Julian Seward, Nick Nethercote, Jeremy Fitzhardinge <p> --- valgrind/coregrind/docs/coregrind_core.html #1.39:1.40 @@ -173,5 +173,6 @@ tree of processes at once, since it means that each process writes to its own logfile, rather than the result being jumbled up in one - big logfile. + big logfile. If <code>filename.pid12345</code> already exists, then + it will name new files <code>filename.pid12345.1</code> and so on. <p> <li>The least intrusive option is to send the commentary to a network @@ -246,5 +247,5 @@ ==25832== by 0x40371E5E: __libc_start_main (libc-start.c:129) ==25832== by 0x80485D1: (within /home/sewardj/newmat10/bogon) - ==25832== Address 0xBFFFF74C is not stack'd, malloc'd or free'd + ==25832== Address 0xBFFFF74C is not stack'd, malloc'd or free'd </pre> @@ -500,5 +501,5 @@ <li><code>--help-debug</code><br> <p>Same as <code>--help</code>, but also lists debugging options which - usually are only of use to developers.</li><br><p> + usually are only of use to Valgrind's developers.</li><br><p> <li><code>--version</code><br> <p>Show the version number of the @@ -727,8 +728,8 @@ <li><code>--alignment=<number></code> [default: 8]<br> <p>By default Valgrind's <code>malloc</code>, <code>realloc</code>, - etc, return 4-byte aligned addresses. These are suitable for + etc, return 8-byte aligned addresses. These are suitable for any accesses on x86 processors. Some programs might however assume that <code>malloc</code> et - al return 8- or more aligned memory. The supplied value must be + al return 16- or more aligned memory. The supplied value must be between 4 and 4096 inclusive, and must be a power of two.</li><br><p> @@ -786,4 +787,25 @@ </li><br><p> + <li><code>--pointercheck=yes</code> [default]<br> + <code>--pointercheck=no</code> + <p> + This option make Valgrind generate a check on every memory + reference to make sure it is within the client's part of the + address space. This prevents stray writes from damaging + Valgrind itself. On x86, this uses the CPU's segmentation + machinery, and has almost no performance cost; there's almost + never a reason to turn it off. + + <li><code>--branchpred=no</code> [default]<br> + <code>--branchpred=yes</code> + <p> + This option enables the generation of static branch prediction + hints. In theory this allows the real CPU to do a better job of + running the generated code, but in practice it makes almost no + measurable difference. It may have a large effect on some x86 + implementations. Try it out; if it makes a difference for you, + put <code>--branchpred=yes</code> into your ~/.valgrindrc and + tell us about it. + <li><code>--weird-hacks=hack1,hack2,...</code> Pass miscellaneous hints to Valgrind which slightly modify the @@ -798,4 +820,12 @@ device drivers with a large number of strange ioctl commands becomes very tiresome. + <li><code>ioctl-mmap</code> Some ioctl requests can mmap new memory into + your process address space. If Valgrind doesn't know about these + mappings, it could put new mappings over them, and/or complain bitterly + when your program uses them. This option makes Valgrind scan + the address space for new mappings after each unknown ioctl has + finished. You may also need to run with <code>--pointercheck=no</code> + if the ioctl decides to place the mapping out of the client's + usual address space. </ul> </li><br><p> @@ -811,5 +841,8 @@ <p>When enabled, each x86 insn is translated separately into instrumented code. When disabled, translation is done on a - per-basic-block basis, giving much better translations.</li><br> + per-basic-block basis, giving much better translations. + This option is very useful if your program expects precise + exceptions (if it, for example, inspects or modifies register + state from within a signal handler).</li><br> <p> @@ -870,7 +903,7 @@ <p> - <li><code>--dump-error=<number></code> [default: inactive] + <li><code>--dump-error=<number></code> [default: inactive] <p>After the program has exited, show gory details of the - translation of the basic block containing the <number>'th + translation of the basic block containing the <number>'th error context. When used with <code>--single-step=yes</code>, can show the exact x86 instruction causing an error. This is @@ -925,15 +958,29 @@ Clients need to include a header file to make this work. Which header file depends on which client requests you use. Some client requests are handled by -the core, and are defined in the header file <code>valgrind.h</code>. +the core, and are defined in the header file <code>valgrind/valgrind.h</code>. Tool-specific header files are named after the tool, e.g. -<code>memcheck.h</code>. All header files can be found in the -<code>include</code> directory of wherever Valgrind was installed. +<code>valgrind/memcheck.h</code>. All header files can be found in the +<code>include/valgrind</code> directory of wherever Valgrind was installed. <p> -The macros in these header files have the magical property that -they generate code in-line which Valgrind can spot. However, the code -does nothing when not run on Valgrind, so you are not forced to run -your program on Valgrind just because you use the macros in this file. +The macros in these header files have the magical property that they +generate code in-line which Valgrind can spot. However, the code does +nothing when not run on Valgrind, so you are not forced to run your +program on Valgrind just because you use the macros in this file. Also, you are not required to link your program with any extra -supporting libraries. +supporting libraries. The code left in your binary has minimal +performance impact. +<p> +If you really wish to compile out the client requests, you can compile +with <code>-DNVALGRIND</code> (analogous to <code>-DNDEBUG</code>'s +effect on <code>assert()</code>). +<p> +You are encouraged to copy the <code>valgrind/*.h</code> headers into +your project's include directory, so your program doesn't have a +compile-time dependency on Valgrind being installed. The Valgrind +headers, unlike the rest of the code, is under a BSD-style license so +you may include them without worrying about license incompatibility. +The macros in <code>valgrind/*.h</code> will be forwards and backwards +compatible across all versions of Valgrind (from 2.0.0 onwards); at +worst a macro will do nothing. <p> Here is a brief description of the macros available in @@ -941,6 +988,8 @@ tool-specific documentation for explanations of the tool-specific macros). <ul> -<li><code>RUNNING_ON_VALGRIND</code>: returns 1 if running on - Valgrind, 0 if running on the real CPU. +<li><code>RUNNING_ON_VALGRIND</code>: returns non-zero if running on + Valgrind, 0 if running on the real CPU. If you are running + Valgrind under itself, it will return the number of layers of + Valgrind emulation we're running under. <p> <li><code>VALGRIND_DISCARD_TRANSLATIONS</code>: discard translations @@ -1023,5 +1072,5 @@ <a name="pthreads"></a> -<h3>2.8 Support for POSIX Pthreads</h3> +<h3>2.8 Support for Threads</h3> Valgrind supports programs which use POSIX pthreads. However, it runs @@ -1031,4 +1080,19 @@ apps only utilise one CPU, even if you have a multiprocessor machine. <p> +Your program will use the native <code>libpthread</code>, but not all +of its facilities will work. In particular, <strong>process-shared +synchronization WILL NOT WORK</strong>. They rely on special atomic +instruction sequences which Valgrind does not emulate in a way which +works between processes. Unfortunately there's no way for Valgrind to +warn when this is happening, and such calls will mostly work; it's +only when there's a race will it fail. +<p> +Valgrind also supports direct use of the <code>clone()</code> system +call, <code>futex()</code> and so on. <code>clone()</code> is +supported where either everything is shared (a thread) or nothing is +shared (fork-like); partial sharing will fail. Again, any use of +atomic instruction sequences in shared memory between processes will +not work. +<p> Valgrind schedules your threads in a round-robin fashion, with all threads having equal priority. It switches threads every 50000 basic @@ -1043,17 +1107,17 @@ <h3>2.9 Handling of signals</h3> -Valgrind provides suitable handling of signals, so, provided you stick -to POSIX stuff, you should be ok. Basic sigaction() and sigprocmask() -are handled. Signal handlers may return in the normal way or do -longjmp(); both should work ok. As specified by POSIX, a signal is -blocked in its own handler. Default actions for signals should work -as before. Etc, etc. - -<p>Under the hood, dealing with signals is a real pain, and Valgrind's -simulation leaves much to be desired. If your program does -way-strange stuff with signals, bad things may happen. If so, let me -know. I don't promise to fix it, but I'd at least like to be aware of -it. +<p>Valgrind has a fairly complete signal implementation. It should be +able to cope with any valid use of signals. +<p>If you're using signals in clever ways (for example, catching +SIGSEGV, modifying page state and restarting the instruction), you're +probably relying on precise exceptions. In this case, you will need +to use <code>--single-step=yes</code>. + +<p>If your program dies as a result of a fatal core-dumping signal, +Valgrind will generate its own core file +(<code>vgcore.pidNNNNN</code>) containing your program's state. You +may use this core file for post-mortem debugging with gdb or similar. +(Note: it will not generate a core if your core dump size limit is 0.) @@ -1065,6 +1129,22 @@ attempted to ensure that it works on machines with kernel 2.4 or 2.6 and glibc 2.2.X or 2.3.X. I don't think there is much else to say. -There are no options apart from the usual <code>--prefix</code> that -you should give to <code>./configure</code>. +<p>There are two options (in addition to the usual +<code>--prefix=</code> which affect how Valgrind is built: +<dl> + <dt><code>--enable-pie</code> + <dd>PIE stands for "position-independent executable". This is + enabled by default if your toolchain supports it. PIE allows + Valgrind to place itself as high as possible in memory, giving + your program as much address space as possible. It also allows + Valgrind to run under itself. If PIE is disabled, Valgrind loads + at a default address which is suitable for most systems. This is + also useful for debugging Valgrind itself. + <dt><code>--enable-tls</code> + <dd>TLS (Thread Local Storage) is a relatively new mechanism which + requires compiler, linker and kernel support. Valgrind automatically + test if TLS is supported and enable this option. Sometimes it + cannot test for TLS, so this option allows you to override the + automatic test. +</dl> <p> @@ -1125,11 +1205,7 @@ <p> - <li>Pthreads support is improving, but there are still significant - limitations in that department. See the section above on - Pthreads. Note that your program must be dynamically linked - against <code>libpthread.so</code>, so that Valgrind can - substitute its own implementation at program startup time. If - you're statically linked against it, things will fail - badly.</li> + <li>Atomic instruction sequences are not supported, which will + affect any use of synchronization objects being shared between + processes. They will appear to work, but fail sporadically.</li> <p> @@ -1152,13 +1228,4 @@ <p> - <li>Valgrind's signal simulation is not as robust as it could be. - Basic POSIX-compliant sigaction and sigprocmask functionality is - supplied, but it's conceivable that things could go badly awry - if you do weird things with signals. Workaround: don't. - Programs that do non-POSIX signal tricks are in any case - inherently unportable, so should be avoided if - possible.</li> - <p> - <li>Programs which switch stacks are not well handled. Valgrind does have support for this, but I don't have great faith in it. @@ -1221,12 +1288,5 @@ <ul> - <li>On Red Hat 7.3, there have been reports of link errors (at - program start time) for threaded programs using - <code>__pthread_clock_gettime</code> and - <code>__pthread_clock_settime</code>. This appears to be due to - <code>/lib/librt-2.2.5.so</code> needing them. Unfortunately I - do not understand enough about this problem to fix it properly, - and I can't reproduce it on my test RedHat 7.3 system. Please - mail me if you have more information / understanding. </li><br> + <li>(None)</li><br> <p> </ul> @@ -1245,38 +1305,33 @@ <h4>2.13.1 Getting started</h4> -Valgrind is compiled into a shared object, valgrind.so. The shell -script valgrind sets the LD_PRELOAD environment variable to point to -valgrind.so. This causes the .so to be loaded as an extra library to -any subsequently executed dynamically-linked ELF binary, viz, the -program you want to debug. - -<p>The dynamic linker allows each .so in the process image to have an -initialisation function which is run before main(). It also allows -each .so to have a finalisation function run after main() exits. - -<p>When valgrind.so's initialisation function is called by the dynamic -linker, the synthetic CPU to starts up. The real CPU remains locked -in valgrind.so for the entire rest of the program, but the synthetic -CPU returns from the initialisation function. Startup of the program -now continues as usual -- the dynamic linker calls all the other .so's -initialisation routines, and eventually runs main(). This all runs on -the synthetic CPU, not the real one, but the client program cannot -tell the difference. - -<p>Eventually main() exits, so the synthetic CPU calls valgrind.so's -finalisation function. Valgrind detects this, and uses it as its cue -to exit. It prints summaries of all errors detected, possibly checks -for memory leaks, and then exits the finalisation routine, but now on -the real CPU. The synthetic CPU has now lost control -- permanently --- so the program exits back to the OS on the real CPU, just as it -would have done anyway. - -<p>On entry, Valgrind switches stacks, so it runs on its own stack. -On exit, it switches back. This means that the client program -continues to run on its own stack, so we can switch back and forth -between running it on the simulated and real CPUs without difficulty. -This was an important design decision, because it makes it easy (well, -significantly less difficult) to debug the synthetic CPU. - +Valgrind is compiled into two executables: <code>valgrind</code>, and +<code>stage2</code>. <code>Valgrind</code> is a statically-linked +executable which loads at the normal address (0x8048000). +<code>Stage2</code> is a normal dynamically-linked executable; it is +either linked to load at a high address (0xb8000000) or is a Position +Independent Executable. + +<p><code>Valgrind</code> (also known as <code>stage1</code>): +<ol> +<li>Decides where to load stage2 +<li>Pads the address space with <code>mmap</code>, leaving holes only + where stage2 should load. +<li>Loads stage2 in the same manner as <code>execve()</code> would, but "manually". +<li>Jumps to the start of stage2 +</ol> + +<p>Once stage2 is loaded, it uses <code>dlopen()</code> to load the +Tool, unmaps all traces of stage1, initializes the client's state, and +starts the synthetic CPU. + +<p>Each thread runs in its own kernel thread, and loops in +<code>VG_(schedule)</code> as it runs. When the thread terminates, +<code>VG_(schedule)</code> returns. Once all the threads have +terminated, Valgrind as a whole exits. + +<p>Each thread also has two stacks. One is the client's stack, which +is manipulated with the client's instructions. The other is +Valgrind's internal stack, which is used by all Valgrind's code on +behalf of that thread. It is important to not get them confused. <a name="engine"></a> @@ -1357,7 +1412,8 @@ <h4>2.13.5 Signals</h4> -All system calls to sigaction() and sigprocmask() are intercepted. If -the client program is trying to set a signal handler, Valgrind makes a -note of the handler address and which signal it is for. Valgrind then + +All signal-related system calls are intercepted. If the client +program is trying to set a signal handler, Valgrind makes a note of +the handler address and which signal it is for. Valgrind then arranges for the same signal to be delivered to its own handler. @@ -1480,5 +1536,5 @@ Or <p> -<li> <code>Warning: noted but unhandled ioctl <number></code> +<li> <code>Warning: noted but unhandled ioctl <number></code> <br> Valgrind observed a call to one of the vast family of @@ -1488,5 +1544,5 @@ errors after this as a result of the non-update of the memory info. <p> -<li> <code>Warning: set address range perms: large range <number></code> +<li> <code>Warning: set address range perms: large range <number></code> <br> Diagnostic message, mostly for benefit of the valgrind --- valgrind/coregrind/docs/coregrind_intro.html #1.8:1.9 @@ -102,4 +102,7 @@ Fitzhardinge, and we have him to thank for getting it to a releasable state. + <p><em>Note:</em>Helgrind is not functioning in 2.4.0; we hope to + resurrect it for the next release. +<p> </ul> @@ -117,18 +120,17 @@ x86s. Valgrind uses the standard Unix <code>./configure</code>, <code>make</code>, <code>make install</code> mechanism, and we have -attempted to ensure that it works on machines with kernel 2.2 or 2.4 -and glibc 2.1.X, 2.2.X or 2.3.1. This should cover the vast majority -of modern Linux installations. Note that glibc-2.3.2+, with the -NPTL (Native Posix Threads Library) package won't work. We hope to -be able to fix this, but it won't be easy. +attempted to ensure that it works on machines with kernel 2.4 or 2.6 +and glibc 2.1.X to 2.3.X. <p> Valgrind is licensed under the GNU General Public License, version -2. Read the file LICENSE in the source distribution for details. Some -of the PThreads test cases, <code>pth_*.c</code>, are taken from -"Pthreads Programming" by Bradford Nichols, Dick Buttlar & -Jacqueline Proulx Farrell, ISBN 1-56592-115-1, published by O'Reilly -& Associates, Inc. +2. Read the file LICENSE in the source distribution for details. The +<code>valgrind/*.h</code> headers are distributed under a BSD-style +license, so you may include them in your code without worrying about +license conflicts. Some of the PThreads test cases, +<code>pth_*.c</code>, are taken from "Pthreads Programming" by +Bradford Nichols, Dick Buttlar & Jacqueline Proulx Farrell, ISBN +1-56592-115-1, published by O'Reilly & Associates, Inc. --- valgrind/memcheck/docs/mc_techdocs.html #1.11:1.12 @@ -69,13 +69,12 @@ code generator for the Glasgow Haskell Compiler (http://www.haskell.org/ghc), gaining familiarity with x86 internals -on the way. I then did Cacheprof (http://www.cacheprof.org), gaining -further x86 experience. Some time around Feb 2000 I started -experimenting with a user-space x86 interpreter for x86-Linux. This -worked, but it was clear that a JIT-based scheme would be necessary to -give reasonable performance for Valgrind. Design work for the JITter -started in earnest in Oct 2000, and by early 2001 I had an x86-to-x86 -dynamic translator which could run quite large programs. This -translator was in a sense pointless, since it did not do any -instrumentation or checking. +on the way. I then did Cacheprof, gaining further x86 experience. +Some time around Feb 2000 I started experimenting with a user-space +x86 interpreter for x86-Linux. This worked, but it was clear that a +JIT-based scheme would be necessary to give reasonable performance for +Valgrind. Design work for the JITter started in earnest in Oct 2000, +and by early 2001 I had an x86-to-x86 dynamic translator which could +run quite large programs. This translator was in a sense pointless, +since it did not do any instrumentation or checking. <p> --- valgrind/memcheck/docs/mc_main.html #1.15:1.16 @@ -165,5 +165,5 @@ by 0x40B07FF4: read_png_image__FP8QImageIO (kernel/qpngio.cpp:326) by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) - Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd + Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd </pre> @@ -253,5 +253,5 @@ by 0x402A6E5E: __libc_start_main (libc-start.c:129) by 0x80483B1: (within tests/doublefree) - Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd + Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) @@ -278,5 +278,5 @@ by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) - Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd + Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd at 0x4004318C: __builtin_vec_new (vg_clientfuncs.c:152) by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) |