|
From: John R.
|
Patches have been developed which enable UserModeLinux for i686 to run under the memcheck tool of Valgrind on i686. Thus it is possible to check dynamically the memory accesses made by a running Linux kernel against memcheck's model of allowed behavior. This work was supported by Google Inc. The combined patches are at "alpha" quality. They have memchecked an entire trivial session (boot UML, login, halt), and have identified a couple specific problems in kernel code. The steps necessary to reach "beta" quality (a motivated kernel developer can get useful results) have been outlined and are being pursued. The goods: 15KB http://bitwagon.com/valgrind+uml/valgrind-3.3.0-2007-12-27.patch.gz 60KB http://bitwagon.com/valgrind+uml/uml-2.6.22.5-2007-12-27.patch.gz Current status and updates will be maintained on the [coming] web page http://bitwagon.com/valgrind+uml/index.html As a convenience for when the official sites are not responding, here are copies of the original unpatched software that is required: 4MB http://bitwagon.com/valgrind+uml/valgrind-3.3.0.tar.bz2 45MB http://bitwagon.com/valgrind+uml/linux-2.6.22.5.tar.bz2 103MB http://bitwagon.com/valgrind+uml/FedoraCore5-x86-root_fs.bz2 Approximately 2.5GB of disk space is required to play. Motivated in part by the difficulty of tracking down the causes of "Conditional jump or move depends on uninitialised data", the patches include a new *optional* mode for memcheck: --complain-asap=yes. In this mode, memcheck issues a complaint immediately for any load from memory that contains uninitialized bits. This gives a very early notice of potential trouble. It also squawks for uninitialized holes in structures or bitfields, conditions which later become ignored or "don't care", certain compiler optimizations for speed, etc. The intent is to reduce the blizzard of "false positive" complaints by using the glibc-audit patches to provide a "quiet" glibc, by making the holes in kernel structures explicit (and filling them), by writing suppressions for known cases, by and further enhancing this new mode of memcheck. On the UML side, there is a significant technical issue: the semantics of kmalloc+kfree do not match the semantics of malloc+free. The kernel slab allocator caches and re-issues identified objects, which accumulate state and retain it throughout execution, including from kfree to kmalloc. In contrast, a region that is passed to free() loses both its contents and its identity. Also, size is an important parameter to malloc, but is implicit to kmalloc. The initial patches finesse these issues (for instance: by supplying the size as trailing parameter to kmalloc, and by noticing that SLAB_POISON ==> free()), but there will be significant discussion and work in resolving the differences. -- John Reiser, jreiser@BitWagon.com |
|
From: Michael A. <Mic...@fs...> - 2007-12-27 19:55:38
|
John Reiser wrote
Hi John,
> Patches have been developed which enable UserModeLinux for i686 to
> run under the memcheck tool of Valgrind on i686. Thus it is possible
> to check dynamically the memory accesses made by a running Linux kernel
> against memcheck's model of allowed behavior. This work was supported
> by Google Inc.
>
> The combined patches are at "alpha" quality.
I applied your patch against a vanilla 3.3.0 and with both gcc 4.2 and a
recent gcc 4.3 snapshot I get the following compilation failure on Linux
x86-64 (Centos 5)
if gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../coregrind -I..
-I../coregrind/amd64 -I../coregrind/linux -I../coregrind/amd64-linux
-I../include -I../VEX/pub -DVG_PLATFORM="\"amd64-linux\"" -DVGA_amd64=1
-DVGO_linux=1 -DVGP_amd64_linux=1
-DVG_LIBDIR="\"/usr/local/valgrind-3.3.0-asap/lib/valgrind"\" -m64
-fomit-frame-pointer -O2 -g -Wmissing-prototypes -Wall -Wshadow
-Wpointer-arith -Wstrict-prototypes -Wmissing-declarations
-fno-strict-aliasing -Wno-long-long -Wno-pointer-sign
-Wdeclaration-after-statement -fno-stack-protector -MT
libcoregrind_amd64_linux_a-syswrap-linux.o -MD -MP -MF
".deps/libcoregrind_amd64_linux_a-syswrap-linux.Tpo" -c -o
libcoregrind_amd64_linux_a-syswrap-linux.o `test -f
'm_syswrap/syswrap-linux.c' || echo './'`m_syswrap/syswrap-linux.c; \
then mv -f ".deps/libcoregrind_amd64_linux_a-syswrap-linux.Tpo"
".deps/libcoregrind_amd64_linux_a-syswrap-linux.Po"; else rm -f
".deps/libcoregrind_amd64_linux_a-syswrap-linux.Tpo"; exit 1; fi
m_syswrap/syswrap-linux.c: In function vgModuleLocal_do_fork_clone:
m_syswrap/syswrap-linux.c:338: error: VexGuestArchState has no member
named guest_EAX
m_syswrap/syswrap-linux.c:340: error: VexGuestArchState has no member
named guest_ESP
m_syswrap/syswrap-linux.c:349: warning: implicit declaration of function
letgo_vex_x86_linux
make[3]: *** [libcoregrind_amd64_linux_a-syswrap-linux.o] Error 1
make[3]: Leaving directory `/tmp/Work/valgrind-3.3.0/coregrind'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/tmp/Work/valgrind-3.3.0/coregrind'
> They have memchecked
> an entire trivial session (boot UML, login, halt), and have identified
> a couple specific problems in kernel code. The steps necessary to reach
> "beta" quality (a motivated kernel developer can get useful results)
> have been outlined and are being pursued.
>
> The goods:
> 15KB http://bitwagon.com/valgrind+uml/valgrind-3.3.0-2007-12-27.patch.gz
> 60KB http://bitwagon.com/valgrind+uml/uml-2.6.22.5-2007-12-27.patch.gz
>
> Current status and updates will be maintained on the [coming] web page
> http://bitwagon.com/valgrind+uml/index.html
>
> As a convenience for when the official sites are not responding,
> here are copies of the original unpatched software that is required:
> 4MB http://bitwagon.com/valgrind+uml/valgrind-3.3.0.tar.bz2
> 45MB http://bitwagon.com/valgrind+uml/linux-2.6.22.5.tar.bz2
> 103MB http://bitwagon.com/valgrind+uml/FedoraCore5-x86-root_fs.bz2
> Approximately 2.5GB of disk space is required to play.
>
> Motivated in part by the difficulty of tracking down the causes of
> "Conditional jump or move depends on uninitialised data", the patches
> include a new *optional* mode for memcheck: --complain-asap=yes.
This is a very, very cool feature. Any way you can split that patch out
from the patchset?
> In this mode, memcheck issues a complaint immediately for any load
> from memory that contains uninitialized bits. This gives a very early
> notice of potential trouble. It also squawks for uninitialized
> holes in structures or bitfields, conditions which later become ignored
> or "don't care", certain compiler optimizations for speed, etc.
> The intent is to reduce the blizzard of "false positive" complaints
> by using the glibc-audit patches to provide a "quiet" glibc,
> by making the holes in kernel structures explicit (and filling them),
> by writing suppressions for known cases, by and further enhancing
> this new mode of memcheck.
>
> On the UML side, there is a significant technical issue: the semantics
> of kmalloc+kfree do not match the semantics of malloc+free. The kernel
> slab allocator caches and re-issues identified objects, which accumulate
> state and retain it throughout execution, including from kfree to kmalloc.
> In contrast, a region that is passed to free() loses both its contents
> and its identity. Also, size is an important parameter to malloc,
> but is implicit to kmalloc. The initial patches finesse these issues
> (for instance: by supplying the size as trailing parameter to kmalloc,
> and by noticing that SLAB_POISON ==> free()), but there will be
> significant discussion and work in resolving the differences.
>
> --
> John Reiser, jreiser@BitWagon.com
Cheers,
Michael
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
|
|
From: John R.
|
Hi, Michael Abshoff, > I applied your patch against a vanilla 3.3.0 and with both gcc 4.2 and a > recent gcc 4.3 snapshot I get the following compilation failure on Linux > x86-64 (Centos 5) > m_syswrap/syswrap-linux.c: In function ‘vgModuleLocal_do_fork_clone’: > m_syswrap/syswrap-linux.c:338: error: ‘VexGuestArchState’ has no member > named ‘guest_EAX’ So far, compiling has used gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27) on i686 only. I will try the gcc on Fedora 8, and work towards separating platform-dependent code better. In the meantime, please use i686 only. [Perhaps -m32 will work on x86_64 ?] >>Motivated in part by the difficulty of tracking down the causes of >>"Conditional jump or move depends on uninitialised data", the patches >>include a new *optional* mode for memcheck: --complain-asap=yes. > This is a very, very cool feature. Any way you can split that patch out > from the patchset? Yes, producing a separate patch for this feature is on the list of tasks to do. In the meantime, nearly all of this feature is in just one file memcheck/mc_main.c. Perhaps if you apply the patch for just that one file, and make minimal adjustments as needed, then you may be able to use the feature separately. -- John Reiser, jreiser@BitWagon.com |
|
From: Tom H. <to...@co...> - 2007-12-29 16:12:38
|
On 27/12/2007, John Reiser <jr...@bi...> wrote: > So far, compiling has used gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27) > on i686 only. I will try the gcc on Fedora 8, and work towards separating > platform-dependent code better. In the meantime, please use i686 only. > [Perhaps -m32 will work on x86_64 ?] Configuring with --enable-only32bit should make valgrind only build 32 bit support. Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: John R.
|
The web page http://bitwagon.com/valgrind+uml/index.html now exists with news, history, commentary, scripts, links on getting User Mode Linux for i686 to run under memcheck on i686. A standalone patch http://bitwagon.com/valgrind+uml/mc_main-asap.patch to valgrind-3.3.0 implements "--complain-asap=yes" independently of the other changes. See http://bitwagon.com/glibc-audit/glibc-audit.html about getting a "quiet" glibc. -- John Reiser, jreiser@BitWagon.com |
|
From: John R.
|
Jeff Dike responded: John Reiser wrote: >>The web page http://bitwagon.com/valgrind+uml/index.html >>now exists with news, history, commentary, scripts, links > You refer to kmalloc and kfree as keeping object state intact between > kfree and kmalloc, thus not being semantically the same as malloc and > free. However, this is true of kmem_cache_alloc and kmem_cache_free, > not kmalloc and kfree. > > Object contents are destroyed between kfree and kmalloc. It looks more complicated to me. When include/linux/slab.h is in view, then I trace (where indentation indicates logical call [inline, or physical call]): kmalloc include/linux/slab.h __kmalloc mm/slab.c __do_kmalloc mm/slab.c __cache_alloc mm/slab.c __do_cache_alloc mm/slab.c ____cache_alloc mm/slab.c cpu_cache_get mm/slab.c When include/linux/slab_def.h is in view, then I trace: For a size that is constant at compile time [very often the case], then I trace: kmalloc include/linux/slab_def.h kmem_cache_alloc mm/slab.c __cache_alloc mm/slab.c <<and continues as in first case above>> If the size is non-constant at compile time, then I trace: kmalloc include/linux/slab_def.h __kmalloc mm/slab.c <<and continues as in first case above>> My tracking of kfree is much simpler: kfree mm/slab.c __cache_free mm/slab.c The net effect is that the slab allocator always calls __cache_alloc, thus re-using a kfree()d object and its accumulated contents. The slab allocator is by far the most used case. slub and slob have not yet been used during my explorations, as far as I can tell. > As far as kmem_cache_alloc and kmem_cache_free are concerned, would it > work to say that they are like malloc and free, except that if there's > a constructor, it is always called before kmem_cache_alloc returns? Under DEBUG, then cache_init_objs() in mm/slab.c does not call the ctor if SLAB_POISON. Under no-DEBUG, then cache_init_objs() always calls the ctor. So here is an exception to the proposed rule, and quite different *semantics* for DEBUG versus no-DEBUG. Also, cache_init_objs is not the allocator, but only the initial condition. Under DEBUG, then cache_alloc_debugcheck_after() in mm/slab.c calls the ctor if SLAB_POISON. Under no-DEBUG, then cache_alloc_debugcheck_after() is a no-op. Again, a difference in *semantics* between DEBUG and no-DEBUG. Altogether: the slab ctor is called once per object; except if DEBUG and SLAB_POISON, when kfree() is considered to destroy the old object and kmalloc is considered to create a new object (yet the two objects occupy identical address space.) In the case of no-DEBUG slab, then the ctor is never called as a result of calling kmalloc. So kmalloc+kfree is distinctly different in semantics from malloc+free, at least as implemented by slab, the most common case. Now, in my patches, I CHANGED the semantics of slab __cache_alloc so that it calls the ctor always, ignoring both no-DEBUG and SLAB_POISON. If there is a ctor, then this makes kmalloc+kfree closer to malloc+free. But if there is no ctor, then things are murky again. Should the result of kmalloc() be marked as if it were returned from malloc() or not? [I.e., are the contents undefined or defined?] At least some clients act that way (under DEBUG and SLAB_POISON), and in general it would be simpler for memcheck if it were so. The patch acts as if it *is* so: VALGRIND_MALLOCLIKE_BLOCK(objp, size, 0, 0); In particular, the contents are considered to be undefined. [Then the ctor, if it exists, defines some or all.] Obviously this is open for corrections, debate, and better understanding! I had difficulty understanding what I observed, and still I search ... The existing patches are the result of trial-and-error to find something which could survive for a complete session of boot+login+halt while respecting at least some intent of memcheck. But probably I missed an important point or two. In particular, I'm confused by allocations which return a "page": either "struct page *", or a "char *" with an aligned group of char of size PAGE_SIZE in the virtual address space. Help? -- John Reiser, jreiser@BitWagon.com |