|
From: Alfonso A. <alf...@gm...> - 2012-12-12 19:10:09
|
Hi, I am trying to run Valgrind on a Linux program which sets its own hooks for certain libc functions (malloc, calloc and free among them). To that effect, instead of using LD_PRELOAD, and for reasons I believe are not really relevant for the discussion, the first bytes of the code from these function are overwritten with a jump to a hijacking function. It seems that Valgrind doesn't like this kind of run-time modifications since it crashes on the first call to the hijacked calloc whereas the program runs just fine when not involving Valgrind. My guess is that Valgrind associates these symbols with its own hooks at load time and that the subsequent function overwriting breaks Valgrind in certain way. Could you confirm that? Any pointers to how to investigate this further? This hooking mechanism has been tested thoroughly in different architectures. Additionally, the program runs just fine under Valgrind when disabling it. Here is the output when running it under Valgrind (i686/Linux) with hijacking enabled: valgrind ./main ==29484== Memcheck, a memory error detector ==29484== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al. ==29484== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info ==29484== Command: ./main ==29484== ==29484== Invalid read of size 4 ==29484== at 0x4974D1C: calloc (malloc.c:3260) ==29484== by 0x6BAD6B1: __vf_init_thread (in /home1a/alfonso/vfroot/vfmerge/install/x86_64-linux/vfexec/i686-linux/lib/libvfrun.so.1.0.1) ==29484== by 0x6BADA81: __vf_run_vfs (in /home1a/alfonso/vfroot/vfmerge/install/x86_64-linux/vfexec/i686-linux/lib/libvfrun.so.1.0.1) ==29484== by 0x8048DD1: ??? (in /home1a/alfonso/vfmerge/service-tst/convolution/src/main) ==29484== Address 0x4 is not stack'd, malloc'd or (recently) free'd ==29484== ==29484== ==29484== Process terminating with default action of signal 11 (SIGSEGV) ==29484== Access not within mapped region at address 0x4 ==29484== at 0x4974D1C: calloc (malloc.c:3260) ==29484== by 0x6BAD6B1: __vf_init_thread (in /home1a/alfonso/vfroot/vfmerge/install/x86_64-linux/vfexec/i686-linux/lib/libvfrun.so.1.0.1) ==29484== by 0x6BADA81: __vf_run_vfs (in /home1a/alfonso/vfroot/vfmerge/install/x86_64-linux/vfexec/i686-linux/lib/libvfrun.so.1.0.1) ==29484== by 0x8048DD1: ??? (in /home1a/alfonso/vfmerge/service-tst/convolution/src/main) ==29484== If you believe this happened as a result of a stack ==29484== overflow in your program's main thread (unlikely but ==29484== possible), you can try to increase the size of the ==29484== main thread stack using the --main-stacksize= flag. ==29484== The main thread stack size used in this run was 8388608. ==29484== ==29484== HEAP SUMMARY: ==29484== in use at exit: 40,801 bytes in 403 blocks ==29484== total heap usage: 419 allocs, 16 frees, 43,121 bytes allocated ==29484== ==29484== LEAK SUMMARY: ==29484== definitely lost: 362 bytes in 5 blocks ==29484== indirectly lost: 4,876 bytes in 2 blocks ==29484== possibly lost: 0 bytes in 0 blocks ==29484== still reachable: 35,563 bytes in 396 blocks ==29484== suppressed: 0 bytes in 0 blocks ==29484== Rerun with --leak-check=full to see details of leaked memory ==29484== ==29484== For counts of detected and suppressed errors, rerun with: -v ==29484== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) Segmentation fault Thanks, Alfonso Acosta |
|
From: Tom H. <to...@co...> - 2012-12-12 19:45:13
|
On 12/12/12 19:09, Alfonso Acosta wrote: > It seems that Valgrind doesn't like this kind of run-time modifications > since it crashes on the first call to the hijacked calloc whereas the > program runs just fine when not involving Valgrind. You need to use --smc-check=all if your code is self modifying. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Alfonso A. <alf...@gm...> - 2012-12-12 20:13:11
|
On Wed, Dec 12, 2012 at 8:44 PM, Tom Hughes <to...@co...> wrote: > > You need to use --smc-check=all if your code is self modifying. Thanks for the tip. Unfortunately, I still get the same error. |
|
From: John R. <jr...@bi...> - 2012-12-12 20:02:42
|
> I am trying to run Valgrind on a Linux program which sets its own hooks for certain libc functions (malloc, calloc and free among them). > > To that effect, instead of using LD_PRELOAD, and for reasons I believe are not really relevant for the discussion, the first bytes of the code from these function are overwritten with a jump to a hijacking function. [snip] > This hooking mechanism has been tested thoroughly in different architectures. Additionally, the program runs just fine under Valgrind when disabling it. Here is the output when running it under Valgrind (i686/Linux) with hijacking enabled: Read and understand valgrind's code for re-direction. Apply some low-level debugger such as gdb (or perhaps valgrind's internal vgdb) to see what actually happens. Or, use LD_PRELOAD, which is the "blessed" mechanism for hooking. It works! and it takes only a few hours to try. See the thread in [valgrind-users] Subject: __malloc_hook by Amir Szekely on 2012-10-19, my response on Oct.22, and Amir's confirmation of success on Oct.23 (which includes his actual code.) -- |
|
From: Alfonso A. <alf...@gm...> - 2012-12-12 21:48:14
|
On Wed, Dec 12, 2012 at 9:03 PM, John Reiser <jr...@bi...> wrote: > > Read and understand valgrind's code for re-direction. Apply some low-level debugger > such as gdb (or perhaps valgrind's internal vgdb) to see what actually happens. Thanks for the vgdb tip! By using vgdb together with --trace-redir I managed to further diagnose the problem. I think I already know what's happening but I am not sure how to solve it (I still need to dive into valgrind's sources though). I will illustrate the situation with a simplified example. Let this be a function to hijack, calloc for instance: calloc: instruction1 instruction2 instruction3 ... In order to hijack it, we patch the first instruction so that it jumps to our own hijacker calloc: jmp calloc_hijacker instruction2 instruction3 ... calloc_hijacker: .... But, since we want to still call the original version of calloc, we allocate a buffer to save the first instruction (orig_calloc): orig_calloc: instruction1 jmp calloc+1 I believe that the problem is that valgrind has a redirection for calloc, that is: calloc -> _vgr10070ZU_libcZdsoZa_calloc But after "moving" the start of calloc to orig_calloc, we want: orig_calloc -> _vgr10070ZU_libcZdsoZa_calloc Is there a way to reassign redirections? > > Or, use LD_PRELOAD, which is the "blessed" mechanism for hooking. It works! > and it takes only a few hours to try. > See the thread in [valgrind-users] Subject: __malloc_hook by Amir Szekely on 2012-10-19, > my response on Oct.22, and Amir's confirmation of success on Oct.23 (which includes > his actual code.) We initially implemented hijacking using LD_PRELOAD, exactly as described in the thread you are pointing to. Later I resorted to this approach for a few reasons that, as I mentioned, I believe not to be worth discussing. |
|
From: John R. <jr...@bi...> - 2012-12-12 23:02:36
|
> But, since we want to still call the original version of calloc, we > allocate a buffer to save the first instruction (orig_calloc): > > orig_calloc: > instruction1 > jmp calloc+1 > > I believe that the problem is that valgrind has a redirection for > calloc, that is: > > calloc -> _vgr10070ZU_libcZdsoZa_calloc > > But after "moving" the start of calloc to orig_calloc, we want: > > orig_calloc -> _vgr10070ZU_libcZdsoZa_calloc You control orig_calloc, so what is stopping you? > > Is there a way to reassign redirections? The routine calloc_hijacker could check whether the instruction layout remains the same as the first time, deduce that valgrind is active, and re-arrange the code further. (Remember to sync the Icache; on x86 any backwards branch suffices, but on other architectures a system call is necessary.) -- |
|
From: Alfonso A. <alf...@gm...> - 2012-12-12 23:44:48
|
On Thu, Dec 13, 2012 at 12:03 AM, John Reiser <jr...@bi...> wrote: >> calloc -> _vgr10070ZU_libcZdsoZa_calloc >> >> But after "moving" the start of calloc to orig_calloc, we want: >> >> orig_calloc -> _vgr10070ZU_libcZdsoZa_calloc > > You control orig_calloc, so what is stopping you? By reassigning redirections I really meant replacing them by another one. True, leaving the resolution of _vgr10070ZU_libcZdsoZa_calloc aside, l could indeed add that redirection myself. However, I think it doesn't cut it: As I understand it, even with that extra redirection, the initial "calloc -> _vgr10070ZU_libcZdsoZa_calloc" redirection would still be active, breaking my hijacking mechanism, which would render my code unusable. Remember that I expect calloc_hijacker to be called instead of calloc. With the initial redirection still in place, _vgr10070ZU_libcZdsoZa_calloc would be called instead. > The routine calloc_hijacker could check whether the instruction layout > remains the same as the first time, deduce that valgrind is active, > and re-arrange the code further. (Remember to sync the Icache; > on x86 any backwards branch suffices, but on other architectures > a system call is necessary.) I might be missing something, but I don't see how calloc_hijacker would be called at all if "calloc -> _vgr10070ZU_libcZdsoZa_calloc" is still in place. After skimming the sources of Valgrind I see that some functions are replaced and not wrapped (with no possibility to access the original function) and calloc seems to be one of them. Even if calloc was wrapped, it is up to the wrapped to call the original function. |
|
From: Alfonso A. <alf...@gm...> - 2012-12-13 17:06:02
|
OK, I found a solution. If executing under valgrind, simply hijack _vgr10070ZU_libcZdsoZa_calloc directly instead of calloc. It works like a charm. Thanks a lot for your help, really :) On Thu, Dec 13, 2012 at 12:44 AM, Alfonso Acosta <alf...@gm...> wrote: > On Thu, Dec 13, 2012 at 12:03 AM, John Reiser <jr...@bi...> wrote: >>> calloc -> _vgr10070ZU_libcZdsoZa_calloc >>> >>> But after "moving" the start of calloc to orig_calloc, we want: >>> >>> orig_calloc -> _vgr10070ZU_libcZdsoZa_calloc >> >> You control orig_calloc, so what is stopping you? > > By reassigning redirections I really meant replacing them by another > one. True, leaving the resolution of _vgr10070ZU_libcZdsoZa_calloc > aside, l could indeed add that redirection myself. However, I think it > doesn't cut it: > > As I understand it, even with that extra redirection, the initial > "calloc -> _vgr10070ZU_libcZdsoZa_calloc" redirection would still be > active, breaking my hijacking mechanism, which would render my code > unusable. Remember that I expect calloc_hijacker to be called instead > of calloc. With the initial redirection still in place, > _vgr10070ZU_libcZdsoZa_calloc would be called instead. > >> The routine calloc_hijacker could check whether the instruction layout >> remains the same as the first time, deduce that valgrind is active, >> and re-arrange the code further. (Remember to sync the Icache; >> on x86 any backwards branch suffices, but on other architectures >> a system call is necessary.) > > > I might be missing something, but I don't see how calloc_hijacker > would be called at all if "calloc -> _vgr10070ZU_libcZdsoZa_calloc" is > still in place. > > After skimming the sources of Valgrind I see that some functions are > replaced and not wrapped (with no possibility to access the original > function) and calloc seems to be one of them. Even if calloc was > wrapped, it is up to the wrapped to call the original function. |