|
From: Giovanni F. <gaf...@gm...> - 2008-02-25 10:43:04
|
Hello all valgrinders, Suppose a C/C++ program like this: char *a = malloc(size_of_a); char *b = malloc(size_of_b); receive_data(a, size_of_a); /* perform operations */ transmit_data(b, size_of_b); I would like to know, for each call to transmit_data, whether its parameter "b" depends on the data "a" of previous receive_data or not. "a" and "b" are either statically or dinamically allocated and datapath between "a" and "b" can be quite complicated. It seems to me that valgrind does something of the genre, any ideas? Thanks, Giovanni |
|
From: Julian S. <js...@ac...> - 2008-02-25 12:37:36
|
On Monday 25 February 2008 11:43, Giovanni Funchal wrote: > Hello all valgrinders, > > Suppose a C/C++ program like this: > > char *a = malloc(size_of_a); > char *b = malloc(size_of_b); > receive_data(a, size_of_a); > /* perform operations */ > transmit_data(b, size_of_b); > > I would like to know, for each call to transmit_data, whether its > parameter "b" depends on the data "a" of previous receive_data or not. > "a" and "b" are either statically or dinamically allocated and > datapath between "a" and "b" can be quite complicated. > > It seems to me that valgrind does something of the genre, any ideas? It certainly provides the infrastructure on which you can build such tools, but I am not aware of any specifically. Others will know more than me on this topic though. For example, Nick Nethercote's experimental "Redux" tool is able to track exactly where every piece of data came from. I believe there may be also experimental tainting tools somewhere. J |
|
From: Nicholas N. <nj...@cs...> - 2008-02-25 21:28:20
|
On Mon, 25 Feb 2008, Julian Seward wrote: >> char *a = malloc(size_of_a); >> char *b = malloc(size_of_b); >> receive_data(a, size_of_a); >> /* perform operations */ >> transmit_data(b, size_of_b); >> >> I would like to know, for each call to transmit_data, whether its >> parameter "b" depends on the data "a" of previous receive_data or not. >> "a" and "b" are either statically or dinamically allocated and >> datapath between "a" and "b" can be quite complicated. >> >> It seems to me that valgrind does something of the genre, any ideas? > > It certainly provides the infrastructure on which you can build > such tools, but I am not aware of any specifically. Others will know > more than me on this topic though. For example, Nick Nethercote's > experimental "Redux" tool is able to track exactly where every piece > of data came from. I believe there may be also experimental tainting > tools somewhere. Yes. Valgrind can definitely be used for this task, but you'd have to write your own tool to do it. (Redux is old and doesn't work any more.) Such tools can be complicated, depending on how accurate you want to be, what granularity you want to track dependences (bit-level, byte-level, word-level?) and whether you want to know only that there is a dependence, or some details of what the dependent data path is (the latter is harder). Nick |
|
From: Giovanni F. <gaf...@gm...> - 2008-02-26 06:17:45
|
Hi, I need only to know whether there is a dependence. Lets say that each call to receive_data is numbered in real time, then, for each call of transmit_data with argument X, what I need is a list of IDs of previous calls to receive_data whose data X depends on. Would this be very complicated? Where to start? Thanks a lot, Giovanni On Mon, Feb 25, 2008 at 10:28 PM, Nicholas Nethercote <nj...@cs...> wrote: > On Mon, 25 Feb 2008, Julian Seward wrote: > > >> char *a = malloc(size_of_a); > >> char *b = malloc(size_of_b); > >> receive_data(a, size_of_a); > >> /* perform operations */ > >> transmit_data(b, size_of_b); > >> > >> I would like to know, for each call to transmit_data, whether its > >> parameter "b" depends on the data "a" of previous receive_data or not. > >> "a" and "b" are either statically or dinamically allocated and > >> datapath between "a" and "b" can be quite complicated. > >> > >> It seems to me that valgrind does something of the genre, any ideas? > > > > It certainly provides the infrastructure on which you can build > > such tools, but I am not aware of any specifically. Others will know > > more than me on this topic though. For example, Nick Nethercote's > > experimental "Redux" tool is able to track exactly where every piece > > of data came from. I believe there may be also experimental tainting > > tools somewhere. > > Yes. Valgrind can definitely be used for this task, but you'd have to write > your own tool to do it. (Redux is old and doesn't work any more.) > > Such tools can be complicated, depending on how accurate you want to be, > what granularity you want to track dependences (bit-level, byte-level, > word-level?) and whether you want to know only that there is a dependence, > or some details of what the dependent data path is (the latter is harder). > > Nick > |
|
From: Nicholas N. <nj...@cs...> - 2008-02-26 21:19:07
|
On Tue, 26 Feb 2008, Giovanni Funchal wrote: > Hi, I need only to know whether there is a dependence. Lets say that > each call to receive_data is numbered in real time, then, for each > call of transmit_data with argument X, what I need is a list of IDs of > previous calls to receive_data whose data X depends on. So you need to know both (a) whether there is a dependence, and (b) what the origin of that dependence is. > Would this be very complicated? Where to start? If you just needed (a), it would be relatively straightforward to modify Memcheck, I think -- it has all the machinery in place for tracking one bit of information per value bit in the program. Tracking (b) as well makes things substantially harder, particular if you want non-terrible performance. If you really want to do this (ie. you're prepared to spend days to weeks on it, not just hours) then I suggest reading these two papers first to get an idea about value tracking: http://www.valgrind.org/docs/memcheck2005.pdf http://www.valgrind.org/docs/origin-tracking2007.pdf (particularly the Valgrind parts, less so the Java parts) These two also have some relevance: http://www.valgrind.org/docs/valgrind2007.pdf http://www.valgrind.org/docs/shadow-memory2007.pdf After that you should have a much better idea of what's involved, and whether you still want to do it. Nick |
|
From: David M. <dm...@gm...> - 2008-02-27 01:15:40
|
On Tue, Feb 26, 2008 at 1:18 PM, Nicholas Nethercote < nj...@cs...> wrote: > On Tue, 26 Feb 2008, Giovanni Funchal wrote: > > > Would this be very complicated? Where to start? > > If you just needed (a), it would be relatively straightforward to modify > Memcheck, I think -- it has all the machinery in place for tracking one > bit > of information per value bit in the program. Tracking (b) as well makes > things substantially harder, particular if you want non-terrible > performance. Will Drewry and Tavis Ormandy's Flayer tool adapts Memcheck's functionality for bit-precise taint tracking, and it's available under the GPL. You'll still want to read all the papers Nicholas mentioned, but this may help by showing how someone else dealt with the problem. The code is here: http://code.google.com/p/flayer/ There's a paper here describing the tool: http://www.usenix.org/event/woot07/tech/full_papers/drewry/drewry_html/ My catchconv tool also has a from-scratch implementation of taint flow, but it isn't optimized for speed. You can see the relevant code in the "isIR.c" and "isIR.h" files in the repository: http://sourceforge.net/cvs/?group_id=187658 and of course I'm happy to answer questions about the tool and this code. Catchconv also outputs symbolic constraints which could in principle be parsed to express tainted bytes as simplified formulas in terms of input variables. It does not currently have an option to map a specific tainted memory location or temporary directly to a simplified expression in terms of inputs - instead the formulas have lots of temporary variables and closely follow the structure of VEX IR. This is because the current use scenario just passes these constraints to a solver which does such simplification as needed, so just translating IR instructions as directly as possible makes sense. Still, it might be a better starting point than trying to do your own dependency tracking from scratch (or not). Performance is terrible, too, although I am working on it. -David Molnar |
|
From: Giovanni F. <gaf...@gm...> - 2008-02-27 07:59:00
|
Thanks a lot for the pointers, I'll be reading the papers right away. Don't hesitate if you have any additional comments. Regards, Giovanni On Wed, Feb 27, 2008 at 2:15 AM, David Molnar <dm...@gm...> wrote: > > > > On Tue, Feb 26, 2008 at 1:18 PM, Nicholas Nethercote > <nj...@cs...> wrote: > > > > > > On Tue, 26 Feb 2008, Giovanni Funchal wrote: > > > > > > > Would this be very complicated? Where to start? > > > > > > If you just needed (a), it would be relatively straightforward to modify > > Memcheck, I think -- it has all the machinery in place for tracking one > bit > > of information per value bit in the program. Tracking (b) as well makes > > things substantially harder, particular if you want non-terrible > > performance. > > Will Drewry and Tavis Ormandy's Flayer tool adapts Memcheck's functionality > for bit-precise taint tracking, and it's available under the GPL. You'll > still want to read all the papers Nicholas mentioned, but this may help by > showing how someone else dealt with the problem. The code is here: > http://code.google.com/p/flayer/ > > There's a paper here describing the tool: > http://www.usenix.org/event/woot07/tech/full_papers/drewry/drewry_html/ > > My catchconv tool also has a from-scratch implementation of taint flow, but > it isn't optimized for speed. You can see the relevant code in the "isIR.c" > and "isIR.h" files in the repository: > http://sourceforge.net/cvs/?group_id=187658 > and of course I'm happy to answer questions about the tool and this code. > > Catchconv also outputs symbolic constraints which could in principle be > parsed to express tainted bytes as simplified formulas in terms of input > variables. It does not currently have an option to map a specific tainted > memory location or temporary directly to a simplified expression in terms of > inputs - instead the formulas have lots of temporary variables and closely > follow the structure of VEX IR. This is because the current use scenario > just passes these constraints to a solver which does such simplification as > needed, so just translating IR instructions as directly as possible makes > sense. Still, it might be a better starting point than trying to do your own > dependency tracking from scratch (or not). Performance is terrible, too, > although I am working on it. > > -David Molnar > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > |