From: Grant E. <gra...@gm...> - 2011-02-17 17:29:42
Attachments:
pathtrace.patch
|
I've re-written the path display/trace features. Attached is a snapshot of the diffs against the current git HEAD. Changes since the first version I posted follow: * Removed the code that tracked fd-table state based on syscalls. It now does a readlink() on /proc/<pid>/fd/<fd> to find the path associated with a file descriptor. * Change the -P option so that instead of accepting a colon-separated list of paths, it accepts a single path. Multiple -P options can be specified to trace multiple paths. * Change the way that file descriptors are printed by the display functions. Instead of using "%s" and a function that returns a formatted string, they now use a printfd() function analogous to the printpath() function. * Added handling (for Linux) for system calls where we need to look at something other than arg[0] for a descriptor/path. Feedback is welcome... -- Grant |
From: Grant E. <gra...@gm...> - 2011-02-23 20:17:01
Attachments:
pathtrace.patch
|
Hi, Attached are my current diffs. I've merged recent changes from the main git repository and cleaned up loose ends. The general approach is: 1) Store -P arguments both as-is and canonicalized. 2) Compare stored paths against /proc/$pid/fd/$fd for fd arguments. 3) Compare stored paths against path arguments. Things on the possible (near-term) todo list: 1) Canonicalize relative path args based on /proc/$pid/cwd 2) Handle combination fd/path args such as those passed to openat() 3) Handle non-simple fd args like fd_set args passed to select() or the array of pollfd structs passed to poll(). Things on the possible (far-term) todo list: 1) Figure out a way at the "exiting" stage to handle matching descriptors that are return values from syscalls that have been filtered (not printed) at the "entering" stage. 2) Add argument type info to the sys_ent table so that generic code can be used to check for matching fd/path arguments. |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 21:00:26
|
Hi, On Wed, Feb 23, 2011 at 02:16:43PM -0600, Grant Edwards wrote: > Attached are my current diffs. I've merged recent changes from the > main git repository and cleaned up loose ends. > > The general approach is: > > 1) Store -P arguments both as-is and canonicalized. > > 2) Compare stored paths against /proc/$pid/fd/$fd for fd arguments. > > 3) Compare stored paths against path arguments. Thanks, it works. I think it could be merged after some cleanup and re-indentation. > Things on the possible (near-term) todo list: > > 1) Canonicalize relative path args based on /proc/$pid/cwd > > 2) Handle combination fd/path args such as those passed to openat() Let's try simple combination without path canonicalization in both cases. I'm still not quite comfortable with applying realpath(3) to user data, from security PoV. Maybe this kind of canonicalization should be enabled only when requested explicitly, e.g. via special option. > 3) Handle non-simple fd args like fd_set args passed to select() or > the array of pollfd structs passed to poll(). Agreed. Now a few comments on the patch itself. > diff --git a/.gitignore b/.gitignore > index 9f17271..7828bad 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -4,6 +4,8 @@ > .libs > .*.swp > > +.dir-locals.el > + > ChangeLog > > Makefile This has nothing to do with new path features. Why this change is needed? > @@ -449,6 +450,7 @@ struct tcb { > #define syserror(tcp) ((tcp)->u_error != 0) > #define verbose(tcp) (qual_flags[(tcp)->scno] & QUAL_VERBOSE) > #define abbrev(tcp) (qual_flags[(tcp)->scno] & QUAL_ABBREV) > +#define filtered(tcp) ((tcp)->flags & TCB_FILTERED) The indentation of the line being added differs from neighbouring lines. > @@ -702,3 +704,11 @@ extern long ia32; > #endif > > extern int not_failing_only; > +extern int show_fd_path; > +extern int tracing_paths; > + > +extern int pathtrace_select(char *path); > +extern int pathtrace_match(struct tcb *tcp); > +extern void printfd(struct tcb* tcp, int fd); > + > + No empty lines at the end, please. > diff --git a/desc.c b/desc.c > index 2b9f30a..ebbcf57 100644 > --- a/desc.c > +++ b/desc.c > @@ -309,7 +309,8 @@ int > sys_fcntl(struct tcb *tcp) > { > if (entering(tcp)) { > - tprintf("%ld, ", tcp->u_arg[0]); > + printfd(tcp,tcp->u_arg[0]); > + tprintf(", "); > printxval(fcntlcmds, tcp->u_arg[1], "F_???"); > switch (tcp->u_arg[1]) { > case F_SETFD: In this and other similar cases, a comma between function arguments is necessary. > @@ -586,14 +591,16 @@ sys_llseek(struct tcb *tcp) > * rather than one 64-bit argument for which LONG_LONG works > * appropriate for the native byte order. > */ > - if (tcp->u_arg[4] == SEEK_SET) > - tprintf("%ld, %llu, ", tcp->u_arg[0], > - (((long long int) tcp->u_arg[1]) << 32 > - | (unsigned long long) (unsigned) tcp->u_arg[2])); > - else > - tprintf("%ld, %lld, ", tcp->u_arg[0], > - (((long long int) tcp->u_arg[1]) << 32 > - | (unsigned long long) (unsigned) tcp->u_arg[2])); > + if (tcp->u_arg[4] == SEEK_SET) { > + printfd(tcp,tcp->u_arg[0]); > + tprintf(", %llu, ", (((long long int) tcp->u_arg[1]) << 32 > + | (unsigned long long) (unsigned) tcp->u_arg[2])); > + } > + else { > + printfd(tcp,tcp->u_arg[0]); > + tprintf(", %lld, ", (((long long int) tcp->u_arg[1]) << 32 > + | (unsigned long long) (unsigned) tcp->u_arg[2])); > + } > } > else { > long long int off; Indentation before the change was better. > @@ -229,9 +235,10 @@ sys_sendfile(tcp) > struct tcb *tcp; > { > if (entering(tcp)) { > - tprintf("%ld, %ld, %llu, %lu", tcp->u_arg[0], tcp->u_arg[1], > - LONG_LONG(tcp->u_arg[2], tcp->u_arg[3]), > - tcp->u_arg[4]); > + printfd(tcp,tcp->u_arg[0]); > + tprintf(", "); > + printfd(tcp,tcp->u_arg[1]); > + tprintf(", %llu, %lu", LONG_LONG(tcp->u_arg[2], tcp->u_arg[3]), tcp->u_arg[4]); > } else { > off_t offset; > Likewise. > diff --git a/pathtrace.c b/pathtrace.c > new file mode 100644 > index 0000000..70babff > --- /dev/null > +++ b/pathtrace.c > @@ -0,0 +1,304 @@ I suggest to apply code indentation of quota.c for new files. This file is indented very different from quota.c > +// get path associated with pid,fd tuple > +static char *getpath(pid_t pid, int fd) > +{ > + static char path[PATH_MAX+1]; > + char linkpath[64]; > + ssize_t n; > + > + if (fd<0) > + return NULL; > + > + snprintf(linkpath, sizeof linkpath, "/proc/%d/fd/%d", pid, fd); > + n = readlink(linkpath, path, (sizeof path) - 0); It has to be sizeof(path) - 1 instead. > +// Add a path to the set we're tracing. Also add the canonicalized > +// versio of the path. Secifying NULL will delete all paths. > +int pathtrace_select(char *path) > +{ > + char *rpath; > + > + if (path==NULL) > + return storepath(path); > + > + if (storepath(path)) > + return -1; > + > + rpath = realpath(path,NULL); > + > + if (rpath == NULL) > + { > + // not an error if user specifies non-existent file/path > + if (errno == ENOENT || errno == ENOTDIR) > + return 0; > + perror("Failed to resolve path"); > + return -1; > + } There are more potential error codes, e.g. EACCES, ELOOP and ENAMETOOLONG, that should not be treated as errors. Maybe strace should tolerate any realpath errors. > + > + // if realpath and specified path are same, we're done > + if (!strcmp(path,rpath)) > + { > + free(rpath); > + return 0; > + } > + > + tprintf("Requested path '%s' resolved into '%s'\n",path,rpath); All debug messages should go to stderr. > @@ -1256,7 +1274,7 @@ proc_open(struct tcb *tcp, int attaching) > #else /* FREEBSD */ > /* just unset the PF_LINGER flag for the Run-on-Last-Close. */ > if (ioctl(tcp->pfd, PIOCGFL, &arg) < 0) { > - perror("PIOCGFL"); > + perror("PIOCGFL"); > return -1; > } > arg &= ~PF_LINGER; In this and other cases, please avoid re-indentation of unrelated code. -- ldv |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 22:29:16
|
On Fri, Feb 25, 2011 at 12:00:18AM +0300, Dmitry V. Levin wrote: > On Wed, Feb 23, 2011 at 02:16:43PM -0600, Grant Edwards wrote: [...] > Now a few comments on the patch itself. [...] > + fprintf(stderr,"Max trace paths exceeded, only using first %d\n",NumElem(selected)); On x86-64, gcc complains: pathtrace.c: In function 'storepath': pathtrace.c:106:3: warning: format '%d' expects type 'int', but argument 3 has type 'long unsigned int' To fix this warning, you can safely cast NumElem(selected) to int. -- ldv |
From: Grant E. <gra...@gm...> - 2011-02-24 21:25:47
|
On 2011-02-24, Dmitry V. Levin <ld...@al...> wrote: >> Attached are my current diffs. I've merged recent changes from the >> main git repository and cleaned up loose ends. >> >> The general approach is: >> >> 1) Store -P arguments both as-is and canonicalized. >> >> 2) Compare stored paths against /proc/$pid/fd/$fd for fd arguments. >> >> 3) Compare stored paths against path arguments. > > Thanks, it works. I think it could be merged after some cleanup and > re-indentation. > >> Things on the possible (near-term) todo list: >> >> 1) Canonicalize relative path args based on /proc/$pid/cwd >> >> 2) Handle combination fd/path args such as those passed to openat() > > Let's try simple combination without path canonicalization in both cases. OK. > I'm still not quite comfortable with applying realpath(3) to user > data, from security PoV. Maybe this kind of canonicalization should > be enabled only when requested explicitly, e.g. via special option. That wouldn't be difficult. Is it something we want to do now? >> 3) Handle non-simple fd args like fd_set args passed to select() or >> the array of pollfd structs passed to poll(). > > Agreed. > > Now a few comments on the patch itself. > >> diff --git a/.gitignore b/.gitignore >> index 9f17271..7828bad 100644 >> --- a/.gitignore >> +++ b/.gitignore >> @@ -4,6 +4,8 @@ >> .libs >> .*.swp >> >> +.dir-locals.el >> + >> ChangeLog >> >> Makefile > > This has nothing to do with new path features. > Why this change is needed? Oops. I added that so that the file used to tell emacs how to indent the source files would be ignored by git. I didn't mean to include it in the patch. > >> @@ -449,6 +450,7 @@ struct tcb { >> #define syserror(tcp) ((tcp)->u_error != 0) >> #define verbose(tcp) (qual_flags[(tcp)->scno] & QUAL_VERBOSE) >> #define abbrev(tcp) (qual_flags[(tcp)->scno] & QUAL_ABBREV) >> +#define filtered(tcp) ((tcp)->flags & TCB_FILTERED) > > The indentation of the line being added differs from neighbouring lines. I'll fix that. >> @@ -702,3 +704,11 @@ extern long ia32; >> #endif >> >> extern int not_failing_only; >> +extern int show_fd_path; >> +extern int tracing_paths; >> + >> +extern int pathtrace_select(char *path); >> +extern int pathtrace_match(struct tcb *tcp); >> +extern void printfd(struct tcb* tcp, int fd); >> + >> + > > No empty lines at the end, please. Roger. >> diff --git a/desc.c b/desc.c >> index 2b9f30a..ebbcf57 100644 >> --- a/desc.c >> +++ b/desc.c >> @@ -309,7 +309,8 @@ int >> sys_fcntl(struct tcb *tcp) >> { >> if (entering(tcp)) { >> - tprintf("%ld, ", tcp->u_arg[0]); >> + printfd(tcp,tcp->u_arg[0]); >> + tprintf(", "); >> printxval(fcntlcmds, tcp->u_arg[1], "F_???"); >> switch (tcp->u_arg[1]) { >> case F_SETFD: > > In this and other similar cases, a comma between function arguments is > necessary. While I understand that sentence, I don't undestand how it is meant as a comment on the diff above it. Can you explain? >> @@ -586,14 +591,16 @@ sys_llseek(struct tcb *tcp) >> * rather than one 64-bit argument for which LONG_LONG works >> * appropriate for the native byte order. >> */ >> - if (tcp->u_arg[4] == SEEK_SET) >> - tprintf("%ld, %llu, ", tcp->u_arg[0], >> - (((long long int) tcp->u_arg[1]) << 32 >> - | (unsigned long long) (unsigned) tcp->u_arg[2])); >> - else >> - tprintf("%ld, %lld, ", tcp->u_arg[0], >> - (((long long int) tcp->u_arg[1]) << 32 >> - | (unsigned long long) (unsigned) tcp->u_arg[2])); >> + if (tcp->u_arg[4] == SEEK_SET) { >> + printfd(tcp,tcp->u_arg[0]); >> + tprintf(", %llu, ", (((long long int) tcp->u_arg[1]) << 32 >> + | (unsigned long long) (unsigned) tcp->u_arg[2])); >> + } >> + else { >> + printfd(tcp,tcp->u_arg[0]); >> + tprintf(", %lld, ", (((long long int) tcp->u_arg[1]) << 32 >> + | (unsigned long long) (unsigned) tcp->u_arg[2])); >> + } >> } >> else { >> long long int off; > > Indentation before the change was better. Sorry about that, I'll fix it. >> @@ -229,9 +235,10 @@ sys_sendfile(tcp) >> struct tcb *tcp; >> { >> if (entering(tcp)) { >> - tprintf("%ld, %ld, %llu, %lu", tcp->u_arg[0], tcp->u_arg[1], >> - LONG_LONG(tcp->u_arg[2], tcp->u_arg[3]), >> - tcp->u_arg[4]); >> + printfd(tcp,tcp->u_arg[0]); >> + tprintf(", "); >> + printfd(tcp,tcp->u_arg[1]); >> + tprintf(", %llu, %lu", LONG_LONG(tcp->u_arg[2], tcp->u_arg[3]), tcp->u_arg[4]); >> } else { >> off_t offset; > > Likewise. I'll fix that too. >> diff --git a/pathtrace.c b/pathtrace.c >> new file mode 100644 >> index 0000000..70babff >> --- /dev/null >> +++ b/pathtrace.c >> @@ -0,0 +1,304 @@ > > I suggest to apply code indentation of quota.c for new files. > This file is indented very different from quota.c I planned on re-indenting it after I was done working on it. Since I haven't found any tools/settings that produce the indentation you desire, it's pretty difficult for me to maintain your desired indentation while working on files. After we've decided on the actual code I'll fix the indentation. Is there a set of options for ident or astyle or some other tool that will produce the indentation you want? >> +// get path associated with pid,fd tuple >> +static char *getpath(pid_t pid, int fd) >> +{ >> + static char path[PATH_MAX+1]; >> + char linkpath[64]; >> + ssize_t n; >> + >> + if (fd<0) >> + return NULL; >> + >> + snprintf(linkpath, sizeof linkpath, "/proc/%d/fd/%d", pid, fd); >> + n = readlink(linkpath, path, (sizeof path) - 0); > > It has to be sizeof(path) - 1 instead. Of course. >> +// Add a path to the set we're tracing. Also add the canonicalized >> +// versio of the path. Secifying NULL will delete all paths. >> +int pathtrace_select(char *path) >> +{ >> + char *rpath; >> + >> + if (path==NULL) >> + return storepath(path); >> + >> + if (storepath(path)) >> + return -1; >> + >> + rpath = realpath(path,NULL); >> + >> + if (rpath == NULL) >> + { >> + // not an error if user specifies non-existent file/path >> + if (errno == ENOENT || errno == ENOTDIR) >> + return 0; >> + perror("Failed to resolve path"); >> + return -1; >> + } > > There are more potential error codes, e.g. EACCES, ELOOP and ENAMETOOLONG, > that should not be treated as errors. Maybe strace should tolerate any > realpath errors. OK. >> + >> + // if realpath and specified path are same, we're done >> + if (!strcmp(path,rpath)) >> + { >> + free(rpath); >> + return 0; >> + } >> + >> + tprintf("Requested path '%s' resolved into '%s'\n",path,rpath); > > All debug messages should go to stderr. > >> @@ -1256,7 +1274,7 @@ proc_open(struct tcb *tcp, int attaching) >> #else /* FREEBSD */ >> /* just unset the PF_LINGER flag for the Run-on-Last-Close. */ >> if (ioctl(tcp->pfd, PIOCGFL, &arg) < 0) { >> - perror("PIOCGFL"); >> + perror("PIOCGFL"); >> return -1; >> } >> arg &= ~PF_LINGER; > > In this and other cases, please avoid re-indentation of unrelated code. I've been trying to, but it's difficult when the indentation is so inconsistent (different styles in different files, spaces in some places within a file tabs in others). I don't know about others, but it I would find it a lot easier to work on the source files if they all had a consistent, standard indentation style that could be produced by a tool like indent or astyle (I don't particulary care what standard or what tool). -- Grant Edwards grant.b.edwards Yow! ! Up ahead! It's a at DONUT HUT!! gmail.com |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 22:25:15
|
On Thu, Feb 24, 2011 at 09:25:13PM +0000, Grant Edwards wrote: > >> 3) Compare stored paths against path arguments. > > > > Thanks, it works. I think it could be merged after some cleanup and > > re-indentation. > > > >> Things on the possible (near-term) todo list: > >> > >> 1) Canonicalize relative path args based on /proc/$pid/cwd > >> > >> 2) Handle combination fd/path args such as those passed to openat() > > > > Let's try simple combination without path canonicalization in both cases. > > OK. > > > I'm still not quite comfortable with applying realpath(3) to user > > data, from security PoV. Maybe this kind of canonicalization should > > be enabled only when requested explicitly, e.g. via special option. > > That wouldn't be difficult. Is it something we want to do now? Let's cleanup and merge already written code first. > >> --- a/desc.c > >> +++ b/desc.c > >> @@ -309,7 +309,8 @@ int > >> sys_fcntl(struct tcb *tcp) > >> { > >> if (entering(tcp)) { > >> - tprintf("%ld, ", tcp->u_arg[0]); > >> + printfd(tcp,tcp->u_arg[0]); > >> + tprintf(", "); > >> printxval(fcntlcmds, tcp->u_arg[1], "F_???"); > >> switch (tcp->u_arg[1]) { > >> case F_SETFD: > > > > In this and other similar cases, a comma between function arguments is > > necessary. > > While I understand that sentence, I don't undestand how it is meant as > a comment on the diff above it. Can you explain? - printfd(tcp,tcp->u_arg[0]); + printfd(tcp, tcp->u_arg[0]); > >> diff --git a/pathtrace.c b/pathtrace.c > >> new file mode 100644 > >> index 0000000..70babff > >> --- /dev/null > >> +++ b/pathtrace.c > >> @@ -0,0 +1,304 @@ > > > > I suggest to apply code indentation of quota.c for new files. > > This file is indented very different from quota.c > > I planned on re-indenting it after I was done working on it. Since I > haven't found any tools/settings that produce the indentation you > desire, it's pretty difficult for me to maintain your desired > indentation while working on files. After we've decided on the actual > code I'll fix the indentation. Is there a set of options for ident or > astyle or some other tool that will produce the indentation you want? For new files, I use GNU indent with attached config, the result it produces is very close to my preferred coding style. > >> @@ -1256,7 +1274,7 @@ proc_open(struct tcb *tcp, int attaching) > >> #else /* FREEBSD */ > >> /* just unset the PF_LINGER flag for the Run-on-Last-Close. */ > >> if (ioctl(tcp->pfd, PIOCGFL, &arg) < 0) { > >> - perror("PIOCGFL"); > >> + perror("PIOCGFL"); > >> return -1; > >> } > >> arg &= ~PF_LINGER; > > > > In this and other cases, please avoid re-indentation of unrelated code. > > I've been trying to, but it's difficult when the indentation is so > inconsistent (different styles in different files, spaces in some > places within a file tabs in others). I agree, this is inconvenient. > I don't know about others, but it I would find it a lot easier to work > on the source files if they all had a consistent, standard indentation > style that could be produced by a tool like indent or astyle (I don't > particulary care what standard or what tool). Yes, adding new code would be easier. Fortunately, git blame has -w option, so one can hope that history tracking won't suffer very much from total code re-indentation. -- ldv |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 23:02:57
Attachments:
.indent.pro
|
On Fri, Feb 25, 2011 at 01:25:07AM +0300, Dmitry V. Levin wrote: > For new files, I use GNU indent with attached config, the result it produces > is very close to my preferred coding style. Looks like I forgot to attach the config. Here it is. -- ldv |
From: Grant E. <gra...@gm...> - 2011-02-24 22:51:15
|
On 2011-02-24, Dmitry V. Levin <ld...@al...> wrote: >>> I'm still not quite comfortable with applying realpath(3) to user >>> data, from security PoV. Maybe this kind of canonicalization should >>> be enabled only when requested explicitly, e.g. via special option. >> >> That wouldn't be difficult. Is it something we want to do now? > > Let's cleanup and merge already written code first. Agreed. >>>> --- a/desc.c >>>> +++ b/desc.c >>>> @@ -309,7 +309,8 @@ int >>>> sys_fcntl(struct tcb *tcp) >>>> { >>>> if (entering(tcp)) { >>>> - tprintf("%ld, ", tcp->u_arg[0]); >>>> + printfd(tcp,tcp->u_arg[0]); >>>> + tprintf(", "); >>>> printxval(fcntlcmds, tcp->u_arg[1], "F_???"); >>>> switch (tcp->u_arg[1]) { >>>> case F_SETFD: >>> >>> In this and other similar cases, a comma between function arguments >>> is necessary. >> >> While I understand that sentence, I don't undestand how it is meant as >> a comment on the diff above it. Can you explain? > > - printfd(tcp,tcp->u_arg[0]); > + printfd(tcp, tcp->u_arg[0]); Ah! You want a _space_after_ the comma between args. I'll fix those. >>>> diff --git a/pathtrace.c b/pathtrace.c >>>> new file mode 100644 >>>> index 0000000..70babff >>>> --- /dev/null >>>> +++ b/pathtrace.c >>>> @@ -0,0 +1,304 @@ >>> >>> I suggest to apply code indentation of quota.c for new files. >>> This file is indented very different from quota.c >> >> I planned on re-indenting it after I was done working on it. Since I >> haven't found any tools/settings that produce the indentation you >> desire, it's pretty difficult for me to maintain your desired >> indentation while working on files. After we've decided on the >> actual code I'll fix the indentation. Is there a set of options for >> ident or astyle or some other tool that will produce the indentation >> you want? > > For new files, I use GNU indent with attached config, the result it > produces is very close to my preferred coding style. Excellent, thanks. >> I don't know about others, but it I would find it a lot easier to >> work on the source files if they all had a consistent, standard >> indentation style that could be produced by a tool like indent or >> astyle (I don't particulary care what standard or what tool). > > Yes, adding new code would be easier. Fortunately, git blame has -w > option, so one can hope that history tracking won't suffer very much > from total code re-indentation. It can be a little painful when you decide to make the jump and re-indent a whole project. -- Grant Edwards grant.b.edwards Yow! I guess it was all a at DREAM ... or an episode of gmail.com HAWAII FIVE-O ... |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 23:00:49
|
On Thu, Feb 24, 2011 at 10:50:48PM +0000, Grant Edwards wrote: > > - printfd(tcp,tcp->u_arg[0]); > > + printfd(tcp, tcp->u_arg[0]); > > Ah! You want a _space_after_ the comma between args. Oops, I definitely meant a _space_ after the comma. :) -- ldv |
From: Roland M. <ro...@re...> - 2011-02-24 22:51:46
|
> Let's cleanup and merge already written code first. Btw, IMHO this new feature should wait until after the 4.6 release. We can have a 4.6.1 when it's ready and well-tested. Thanks, Roland |
From: Dmitry V. L. <ld...@al...> - 2011-02-24 22:58:47
|
On Thu, Feb 24, 2011 at 02:51:38PM -0800, Roland McGrath wrote: > > Let's cleanup and merge already written code first. > > Btw, IMHO this new feature should wait until after the 4.6 release. > We can have a 4.6.1 when it's ready and well-tested. Yes, it is going to be merged after 4.6 release, which is going to happen as soon as we get debian/* and linux/[^b]*/ioctlent.h files sorted out. -- ldv |
From: Grant E. <gra...@gm...> - 2011-02-24 22:55:22
|
On 2011-02-24, Dmitry V. Levin <ld...@al...> wrote: > On Fri, Feb 25, 2011 at 12:00:18AM +0300, Dmitry V. Levin wrote: >> On Wed, Feb 23, 2011 at 02:16:43PM -0600, Grant Edwards wrote: > [...] >> Now a few comments on the patch itself. > [...] >> + fprintf(stderr,"Max trace paths exceeded, only using first %d\n",NumElem(selected)); > > On x86-64, gcc complains: > pathtrace.c: In function 'storepath': > pathtrace.c:106:3: warning: format '%d' expects type 'int', but argument 3 has type 'long unsigned int' > > To fix this warning, you can safely cast NumElem(selected) to int. OK. -- Grant Edwards grant.b.edwards Yow! Either CONFESS now or at we go to "PEOPLE'S COURT"!! gmail.com |
From: Grant E. <gra...@gm...> - 2011-03-12 21:48:19
Attachments:
pathtrace.patch
|
Attached are my current diffs. I've merged recent changes from the main git repository and addressed various comments. The notable changes are addition of handling for 'select' syscalls, 'poll' syscalls, and 'epoll_ctl'. There are a few more changes that have been discussed, but my understanding is that we've decided to delay them until after the existing features have been merged: 1) Canonicalization of arguments that are relative paths. 2) Canonicalization of arguments that are (fd, relpath) tuples. 3) Filtering based on retrun-values that are file dscriptors. Numbers 1 and 2 probably aren't difficult, but there is some question as to their reliability. Number 3 would take more thought because the decision to display a system call or not is made at the entry of the system call, and "un-doing" that decision when the system call returns would be messy. -- Grant Edwards |
From: Grant E. <gra...@gm...> - 2011-03-13 18:16:18
|
On 2011-03-12, Grant Edwards <gra...@gm...> wrote: > Attached are my current diffs. I've merged recent changes from the > main git repository and addressed various comments. I've decided that all the lines like this: (umovestr(tcp, tcp->u_arg[0], sizeof path, path), pathmatch(path)) are just too ugly. I'm going to refactor things a bit and add a upathmatch() function that accepts a pointer to a string in user-space. That will eliminate all of the comma-operators in those long boolean return expressions. I'll submit new diffs on Monday. -- Grant |
From: Grant E. <gra...@gm...> - 2011-03-14 15:47:53
Attachments:
pathtrace.patch
|
On Sat, Mar 12, 2011 at 03:48:08PM -0600, Grant Edwards wrote: > Attached are my current diffs. I've merged recent changes from the > main git repository and addressed various comments. > > The notable changes are addition of handling for 'select' syscalls, > 'poll' syscalls, and 'epoll_ctl'. This set of diffs has had a little more cleanup done an code that checks paths residing in user-space was re-factored into a separate function. |
From: Dmitry V. L. <ld...@al...> - 2011-03-14 23:20:39
|
On Mon, Mar 14, 2011 at 10:47:18AM -0500, Grant Edwards wrote: > On Sat, Mar 12, 2011 at 03:48:08PM -0600, Grant Edwards wrote: > > > Attached are my current diffs. I've merged recent changes from the > > main git repository and addressed various comments. > > > > The notable changes are addition of handling for 'select' syscalls, > > 'poll' syscalls, and 'epoll_ctl'. > > This set of diffs has had a little more cleanup done an code that > checks paths residing in user-space was re-factored into a separate > function. Thanks. I have two comments after skimming through the patch: 1. never use alloca(3) with arbitrary size coming from untrusted input: it's unsafe, because alloca(3) has no means to indicate a memory allocation failure; 2. getpath() sounds a bit generic, I suggest renaming it to something more specific, e.g. getfdpath(). -- ldv |
From: Grant E. <gra...@gm...> - 2011-03-15 04:39:17
|
On 2011-03-14, Dmitry V. Levin <ld...@al...> wrote: > On Mon, Mar 14, 2011 at 10:47:18AM -0500, Grant Edwards wrote: >> On Sat, Mar 12, 2011 at 03:48:08PM -0600, Grant Edwards wrote: >> >> > Attached are my current diffs. I've merged recent changes from the >> > main git repository and addressed various comments. >> > >> > The notable changes are addition of handling for 'select' syscalls, >> > 'poll' syscalls, and 'epoll_ctl'. >> >> This set of diffs has had a little more cleanup done an code that >> checks paths residing in user-space was re-factored into a separate >> function. > > Thanks. I have two comments after skimming through the patch: > > 1. never use alloca(3) with arbitrary size coming from untrusted input: > it's unsafe, because alloca(3) has no means to indicate a memory > allocation failure; I'll change it to a malloc. > 2. getpath() sounds a bit generic, I suggest renaming it to something > more specific, e.g. getfdpath(). That sounds good. -- Grant |
From: Grant E. <gra...@gm...> - 2011-03-16 12:43:48
Attachments:
pathtrace.patch
|
On 2011-03-14, Dmitry V. Levin <ldv-u2l5PoMzF/Vg9...@pu...> wrote: > On Mon, Mar 14, 2011 at 10:47:18AM -0500, Grant Edwards wrote: >> On Sat, Mar 12, 2011 at 03:48:08PM -0600, Grant Edwards wrote: >> >> > Attached are my current diffs. I've merged recent changes from the >> > main git repository and addressed various comments. >> > >> > The notable changes are addition of handling for 'select' syscalls, >> > 'poll' syscalls, and 'epoll_ctl'. >> >> This set of diffs has had a little more cleanup done an code that >> checks paths residing in user-space was re-factored into a separate >> function. > > Thanks. I have two comments after skimming through the patch: > > 1. never use alloca(3) with arbitrary size coming from untrusted input: > it's unsafe, because alloca(3) has no means to indicate a memory > allocation failure; > > 2. getpath() sounds a bit generic, I suggest renaming it to something > more specific, e.g. getfdpath(). Attached are updated diffs against git HEAD as of a few minutes ago... -- Grant |
From: Dmitry V. L. <ld...@al...> - 2011-03-22 14:02:57
|
On Wed, Mar 16, 2011 at 07:43:36AM -0500, Grant Edwards wrote: > Attached are updated diffs against git HEAD as of a few minutes ago... I've made a few minor corrections, most essential of them are early include of "defs.h" (due to config.h), and check for tcp->scno before pathtrace_match() call (because the latter uses tcp->scno). Would you like to prepare a commit message? diff --git a/defs.h b/defs.h index ecfd7c0..9e1f5f4 100644 --- a/defs.h +++ b/defs.h @@ -453,7 +453,7 @@ struct tcb { #define syserror(tcp) ((tcp)->u_error != 0) #define verbose(tcp) (qual_flags[(tcp)->scno] & QUAL_VERBOSE) #define abbrev(tcp) (qual_flags[(tcp)->scno] & QUAL_ABBREV) -#define filtered(tcp) ((tcp)->flags & TCB_FILTERED) +#define filtered(tcp) ((tcp)->flags & TCB_FILTERED) struct xlat { int val; diff --git a/pathtrace.c b/pathtrace.c index 495122d..2980a12 100644 --- a/pathtrace.c +++ b/pathtrace.c @@ -1,4 +1,3 @@ - /* * Copyright (c) 2011, Comtrol Corp. * All rights reserved. @@ -27,11 +26,11 @@ * */ +#include "defs.h" + #include <ctype.h> #include <limits.h> -#include "defs.h" - #ifdef HAVE_POLL_H #include <poll.h> #endif @@ -67,6 +66,7 @@ static int upathmatch(struct tcb *tcp, unsigned long upath) { char path[PATH_MAX + 1]; + return umovestr(tcp, upath, sizeof path, path) == 0 && pathmatch(path); } @@ -135,8 +135,8 @@ pathtrace_select(char *path) return 0; } - fprintf(stderr, "Requested path '%s' resolved into '%s'\n", path, - rpath); + fprintf(stderr, "Requested path '%s' resolved into '%s'\n", + path, rpath); return storepath(rpath); } @@ -249,7 +249,8 @@ pathtrace_match(struct tcb *tcp) if (s->sys_func == sys_oldselect) { - if (umoven(tcp, tcp->u_arg[0], sizeof oldargs, (char*)oldargs) < 0) + if (umoven(tcp, tcp->u_arg[0], sizeof oldargs, + (char*) oldargs) < 0) { fprintf(stderr, "umoven() failed\n"); return 0; @@ -259,9 +260,9 @@ pathtrace_match(struct tcb *tcp) args = tcp->u_arg; nfds = args[0]; - fdsize = ((((nfds + 7) / 8) + sizeof(long) - - 1) & -sizeof(long)); - fds = (fd_set *) malloc(fdsize); + fdsize = ((((nfds + 7) / 8) + sizeof(long) - 1) + & -sizeof(long)); + fds = malloc(fdsize); if (fds == NULL) { @@ -269,7 +270,7 @@ pathtrace_match(struct tcb *tcp) return 0; } - for (i = 1; i <= 3; i++) + for (i = 1; i <= 3; ++i) { if (args[i] == 0) continue; @@ -280,7 +281,7 @@ pathtrace_match(struct tcb *tcp) continue; } - for (j = 0; j < nfds; j++) + for (j = 0; j < nfds; ++j) if (FD_ISSET(j, fds) && fdmatch(tcp, j)) { free(fds); diff --git a/strace.c b/strace.c index f8ae0d0..7874d19 100644 --- a/strace.c +++ b/strace.c @@ -105,7 +105,7 @@ static bool daemonized_tracer = 0; int not_failing_only = 0; /* Show path associated with fd arguments */ -int show_fd_path; +int show_fd_path = 0; /* are we filtering traces based on paths? */ int tracing_paths = 0; diff --git a/syscall.c b/syscall.c index 213492a..6987011 100644 --- a/syscall.c +++ b/syscall.c @@ -2688,10 +2688,10 @@ trace_syscall_entering(struct tcb *tcp) internal_syscall(tcp); - if ((tracing_paths && !pathtrace_match(tcp)) || - (tcp->scno >= 0 && tcp->scno < nsyscalls && !(qual_flags[tcp->scno] & QUAL_TRACE))) { - tcp->flags |= TCB_FILTERED; - tcp->flags |= TCB_INSYSCALL; + if ((tcp->scno >= 0 && tcp->scno < nsyscalls && + !(qual_flags[tcp->scno] & QUAL_TRACE)) || + (tracing_paths && !pathtrace_match(tcp))) { + tcp->flags |= TCB_INSYSCALL | TCB_FILTERED; return 0; } diff --git a/util.c b/util.c index 57e11be..d6c4df7 100644 --- a/util.c +++ b/util.c @@ -419,7 +419,8 @@ void printfd(struct tcb *tcp, int fd) { char *p; - if (show_fd_path && (p=getfdpath(tcp, fd))) + + if (show_fd_path && (p = getfdpath(tcp, fd))) tprintf("%d<%s>", fd, p); else tprintf("%d", fd); @@ -1781,29 +1782,24 @@ err: #endif /* SUNOS4 */ -#ifdef LINUX -#include <unistd.h> - // get path associated with fd char *getfdpath(struct tcb *tcp, int fd) { +#ifdef LINUX static char path[PATH_MAX+1]; char linkpath[64]; ssize_t n; - + if (fd < 0) return NULL; - + snprintf(linkpath, sizeof linkpath, "/proc/%d/fd/%d", tcp->pid, fd); n = readlink(linkpath, path, (sizeof path) - 1); if (n <= 0) return NULL; path[n] = '\0'; return path; -} #else -char *getfdpath(struct tcb *tcp, int fd) -{ return NULL; -} #endif +} -- ldv |
From: Grant E. <gra...@gm...> - 2011-03-28 15:43:17
|
On 2011-03-22, Dmitry V. Levin <ld...@al...> wrote: > On Wed, Mar 16, 2011 at 07:43:36AM -0500, Grant Edwards wrote: >> Attached are updated diffs against git HEAD as of a few minutes ago... > > I've made a few minor corrections, most essential of them are early > include of "defs.h" (due to config.h), and check for tcp->scno > before pathtrace_match() call (because the latter uses tcp->scno). > > Would you like to prepare a commit message? How about this: ------------------------------------------------------------------------ Add ability to print file descriptor paths and filter by those paths. * pathrace.c (pathmatch, upathmatch, fdmatch, storepath, pathtrace_select, pathtrace_match): New functions to handle matching syscall arguments to user-specified file paths. * Makefile.am (strace_SOURCES) Add pathtrace.c. * defs.h (TCB_FILTERED, filtered): New defines. (getfdpath, pathtrace_select, pathtrace_match, show_fd_path, tracing_paths) New declarations. * strace.1 Add descriptions of -y and -P options. * strace.c (show_fd_path, tracing_paths): New global variables. (usage, main): Add handling of -y and -P options. * syscall.c (trace_syscall_entering): Add path matching logic to the "print/noprint" decision and set the TCB_FILTERED bit appropriately. (trace_syscall_exiting) Use filtered() macro to check the TCB_FILTERED bit to determine print/noprint status. ------------------------------------------------------------------------ |
From: Dmitry V. L. <ld...@al...> - 2011-04-07 20:37:50
|
On Mon, Mar 28, 2011 at 03:42:53PM +0000, Grant Edwards wrote: > On 2011-03-22, Dmitry V. Levin wrote: > > On Wed, Mar 16, 2011 at 07:43:36AM -0500, Grant Edwards wrote: > >> Attached are updated diffs against git HEAD as of a few minutes ago... > > > > I've made a few minor corrections, most essential of them are early > > include of "defs.h" (due to config.h), and check for tcp->scno > > before pathtrace_match() call (because the latter uses tcp->scno). > > > > Would you like to prepare a commit message? > > How about this: Thanks, and sorry for the delay on my side. I've just pushed it to ldv/pathtrace branch; if there are no obvious issues to be corrected, I'll move it to HEAD. -- ldv |
From: Grant E. <gra...@gm...> - 2011-04-07 21:05:13
|
On 2011-04-07, Dmitry V. Levin <ld...@al...> wrote: > On Mon, Mar 28, 2011 at 03:42:53PM +0000, Grant Edwards wrote: >> On 2011-03-22, Dmitry V. Levin wrote: >> > On Wed, Mar 16, 2011 at 07:43:36AM -0500, Grant Edwards wrote: >> >> Attached are updated diffs against git HEAD as of a few minutes ago... >> > >> > I've made a few minor corrections, most essential of them are early >> > include of "defs.h" (due to config.h), and check for tcp->scno >> > before pathtrace_match() call (because the latter uses tcp->scno). >> > >> > Would you like to prepare a commit message? >> >> How about this: > > Thanks, and sorry for the delay on my side. No problem. > I've just pushed it to ldv/pathtrace branch; if there are no obvious > issues to be corrected, I'll move it to HEAD. Thanks! -- Grant Edwards grant.b.edwards Yow! over in west at Philadelphia a puppy is gmail.com vomiting ... |
From: Dmitry V. L. <ld...@al...> - 2011-02-18 00:12:50
|
On Thu, Feb 17, 2011 at 11:29:27AM -0600, Grant Edwards wrote: > I've re-written the path display/trace features. Attached is a > snapshot of the diffs against the current git HEAD. Changes since the > first version I posted follow: Thanks a lot. Now we have a working PoC to discuss details. > * Removed the code that tracked fd-table state based on syscalls. It > now does a readlink() on /proc/<pid>/fd/<fd> to find the path > associated with a file descriptor. Linux kernel treats file descriptors as unsigned integers, but sysctl_nr_open is still limited to 0x100000, so I'd add an extra check for fd < 0 in getpath(). > * Change the -P option so that instead of accepting a colon-separated > list of paths, it accepts a single path. Multiple -P options can > be specified to trace multiple paths. Maybe an attempt to exceed MAXSELECTED in pathtrace_select() should be treated as a fatal error. > * Change the way that file descriptors are printed by the display > functions. Instead of using "%s" and a function that returns a > formatted string, they now use a printfd() function analogous to > the printpath() function. BTW, there is a long standing bug in decoding of file descriptors on 64bit architectures, and it's time to fix it in one place. For example, $ cat close.c int close(unsigned long fd); int main(void){return !!close(0xffffffff00000000UL);} $ gcc -Wall -O2 close.c -o close $ strace -eclose -o'|tail -1' ./close close(-4294967296) = 0 $ strace -y -eclose -o'|tail -1' ./close close(-4294967296</dev/pts/1>) = 0 The fix is to change "fd" type in printfd() from long to int, and to print it using %d format. > * Added handling (for Linux) for system calls where we need to look > at something other than arg[0] for a descriptor/path. There is a lot of work to do. Some non-arg[0] syscalls are not listed in pathtrace_match(); for example, sys_dup3 and sys_old_mmap are listed but sys_dup2 and sys_mmap are not. Some struct sysent records still have outdated sys_flags; for example, TRACE_DESC is not set for sys_mmap and sys_fadvise64*. I'm not sure that we can ignore all cases where syscalls return file descriptor as their return value. For example, $ strace -s4 -y -P /lib64/libc-2.11.3.so -P /lib64/libc.so.6 /bin/echo open("/lib64/libc.so.6", O_RDONLY) = 3 read(3</lib64/libc-2.11.3.so>, "\177ELF"..., 832) = 832 fstat(3</lib64/libc-2.11.3.so>, {st_mode=S_IFREG|0755, st_size=1465744, ...}) = 0 close(3</lib64/libc-2.11.3.so>) = 0 The pathname passed to open(2) is a symlink, and /proc/<pid>/fd/<fd> points to the canonicalized pathname, so -P /lib64/libc-2.11.3.so won't catch this open(2) call now. -- ldv |
From: Dmitry V. L. <ld...@al...> - 2011-02-20 21:16:54
|
On Fri, Feb 18, 2011 at 03:12:42AM +0300, Dmitry V. Levin wrote: > Some struct sysent records still have outdated sys_flags; for example, > TRACE_DESC is not set for sys_mmap and sys_fadvise64*. I've just pushed fixes for these and other similar bugs in sys_flags. Now epoll_create* and fanotify_init also have to be ignored in the path filtering code. -- ldv |
From: Grant E. <gra...@gm...> - 2011-02-18 16:20:52
|
On 2011-02-18, Dmitry V. Levin <ld...@al...> wrote: > On Thu, Feb 17, 2011 at 11:29:27AM -0600, Grant Edwards wrote: >> I've re-written the path display/trace features. Attached is a >> snapshot of the diffs against the current git HEAD. Changes since the >> first version I posted follow: > > Thanks a lot. Now we have a working PoC to discuss details. > >> * Removed the code that tracked fd-table state based on syscalls. It >> now does a readlink() on /proc/<pid>/fd/<fd> to find the path >> associated with a file descriptor. > > Linux kernel treats file descriptors as unsigned integers, but > sysctl_nr_open is still limited to 0x100000, so I'd add an extra check > for fd < 0 in getpath(). Done. >> * Change the -P option so that instead of accepting a colon-separated >> list of paths, it accepts a single path. Multiple -P options can >> be specified to trace multiple paths. > > Maybe an attempt to exceed MAXSELECTED in pathtrace_select() should be > treated as a fatal error. Done. >> * Change the way that file descriptors are printed by the display >> functions. Instead of using "%s" and a function that returns a >> formatted string, they now use a printfd() function analogous to >> the printpath() function. > > BTW, there is a long standing bug in decoding of file descriptors on > 64bit architectures, and it's time to fix it in one place. > > For example, > $ cat close.c > int close(unsigned long fd); > int main(void){return !!close(0xffffffff00000000UL);} > $ gcc -Wall -O2 close.c -o close > $ strace -eclose -o'|tail -1' ./close > close(-4294967296) = 0 > $ strace -y -eclose -o'|tail -1' ./close > close(-4294967296</dev/pts/1>) = 0 > > The fix is to change "fd" type in printfd() from long to int, and to print > it using %d format. Done. >> * Added handling (for Linux) for system calls where we need to look >> at something other than arg[0] for a descriptor/path. > > There is a lot of work to do. > Some non-arg[0] syscalls are not listed in pathtrace_match(); for > example, sys_dup3 and sys_old_mmap are listed but sys_dup2 and > sys_mmap are not. The missing dup2 appears to have been an oversight on my part. sys_mmap is missing becuase I only examined system calls that had the TRACE_DESC flag set in the sysent table. I've added both of them and added the TRACE_DESC flag to the sysent entry for sys_mmap. > Some struct sysent records still have outdated sys_flags; for example, > TRACE_DESC is not set for sys_mmap and sys_fadvise64*. Fixed. [I notice that the flags value was TF on m68k and s390x, but it was 0 for the rest. I've changed it to TD for all architectures.] > I'm not sure that we can ignore all cases where syscalls return file > descriptor as their return value. That wasn't my intent, and I don't think that's what the code does. I meant to ignore system calls who returned file descriptors (and therefore had the TRACE_DESC flag set), but didn't have arguments that were file paths or open file descriptors. I assume you're talking about are this bit of code: if (s->sys_func == sys_poll || s->sys_func == printargs || s->sys_func == sys_ppoll || s->sys_func == sys_select || s->sys_func == sys_oldselect || s->sys_func == sys_pselect6 || s->sys_func == sys_pipe || s->sys_func == sys_pipe2 || s->sys_func == sys_eventfd2 || s->sys_func == sys_eventfd || s->sys_func == sys_inotify_init1 || s->sys_func == sys_timerfd_create || s->sys_func == sys_timerfd_gettime || s->sys_func == sys_timerfd_settime) { // these either return fd's or they do other things we don't // yet handle return 0; } How's this? if (s->sys_func == printargs || s->sys_func == sys_pipe || s->sys_func == sys_pipe2 || s->sys_func == sys_eventfd2 || s->sys_func == sys_eventfd || s->sys_func == sys_inotify_init1 || s->sys_func == sys_timerfd_create || s->sys_func == sys_timerfd_settime || s->sys_func == sys_timerfd_gettime) { // these have TRACE_FILE or TRACE_DESCRIPTOR set, but they don't // have any file descriptor or path args to test return 0; } if (s->sys_func == sys_poll || s->sys_func == sys_ppoll || s->sys_func == sys_select || s->sys_func == sys_oldselect || s->sys_func == sys_pselect6) { // these have arguments that refer indirectly to file // descriptors, and we need to add code to handle them. return 0; } I'm going to work on checking the fd's referred in the args to the poll/select calls. Are there other calls in that list that have path/fd arguments that should be tested? > For example, > $ strace -s4 -y -P /lib64/libc-2.11.3.so -P /lib64/libc.so.6 /bin/echo > open("/lib64/libc.so.6", O_RDONLY) = 3 > read(3</lib64/libc-2.11.3.so>, "\177ELF"..., 832) = 832 > fstat(3</lib64/libc-2.11.3.so>, {st_mode=S_IFREG|0755, st_size=1465744, ...}) = 0 > close(3</lib64/libc-2.11.3.so>) = 0 > > The pathname passed to open(2) is a symlink, and /proc/<pid>/fd/<fd> > points to the canonicalized pathname, so -P /lib64/libc-2.11.3.so > won't catch this open(2) call now. I can think of a couple options: 1) Store both the canoical and "as-provided" versions of paths passed to -P. 2) Canonicalize the pathname passed to -P, and also canonicalize path arguments when checking for a match. Either would be OK with me. -- Grant Edwards grant.b.edwards Yow! Are we wet yet? at gmail.com |