From: Zubin M. <zub...@gm...> - 2014-02-25 16:57:46
|
Hey all, I'm Zubin and I love low level systems programming! :) A little about myself, I program primarily in C and Python, have systems programming experience with Minix(filesystem development) and Linux and am a hobbyist reverse engineer(I play CTF security exercises) -- and thats when I use strace the most ! I had a look at the ideas list here[1] and found the idea on improved path decoding quite interesting and was hoping we could discuss it further on the mailing list. I had a quick look at the implementation of the -y flag and noticed the implementation of getfdpath(where the magic seemed to be happening). It seemed to be trying to read the value of the symbolic link at /proc/<pid>/fd/<fd>. Is my understanding of the following accurate? Modifications need to be made such that upon using the "yy" flag:- - Calls to functions that take a path as an argument are displayed with the absolute path regardless of the argument that is passed in. - When calls to functions that return a file descriptor are made, the absolute path to the filename corresponding to the file descriptor needs to be printed - Same as above for functions that use path/descriptor combos. I believe that the first step would be to document and note down the system calls that belong to one or more of the above categories and their system call numbers, and if the -yy flag is used, check the tcp->scno against these numbers and act accordingly. Is there something I'm missing? I'd love any kind of feedback! Cheers, -- zm [1] http://sourceforge.net/p/strace/wiki/GoogleSummerOfCode2014/ |
From: Dmitry V. L. <ld...@al...> - 2014-02-26 01:28:13
|
Hi, On Tue, Feb 25, 2014 at 10:27:37PM +0530, Zubin Mithra wrote: > Hey all, > > I'm Zubin and I love low level systems programming! :) Great! :) > A little about myself, I program primarily in C and Python, have systems > programming experience with Minix(filesystem development) and Linux and am > a hobbyist reverse engineer(I play CTF security exercises) -- and thats > when I use strace the most ! > > I had a look at the ideas list here[1] and found the idea on improved path > decoding quite interesting and was hoping we could discuss it further on > the mailing list. > > I had a quick look at the implementation of the -y flag and noticed the > implementation of getfdpath(where the magic seemed to be happening). It > seemed to be trying to read the value of the symbolic link at > /proc/<pid>/fd/<fd>. > > Is my understanding of the following accurate? Yes, getfdpath() is the single point where strace fetches the path corresponding to the given descriptor. Besides that, printfd() is the single point where this fetched path is currently printed. The path fetched by getfdpath() is also used by fdmatch() to implement -P option. > Modifications need to be made such that upon using the "yy" flag:- > - Calls to functions that take a path as an argument are displayed with the > absolute path regardless of the argument that is passed in. I suppose path arguments should be printed the same way as they are printed now, and, in addition, if -yy mode is specified, canonicalized paths should be printed in <> form. > - When calls to functions that return a file descriptor are made, the > absolute path to the filename corresponding to the file descriptor needs to > be printed I think yes, probably by means of printfd(). > - Same as above for functions that use path/descriptor combos. Right. In addition, there are at* functions: to canonicalize their path arguments, dirfd argument also has to be taken into account. > I believe that the first step would be to document and note down the system > calls that belong to one or more of the above categories and their system > call numbers, and if the -yy flag is used, check the tcp->scno against > these numbers and act accordingly. > > Is there something I'm missing? I'd love any kind of feedback! You probably don't need to care about tcp->scno to implement -yy mode. First, since all descriptors (should be) printed using printfd(), you just use it to decode returned descriptors. Second, since all paths are printed using printpath() or printpathn(), it should be enough to extend this scheme for decoding path arguments of regular and at* functions in -yy mode. Third, -yy mode should extend -P functionality, this means that path arguments matching should work with canonicalized pathnames. Forth, I think -yy should also "canonicalize" socket descriptors, i.e. print their addresses in <> form, resembling lsof(8) output. This would be a really nice feature. -- ldv |
From: eQuiNoX <equ...@gm...> - 2014-02-26 17:35:33
|
Thank you for your reply, Dmitry! On Wed, Feb 26, 2014 at 6:58 AM, Dmitry V. Levin <ld...@al...> wrote: > Hi, > > On Tue, Feb 25, 2014 at 10:27:37PM +0530, Zubin Mithra wrote: > > Hey all, > > > > I'm Zubin and I love low level systems programming! :) > > Great! :) > > > A little about myself, I program primarily in C and Python, have systems > > programming experience with Minix(filesystem development) and Linux and > am > > a hobbyist reverse engineer(I play CTF security exercises) -- and thats > > when I use strace the most ! > > > > I had a look at the ideas list here[1] and found the idea on improved > path > > decoding quite interesting and was hoping we could discuss it further on > > the mailing list. > > > > I had a quick look at the implementation of the -y flag and noticed the > > implementation of getfdpath(where the magic seemed to be happening). It > > seemed to be trying to read the value of the symbolic link at > > /proc/<pid>/fd/<fd>. > > > > Is my understanding of the following accurate? > > Yes, getfdpath() is the single point where strace fetches the path > corresponding to the given descriptor. > > Besides that, printfd() is the single point where this fetched path is > currently printed. The path fetched by getfdpath() is also used by > fdmatch() to implement -P option. > > > Modifications need to be made such that upon using the "yy" flag:- > > - Calls to functions that take a path as an argument are displayed with > the > > absolute path regardless of the argument that is passed in. > > I suppose path arguments should be printed the same way as they are > printed now, and, in addition, if -yy mode is specified, canonicalized > paths should be printed in <> form. > I see, cool stuff. > > > - When calls to functions that return a file descriptor are made, the > > absolute path to the filename corresponding to the file descriptor needs > to > > be printed > > I think yes, probably by means of printfd(). > > > - Same as above for functions that use path/descriptor combos. > > Right. > > In addition, there are at* functions: to canonicalize their path arguments, > dirfd argument also has to be taken into account. > > > I believe that the first step would be to document and note down the > system > > calls that belong to one or more of the above categories and their system > > call numbers, and if the -yy flag is used, check the tcp->scno against > > these numbers and act accordingly. > > > > Is there something I'm missing? I'd love any kind of feedback! > > You probably don't need to care about tcp->scno to implement -yy mode. > I should probably go read the source instead -- is there any other way to get information about an intercepted system call other than from the struct tcb? > > First, since all descriptors (should be) printed using printfd(), you just > use it to decode returned descriptors. > > Second, since all paths are printed using printpath() or printpathn(), it > should be enough to extend this scheme for decoding path arguments of > regular and at* functions in -yy mode. > > Third, -yy mode should extend -P functionality, this means that path > arguments matching should work with canonicalized pathnames. > > Forth, I think -yy should also "canonicalize" socket descriptors, i.e. > print their addresses in <> form, resembling lsof(8) output. This would > be a really nice feature. > > Great stuff, thank you very much ! Also, is there an IRC channel for strace dev? Cheers, zm |
From: Philippe O. <pom...@ne...> - 2014-03-02 11:07:19
|
On Wed, Feb 26, 2014 at 6:35 PM, eQuiNoX <equ...@gm...> wrote: >> On Tue, Feb 25, 2014 at 10:27:37PM +0530, Zubin Mithra wrote: >> > Hey all, >> > I'm Zubin and I love low level systems programming! :) > Also, is there an IRC channel for strace dev? Not for now, but it could make sense to have one at least for GSOC duration... For now the best is to go through the mailing list, which is the single place where everything happens. It may feel a bit archaic, but it is in fact quite efficient as everything is one single place. -- Philippe Ombredanne |
From: Philippe O. <pom...@ne...> - 2014-03-02 10:55:44
|
On Wed, Feb 26, 2014 at 2:28 AM, Dmitry V. Levin <ld...@al...> wrote: > Fourth, I think -yy should also "canonicalize" socket descriptors, i.e. > print their addresses in <> form, resembling lsof(8) output. This would > be a really nice feature. Indeed that would be awesome and help a lot with tracing network related calls. I think you meant this lsof output, for instance from a wget invocation: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME wget 3695865 pombredanne 3u IPv4 125707086 0t0 TCP myhost.local:56120->ch3.sourceforge.net:http (ESTABLISHED) How would you report this? Would this work for the example above?: 3<socket:[125707086] IPv4, TCP, "myhost.local:56120", "ch3.sourceforge.net:http"> where: - IPv4 would be the type as reported by lsof(8) - TCP, "myhost.local:56120" and "ch3.sourceforge.net:http" would the parts of the node name as reported by lsof(8) -- Philippe Ombredanne |
From: Philippe O. <pom...@ne...> - 2014-03-02 11:01:11
|
On Tue, Feb 25, 2014 at 5:57 PM, Zubin Mithra <zub...@gm...> wrote: > Hey all, > I'm Zubin and I love low level systems programming! :) [...] > I had a look at the ideas list here[1] and found the idea on improved path > decoding quite interesting and was hoping we could discuss it further on the > mailing list. Hi Zubin: thanks for your interest in strace and your detailed message and initial investigations! I wonder if the advanced path decoding itself would be large enough to fill a whole 3 month GSOC project What do you think? While looking at path decoding is there other areas or ideas you could consider too such as structured json output? -- Philippe Ombredanne |
From: eQuiNoX <equ...@gm...> - 2014-03-03 05:32:07
|
On Sun, Mar 2, 2014 at 4:30 PM, Philippe Ombredanne <pom...@ne...> wrote: > On Tue, Feb 25, 2014 at 5:57 PM, Zubin Mithra <zub...@gm...> wrote: >> Hey all, >> I'm Zubin and I love low level systems programming! :) > [...] >> I had a look at the ideas list here[1] and found the idea on improved path >> decoding quite interesting and was hoping we could discuss it further on the >> mailing list. > > Hi Zubin: > thanks for your interest in strace and your detailed message and > initial investigations! > > I wonder if the advanced path decoding itself would be large enough to > fill a whole 3 month GSOC project > What do you think? I find "Reliable multiarchitecture support" fascinating too but I'm not sure about its scope just yet. I read a thread on the same and need to grok the codebase to get a clearer idea. > > While looking at path decoding is there other areas or ideas you could > consider too such as structured json output? > > -- > Philippe Ombredanne > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Strace-devel mailing list > Str...@li... > https://lists.sourceforge.net/lists/listinfo/strace-devel |
From: Dmitry V. L. <ld...@al...> - 2014-03-02 11:44:19
|
On Sun, Mar 02, 2014 at 11:54:57AM +0100, Philippe Ombredanne wrote: > On Wed, Feb 26, 2014 at 2:28 AM, Dmitry V. Levin <ld...@al...> wrote: > > > Fourth, I think -yy should also "canonicalize" socket descriptors, i.e. > > print their addresses in <> form, resembling lsof(8) output. This would > > be a really nice feature. > > Indeed that would be awesome and help a lot with tracing network related calls. > > I think you meant this lsof output, for instance from a wget invocation: > > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > wget 3695865 pombredanne 3u IPv4 125707086 0t0 TCP > myhost.local:56120->ch3.sourceforge.net:http (ESTABLISHED) > > How would you report this? > > Would this work for the example above?: > 3<socket:[125707086] IPv4, TCP, "myhost.local:56120", > "ch3.sourceforge.net:http"> > where: > - IPv4 would be the type as reported by lsof(8) > - TCP, "myhost.local:56120" and "ch3.sourceforge.net:http" would the > parts of the node name as reported by lsof(8) Employing network address resolving in strace is risky because of potentially huge delays it may cause. By mentioning "lsof" I rather meant "lsof -n". The exact output format may vary, but the general idea of strace decoding is to mimic C syntax. From this PoV, path names should always be enclosed in double quotes. Unfortunately, in case of -y output this rule hasn't been enforced, so you cannot distinguish a socket with inode 1234567 with a file named "socket:[1234567]". -- ldv |
From: Philippe O. <pom...@ne...> - 2014-03-02 12:19:45
|
On Sun, Mar 2, 2014 at 12:44 PM, Dmitry V. Levin <ld...@al...> wrote: > On Sun, Mar 02, 2014 at 11:54:57AM +0100, Philippe Ombredanne wrote: >> On Wed, Feb 26, 2014 at 2:28 AM, Dmitry V. Levin <ld...@al...> wrote: >> >> > Fourth, I think -yy should also "canonicalize" socket descriptors, i.e. >> > print their addresses in <> form, resembling lsof(8) output. This would >> > be a really nice feature. >> >> Indeed that would be awesome and help a lot with tracing network related calls. >> >> I think you meant this lsof output, for instance from a wget invocation: >> >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> wget 3695865 pombredanne 3u IPv4 125707086 0t0 TCP >> myhost.local:56120->ch3.sourceforge.net:http (ESTABLISHED) >> >> How would you report this? >> >> Would this work for the example above?: >> 3<socket:[125707086] IPv4, TCP, "myhost.local:56120", >> "ch3.sourceforge.net:http"> >> where: >> - IPv4 would be the type as reported by lsof(8) >> - TCP, "myhost.local:56120" and "ch3.sourceforge.net:http" would the >> parts of the node name as reported by lsof(8) > > Employing network address resolving in strace is risky because of > potentially huge delays it may cause. > By mentioning "lsof" I rather meant "lsof -n". Good point, lsof-like reverse DNS lookup kind of call would be a killer and fragile too. I always wondered why this is a default for lsof btw > The exact output format may vary, but the general idea of strace decoding > is to mimic C syntax. From this PoV, path names should always be enclosed > in double quotes. Unfortunately, in case of -y output this rule hasn't > been enforced, so you cannot distinguish a socket with inode 1234567 with > a file named "socket:[1234567]". The someone naming a file "socket:[1234567]" ought to get at least a good slap on the wrist :D On the other hand, this could be a mistake and one that strace could help debug and find Now could this be something that could be fixed as part of advanced path decoding? or is it really worth adding quotes around all decoded "real" paths to care for such a rare case? -- Philippe Ombredanne On Sun, Mar 2, 2014 at 12:44 PM, Dmitry V. Levin <ld...@al...> wrote: > On Sun, Mar 02, 2014 at 11:54:57AM +0100, Philippe Ombredanne wrote: >> On Wed, Feb 26, 2014 at 2:28 AM, Dmitry V. Levin <ld...@al...> wrote: >> >> > Fourth, I think -yy should also "canonicalize" socket descriptors, i.e. >> > print their addresses in <> form, resembling lsof(8) output. This would >> > be a really nice feature. >> >> Indeed that would be awesome and help a lot with tracing network related calls. >> >> I think you meant this lsof output, for instance from a wget invocation: >> >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> wget 3695865 pombredanne 3u IPv4 125707086 0t0 TCP >> myhost.local:56120->ch3.sourceforge.net:http (ESTABLISHED) >> >> How would you report this? >> >> Would this work for the example above?: >> 3<socket:[125707086] IPv4, TCP, "myhost.local:56120", >> "ch3.sourceforge.net:http"> >> where: >> - IPv4 would be the type as reported by lsof(8) >> - TCP, "myhost.local:56120" and "ch3.sourceforge.net:http" would the >> parts of the node name as reported by lsof(8) > > Employing network address resolving in strace is risky because of > potentially huge delays it may cause. > By mentioning "lsof" I rather meant "lsof -n". > > The exact output format may vary, but the general idea of strace decoding > is to mimic C syntax. From this PoV, path names should always be enclosed > in double quotes. Unfortunately, in case of -y output this rule hasn't > been enforced, so you cannot distinguish a socket with inode 1234567 with > a file named "socket:[1234567]". > > > -- > ldv > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Strace-devel mailing list > Str...@li... > https://lists.sourceforge.net/lists/listinfo/strace-devel > -- Philippe Ombredanne +1 650 799 0949 | pom...@ne... DejaCode Enterprise at http://www.dejacode.com nexB Inc. at http://www.nexb.com CONFIDENTIALITY NOTICE: This e-mail (including attachments) may contain information that is proprietary or confidential. If you are not the intended recipient or a person responsible for its delivery to the intended recipient, do not copy or distribute it. Please permanently delete the e-mail and any attachments, and notify us immediately at (650) 799 0949. |
From: Dmitry V. L. <ld...@al...> - 2014-03-02 16:07:56
|
On Sun, Mar 02, 2014 at 01:18:57PM +0100, Philippe Ombredanne wrote: > On Sun, Mar 2, 2014 at 12:44 PM, Dmitry V. Levin <ld...@al...> wrote: [...] > > The exact output format may vary, but the general idea of strace decoding > > is to mimic C syntax. From this PoV, path names should always be enclosed > > in double quotes. Unfortunately, in case of -y output this rule hasn't > > been enforced, so you cannot distinguish a socket with inode 1234567 with > > a file named "socket:[1234567]". > > The someone naming a file "socket:[1234567]" ought to get at least a > good slap on the wrist :D Well, strace is a tool for tracing, not for slapping. :) > On the other hand, this could be a mistake and one that strace could > help debug and find > Now could this be something that could be fixed as part of advanced > path decoding? > or is it really worth adding quotes around all decoded "real" paths to > care for such a rare case? Given that names readlinked from /proc/<pid>/fd/<fd> are "real" iff they start with a slash, they actually can be distinguished in the current -y output format. It's just not as obvious as other strace output is. -- ldv |
From: Zubin M. <zub...@gm...> - 2014-03-03 05:18:47
|
Hey Philippe and Dmitry, On Sun, Mar 2, 2014 at 4:30 PM, Philippe Ombredanne <pom...@ne...> wrote: > On Tue, Feb 25, 2014 at 5:57 PM, Zubin Mithra <zub...@gm...> wrote: >> Hey all, >> I'm Zubin and I love low level systems programming! :) > [...] >> I had a look at the ideas list here[1] and found the idea on improved path >> decoding quite interesting and was hoping we could discuss it further on the >> mailing list. > > Hi Zubin: > thanks for your interest in strace and your detailed message and > initial investigations! > > I wonder if the advanced path decoding itself would be large enough to > fill a whole 3 month GSOC project > What do you think? Yes, I do agree -- path decoding alone would not be large enough for filling up a 3 month GSoC project. Reading the discussing below, the improvements that could be made are the "-yy" feature and the quotes around paths when using the -y flag. > > While looking at path decoding is there other areas or ideas you could > consider too such as structured json output? I just had a second look at the ideas list and the discussions on the mailing list so far. Its quite interesting and I believe something that can fit in with the existing idea. Perhaps the following format makes sense? A call to :- open("/usr/lib/locale/UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT could be represented in JSON as :- { "call_one" : { "fnname" = "open", "arg1" = "\"/usr/lib/locale/UTF-8/LC_CTYPE\"", "arg2" = "O_RDONLY|O_CLOEXEC", "ret" = "-1" } } Of course, this above example is oversimplified(And I'm not sure thats the best way to manage quoting to be honest, I'll put in some more thought). In cases where a struct is passed as an argument, as in the case of a bind call we have, bind(3, {sa_family=AF_INET, sin_port=htons(7171), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 We could have "arg2" set to "{sa_family=AF_INET, sin_port=htons(7171), sin_addr=inet_addr("0.0.0.0")}" but I feel it defeats the purpose as parsing output itself would be more painful. Perhaps something like the following would be nice. { "call_thirteen" : { "fnname" = "bind", "arg1" = "3", "arg2" : { "sa_family" : "AF_INET", "sin_port" : "htons(7171)", "sin_addr" : "inet_addr("0.0.0.0")" }, "ret" = "-1" } } Is that what you had in mind? Thanks zm |
From: Zubin M. <zub...@gm...> - 2014-03-03 05:22:55
|
>> I believe that the first step would be to document and note down the system >> calls that belong to one or more of the above categories and their system >> call numbers, and if the -yy flag is used, check the tcp->scno against >> these numbers and act accordingly. >> >> Is there something I'm missing? I'd love any kind of feedback! > > You probably don't need to care about tcp->scno to implement -yy mode. Just a little something I'd like to clarify -- did you mean I should use tcp->s_ent->sys_name instead? Just to make sure I'm not terribly misunderstanding something. :) Thanks, zm |
From: Philippe O. <pom...@ne...> - 2014-03-03 09:27:10
|
On Mon, Mar 3, 2014 at 6:18 AM, Zubin Mithra <zub...@gm...> wrote: > On Sun, Mar 2, 2014 at 4:30 PM, Philippe Ombredanne > <pom...@ne...> wrote: >> On Tue, Feb 25, 2014 at 5:57 PM, Zubin Mithra <zub...@gm...> wrote: >>> Hey all, >>> I'm Zubin and I love low level systems programming! :) >> [...] >>> I had a look at the ideas list here[1] and found the idea on improved path >>> decoding quite interesting and was hoping we could discuss it further on the >>> mailing list. >> While looking at path decoding is there other areas or ideas you could >> consider too such as structured json output? > > I just had a second look at the ideas list and the discussions on the > mailing list so far. Its quite interesting and I believe something > that can fit in with the existing idea. > > Perhaps the following format makes sense? A call to :- > open("/usr/lib/locale/UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = -1 ENOENT > > could be represented in JSON as :- > { > "call_one" : { > "fnname" = "open", > "arg1" = "\"/usr/lib/locale/UTF-8/LC_CTYPE\"", > "arg2" = "O_RDONLY|O_CLOEXEC", > "ret" = "-1" > } > } Just curious, why would you use call_one? and arg1,arg2 v.s using lists? FWIW the above would not be valid JSON. What about something like: [ {"fnname": "open", "args": ["/usr/lib/locale/UTF-8/LC_CTYPE", "O_RDONLY|O_CLOEXEC"], "ret": "-1" } ] or possibly (not sure which form I like best) using a more compact entirely and positional list of lists: [ "open", "-1", [ "/usr/lib/locale/UTF-8/LC_CTYPE", "O_RDONLY|O_CLOEXEC" ] ] > Of course, this above example is oversimplified(And I'm not sure thats > the best way to manage quoting to be honest, I'll put in some more > thought). I think that in the case of a JSON output, double quoting paths would not be desirable and paths should be returned a simple JSON string > In cases where a struct is passed as an argument, as in the case of a > bind call we have, > bind(3, {sa_family=AF_INET, sin_port=htons(7171), > sin_addr=inet_addr("0.0.0.0")}, 16) = 0 > > We could have "arg2" set to "{sa_family=AF_INET, sin_port=htons(7171), > sin_addr=inet_addr("0.0.0.0")}" but I feel it defeats the purpose as > parsing output itself would be more painful. Perhaps something like > the following would be nice. > > { > "call_thirteen" : { > "fnname" = "bind", > "arg1" = "3", > "arg2" : { > "sa_family" : "AF_INET", > "sin_port" : "htons(7171)", > "sin_addr" : "inet_addr("0.0.0.0")" > }, > "ret" = "-1" > } > } This makes sense but same comment as above: why not using a combo of objects and arrays (using JSON speak)? and why not structuring everything that can be? ie something along these lines? [ { "fnname": "bind", "args": [ "3", { "sa_family": "AF_INET", "sin_port": { "htons": "7171" }, "sin_addr": { "inet_addr": "0.0.0.0" } } ], "ret": "-1" } ] -- Philippe Ombredanne |
From: Dmitry V. L. <ld...@al...> - 2014-03-03 12:56:20
|
On Mon, Mar 03, 2014 at 10:52:48AM +0530, Zubin Mithra wrote: > >> I believe that the first step would be to document and note down the system > >> calls that belong to one or more of the above categories and their system > >> call numbers, and if the -yy flag is used, check the tcp->scno against > >> these numbers and act accordingly. > >> > >> Is there something I'm missing? I'd love any kind of feedback! > > > > You probably don't need to care about tcp->scno to implement -yy mode. > > Just a little something I'd like to clarify -- did you mean I should > use tcp->s_ent->sys_name instead? Just to make sure I'm not terribly > misunderstanding something. :) The way how strace decodes each syscall is, shortly speaking, this: 1. take a syscall number (tcp->scno); 2. filter out those syscalls that should not be decoded; 3. call the handler assigned for the syscall (tcp->s_ent->sys_func). At the point of syscall decoding where absolute paths decoding should be implemented, the syscall handler is already called, so neither tcp->scno nor tcp->s_ent->sys_func is required for decoding. Only when you are changing syscall filtering algorithms you may need to know scno/sys_func in advance. Just have a look at the code. :) -- ldv |
From: eQuiNoX <equ...@gm...> - 2014-03-04 13:16:23
|
On Mon, Mar 3, 2014 at 6:26 PM, Dmitry V. Levin <ld...@al...> wrote: > On Mon, Mar 03, 2014 at 10:52:48AM +0530, Zubin Mithra wrote: >> >> I believe that the first step would be to document and note down the system >> >> calls that belong to one or more of the above categories and their system >> >> call numbers, and if the -yy flag is used, check the tcp->scno against >> >> these numbers and act accordingly. >> >> >> >> Is there something I'm missing? I'd love any kind of feedback! >> > >> > You probably don't need to care about tcp->scno to implement -yy mode. >> >> Just a little something I'd like to clarify -- did you mean I should >> use tcp->s_ent->sys_name instead? Just to make sure I'm not terribly >> misunderstanding something. :) > > The way how strace decodes each syscall is, shortly speaking, this: > 1. take a syscall number (tcp->scno); > 2. filter out those syscalls that should not be decoded; > 3. call the handler assigned for the syscall (tcp->s_ent->sys_func). > > At the point of syscall decoding where absolute paths decoding should > be implemented, the syscall handler is already called, so neither > tcp->scno nor tcp->s_ent->sys_func is required for decoding. > > Only when you are changing syscall filtering algorithms you may need > to know scno/sys_func in advance. > > Just have a look at the code. :) Already on it, thanks heaps Dmitry ! > > > -- > ldv > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Strace-devel mailing list > Str...@li... > https://lists.sourceforge.net/lists/listinfo/strace-devel > |
From: Zubin M. <zub...@gm...> - 2014-03-04 12:59:33
|
Hey Philippe, > Just curious, why would you use call_one? and arg1,arg2 v.s using lists? I was just wondering if information related to the call sequence might be useful. In quite a few languages, JSON data directly maps to dictionary representations(eg:- Python) -- but upon doing that we'd lose information about the sequence in which the calls occurred once we create a dictionary from the JSON. In such cases having explicit information about the order might be useful. > FWIW the above would not be valid JSON. > What about something like: > [ > {"fnname": "open", > "args": ["/usr/lib/locale/UTF-8/LC_CTYPE", "O_RDONLY|O_CLOEXEC"], > "ret": "-1" > } > ] Cool stuff, thank you! > > or possibly (not sure which form I like best) using a more compact > entirely and positional list of lists: > [ > "open", > "-1", > [ > "/usr/lib/locale/UTF-8/LC_CTYPE", > "O_RDONLY|O_CLOEXEC" > ] > ] > >> Of course, this above example is oversimplified(And I'm not sure thats >> the best way to manage quoting to be honest, I'll put in some more >> thought). I think I like the first one better, I think it looks cleaner(open to suggestions of course!). > > I think that in the case of a JSON output, double quoting paths would > not be desirable and paths should be returned a simple JSON string Cool stuff, makes sense. > >> In cases where a struct is passed as an argument, as in the case of a >> bind call we have, >> bind(3, {sa_family=AF_INET, sin_port=htons(7171), >> sin_addr=inet_addr("0.0.0.0")}, 16) = 0 >> >> We could have "arg2" set to "{sa_family=AF_INET, sin_port=htons(7171), >> sin_addr=inet_addr("0.0.0.0")}" but I feel it defeats the purpose as >> parsing output itself would be more painful. Perhaps something like >> the following would be nice. >> >> { >> "call_thirteen" : { >> "fnname" = "bind", >> "arg1" = "3", >> "arg2" : { >> "sa_family" : "AF_INET", >> "sin_port" : "htons(7171)", >> "sin_addr" : "inet_addr("0.0.0.0")" >> }, >> "ret" = "-1" >> } >> } > > This makes sense but same comment as above: why not using a combo of > objects and arrays (using JSON speak)? > and why not structuring everything that can be? > ie something along these lines? > [ > { > "fnname": "bind", > "args": [ > "3", > { > "sa_family": "AF_INET", > "sin_port": { > "htons": "7171" > }, > "sin_addr": { > "inet_addr": "0.0.0.0" > } > } > ], > "ret": "-1" > } > ] If the call information being added makes sense, it would look something as follows, I believe :- '{"call_one": [{"fnname": "bind", "ret": "-1", "args": ["3", {"sin_port": {"htons": "7171"}, "sin_addr": {"inet_addr": "0.0.0.0"}, "sa_family": "AF_INET"}]}]}' Cheers Zubin |
From: Philippe O. <pom...@ne...> - 2014-03-06 14:32:12
|
On Tue, Mar 4, 2014 at 1:59 PM, Zubin Mithra <zub...@gm...> wrote: > Hey Philippe, > >> Just curious, why would you use call_one? and arg1,arg2 v.s using lists? > > I was just wondering if information related to the call sequence might > be useful. In quite a few languages, JSON data directly maps to > dictionary representations(eg:- Python) -- but upon doing that we'd > lose information about the sequence in which the calls occurred once > we create a dictionary from the JSON. In such cases having explicit > information about the order might be useful. JSON is a spec so you should not care about how it is interpreted in a given language. JSON has arrays that are ordered lists and objects that are unordered name/value mappings. So when you need an ordered sequence, use an array, please do not make up a map name to track ordering. If you have name/values mapping and need ordering, wrap that in an array. >> FWIW the above would not be valid JSON. >> What about something like: >> [ >> {"fnname": "open", >> "args": ["/usr/lib/locale/UTF-8/LC_CTYPE", "O_RDONLY|O_CLOEXEC"], >> "ret": "-1" >> } >> ] > > Cool stuff, thank you! > >> >> or possibly (not sure which form I like best) using a more compact >> entirely and positional list of lists: >> [ >> "open", >> "-1", >> [ >> "/usr/lib/locale/UTF-8/LC_CTYPE", >> "O_RDONLY|O_CLOEXEC" >> ] >> ] >> >>> Of course, this above example is oversimplified(And I'm not sure thats >>> the best way to manage quoting to be honest, I'll put in some more >>> thought). > > I think I like the first one better, I think it looks cleaner(open to > suggestions of course!). I do not know which one is better yet but a simpler array without made-up names when they do not exist feels much cleaner and less verbose to me, and does not affect the structure nor the readability. Being somewhat compact is not something nice to have but a feature when tracing IMHO both in terms of time and space. >> I think that in the case of a JSON output, double quoting paths would >> not be desirable and paths should be returned a simple JSON string > > Cool stuff, makes sense. > >> >>> In cases where a struct is passed as an argument, as in the case of a >>> bind call we have, >>> bind(3, {sa_family=AF_INET, sin_port=htons(7171), >>> sin_addr=inet_addr("0.0.0.0")}, 16) = 0 >>> >>> We could have "arg2" set to "{sa_family=AF_INET, sin_port=htons(7171), >>> sin_addr=inet_addr("0.0.0.0")}" but I feel it defeats the purpose as >>> parsing output itself would be more painful. Perhaps something like >>> the following would be nice. >>> >>> { >>> "call_thirteen" : { >>> "fnname" = "bind", >>> "arg1" = "3", >>> "arg2" : { >>> "sa_family" : "AF_INET", >>> "sin_port" : "htons(7171)", >>> "sin_addr" : "inet_addr("0.0.0.0")" >>> }, >>> "ret" = "-1" >>> } >>> } >> >> This makes sense but same comment as above: why not using a combo of >> objects and arrays (using JSON speak)? >> and why not structuring everything that can be? >> ie something along these lines? >> [ >> { >> "fnname": "bind", >> "args": [ >> "3", >> { >> "sa_family": "AF_INET", >> "sin_port": { >> "htons": "7171" >> }, >> "sin_addr": { >> "inet_addr": "0.0.0.0" >> } >> } >> ], >> "ret": "-1" >> } >> ] > > If the call information being added makes sense, it would look > something as follows, I believe :- > > '{"call_one": [{"fnname": "bind", "ret": "-1", "args": ["3", > {"sin_port": {"htons": "7171"}, "sin_addr": {"inet_addr": "0.0.0.0"}, > "sa_family": "AF_INET"}]}]}' As I said above adding call_one does not make sense and is rather ugly. Use arrays/list not maps/objects for sequences. The top level construct should therefore be a list [] not a map. Where you need a sequence, use a list. And use a map only when needed. Do not make up names. -- Philippe Ombredanne |
From: eQuiNoX <equ...@gm...> - 2014-03-07 02:39:00
|
On Thu, Mar 6, 2014 at 8:01 PM, Philippe Ombredanne <pom...@ne...> wrote: > On Tue, Mar 4, 2014 at 1:59 PM, Zubin Mithra <zub...@gm...> wrote: >> Hey Philippe, >> >>> Just curious, why would you use call_one? and arg1,arg2 v.s using lists? >> >> I was just wondering if information related to the call sequence might >> be useful. In quite a few languages, JSON data directly maps to >> dictionary representations(eg:- Python) -- but upon doing that we'd >> lose information about the sequence in which the calls occurred once >> we create a dictionary from the JSON. In such cases having explicit >> information about the order might be useful. > > JSON is a spec so you should not care about how it is interpreted in a > given language. > JSON has arrays that are ordered lists and objects that are unordered > name/value mappings. > So when you need an ordered sequence, use an array, please do not make > up a map name to track ordering. > If you have name/values mapping and need ordering, wrap that in an array. Good point, yes, I agree. > > >>> FWIW the above would not be valid JSON. >>> What about something like: >>> [ >>> {"fnname": "open", >>> "args": ["/usr/lib/locale/UTF-8/LC_CTYPE", "O_RDONLY|O_CLOEXEC"], >>> "ret": "-1" >>> } >>> ] >> >> Cool stuff, thank you! >> >>> >>> or possibly (not sure which form I like best) using a more compact >>> entirely and positional list of lists: >>> [ >>> "open", >>> "-1", >>> [ >>> "/usr/lib/locale/UTF-8/LC_CTYPE", >>> "O_RDONLY|O_CLOEXEC" >>> ] >>> ] >>> >>>> Of course, this above example is oversimplified(And I'm not sure thats >>>> the best way to manage quoting to be honest, I'll put in some more >>>> thought). >> >> I think I like the first one better, I think it looks cleaner(open to >> suggestions of course!). > > I do not know which one is better yet but a simpler array without > made-up names when they do not exist feels much cleaner and less > verbose to me, and does not affect the structure nor the readability. > Being somewhat compact is not something nice to have but a feature > when tracing IMHO both in terms of time and space. Indeed, I hadn't considered that. > >>> I think that in the case of a JSON output, double quoting paths would >>> not be desirable and paths should be returned a simple JSON string >> >> Cool stuff, makes sense. >> >>> >>>> In cases where a struct is passed as an argument, as in the case of a >>>> bind call we have, >>>> bind(3, {sa_family=AF_INET, sin_port=htons(7171), >>>> sin_addr=inet_addr("0.0.0.0")}, 16) = 0 >>>> >>>> We could have "arg2" set to "{sa_family=AF_INET, sin_port=htons(7171), >>>> sin_addr=inet_addr("0.0.0.0")}" but I feel it defeats the purpose as >>>> parsing output itself would be more painful. Perhaps something like >>>> the following would be nice. >>>> >>>> { >>>> "call_thirteen" : { >>>> "fnname" = "bind", >>>> "arg1" = "3", >>>> "arg2" : { >>>> "sa_family" : "AF_INET", >>>> "sin_port" : "htons(7171)", >>>> "sin_addr" : "inet_addr("0.0.0.0")" >>>> }, >>>> "ret" = "-1" >>>> } >>>> } >>> >>> This makes sense but same comment as above: why not using a combo of >>> objects and arrays (using JSON speak)? >>> and why not structuring everything that can be? >>> ie something along these lines? >>> [ >>> { >>> "fnname": "bind", >>> "args": [ >>> "3", >>> { >>> "sa_family": "AF_INET", >>> "sin_port": { >>> "htons": "7171" >>> }, >>> "sin_addr": { >>> "inet_addr": "0.0.0.0" >>> } >>> } >>> ], >>> "ret": "-1" >>> } >>> ] >> >> If the call information being added makes sense, it would look >> something as follows, I believe :- >> >> '{"call_one": [{"fnname": "bind", "ret": "-1", "args": ["3", >> {"sin_port": {"htons": "7171"}, "sin_addr": {"inet_addr": "0.0.0.0"}, >> "sa_family": "AF_INET"}]}]}' > > As I said above adding call_one does not make sense and is rather > ugly. Use arrays/list not maps/objects for sequences. > The top level construct should therefore be a list [] not a map. > Where you need a sequence, use a list. And use a map only when needed. > Do not make up names. Perfect, sounds good to me! I'll modify my GSoC proposal to reflect these changes, thank you! Thanks! zm |
From: Philippe O. <pom...@ne...> - 2014-03-07 08:45:53
|
On Fri, Mar 7, 2014 at 3:38 AM, eQuiNoX <equ...@gm...> wrote: >> On Tue, Mar 4, 2014 at 1:59 PM, Zubin Mithra <zub...@gm...> wrote: [...] > Perfect, sounds good to me! I'll modify my GSoC proposal to reflect > these changes, thank you! > Thanks! > zm Just to make sure I understand: are Zubin and eQuiNoX the same human ;) ? I am a bit a confused here.... If yes, you might want to stick to one email persona to avoid confusion; if not, ignore this message. -- Philippe Ombredanne |
From: Zubin M. <zub...@gm...> - 2014-03-07 08:53:16
|
On Fri, Mar 7, 2014 at 2:15 PM, Philippe Ombredanne <pom...@ne...> wrote: > On Fri, Mar 7, 2014 at 3:38 AM, eQuiNoX <equ...@gm...> wrote: >>> On Tue, Mar 4, 2014 at 1:59 PM, Zubin Mithra <zub...@gm...> wrote: > [...] >> Perfect, sounds good to me! I'll modify my GSoC proposal to reflect >> these changes, thank you! >> Thanks! >> zm > > Just to make sure I understand: are Zubin and eQuiNoX the same human ;) ? Oh yikes! :) Yes, its me! :) > I am a bit a confused here.... If yes, you might want to stick to one > email persona to avoid confusion; if not, ignore this message. Sure, sorry about any confusion haha! :) Cheers, zm |
From: Mike F. <va...@ge...> - 2014-03-11 05:36:21
Attachments:
signature.asc
|
On Mon 03 Mar 2014 10:26:22 Philippe Ombredanne wrote: > or possibly (not sure which form I like best) using a more compact > entirely and positional list of lists: > [ > "open", > "-1", > [ > "/usr/lib/locale/UTF-8/LC_CTYPE", > "O_RDONLY|O_CLOEXEC" > ] > ] that's a good way to not be future-proof. considering you're already serializing to json, adding a few extra fields to keep things sane isn't really going to hurt. if performance is an issue, that's where a binary output format would come in rather than making the json so terse as to be a pita to maintain. especially considering the point is to make an interface that other tools can build on top of sanely, and breaking the json output straight up isn't useful. although quibbling over the exact output format doesn't really matter to the internal design aspects (which is the majority of the work is going to be anyways). the strace code base would have a framework to call an output module and that would take care of the exact output details. -mike |
From: Philippe O. <pom...@ne...> - 2014-03-11 11:32:28
|
On Tue, Mar 11, 2014 at 6:36 AM, Mike Frysinger <va...@ge...> wrote: > On Mon 03 Mar 2014 10:26:22 Philippe Ombredanne wrote: >> or possibly (not sure which form I like best) using a more compact >> entirely and positional list of lists: >> [ >> "open", >> "-1", >> [ >> "/usr/lib/locale/UTF-8/LC_CTYPE", >> "O_RDONLY|O_CLOEXEC" >> ] >> ] > > that's a good way to not be future-proof. considering you're already > serializing to json, adding a few extra fields to keep things sane isn't > really going to hurt. Good point. So let's go with explicit field names, except for things like "call_one" "call_two" that really do not make sense. i.e. the top level structure should likely be a list of calls. > if performance is an issue, that's where a binary output format would come in > rather than making the json so terse as to be a pita to maintain. especially > considering the point is to make an interface that other tools can build on > top of sanely, and breaking the json output straight up isn't useful. > > although quibbling over the exact output format doesn't really matter to the > internal design aspects (which is the majority of the work is going to be > anyways). the strace code base would have a framework to call an output > module and that would take care of the exact output details. It is the goal indeed! -- Philippe Ombredanne |
From: Mike F. <va...@ge...> - 2014-03-11 11:44:26
Attachments:
signature.asc
|
On Tue 11 Mar 2014 12:31:39 Philippe Ombredanne wrote: > On Tue, Mar 11, 2014 at 6:36 AM, Mike Frysinger <va...@ge...> wrote: > > On Mon 03 Mar 2014 10:26:22 Philippe Ombredanne wrote: > >> or possibly (not sure which form I like best) using a more compact > >> entirely and positional list of lists: > >> [ > >> "open", > >> "-1", > >> [ > >> "/usr/lib/locale/UTF-8/LC_CTYPE", > >> "O_RDONLY|O_CLOEXEC" > >> ] > >> ] > > > > that's a good way to not be future-proof. considering you're already > > serializing to json, adding a few extra fields to keep things sane isn't > > really going to hurt. > > Good point. So let's go with explicit field names, except for things > like "call_one" "call_two" that really do not make sense. > i.e. the top level structure should likely be a list of calls. true, trying to fabricate indexes with things like "call_one" and "call_two" doesn't make much sense. i read it as "one call has been made" rather than "this is the first call" which is why i didn't object right away ;). the format will need some way of marking suspended/resumed calls and making sure the other side is able to sanely re-assemble them. and do so across processes/threads/signals/etc... :). -mike |
From: Philippe O. <pom...@ne...> - 2014-03-11 14:11:00
|
On Tue, Mar 11, 2014 at 12:44 PM, Mike Frysinger <va...@ge...> wrote: > the format will need some way of marking suspended/resumed calls This would apply only when you follow processes with -f right? -ff has no unfinished/resumed business. So we could limit support for a structured output to a -ff output only? Or would this be too much of a limitation? > and making > sure the other side is able to sanely re-assemble them. and do so across > processes/threads/signals/etc... :). Note that with -f, reconciliation of unfinished to resumed calls can be based on the PID and a stack of unfinished calls and afaik there should be no ambiguous case of a resumed call that cannot be traced to its unfinished call correctly this way. So IMHO even if we support a -f structured output there is nothing super special to support reassembling these sanely as long as the structured output has the pid, as it should. We should just make sure that we provide some extra field that states a call is unfinished or resumed. Sidebar: how to handle structured PIDs with -ff vs. -f: - in -ff today, the trace does not contain the PID, only the trace filename holds the PID. - with -f the trace contains the PID on each line. I think we should err on dealing with PIDs in a uniform way with -f or -ff and always have a pid field in these cases, even if there might be also a pid in the output filename. Other suspension cases could be things that happen after some signal interruptions (even in a -ff trace) such as: nanosleep({15, 0}, NULL) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=3948067, si_uid=1000} --- restart_syscall(<... resuming interrupted call ...>) = 0 I am not sure this (-ff) output above would warrant any special treatment, since this is a different animal and not an unfinished/suspend proper IMHO. In the case of a -f output you could turn out as also having the restart_syscall itself unfinished this would be a regular unfinished/resume processing IMHO: 3948469 restart_syscall(<... resuming interrupted call ...> <unfinished ...> 3948468 wait4(-1, <unfinished ...> 3948466 --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=3948067, si_uid=1000} --- 3948466 wait4(-1, <unfinished ...> 3948469 <... restart_syscall resumed> ) = 0 So to recap I suggest: - always include a pid field in the structured output of -f and -ff - add a field to support unfinished/resumed business with -f. This could be something like a "state" field with possible values of empty, unfinished or resumed A stylistic question would be: should fields be always included even if empty in the case of pid or state? or should they be only present if asked for? aka sparse vs. dense output? I would tend to prefer sparse without empty values than dense with empty values a dense output with empty values would still require a parser to test for empties. sparse values would require to test for presence, so in both case the parser would have to test something. -- Philippe Ombredanne |
From: Mike F. <va...@ge...> - 2014-03-13 09:02:25
Attachments:
signature.asc
|
On Tue 11 Mar 2014 15:10:12 Philippe Ombredanne wrote: > On Tue, Mar 11, 2014 at 12:44 PM, Mike Frysinger wrote: > > the format will need some way of marking suspended/resumed calls > > This would apply only when you follow processes with -f right? -ff has > no unfinished/resumed business. > So we could limit support for a structured output to a -ff output only? > Or would this be too much of a limitation? fundamentally, strace sees two events -- entering and exiting. the fact that it appears as one call is merely because nothing happened inbetween and because it was fast. it would probably be a bad limitation if we only delivered "whole" results. especially if someone wanted to write a GUI that used strace as the backend ... when the app hit a sleep() or other long lived call, they wouldn't see the entry at all until it finished. also consider that the inputs/outputs could be spread across args. to see what i mean, run: strace sleep 2 see how the syscall & first arg are decoded & displayed, then the system pauses. after 2 seconds, the 2nd arg and the return value are decoded & displayed. so we probably need to also support delivering info about inputs/outputs and intermingling them. { "syscall": "nanosleep", "state": "start", "arg0": { "dir": "in", "raw": "0x12345", "cook": "{1, 0}", }, } ... { "syscall": "nanosleep" "state": "finish", "arg1": { "dir": "out", "raw": "0x6789", "cook": "{0, 0}", }, "ret": "0", } that'd probably be fine for a first cut. longer term, we'd probably want to consider delivering the results broken out even more: "bake": { "tv_sec": "1", "tv_nsec": "0", } and instead of packing everything as strings, actually deliver them as numbers: "bake": { "tv_sec": 1, "tv_nsec": 0, } we'd probably have a number of extended options for controlling the output like dumping the syscall table so people can ingest it easily for GUI creation. > > and making > > sure the other side is able to sanely re-assemble them. and do so across > > processes/threads/signals/etc... :). > > Note that with -f, reconciliation of unfinished to resumed calls can > be based on the PID and a stack of unfinished calls and afaik there > should be no ambiguous case of a resumed call that cannot be traced to > its unfinished call correctly this way. > So IMHO even if we support a -f structured output there is nothing > super special to support reassembling these sanely as long as the > structured output has the pid, as it should. > We should just make sure that we provide some extra field that states > a call is unfinished or resumed. that probably should be sufficient since i think that's how strace is operating internally already > Sidebar: how to handle structured PIDs with -ff vs. -f: > - in -ff today, the trace does not contain the PID, only the trace > filename holds the PID. > - with -f the trace contains the PID on each line. > > I think we should err on dealing with PIDs in a uniform way with -f or > -ff and always have a pid field in these cases, even if there might be sounds fine > Other suspension cases could be things that happen after some signal > interruptions (even in a -ff trace) such as: > > nanosleep({15, 0}, NULL) = ? ERESTART_RESTARTBLOCK > (Interrupted by signal) > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=3948067, si_uid=1000} > --- restart_syscall(<... resuming interrupted call ...>) = 0 this isn't specific to -f or -ff. you'll always see this behavior. strace sleep 1000 now in a diff term, do: kill -STOP `pidof sleep` kill -CONT `pidof sleep` further consider what happens if you have a custom signal handler that makes syscalls. when it returns, the original syscall will be resumed. or if you have nested signals. this is another reason why a flat array won't really work ... strace also needs to pass along event information that aren't syscall related like signals. > - always include a pid field in the structured output of -f and -ff > - add a field to support unfinished/resumed business with -f. This > could be something like a "state" field with possible values of empty, > unfinished or resumed i think it'd go even deeper. every syscall (independent of -f/-ff) would have a state field. state: {start, resume, finish} > A stylistic question would be: > should fields be always included even if empty in the case of pid or > state? or should they be only present if asked for? > aka sparse vs. dense output? I would tend to prefer sparse without > empty values than dense with empty values a dense output with empty > values would still require a parser to test for empties. sparse values > would require to test for presence, so in both case the parser would > have to test something. sparse should be fine. any sane JSON parser can deal with this easily. -mike |