From: Jiri J. <jja...@re...> - 2014-07-28 13:33:32
|
Hello Linda & others, I've been doing some syscall work recently and while doing it, I decided to "clean up" the do_ipc and do_socketcall wrappers, which seemed like a duplicated functionality, since they call exactly the same library functions as normal do_* wrappers. Also, I really wanted to get rid of the ipc headerhack. :) This was somewhat amplified by the fact that the code in ipc_common.c was over-shared, meaning that ie. semget was using flags for semctl or semop. So I made a series of ~6 commits, carefully moving all the functionality from ipc_common.c into separate wrappers and removing the do_ipc wrapper. I did the same for do_socketcall, which just calls bind, using a library function, like do_bind. All this with removing respective sections from syscalls/*.conf, of course. I was quite happy with the series, since - functionality wise - it was transparent. However when I tested it on MODE=32, the syscalls bucket started throwing ERRORs. Some investigation uncovered that the syscalls bucket was actually using these "duplicated" wrappers for proper auditing - because auditctl works with real syscalls, not libc functions. The extra wrappers were therefore nothing more than a name, simplifying logic in the syscalls bucket. This goes against some other approaches used in the suite - in the network bucket, for example, which - based on the architecture - selects proper syscall name for auditctl, while still calling the original syscall wrapper (which uses library functions). ----------------------------------------------------------------------- This led me into a certain design question I'd like to ask here; how to design syscall wrappers and the execution and auditing infrastructure around them? What would be the best approach? I've identified 3 most obvious ways to write a syscall wrapper: A) use syscall(__NR_syscallname, ...) directly, bypassing libc B) use libc functions C) use (A), but simulate libc using #ifdefs manually These approaches solve the "which syscalls to run where" problem somewhat differently and therefore have different benefits and drawbacks in various situations: 1) compile time (A) is a bit problematic, since we would need to come up with a full logic of what syscalls to compile on what architectures. We already do that on some level, but this option calls for a separate mapping file (or so) instead of simple Makefile-based conditions, integrating it somehow into the make system. (B) and (C) are really easy - the libc (or custom #ifdefs) take care of which syscall should be called on which architecture. (C) has the disadvantage of actually doing the mapping from (A), just in a less visible way. 2) run (execution) time (A) again needs to re-use the mapping file (or logic) in most cases, since - if we want to call ie. open-like syscall, we need to call explicitly either do_open (non-arm) or do_openat (arm) (B) and (C) simplify this case a lot, but may fall short if we *really* want to call openat instead of open on non-arm 3) auditing time (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* wrappers. It still needs to re-use the mapping file (and therefore needs to manually specify which syscalls to audit on which archs), but it's very straightforward and clear. (B) and (C) have a significant problem here - we don't know which syscalls are being called "under the hood" of the do_* wrappers, so we need to try them out and then create per-arch hacks in the code similar to what we've seen in the arm patchset recently, ie. "set up auditing for fchmodat and call do_chmod", which can be somewhat confusing. The other option is to duplicate wrappers like do_socketcall, but the per-arch hacks still persist. We could create a mapping file for them, but then we might as well use (A). So what would be the best approach for new (and existing, over time) syscall wrappers? I personally really like (A) due to its clean design - there are no "Note: There is no glibc wrapper for this system call" exceptions and it's clear what syscalls run on which architectures and 32/64bit variations. The mapping file, with its helper bash functions doing ie. "is this syscall relevant for current arch/mode?" or "list all relevant syscalls for current arch/mode", along with some documentation, should be mostly easy to implement. A practical example in the syscalls bucket would be checking for arch relevancy in the `+' function with no per-arch or per-mode conditions in the .conf files. The rollup log (or run.bash --list) would then show which syscalls were actually run. This also means that I would have to throw away most of my series on do_ipc and do_socketcall, possibly re-implementing them in the future, but I'm fine with that. What's your opinion? Thanks, Jiri PS: I'm asking because we currently have ~70 new syscall wrappers staged for review and there's not much consistency in terms of (A) vs (B) vs (C). |
From: Linda K. <lin...@hp...> - 2014-07-28 16:39:19
|
Hi Jiri, On 07/28/2014 09:33 AM, Jiri Jaburek wrote: > Hello Linda & others, > > I've been doing some syscall work recently and while doing it, > I decided to "clean up" the do_ipc and do_socketcall wrappers, which > seemed like a duplicated functionality, since they call exactly the same > library functions as normal do_* wrappers. Also, I really wanted to get > rid of the ipc headerhack. :) That's a good goal.:-) > This was somewhat amplified by the fact that the code in ipc_common.c > was over-shared, meaning that ie. semget was using flags for semctl > or semop. So I made a series of ~6 commits, carefully moving all the > functionality from ipc_common.c into separate wrappers and removing the > do_ipc wrapper. I did the same for do_socketcall, which just calls bind, > using a library function, like do_bind. All this with removing > respective sections from syscalls/*.conf, of course. > > I was quite happy with the series, since - functionality wise - it was > transparent. However when I tested it on MODE=32, the syscalls bucket > started throwing ERRORs. Right. The 32-bit x86 syscalls add a lot of complexity. Did you see any problems on non-x86 architectures? > Some investigation uncovered that the syscalls > bucket was actually using these "duplicated" wrappers for proper > auditing - because auditctl works with real syscalls, not libc > functions. The extra wrappers were therefore nothing more than a name, > simplifying logic in the syscalls bucket. > > This goes against some other approaches used in the suite - in the > network bucket, for example, which - based on the architecture - selects > proper syscall name for auditctl, while still calling the original > syscall wrapper (which uses library functions). > > ----------------------------------------------------------------------- > > This led me into a certain design question I'd like to ask here; how to > design syscall wrappers and the execution and auditing infrastructure > around them? What would be the best approach? > > I've identified 3 most obvious ways to write a syscall wrapper: > > A) use syscall(__NR_syscallname, ...) directly, bypassing libc > B) use libc functions > C) use (A), but simulate libc using #ifdefs manually Today we use both A) and B), depending on the syscall. B) is easiest from a coding perspective. A) is sometimes necessary because libc might not actually be using the syscall we want in the mode we want or may be doing error checking of it's own that prevent some of the case we want to test. I'm not sure I understand C). > These approaches solve the "which syscalls to run where" problem > somewhat differently and therefore have different benefits and > drawbacks in various situations: > > 1) compile time > > (A) is a bit problematic, since we would need to come up with a full > logic of what syscalls to compile on what architectures. We already > do that on some level, but this option calls for a separate mapping > file (or so) instead of simple Makefile-based conditions, > integrating it somehow into the make system. Right - we seem to be reinventing libc. > (B) and (C) are really easy - the libc (or custom #ifdefs) take care > of which syscall should be called on which architecture. (C) has > the disadvantage of actually doing the mapping from (A), just in > a less visible way. > > 2) run (execution) time > > (A) again needs to re-use the mapping file (or logic) in most cases, > since - if we want to call ie. open-like syscall, we need to call > explicitly either do_open (non-arm) or do_openat (arm) And in some cases, we need to do both when the arch supports both. > (B) and (C) simplify this case a lot, but may fall short if we > *really* want to call openat instead of open on non-arm We do that today. We test both open and openat using the do_* programs. > 3) auditing time > > (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* > wrappers. It still needs to re-use the mapping file (and therefore > needs to manually specify which syscalls to audit on which archs), > but it's very straightforward and clear. > > (B) and (C) have a significant problem here - we don't know which > syscalls are being called "under the hood" of the do_* wrappers, Well, we do for all the existing calls. > so > we need to try them out and then create per-arch hacks in the code > similar to what we've seen in the arm patchset recently, ie. > "set up auditing for fchmodat and call do_chmod", which can be > somewhat confusing. The other option is to duplicate wrappers like > do_socketcall, but the per-arch hacks still persist. We could create > a mapping file for them, but then we might as well use (A). > > > So what would be the best approach for new (and existing, over time) > syscall wrappers? > > I personally really like (A) due to its clean design - there are no > "Note: There is no glibc wrapper for this system call" exceptions > and it's clear what syscalls run on which architectures and 32/64bit > variations. The mapping file, with its helper bash functions doing ie. > "is this syscall relevant for current arch/mode?" or "list all relevant > syscalls for current arch/mode", along with some documentation, should > be mostly easy to implement. > A practical example in the syscalls bucket would be checking for arch > relevancy in the `+' function with no per-arch or per-mode conditions > in the .conf files. The rollup log (or run.bash --list) would then show > which syscalls were actually run. I don't think there is one "best approach". Some of what we have today is because the suite has evolved over time but some if it is because for the most part, using the libc functions is simple and appropriate. Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) function. Sometimes due to legacy syscalls, like the multiplexed 32-bit syscalls, it gets a bit complicated but hey, we've written that code already. :-) > This also means that I would have to throw away most of my series on > do_ipc and do_socketcall, possibly re-implementing them in the future, > but I'm fine with that. > > What's your opinion? If it ain't broke....? > Thanks, > Jiri > > > PS: I'm asking because we currently have ~70 new syscall wrappers staged > for review and there's not much consistency in terms of (A) vs (B) > vs (C). I'm interesting in knowing more about your new syscalls and why you have a mix of A), B) and C) (which I still don't understand but I'm sure a simple example would clear things up). Is it because 3 different people wrote them or because there really is no "best approach"? I'm not trying to say that we have to keep doing things the way we've always done it. This suite has clearly evolved and improved as you and others have worked on it so I'd like to continue this discussion. -- ljk > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer |
From: Jiri J. <jja...@re...> - 2014-07-29 11:22:27
|
On 07/28/2014 06:39 PM, Linda Knippers wrote: > Hi Jiri, > > On 07/28/2014 09:33 AM, Jiri Jaburek wrote: >> Hello Linda & others, >> >> I've been doing some syscall work recently and while doing it, >> I decided to "clean up" the do_ipc and do_socketcall wrappers, which >> seemed like a duplicated functionality, since they call exactly the same >> library functions as normal do_* wrappers. Also, I really wanted to get >> rid of the ipc headerhack. :) > > That's a good goal.:-) > >> This was somewhat amplified by the fact that the code in ipc_common.c >> was over-shared, meaning that ie. semget was using flags for semctl >> or semop. So I made a series of ~6 commits, carefully moving all the >> functionality from ipc_common.c into separate wrappers and removing the >> do_ipc wrapper. I did the same for do_socketcall, which just calls bind, >> using a library function, like do_bind. All this with removing >> respective sections from syscalls/*.conf, of course. >> >> I was quite happy with the series, since - functionality wise - it was >> transparent. However when I tested it on MODE=32, the syscalls bucket >> started throwing ERRORs. > > Right. The 32-bit x86 syscalls add a lot of complexity. > Did you see any problems on non-x86 architectures? I didn't test other architectures, but the code suggests that all 32bit variants are affected, for ipc(2) at least. > >> Some investigation uncovered that the syscalls >> bucket was actually using these "duplicated" wrappers for proper >> auditing - because auditctl works with real syscalls, not libc >> functions. The extra wrappers were therefore nothing more than a name, >> simplifying logic in the syscalls bucket. >> >> This goes against some other approaches used in the suite - in the >> network bucket, for example, which - based on the architecture - selects >> proper syscall name for auditctl, while still calling the original >> syscall wrapper (which uses library functions). >> >> ----------------------------------------------------------------------- >> >> This led me into a certain design question I'd like to ask here; how to >> design syscall wrappers and the execution and auditing infrastructure >> around them? What would be the best approach? >> >> I've identified 3 most obvious ways to write a syscall wrapper: >> >> A) use syscall(__NR_syscallname, ...) directly, bypassing libc >> B) use libc functions >> C) use (A), but simulate libc using #ifdefs manually > > Today we use both A) and B), depending on the syscall. B) is easiest > from a coding perspective. A) is sometimes necessary because libc > might not actually be using the syscall we want in the mode we want > or may be doing error checking of it's own that prevent some of the case > we want to test. I'm not sure I understand C). (C) for do_chmod could look like #ifdef ARM exitval = syscall(__NR_fchmodat, ...); #else exitval = syscall(__NR_chmod, ...); #endif essentially simulating glibc in a controlled manner. > >> These approaches solve the "which syscalls to run where" problem >> somewhat differently and therefore have different benefits and >> drawbacks in various situations: >> >> 1) compile time >> >> (A) is a bit problematic, since we would need to come up with a full >> logic of what syscalls to compile on what architectures. We already >> do that on some level, but this option calls for a separate mapping >> file (or so) instead of simple Makefile-based conditions, >> integrating it somehow into the make system. > > Right - we seem to be reinventing libc. Please note that (A) isn't about reinventing libc - it isn't about providing abstractions for syscalls, since it's these abstractions that later cause "hacky" auditing code in tests. > >> (B) and (C) are really easy - the libc (or custom #ifdefs) take care >> of which syscall should be called on which architecture. (C) has >> the disadvantage of actually doing the mapping from (A), just in >> a less visible way. >> >> 2) run (execution) time >> >> (A) again needs to re-use the mapping file (or logic) in most cases, >> since - if we want to call ie. open-like syscall, we need to call >> explicitly either do_open (non-arm) or do_openat (arm) > > And in some cases, we need to do both when the arch supports both. > >> (B) and (C) simplify this case a lot, but may fall short if we >> *really* want to call openat instead of open on non-arm > > We do that today. We test both open and openat using the do_* programs. > >> 3) auditing time >> >> (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* >> wrappers. It still needs to re-use the mapping file (and therefore >> needs to manually specify which syscalls to audit on which archs), >> but it's very straightforward and clear. >> >> (B) and (C) have a significant problem here - we don't know which >> syscalls are being called "under the hood" of the do_* wrappers, > > Well, we do for all the existing calls. *We* do, but the tests don't, which is why we have to create ad-hoc conditions to tell them. > >> so >> we need to try them out and then create per-arch hacks in the code >> similar to what we've seen in the arm patchset recently, ie. >> "set up auditing for fchmodat and call do_chmod", which can be >> somewhat confusing. The other option is to duplicate wrappers like >> do_socketcall, but the per-arch hacks still persist. We could create >> a mapping file for them, but then we might as well use (A). >> >> >> So what would be the best approach for new (and existing, over time) >> syscall wrappers? >> >> I personally really like (A) due to its clean design - there are no >> "Note: There is no glibc wrapper for this system call" exceptions >> and it's clear what syscalls run on which architectures and 32/64bit >> variations. The mapping file, with its helper bash functions doing ie. >> "is this syscall relevant for current arch/mode?" or "list all relevant >> syscalls for current arch/mode", along with some documentation, should >> be mostly easy to implement. >> A practical example in the syscalls bucket would be checking for arch >> relevancy in the `+' function with no per-arch or per-mode conditions >> in the .conf files. The rollup log (or run.bash --list) would then show >> which syscalls were actually run. > > I don't think there is one "best approach". Some of what we have today > is because the suite has evolved over time but some if it is because for > the most part, using the libc functions is simple and appropriate. > Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) > function. Sometimes due to legacy syscalls, like the multiplexed 32-bit > syscalls, it gets a bit complicated but hey, we've written that code already. :-) I agree that using libc is often better and it would be perhaps cleaner if auditctl supported syscall translation (ie. chmod->fchmodat on arm), but it doesn't, it needs a direct syscall name for a given arch / mode. My points for (A) go mostly towards the auditing code, see the current state of augrok_default and auwatch_default in network/run.conf. Or the per-arch conditions in syscalls/*-run.conf. With a unified way of telling which syscalls are relevant for which architectures, the conditions in syscalls/ could go away. The entire case/esac structure in network/ could as well, since $syscall would always be the tested syscall and no other. The "unified way" doesn't have to be a static file, it can be generated using gcc from unistd.h dynamically. Also, "we've written that code already" doesn't mean there won't be more of it in the future (well, near future). > >> This also means that I would have to throw away most of my series on >> do_ipc and do_socketcall, possibly re-implementing them in the future, >> but I'm fine with that. >> >> What's your opinion? > > If it ain't broke....? Right, I was wondering whether a "clean" solution wouldn't be an overkill, since there aren't that much differences aside from do_ipc and do_socketcall. For me, it's just another cleanup, like the netfilter or networking-related code. > >> Thanks, >> Jiri >> >> >> PS: I'm asking because we currently have ~70 new syscall wrappers staged >> for review and there's not much consistency in terms of (A) vs (B) >> vs (C). > > I'm interesting in knowing more about your new syscalls and why you have > a mix of A), B) and C) (which I still don't understand but I'm sure a simple > example would clear things up). Is it because 3 different people wrote them > or because there really is no "best approach"? Various reasons, really. Written mostly by a single person, (A) was used where using (B) would pull in another suite dependency (NUMA), (B) was used where (A) would result in two independent wrappers (fstatat vs newfstatat), and various mix of both. Yes, because there's no defined policy of "best approach". > > I'm not trying to say that we have to keep doing things the way we've always done it. > This suite has clearly evolved and improved as you and others have worked on it so I'd > like to continue this discussion. Sure, I'd like to get as much feedback as possible before implementing anything (or deciding to leave it be in the current state). > > -- ljk > >> >> ------------------------------------------------------------------------------ >> Infragistics Professional >> Build stunning WinForms apps today! >> Reboot your WinForms applications with our WinForms controls. >> Build a bridge from your legacy apps to the future. >> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >> _______________________________________________ >> Audit-test-developer mailing list >> Aud...@li... >> https://lists.sourceforge.net/lists/listinfo/audit-test-developer > > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > Jiri |
From: Jiri J. <jja...@re...> - 2014-08-01 12:12:10
|
On 07/29/2014 01:22 PM, Jiri Jaburek wrote: > On 07/28/2014 06:39 PM, Linda Knippers wrote: >> Hi Jiri, >> >> On 07/28/2014 09:33 AM, Jiri Jaburek wrote: >>> Hello Linda & others, >>> >>> I've been doing some syscall work recently and while doing it, >>> I decided to "clean up" the do_ipc and do_socketcall wrappers, which >>> seemed like a duplicated functionality, since they call exactly the same >>> library functions as normal do_* wrappers. Also, I really wanted to get >>> rid of the ipc headerhack. :) >> >> That's a good goal.:-) >> >>> This was somewhat amplified by the fact that the code in ipc_common.c >>> was over-shared, meaning that ie. semget was using flags for semctl >>> or semop. So I made a series of ~6 commits, carefully moving all the >>> functionality from ipc_common.c into separate wrappers and removing the >>> do_ipc wrapper. I did the same for do_socketcall, which just calls bind, >>> using a library function, like do_bind. All this with removing >>> respective sections from syscalls/*.conf, of course. >>> >>> I was quite happy with the series, since - functionality wise - it was >>> transparent. However when I tested it on MODE=32, the syscalls bucket >>> started throwing ERRORs. >> >> Right. The 32-bit x86 syscalls add a lot of complexity. >> Did you see any problems on non-x86 architectures? > > I didn't test other architectures, but the code suggests that all 32bit > variants are affected, for ipc(2) at least. > >> >>> Some investigation uncovered that the syscalls >>> bucket was actually using these "duplicated" wrappers for proper >>> auditing - because auditctl works with real syscalls, not libc >>> functions. The extra wrappers were therefore nothing more than a name, >>> simplifying logic in the syscalls bucket. >>> >>> This goes against some other approaches used in the suite - in the >>> network bucket, for example, which - based on the architecture - selects >>> proper syscall name for auditctl, while still calling the original >>> syscall wrapper (which uses library functions). >>> >>> ----------------------------------------------------------------------- >>> >>> This led me into a certain design question I'd like to ask here; how to >>> design syscall wrappers and the execution and auditing infrastructure >>> around them? What would be the best approach? >>> >>> I've identified 3 most obvious ways to write a syscall wrapper: >>> >>> A) use syscall(__NR_syscallname, ...) directly, bypassing libc >>> B) use libc functions >>> C) use (A), but simulate libc using #ifdefs manually >> >> Today we use both A) and B), depending on the syscall. B) is easiest >> from a coding perspective. A) is sometimes necessary because libc >> might not actually be using the syscall we want in the mode we want >> or may be doing error checking of it's own that prevent some of the case >> we want to test. I'm not sure I understand C). > > (C) for do_chmod could look like > > #ifdef ARM > exitval = syscall(__NR_fchmodat, ...); > #else > exitval = syscall(__NR_chmod, ...); > #endif > > essentially simulating glibc in a controlled manner. > >> >>> These approaches solve the "which syscalls to run where" problem >>> somewhat differently and therefore have different benefits and >>> drawbacks in various situations: >>> >>> 1) compile time >>> >>> (A) is a bit problematic, since we would need to come up with a full >>> logic of what syscalls to compile on what architectures. We already >>> do that on some level, but this option calls for a separate mapping >>> file (or so) instead of simple Makefile-based conditions, >>> integrating it somehow into the make system. >> >> Right - we seem to be reinventing libc. > > Please note that (A) isn't about reinventing libc - it isn't about > providing abstractions for syscalls, since it's these abstractions that > later cause "hacky" auditing code in tests. > >> >>> (B) and (C) are really easy - the libc (or custom #ifdefs) take care >>> of which syscall should be called on which architecture. (C) has >>> the disadvantage of actually doing the mapping from (A), just in >>> a less visible way. >>> >>> 2) run (execution) time >>> >>> (A) again needs to re-use the mapping file (or logic) in most cases, >>> since - if we want to call ie. open-like syscall, we need to call >>> explicitly either do_open (non-arm) or do_openat (arm) >> >> And in some cases, we need to do both when the arch supports both. >> >>> (B) and (C) simplify this case a lot, but may fall short if we >>> *really* want to call openat instead of open on non-arm >> >> We do that today. We test both open and openat using the do_* programs. >> >>> 3) auditing time >>> >>> (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* >>> wrappers. It still needs to re-use the mapping file (and therefore >>> needs to manually specify which syscalls to audit on which archs), >>> but it's very straightforward and clear. >>> >>> (B) and (C) have a significant problem here - we don't know which >>> syscalls are being called "under the hood" of the do_* wrappers, >> >> Well, we do for all the existing calls. > > *We* do, but the tests don't, which is why we have to create ad-hoc > conditions to tell them. > >> >>> so >>> we need to try them out and then create per-arch hacks in the code >>> similar to what we've seen in the arm patchset recently, ie. >>> "set up auditing for fchmodat and call do_chmod", which can be >>> somewhat confusing. The other option is to duplicate wrappers like >>> do_socketcall, but the per-arch hacks still persist. We could create >>> a mapping file for them, but then we might as well use (A). >>> >>> >>> So what would be the best approach for new (and existing, over time) >>> syscall wrappers? >>> >>> I personally really like (A) due to its clean design - there are no >>> "Note: There is no glibc wrapper for this system call" exceptions >>> and it's clear what syscalls run on which architectures and 32/64bit >>> variations. The mapping file, with its helper bash functions doing ie. >>> "is this syscall relevant for current arch/mode?" or "list all relevant >>> syscalls for current arch/mode", along with some documentation, should >>> be mostly easy to implement. >>> A practical example in the syscalls bucket would be checking for arch >>> relevancy in the `+' function with no per-arch or per-mode conditions >>> in the .conf files. The rollup log (or run.bash --list) would then show >>> which syscalls were actually run. >> >> I don't think there is one "best approach". Some of what we have today >> is because the suite has evolved over time but some if it is because for >> the most part, using the libc functions is simple and appropriate. >> Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) >> function. Sometimes due to legacy syscalls, like the multiplexed 32-bit >> syscalls, it gets a bit complicated but hey, we've written that code already. :-) > > I agree that using libc is often better and it would be perhaps cleaner > if auditctl supported syscall translation (ie. chmod->fchmodat on arm), > but it doesn't, it needs a direct syscall name for a given arch / mode. > > My points for (A) go mostly towards the auditing code, see the current > state of augrok_default and auwatch_default in network/run.conf. > Or the per-arch conditions in syscalls/*-run.conf. > > With a unified way of telling which syscalls are relevant for which > architectures, the conditions in syscalls/ could go away. The entire > case/esac structure in network/ could as well, since $syscall would > always be the tested syscall and no other. > > The "unified way" doesn't have to be a static file, it can be generated > using gcc from unistd.h dynamically. > > Also, "we've written that code already" doesn't mean there won't be more > of it in the future (well, near future). > >> >>> This also means that I would have to throw away most of my series on >>> do_ipc and do_socketcall, possibly re-implementing them in the future, >>> but I'm fine with that. >>> >>> What's your opinion? >> >> If it ain't broke....? > > Right, I was wondering whether a "clean" solution wouldn't be an > overkill, since there aren't that much differences aside from do_ipc > and do_socketcall. For me, it's just another cleanup, like the netfilter > or networking-related code. > >> >>> Thanks, >>> Jiri >>> >>> >>> PS: I'm asking because we currently have ~70 new syscall wrappers staged >>> for review and there's not much consistency in terms of (A) vs (B) >>> vs (C). >> >> I'm interesting in knowing more about your new syscalls and why you have >> a mix of A), B) and C) (which I still don't understand but I'm sure a simple >> example would clear things up). Is it because 3 different people wrote them >> or because there really is no "best approach"? > > Various reasons, really. Written mostly by a single person, (A) was used > where using (B) would pull in another suite dependency (NUMA), > (B) was used where (A) would result in two independent wrappers (fstatat > vs newfstatat), and various mix of both. Yes, because there's no defined > policy of "best approach". > >> >> I'm not trying to say that we have to keep doing things the way we've always done it. >> This suite has clearly evolved and improved as you and others have worked on it so I'd >> like to continue this discussion. > > Sure, I'd like to get as much feedback as possible before implementing > anything (or deciding to leave it be in the current state). > I made some proof-of-concept attempts regarding the automatic build of only relevant syscalls and I'm not sure anymore whether it's a good idea. The thing is - it's too "smart", which would be beneficial for something like LTP (which has TCONF), but not really for our suite, where we want things to be as static as possible. IOW the way I understood it, we don't have any advanced output-checking logic that would guarantee that all the required/expected syscall tests were run - this is guaranteed by the (static) logic itself. Using a static list both as a replacement for utils/bin/Makefile and for execution/auditing could work, but I'm not really sure of its format and whether it wouldn't go against KISS. Meaning that, sometimes, leaving things in apparent disorder might be the best solution. The implication being that I'll stick with (B), eg. using library functions where possible, ie. fstatat() instead of newfstatat(), etc. This also means that I can finish that ipc/socketcall patchseries, I'm still undecided whether to do it network-style (add ie. do_msgget, call it, but audit syscall==ipc on related arch/modes) or whether leave do_ipc in place and just make it a special case (calling bodies of the "normal" do_* ipc wrappers) instead of a feature. Jiri |
From: Linda K. <lin...@hp...> - 2014-08-04 19:27:00
|
On 7/29/2014 7:22 AM, Jiri Jaburek wrote: > On 07/28/2014 06:39 PM, Linda Knippers wrote: >> Hi Jiri, >> >> On 07/28/2014 09:33 AM, Jiri Jaburek wrote: >>> Hello Linda & others, >>> >>> I've been doing some syscall work recently and while doing it, >>> I decided to "clean up" the do_ipc and do_socketcall wrappers, which >>> seemed like a duplicated functionality, since they call exactly the same >>> library functions as normal do_* wrappers. Also, I really wanted to get >>> rid of the ipc headerhack. :) >> >> That's a good goal.:-) >> >>> This was somewhat amplified by the fact that the code in ipc_common.c >>> was over-shared, meaning that ie. semget was using flags for semctl >>> or semop. So I made a series of ~6 commits, carefully moving all the >>> functionality from ipc_common.c into separate wrappers and removing the >>> do_ipc wrapper. I did the same for do_socketcall, which just calls bind, >>> using a library function, like do_bind. All this with removing >>> respective sections from syscalls/*.conf, of course. >>> >>> I was quite happy with the series, since - functionality wise - it was >>> transparent. However when I tested it on MODE=32, the syscalls bucket >>> started throwing ERRORs. >> >> Right. The 32-bit x86 syscalls add a lot of complexity. >> Did you see any problems on non-x86 architectures? > > I didn't test other architectures, but the code suggests that all 32bit > variants are affected, for ipc(2) at least. > >> >>> Some investigation uncovered that the syscalls >>> bucket was actually using these "duplicated" wrappers for proper >>> auditing - because auditctl works with real syscalls, not libc >>> functions. The extra wrappers were therefore nothing more than a name, >>> simplifying logic in the syscalls bucket. >>> >>> This goes against some other approaches used in the suite - in the >>> network bucket, for example, which - based on the architecture - selects >>> proper syscall name for auditctl, while still calling the original >>> syscall wrapper (which uses library functions). >>> >>> ----------------------------------------------------------------------- >>> >>> This led me into a certain design question I'd like to ask here; how to >>> design syscall wrappers and the execution and auditing infrastructure >>> around them? What would be the best approach? >>> >>> I've identified 3 most obvious ways to write a syscall wrapper: >>> >>> A) use syscall(__NR_syscallname, ...) directly, bypassing libc >>> B) use libc functions >>> C) use (A), but simulate libc using #ifdefs manually >> >> Today we use both A) and B), depending on the syscall. B) is easiest >> from a coding perspective. A) is sometimes necessary because libc >> might not actually be using the syscall we want in the mode we want >> or may be doing error checking of it's own that prevent some of the case >> we want to test. I'm not sure I understand C). > > (C) for do_chmod could look like > > #ifdef ARM > exitval = syscall(__NR_fchmodat, ...); > #else > exitval = syscall(__NR_chmod, ...); > #endif > > essentially simulating glibc in a controlled manner. I thought that was A). In this example, I'm not sure you'd have an ifdef in do_chmod to map it to fchmodat on ARM because do_chmod is to test chmod, and there isn't one for arm. You'd just use do_fchmodat, which exists for other arches too. >> >>> These approaches solve the "which syscalls to run where" problem >>> somewhat differently and therefore have different benefits and >>> drawbacks in various situations: >>> >>> 1) compile time >>> >>> (A) is a bit problematic, since we would need to come up with a full >>> logic of what syscalls to compile on what architectures. We already >>> do that on some level, but this option calls for a separate mapping >>> file (or so) instead of simple Makefile-based conditions, >>> integrating it somehow into the make system. >> >> Right - we seem to be reinventing libc. > > Please note that (A) isn't about reinventing libc - it isn't about > providing abstractions for syscalls, since it's these abstractions that > later cause "hacky" auditing code in tests. Ok, but using the example from above, what would the audit test test for? You can't test a chmod syscall on ARM if there isn't one. The auditctl to select that syscall for auditing would fail too, right? > >> >>> (B) and (C) are really easy - the libc (or custom #ifdefs) take care >>> of which syscall should be called on which architecture. (C) has >>> the disadvantage of actually doing the mapping from (A), just in >>> a less visible way. >>> >>> 2) run (execution) time >>> >>> (A) again needs to re-use the mapping file (or logic) in most cases, >>> since - if we want to call ie. open-like syscall, we need to call >>> explicitly either do_open (non-arm) or do_openat (arm) >> >> And in some cases, we need to do both when the arch supports both. >> >>> (B) and (C) simplify this case a lot, but may fall short if we >>> *really* want to call openat instead of open on non-arm >> >> We do that today. We test both open and openat using the do_* programs. >> >>> 3) auditing time >>> >>> (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* >>> wrappers. It still needs to re-use the mapping file (and therefore >>> needs to manually specify which syscalls to audit on which archs), >>> but it's very straightforward and clear. >>> >>> (B) and (C) have a significant problem here - we don't know which >>> syscalls are being called "under the hood" of the do_* wrappers, >> >> Well, we do for all the existing calls. > > *We* do, but the tests don't, which is why we have to create ad-hoc > conditions to tell them. > >> >>> so >>> we need to try them out and then create per-arch hacks in the code >>> similar to what we've seen in the arm patchset recently, ie. >>> "set up auditing for fchmodat and call do_chmod", which can be >>> somewhat confusing. The other option is to duplicate wrappers like >>> do_socketcall, but the per-arch hacks still persist. We could create >>> a mapping file for them, but then we might as well use (A). >>> >>> >>> So what would be the best approach for new (and existing, over time) >>> syscall wrappers? >>> >>> I personally really like (A) due to its clean design - there are no >>> "Note: There is no glibc wrapper for this system call" exceptions >>> and it's clear what syscalls run on which architectures and 32/64bit >>> variations. The mapping file, with its helper bash functions doing ie. >>> "is this syscall relevant for current arch/mode?" or "list all relevant >>> syscalls for current arch/mode", along with some documentation, should >>> be mostly easy to implement. >>> A practical example in the syscalls bucket would be checking for arch >>> relevancy in the `+' function with no per-arch or per-mode conditions >>> in the .conf files. The rollup log (or run.bash --list) would then show >>> which syscalls were actually run. >> >> I don't think there is one "best approach". Some of what we have today >> is because the suite has evolved over time but some if it is because for >> the most part, using the libc functions is simple and appropriate. >> Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) >> function. Sometimes due to legacy syscalls, like the multiplexed 32-bit >> syscalls, it gets a bit complicated but hey, we've written that code already. :-) > > I agree that using libc is often better and it would be perhaps cleaner > if auditctl supported syscall translation (ie. chmod->fchmodat on arm), > but it doesn't, it needs a direct syscall name for a given arch / mode. Maybe fixing auditctl is the better solution. If we don't know what the syscalls are, how are audit users supposed to know? > My points for (A) go mostly towards the auditing code, see the current > state of augrok_default and auwatch_default in network/run.conf. > Or the per-arch conditions in syscalls/*-run.conf. The multiplexed system calls are a pain, but it's mostly legacy pain. Are there new syscalls that are implemented that way? > > With a unified way of telling which syscalls are relevant for which > architectures, the conditions in syscalls/ could go away. The entire > case/esac structure in network/ could as well, since $syscall would > always be the tested syscall and no other. I'm not sure how unified it can really be though. In the case of these multiplexed system calls, we still have to test the various options to the syscalls because they go down different security relevant paths. I don't think we could have just one test for socketcall(), for example. > The "unified way" doesn't have to be a static file, it can be generated > using gcc from unistd.h dynamically. > > Also, "we've written that code already" doesn't mean there won't be more > of it in the future (well, near future). > >> >>> This also means that I would have to throw away most of my series on >>> do_ipc and do_socketcall, possibly re-implementing them in the future, >>> but I'm fine with that. >>> >>> What's your opinion? >> >> If it ain't broke....? > > Right, I was wondering whether a "clean" solution wouldn't be an > overkill, since there aren't that much differences aside from do_ipc > and do_socketcall. For me, it's just another cleanup, like the netfilter > or networking-related code. > >> >>> Thanks, >>> Jiri >>> >>> >>> PS: I'm asking because we currently have ~70 new syscall wrappers staged >>> for review and there's not much consistency in terms of (A) vs (B) >>> vs (C). >> >> I'm interesting in knowing more about your new syscalls and why you have >> a mix of A), B) and C) (which I still don't understand but I'm sure a simple >> example would clear things up). Is it because 3 different people wrote them >> or because there really is no "best approach"? > > Various reasons, really. Written mostly by a single person, (A) was used > where using (B) would pull in another suite dependency (NUMA), > (B) was used where (A) would result in two independent wrappers (fstatat > vs newfstatat), and various mix of both. Yes, because there's no defined > policy of "best approach". If fstatat and newfstatat are both syscalls that are both callable by an architecture, then you might need two since you'd have to test both. >> I'm not trying to say that we have to keep doing things the way we've always done it. >> This suite has clearly evolved and improved as you and others have worked on it so I'd >> like to continue this discussion. > > Sure, I'd like to get as much feedback as possible before implementing > anything (or deciding to leave it be in the current state). I see you've sent another message now that I haven't read yet but I'll read next. -- ljk > >> >> -- ljk >> >>> >>> ------------------------------------------------------------------------------ >>> Infragistics Professional >>> Build stunning WinForms apps today! >>> Reboot your WinForms applications with our WinForms controls. >>> Build a bridge from your legacy apps to the future. >>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Audit-test-developer mailing list >>> Aud...@li... >>> https://lists.sourceforge.net/lists/listinfo/audit-test-developer >> >> >> ------------------------------------------------------------------------------ >> Infragistics Professional >> Build stunning WinForms apps today! >> Reboot your WinForms applications with our WinForms controls. >> Build a bridge from your legacy apps to the future. >> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >> _______________________________________________ >> Audit-test-developer mailing list >> Aud...@li... >> https://lists.sourceforge.net/lists/listinfo/audit-test-developer >> > > Jiri > |
From: Jiri J. <jja...@re...> - 2014-08-05 15:15:05
|
On 08/04/2014 09:26 PM, Linda Knippers wrote: > On 7/29/2014 7:22 AM, Jiri Jaburek wrote: >> On 07/28/2014 06:39 PM, Linda Knippers wrote: >>> Hi Jiri, >>> >>> On 07/28/2014 09:33 AM, Jiri Jaburek wrote: >>>> Hello Linda & others, >>>> >>>> I've been doing some syscall work recently and while doing it, >>>> I decided to "clean up" the do_ipc and do_socketcall wrappers, which >>>> seemed like a duplicated functionality, since they call exactly the same >>>> library functions as normal do_* wrappers. Also, I really wanted to get >>>> rid of the ipc headerhack. :) >>> >>> That's a good goal.:-) >>> >>>> This was somewhat amplified by the fact that the code in ipc_common.c >>>> was over-shared, meaning that ie. semget was using flags for semctl >>>> or semop. So I made a series of ~6 commits, carefully moving all the >>>> functionality from ipc_common.c into separate wrappers and removing the >>>> do_ipc wrapper. I did the same for do_socketcall, which just calls bind, >>>> using a library function, like do_bind. All this with removing >>>> respective sections from syscalls/*.conf, of course. >>>> >>>> I was quite happy with the series, since - functionality wise - it was >>>> transparent. However when I tested it on MODE=32, the syscalls bucket >>>> started throwing ERRORs. >>> >>> Right. The 32-bit x86 syscalls add a lot of complexity. >>> Did you see any problems on non-x86 architectures? >> >> I didn't test other architectures, but the code suggests that all 32bit >> variants are affected, for ipc(2) at least. >> >>> >>>> Some investigation uncovered that the syscalls >>>> bucket was actually using these "duplicated" wrappers for proper >>>> auditing - because auditctl works with real syscalls, not libc >>>> functions. The extra wrappers were therefore nothing more than a name, >>>> simplifying logic in the syscalls bucket. >>>> >>>> This goes against some other approaches used in the suite - in the >>>> network bucket, for example, which - based on the architecture - selects >>>> proper syscall name for auditctl, while still calling the original >>>> syscall wrapper (which uses library functions). >>>> >>>> ----------------------------------------------------------------------- >>>> >>>> This led me into a certain design question I'd like to ask here; how to >>>> design syscall wrappers and the execution and auditing infrastructure >>>> around them? What would be the best approach? >>>> >>>> I've identified 3 most obvious ways to write a syscall wrapper: >>>> >>>> A) use syscall(__NR_syscallname, ...) directly, bypassing libc >>>> B) use libc functions >>>> C) use (A), but simulate libc using #ifdefs manually >>> >>> Today we use both A) and B), depending on the syscall. B) is easiest >>> from a coding perspective. A) is sometimes necessary because libc >>> might not actually be using the syscall we want in the mode we want >>> or may be doing error checking of it's own that prevent some of the case >>> we want to test. I'm not sure I understand C). >> >> (C) for do_chmod could look like >> >> #ifdef ARM >> exitval = syscall(__NR_fchmodat, ...); >> #else >> exitval = syscall(__NR_chmod, ...); >> #endif >> >> essentially simulating glibc in a controlled manner. > > I thought that was A). (A) was using just one in each wrapper and doing conditional build based on an external syscall-to-arch mapping file (or autodetection). > > In this example, I'm not sure you'd have an ifdef in do_chmod to map > it to fchmodat on ARM because do_chmod is to test chmod, and there isn't > one for arm. You'd just use do_fchmodat, which exists for other arches too. Probably a bad example, fstatat / newfstatat would be better here, ie. something already done by glibc. > >>> >>>> These approaches solve the "which syscalls to run where" problem >>>> somewhat differently and therefore have different benefits and >>>> drawbacks in various situations: >>>> >>>> 1) compile time >>>> >>>> (A) is a bit problematic, since we would need to come up with a full >>>> logic of what syscalls to compile on what architectures. We already >>>> do that on some level, but this option calls for a separate mapping >>>> file (or so) instead of simple Makefile-based conditions, >>>> integrating it somehow into the make system. >>> >>> Right - we seem to be reinventing libc. >> >> Please note that (A) isn't about reinventing libc - it isn't about >> providing abstractions for syscalls, since it's these abstractions that >> later cause "hacky" auditing code in tests. > > Ok, but using the example from above, what would the audit test test for? > You can't test a chmod syscall on ARM if there isn't one. The auditctl > to select that syscall for auditing would fail too, right? The (A) "way" would be to test both fstatat and newfstatat by calling a bash function, which checks whether a given syscall is "relevant" to the current architecture and compile/execute/audit it if it is. The (B) "way" would be to always compile and execute whatever glibc hides under fstatat() and selectively audit (guess) the syscall behind it. The (C) "way" would be the same as (B), except that we wouldn't have to guess, we would explicitly say which syscall to execute/audit where. Of all three options, only (A) allows us to test multiple syscalls potentially hidden under a single glibc function. However we might as well simply use __NR_name in these special cases, like we do with fork/clone, for example. If these syscalls are arch-specific, this might become more ugly on the (bash) test side and the Makefile side. >> >>> >>>> (B) and (C) are really easy - the libc (or custom #ifdefs) take care >>>> of which syscall should be called on which architecture. (C) has >>>> the disadvantage of actually doing the mapping from (A), just in >>>> a less visible way. >>>> >>>> 2) run (execution) time >>>> >>>> (A) again needs to re-use the mapping file (or logic) in most cases, >>>> since - if we want to call ie. open-like syscall, we need to call >>>> explicitly either do_open (non-arm) or do_openat (arm) >>> >>> And in some cases, we need to do both when the arch supports both. >>> >>>> (B) and (C) simplify this case a lot, but may fall short if we >>>> *really* want to call openat instead of open on non-arm >>> >>> We do that today. We test both open and openat using the do_* programs. >>> >>>> 3) auditing time >>>> >>>> (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* >>>> wrappers. It still needs to re-use the mapping file (and therefore >>>> needs to manually specify which syscalls to audit on which archs), >>>> but it's very straightforward and clear. >>>> >>>> (B) and (C) have a significant problem here - we don't know which >>>> syscalls are being called "under the hood" of the do_* wrappers, >>> >>> Well, we do for all the existing calls. >> >> *We* do, but the tests don't, which is why we have to create ad-hoc >> conditions to tell them. >> >>> >>>> so >>>> we need to try them out and then create per-arch hacks in the code >>>> similar to what we've seen in the arm patchset recently, ie. >>>> "set up auditing for fchmodat and call do_chmod", which can be >>>> somewhat confusing. The other option is to duplicate wrappers like >>>> do_socketcall, but the per-arch hacks still persist. We could create >>>> a mapping file for them, but then we might as well use (A). >>>> >>>> >>>> So what would be the best approach for new (and existing, over time) >>>> syscall wrappers? >>>> >>>> I personally really like (A) due to its clean design - there are no >>>> "Note: There is no glibc wrapper for this system call" exceptions >>>> and it's clear what syscalls run on which architectures and 32/64bit >>>> variations. The mapping file, with its helper bash functions doing ie. >>>> "is this syscall relevant for current arch/mode?" or "list all relevant >>>> syscalls for current arch/mode", along with some documentation, should >>>> be mostly easy to implement. >>>> A practical example in the syscalls bucket would be checking for arch >>>> relevancy in the `+' function with no per-arch or per-mode conditions >>>> in the .conf files. The rollup log (or run.bash --list) would then show >>>> which syscalls were actually run. >>> >>> I don't think there is one "best approach". Some of what we have today >>> is because the suite has evolved over time but some if it is because for >>> the most part, using the libc functions is simple and appropriate. >>> Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) >>> function. Sometimes due to legacy syscalls, like the multiplexed 32-bit >>> syscalls, it gets a bit complicated but hey, we've written that code already. :-) >> >> I agree that using libc is often better and it would be perhaps cleaner >> if auditctl supported syscall translation (ie. chmod->fchmodat on arm), >> but it doesn't, it needs a direct syscall name for a given arch / mode. > > Maybe fixing auditctl is the better solution. If we don't know what > the syscalls are, how are audit users supposed to know? Even if we were able to make auditctl use glibc wrapper names instead of syscalls, the kernel-based audit log won't care for them and will always use the syscalls themselves (numbers). Though I guess the translation layer could be used for ausearch as well. > >> My points for (A) go mostly towards the auditing code, see the current >> state of augrok_default and auwatch_default in network/run.conf. >> Or the per-arch conditions in syscalls/*-run.conf. > > The multiplexed system calls are a pain, but it's mostly legacy pain. > Are there new syscalls that are implemented that way? Not that I'm aware of, but the legacy pain didn't end, the PPC arch now (since 2010, 86250b9d12caa1a3dee12a7cf638b7dd70eaadb6, in RHEL7) supports the "normal" network syscalls *in addition* to the socketcall syscall. On both 64 *and* 32bits. Not sure which of them is used by glibc, but our suite expects socketcall. Also, there's no technical reason why 64bit versions of PPC/S390 couldn't support the "normal" ipc syscalls instead of just ipc(2), meaning somebody will probably add those in the future. >> >> With a unified way of telling which syscalls are relevant for which >> architectures, the conditions in syscalls/ could go away. The entire >> case/esac structure in network/ could as well, since $syscall would >> always be the tested syscall and no other. > > I'm not sure how unified it can really be though. In the case of > these multiplexed system calls, we still have to test the various > options to the syscalls because they go down different security > relevant paths. I don't think we could have just one test for > socketcall(), for example. It's true that different variants of the same syscall could cause problems since we're checking for arguments (a0, ..) of the syscalls when searching the audit log. If some arch decided to move the ipc operation type in ipc(2) from a0 to a1, it would be a big problem for generic testing as done by (A). I'm not currently sure how to deal with this in a reasonably simple manner. > >> The "unified way" doesn't have to be a static file, it can be generated >> using gcc from unistd.h dynamically. >> >> Also, "we've written that code already" doesn't mean there won't be more >> of it in the future (well, near future). >> >>> >>>> This also means that I would have to throw away most of my series on >>>> do_ipc and do_socketcall, possibly re-implementing them in the future, >>>> but I'm fine with that. >>>> >>>> What's your opinion? >>> >>> If it ain't broke....? >> >> Right, I was wondering whether a "clean" solution wouldn't be an >> overkill, since there aren't that much differences aside from do_ipc >> and do_socketcall. For me, it's just another cleanup, like the netfilter >> or networking-related code. >> >>> >>>> Thanks, >>>> Jiri >>>> >>>> >>>> PS: I'm asking because we currently have ~70 new syscall wrappers staged >>>> for review and there's not much consistency in terms of (A) vs (B) >>>> vs (C). >>> >>> I'm interesting in knowing more about your new syscalls and why you have >>> a mix of A), B) and C) (which I still don't understand but I'm sure a simple >>> example would clear things up). Is it because 3 different people wrote them >>> or because there really is no "best approach"? >> >> Various reasons, really. Written mostly by a single person, (A) was used >> where using (B) would pull in another suite dependency (NUMA), >> (B) was used where (A) would result in two independent wrappers (fstatat >> vs newfstatat), and various mix of both. Yes, because there's no defined >> policy of "best approach". > > If fstatat and newfstatat are both syscalls that are both callable by an > architecture, then you might need two since you'd have to test both. It seems that 64bit variants of all supported architectures use newfstatat and all 32bit variants use fstatat64, there's no overlap. There is "sys_fstatat" defined in the kernel, but it doesn't seem to be in unistd.h, so it's probably not exported as a syscall. > >>> I'm not trying to say that we have to keep doing things the way we've always done it. >>> This suite has clearly evolved and improved as you and others have worked on it so I'd >>> like to continue this discussion. >> >> Sure, I'd like to get as much feedback as possible before implementing >> anything (or deciding to leave it be in the current state). > > I see you've sent another message now that I haven't read yet but > I'll read next. > > -- ljk >> >>> >>> -- ljk >>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Infragistics Professional >>>> Build stunning WinForms apps today! >>>> Reboot your WinForms applications with our WinForms controls. >>>> Build a bridge from your legacy apps to the future. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Audit-test-developer mailing list >>>> Aud...@li... >>>> https://lists.sourceforge.net/lists/listinfo/audit-test-developer >>> >>> >>> ------------------------------------------------------------------------------ >>> Infragistics Professional >>> Build stunning WinForms apps today! >>> Reboot your WinForms applications with our WinForms controls. >>> Build a bridge from your legacy apps to the future. >>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Audit-test-developer mailing list >>> Aud...@li... >>> https://lists.sourceforge.net/lists/listinfo/audit-test-developer >>> >> >> Jiri >> > > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > |
From: Linda K. <lin...@hp...> - 2014-08-04 20:09:14
|
On 8/1/2014 8:11 AM, Jiri Jaburek wrote: > On 07/29/2014 01:22 PM, Jiri Jaburek wrote: >> On 07/28/2014 06:39 PM, Linda Knippers wrote: >>> Hi Jiri, >>> >>> On 07/28/2014 09:33 AM, Jiri Jaburek wrote: >>>> Hello Linda & others, >>>> >>>> I've been doing some syscall work recently and while doing it, >>>> I decided to "clean up" the do_ipc and do_socketcall wrappers, which >>>> seemed like a duplicated functionality, since they call exactly the same >>>> library functions as normal do_* wrappers. Also, I really wanted to get >>>> rid of the ipc headerhack. :) >>> >>> That's a good goal.:-) >>> >>>> This was somewhat amplified by the fact that the code in ipc_common.c >>>> was over-shared, meaning that ie. semget was using flags for semctl >>>> or semop. So I made a series of ~6 commits, carefully moving all the >>>> functionality from ipc_common.c into separate wrappers and removing the >>>> do_ipc wrapper. I did the same for do_socketcall, which just calls bind, >>>> using a library function, like do_bind. All this with removing >>>> respective sections from syscalls/*.conf, of course. >>>> >>>> I was quite happy with the series, since - functionality wise - it was >>>> transparent. However when I tested it on MODE=32, the syscalls bucket >>>> started throwing ERRORs. >>> >>> Right. The 32-bit x86 syscalls add a lot of complexity. >>> Did you see any problems on non-x86 architectures? >> >> I didn't test other architectures, but the code suggests that all 32bit >> variants are affected, for ipc(2) at least. >> >>> >>>> Some investigation uncovered that the syscalls >>>> bucket was actually using these "duplicated" wrappers for proper >>>> auditing - because auditctl works with real syscalls, not libc >>>> functions. The extra wrappers were therefore nothing more than a name, >>>> simplifying logic in the syscalls bucket. >>>> >>>> This goes against some other approaches used in the suite - in the >>>> network bucket, for example, which - based on the architecture - selects >>>> proper syscall name for auditctl, while still calling the original >>>> syscall wrapper (which uses library functions). >>>> >>>> ----------------------------------------------------------------------- >>>> >>>> This led me into a certain design question I'd like to ask here; how to >>>> design syscall wrappers and the execution and auditing infrastructure >>>> around them? What would be the best approach? >>>> >>>> I've identified 3 most obvious ways to write a syscall wrapper: >>>> >>>> A) use syscall(__NR_syscallname, ...) directly, bypassing libc >>>> B) use libc functions >>>> C) use (A), but simulate libc using #ifdefs manually >>> >>> Today we use both A) and B), depending on the syscall. B) is easiest >>> from a coding perspective. A) is sometimes necessary because libc >>> might not actually be using the syscall we want in the mode we want >>> or may be doing error checking of it's own that prevent some of the case >>> we want to test. I'm not sure I understand C). >> >> (C) for do_chmod could look like >> >> #ifdef ARM >> exitval = syscall(__NR_fchmodat, ...); >> #else >> exitval = syscall(__NR_chmod, ...); >> #endif >> >> essentially simulating glibc in a controlled manner. >> >>> >>>> These approaches solve the "which syscalls to run where" problem >>>> somewhat differently and therefore have different benefits and >>>> drawbacks in various situations: >>>> >>>> 1) compile time >>>> >>>> (A) is a bit problematic, since we would need to come up with a full >>>> logic of what syscalls to compile on what architectures. We already >>>> do that on some level, but this option calls for a separate mapping >>>> file (or so) instead of simple Makefile-based conditions, >>>> integrating it somehow into the make system. >>> >>> Right - we seem to be reinventing libc. >> >> Please note that (A) isn't about reinventing libc - it isn't about >> providing abstractions for syscalls, since it's these abstractions that >> later cause "hacky" auditing code in tests. >> >>> >>>> (B) and (C) are really easy - the libc (or custom #ifdefs) take care >>>> of which syscall should be called on which architecture. (C) has >>>> the disadvantage of actually doing the mapping from (A), just in >>>> a less visible way. >>>> >>>> 2) run (execution) time >>>> >>>> (A) again needs to re-use the mapping file (or logic) in most cases, >>>> since - if we want to call ie. open-like syscall, we need to call >>>> explicitly either do_open (non-arm) or do_openat (arm) >>> >>> And in some cases, we need to do both when the arch supports both. >>> >>>> (B) and (C) simplify this case a lot, but may fall short if we >>>> *really* want to call openat instead of open on non-arm >>> >>> We do that today. We test both open and openat using the do_* programs. >>> >>>> 3) auditing time >>>> >>>> (A) shines here, IMO, since the auditctl mapping is 1:1 to the do_* >>>> wrappers. It still needs to re-use the mapping file (and therefore >>>> needs to manually specify which syscalls to audit on which archs), >>>> but it's very straightforward and clear. >>>> >>>> (B) and (C) have a significant problem here - we don't know which >>>> syscalls are being called "under the hood" of the do_* wrappers, >>> >>> Well, we do for all the existing calls. >> >> *We* do, but the tests don't, which is why we have to create ad-hoc >> conditions to tell them. >> >>> >>>> so >>>> we need to try them out and then create per-arch hacks in the code >>>> similar to what we've seen in the arm patchset recently, ie. >>>> "set up auditing for fchmodat and call do_chmod", which can be >>>> somewhat confusing. The other option is to duplicate wrappers like >>>> do_socketcall, but the per-arch hacks still persist. We could create >>>> a mapping file for them, but then we might as well use (A). >>>> >>>> >>>> So what would be the best approach for new (and existing, over time) >>>> syscall wrappers? >>>> >>>> I personally really like (A) due to its clean design - there are no >>>> "Note: There is no glibc wrapper for this system call" exceptions >>>> and it's clear what syscalls run on which architectures and 32/64bit >>>> variations. The mapping file, with its helper bash functions doing ie. >>>> "is this syscall relevant for current arch/mode?" or "list all relevant >>>> syscalls for current arch/mode", along with some documentation, should >>>> be mostly easy to implement. >>>> A practical example in the syscalls bucket would be checking for arch >>>> relevancy in the `+' function with no per-arch or per-mode conditions >>>> in the .conf files. The rollup log (or run.bash --list) would then show >>>> which syscalls were actually run. >>> >>> I don't think there is one "best approach". Some of what we have today >>> is because the suite has evolved over time but some if it is because for >>> the most part, using the libc functions is simple and appropriate. >>> Where's it's not simple or appropriate, we drop back to the the syscall(_NR...) >>> function. Sometimes due to legacy syscalls, like the multiplexed 32-bit >>> syscalls, it gets a bit complicated but hey, we've written that code already. :-) >> >> I agree that using libc is often better and it would be perhaps cleaner >> if auditctl supported syscall translation (ie. chmod->fchmodat on arm), >> but it doesn't, it needs a direct syscall name for a given arch / mode. >> >> My points for (A) go mostly towards the auditing code, see the current >> state of augrok_default and auwatch_default in network/run.conf. >> Or the per-arch conditions in syscalls/*-run.conf. >> >> With a unified way of telling which syscalls are relevant for which >> architectures, the conditions in syscalls/ could go away. The entire >> case/esac structure in network/ could as well, since $syscall would >> always be the tested syscall and no other. >> >> The "unified way" doesn't have to be a static file, it can be generated >> using gcc from unistd.h dynamically. >> >> Also, "we've written that code already" doesn't mean there won't be more >> of it in the future (well, near future). >> >>> >>>> This also means that I would have to throw away most of my series on >>>> do_ipc and do_socketcall, possibly re-implementing them in the future, >>>> but I'm fine with that. >>>> >>>> What's your opinion? >>> >>> If it ain't broke....? >> >> Right, I was wondering whether a "clean" solution wouldn't be an >> overkill, since there aren't that much differences aside from do_ipc >> and do_socketcall. For me, it's just another cleanup, like the netfilter >> or networking-related code. >> >>> >>>> Thanks, >>>> Jiri >>>> >>>> >>>> PS: I'm asking because we currently have ~70 new syscall wrappers staged >>>> for review and there's not much consistency in terms of (A) vs (B) >>>> vs (C). >>> >>> I'm interesting in knowing more about your new syscalls and why you have >>> a mix of A), B) and C) (which I still don't understand but I'm sure a simple >>> example would clear things up). Is it because 3 different people wrote them >>> or because there really is no "best approach"? >> >> Various reasons, really. Written mostly by a single person, (A) was used >> where using (B) would pull in another suite dependency (NUMA), >> (B) was used where (A) would result in two independent wrappers (fstatat >> vs newfstatat), and various mix of both. Yes, because there's no defined >> policy of "best approach". >> >>> >>> I'm not trying to say that we have to keep doing things the way we've always done it. >>> This suite has clearly evolved and improved as you and others have worked on it so I'd >>> like to continue this discussion. >> >> Sure, I'd like to get as much feedback as possible before implementing >> anything (or deciding to leave it be in the current state). >> > > I made some proof-of-concept attempts regarding the automatic build of > only relevant syscalls and I'm not sure anymore whether it's a good > idea. The thing is - it's too "smart", which would be beneficial for > something like LTP (which has TCONF), but not really for our suite, > where we want things to be as static as possible. > > IOW the way I understood it, we don't have any advanced output-checking > logic that would guarantee that all the required/expected syscall tests > were run - this is guaranteed by the (static) logic itself. Right. I think you'd have to start with a static list but you could automate the build of the tests for the calls on that list, but the static list would still have to know about the weirdness for different arches. Our static logic presumes that someone has looked at all the system calls and determined that all that are necessary are actually covered. If we were generating something automatically, we'd probably want to check against a whitelist of system calls we know we need to include, a blacklist of system calls we know we don't need to include, and if there are any that aren't on either list - that should generate a warning. > Using a static list both as a replacement for utils/bin/Makefile and for > execution/auditing could work, but I'm not really sure of its format > and whether it wouldn't go against KISS. Meaning that, sometimes, > leaving things in apparent disorder might be the best solution. At least we understand our mess. :-) > The implication being that I'll stick with (B), eg. using library > functions where possible, ie. fstatat() instead of newfstatat(), etc. > > This also means that I can finish that ipc/socketcall patchseries, > I'm still undecided whether to do it network-style (add ie. do_msgget, > call it, but audit syscall==ipc on related arch/modes) or whether leave > do_ipc in place and just make it a special case (calling bodies of the > "normal" do_* ipc wrappers) instead of a feature. The special case is more like it is now? -- ljk > > Jiri > |
From: Jiri J. <jja...@re...> - 2014-08-05 09:10:25
|
On 08/04/2014 10:08 PM, Linda Knippers wrote: > On 8/1/2014 8:11 AM, Jiri Jaburek wrote: >> I made some proof-of-concept attempts regarding the automatic build of >> only relevant syscalls and I'm not sure anymore whether it's a good >> idea. The thing is - it's too "smart", which would be beneficial for >> something like LTP (which has TCONF), but not really for our suite, >> where we want things to be as static as possible. >> >> IOW the way I understood it, we don't have any advanced output-checking >> logic that would guarantee that all the required/expected syscall tests >> were run - this is guaranteed by the (static) logic itself. > > Right. I think you'd have to start with a static list but you could automate > the build of the tests for the calls on that list, but the static list would > still have to know about the weirdness for different arches. > > Our static logic presumes that someone has looked at all the system calls > and determined that all that are necessary are actually covered. If we > were generating something automatically, we'd probably want to check against > a whitelist of system calls we know we need to include, a blacklist of system > calls we know we don't need to include, and if there are any that aren't on > either list - that should generate a warning. > >> Using a static list both as a replacement for utils/bin/Makefile and for >> execution/auditing could work, but I'm not really sure of its format >> and whether it wouldn't go against KISS. Meaning that, sometimes, >> leaving things in apparent disorder might be the best solution. > > At least we understand our mess. :-) > >> The implication being that I'll stick with (B), eg. using library >> functions where possible, ie. fstatat() instead of newfstatat(), etc. >> >> This also means that I can finish that ipc/socketcall patchseries, >> I'm still undecided whether to do it network-style (add ie. do_msgget, >> call it, but audit syscall==ipc on related arch/modes) or whether leave >> do_ipc in place and just make it a special case (calling bodies of the >> "normal" do_* ipc wrappers) instead of a feature. > > The special case is more like it is now? I wanted to make all the ipc-related wrappers standalone, ie. more like all the other wrappers (do_socket, do_bind, etc.), with do_ipc being a "special case" that can use various hacks and tricks to call ipc(2), but these hacks would be confined to do_ipc.c. Hacks like doing execve() on the standalone wrappers (ie. do_semget) or like (more cleanly) making the standalone wrappers module-friendly (python like) and omitting the main() when included into do_ipc.c. Though I'm not sure whether any of this would be worth reviewing/including as it doesn't improve the situation by much. My current plan is to make a new augrok_ipc function in syscalls/, which would differentiate between various architectures/modes, and use this function in run.conf of syscalls/, eliminating the in-place conditions along with do_ipc and do_socketcall. This makes more sense to me, since we need these conditions only for auditing purposes. This is essentially what network/, trustedprograms/, etc., do. For that to work, I need to sort out other problems first, though. > > -- ljk >> >> Jiri >> > > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > |
From: Jiri J. <jja...@re...> - 2014-08-05 16:35:52
|
On 07/28/2014 03:33 PM, Jiri Jaburek wrote: > (A) is a bit problematic, since we would need to come up with a full > logic of what syscalls to compile on what architectures. We already > do that on some level, but this option calls for a separate mapping > file (or so) instead of simple Makefile-based conditions, > integrating it somehow into the make system. > Mapping file syntax concept (quick draft): <line>: <syscall> | <syscall><separator><archlist> <syscall>: (syscall name, any non-whitespace string) <separator>: (any non-LF whitespace) <archlist>: <neg><archspec> | <neg><archspec>,<archlist> <neg>: ! | (empty) <archspec>: <arch> | <arch>:<bits> <arch>: (uname -m output, any non-whitespace string) | all <bits>: 32 | 64 The logic would go through the <archlist> from left to right, ending the search on first matching expression. Empty <archlist> would imply "relevant everywhere". In the following example, "use" means "compile/run/audit": (comments not included in syntax definition here) # use everywhere accept # same as above accept4 all # never use access !all # use everywhere (leftmost precedence) acct all,!all # never use (leftmost precedence) add_key !all,all # use everywhere, but only on 64bits adjtimex all:64 # don't use on 32bit anywhere afs_syscall !all:32 # use only on s390x and ppc64 alarm s390x,ppc64 # same as above arch_prctl s390x,ppc64,!all Some practical examples: # not on s390x, everywhere on ppc64, others only non-32bit bind !s390x,ppc64,!all:32 # alternative specification of the above, more explicit socket !s390x,ppc64,x86_64 # everywhere but 64bit x86_64 socketcall !x86_64:64 Existing rules from utils/bin/Makefile (some of the special cases): clone2 ia64 # in Makefile as "ONLY86", but in fact this fork !s390x,!ppc64,!ia64,!aarch64 vfork !s390x,!ppc64,!ia64,!aarch64 # ugh, duplications in Makefile mmap2 ppc64:32,s390x:32,all:32 uselib ppc64:32,s390x:32,all:32,all (most of the arch-specific syscalls are missing here, probably due to most wrappers using glibc) I've tried to make it as simple / readable as possible. Please note that this is just a draft in case we go with (A), showing more realistically how it would look. As for the "useless" lines having just the syscall name - we definitely want these to remain, since they assert that the syscall is being tested. Autodetecting anything not on the list wouldn't ensure that. Thanks, Jiri |
From: Linda K. <lin...@hp...> - 2014-08-06 19:41:03
|
Hi Jiri, On 8/5/2014 12:35 PM, Jiri Jaburek wrote: > On 07/28/2014 03:33 PM, Jiri Jaburek wrote: >> (A) is a bit problematic, since we would need to come up with a full >> logic of what syscalls to compile on what architectures. We already >> do that on some level, but this option calls for a separate mapping >> file (or so) instead of simple Makefile-based conditions, >> integrating it somehow into the make system. >> > > Mapping file syntax concept (quick draft): > > <line>: <syscall> | <syscall><separator><archlist> > <syscall>: (syscall name, any non-whitespace string) > <separator>: (any non-LF whitespace) > <archlist>: <neg><archspec> | <neg><archspec>,<archlist> > <neg>: ! | (empty) > <archspec>: <arch> | <arch>:<bits> > <arch>: (uname -m output, any non-whitespace string) | all > <bits>: 32 | 64 > > The logic would go through the <archlist> from left to right, > ending the search on first matching expression. > Empty <archlist> would imply "relevant everywhere". > > In the following example, "use" means "compile/run/audit": > (comments not included in syntax definition here) > > # use everywhere > accept > # same as above > accept4 all > # never use > access !all > # use everywhere (leftmost precedence) > acct all,!all I get this one, but would hope to never see it. > # never use (leftmost precedence) > add_key !all,all This one confusing me, so I really hope to never see it. In fact, I'm confused by "all" and "!all" in places. In some cases, it seems like "all" is implied but in other cases, it's not. > # use everywhere, but only on 64bits > adjtimex all:64 > # don't use on 32bit anywhere > afs_syscall !all:32 Does that mean that "all" or "all:64" are implied? > # use only on s390x and ppc64 > alarm s390x,ppc64 But here, nothing about "all" is implied. > # same as above > arch_prctl s390x,ppc64,!all Oh, but "!all" is implied? > Some practical examples: > > # not on s390x, everywhere on ppc64, others only non-32bit > bind !s390x,ppc64,!all:32 So all:64 is implied. > # alternative specification of the above, more explicit > socket !s390x,ppc64,x86_64 x86_64 means not 32? x86_64 supports 32-bit syscalls on a 64-bit kernel, which is different than just x86, which is just a 32-bit arch? > # everywhere but 64bit x86_64 > socketcall !x86_64:64 > > Existing rules from utils/bin/Makefile (some of the special cases): > > clone2 ia64 > # in Makefile as "ONLY86", but in fact this > fork !s390x,!ppc64,!ia64,!aarch64 > vfork !s390x,!ppc64,!ia64,!aarch64 > # ugh, duplications in Makefile > mmap2 ppc64:32,s390x:32,all:32 > uselib ppc64:32,s390x:32,all:32,all Does that one really mean all? > > (most of the arch-specific syscalls are missing here, probably due to > most wrappers using glibc) > > I've tried to make it as simple / readable as possible. > Please note that this is just a draft in case we go with (A), showing > more realistically how it would look. Yes, thanks. It's really helpful to have concrete examples to consider. > > As for the "useless" lines having just the syscall name - we definitely > want these to remain, since they assert that the syscall is being > tested. Autodetecting anything not on the list wouldn't ensure that. After reading all this, I would think that a line with just the syscall name would mean we know there is a syscall but we're not testing it, unless "all" is implied, but I think that's confusing. Do my comments make sense? -- ljk > > Thanks, > Jiri > > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > |
From: Jiri J. <jja...@re...> - 2014-08-07 08:28:46
|
On 08/06/2014 09:40 PM, Linda Knippers wrote: > Hi Jiri, > > On 8/5/2014 12:35 PM, Jiri Jaburek wrote: >> On 07/28/2014 03:33 PM, Jiri Jaburek wrote: >>> (A) is a bit problematic, since we would need to come up with a full >>> logic of what syscalls to compile on what architectures. We already >>> do that on some level, but this option calls for a separate mapping >>> file (or so) instead of simple Makefile-based conditions, >>> integrating it somehow into the make system. >>> >> >> Mapping file syntax concept (quick draft): >> >> <line>: <syscall> | <syscall><separator><archlist> >> <syscall>: (syscall name, any non-whitespace string) >> <separator>: (any non-LF whitespace) >> <archlist>: <neg><archspec> | <neg><archspec>,<archlist> >> <neg>: ! | (empty) >> <archspec>: <arch> | <arch>:<bits> >> <arch>: (uname -m output, any non-whitespace string) | all >> <bits>: 32 | 64 >> >> The logic would go through the <archlist> from left to right, >> ending the search on first matching expression. >> Empty <archlist> would imply "relevant everywhere". >> >> In the following example, "use" means "compile/run/audit": >> (comments not included in syntax definition here) >> >> # use everywhere >> accept >> # same as above >> accept4 all >> # never use >> access !all >> # use everywhere (leftmost precedence) >> acct all,!all > > I get this one, but would hope to never see it. Example, used just to demonstrate the logic, never to be realistically used as it doesn't make sense. > >> # never use (leftmost precedence) >> add_key !all,all > > This one confusing me, so I really hope to never see it. > In fact, I'm confused by "all" and "!all" in places. In > some cases, it seems like "all" is implied but in other cases, > it's not. Example, see above. > >> # use everywhere, but only on 64bits >> adjtimex all:64 >> # don't use on 32bit anywhere >> afs_syscall !all:32 > > Does that mean that "all" or "all:64" are implied? It means that anything except 32bit architectures is implied, which is - in the current concept - all:64, but might as well be all:128. :) > >> # use only on s390x and ppc64 >> alarm s390x,ppc64 > > But here, nothing about "all" is implied. > >> # same as above >> arch_prctl s390x,ppc64,!all > > Oh, but "!all" is implied? Example, see above. > >> Some practical examples: >> >> # not on s390x, everywhere on ppc64, others only non-32bit >> bind !s390x,ppc64,!all:32 > > So all:64 is implied. > >> # alternative specification of the above, more explicit >> socket !s390x,ppc64,x86_64 > > x86_64 means not 32? x86_64 supports 32-bit syscalls on a 64-bit > kernel, which is different than just x86, which is just a 32-bit > arch? > >> # everywhere but 64bit x86_64 >> socketcall !x86_64:64 >> >> Existing rules from utils/bin/Makefile (some of the special cases): >> >> clone2 ia64 >> # in Makefile as "ONLY86", but in fact this >> fork !s390x,!ppc64,!ia64,!aarch64 >> vfork !s390x,!ppc64,!ia64,!aarch64 >> # ugh, duplications in Makefile >> mmap2 ppc64:32,s390x:32,all:32 >> uselib ppc64:32,s390x:32,all:32,all > > Does that one really mean all? >> >> (most of the arch-specific syscalls are missing here, probably due to >> most wrappers using glibc) >> >> I've tried to make it as simple / readable as possible. >> Please note that this is just a draft in case we go with (A), showing >> more realistically how it would look. > > Yes, thanks. It's really helpful to have concrete examples to > consider. >> >> As for the "useless" lines having just the syscall name - we definitely >> want these to remain, since they assert that the syscall is being >> tested. Autodetecting anything not on the list wouldn't ensure that. > > After reading all this, I would think that a line with just the syscall name > would mean we know there is a syscall but we're not testing it, unless "all" > is implied, but I think that's confusing. > I may have incorrectly explained the logic, basically: - if archlist is missing, the syscall is relevant everywhere - if archlist is specified, the syscall is relevant only to architectures specified in archlist - the archlist is parsed from left to right and the first "matching" archspec is used (think: iptables rules) Eg. the difference for an Intel system reading syscall1 "ppc64" syscall2 "ppc64,!s390x" is that syscall1 never matches (x86_64 == ppc64 is false), but syscall2 does, (x86_64 == ppc64 || x86_64 == !s390x is true). Some more examples from the point of view of x86_64 using an alternate pseudocode: >> bind !s390x,ppc64,!all:32 (x86_64 not in (s390x) || x86_64 in (ppc64) || x86_64 not in (all:32)) would match on the first comparison >> mmap2 ppc64:32,s390x:32,all:32 (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32)) would match on the third comparison if MODE is 32 >> uselib ppc64:32,s390x:32,all:32,all (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32) || x86_64 in (all)) would match on the third comparison if MODE is 32, match on the fourth comparison otherwise I'm open to modifications of the logic, I guess we could make the inclusion more explicit, so that "!s390x" wouldn't mean anything without "all" after it, adding explicit "all" on all lines, my original design was to be more inclusive by default (and make use of !<arch> to exclude architectures that are known to not have the syscall). Jiri |
From: Miroslav V. <mva...@re...> - 2014-08-11 05:17:34
|
Hi, ----- Original Message ----- > On 08/06/2014 09:40 PM, Linda Knippers wrote: > > Hi Jiri, > > > > On 8/5/2014 12:35 PM, Jiri Jaburek wrote: > >> On 07/28/2014 03:33 PM, Jiri Jaburek wrote: > >>> (A) is a bit problematic, since we would need to come up with a full > >>> logic of what syscalls to compile on what architectures. We already > >>> do that on some level, but this option calls for a separate mapping > >>> file (or so) instead of simple Makefile-based conditions, > >>> integrating it somehow into the make system. > >>> > >> > >> Mapping file syntax concept (quick draft): > >> > >> <line>: <syscall> | <syscall><separator><archlist> > >> <syscall>: (syscall name, any non-whitespace string) > >> <separator>: (any non-LF whitespace) > >> <archlist>: <neg><archspec> | <neg><archspec>,<archlist> > >> <neg>: ! | (empty) > >> <archspec>: <arch> | <arch>:<bits> > >> <arch>: (uname -m output, any non-whitespace string) | all > >> <bits>: 32 | 64 > >> > >> The logic would go through the <archlist> from left to right, > >> ending the search on first matching expression. > >> Empty <archlist> would imply "relevant everywhere". > >> > >> In the following example, "use" means "compile/run/audit": > >> (comments not included in syntax definition here) > >> > >> # use everywhere > >> accept > >> # same as above > >> accept4 all > >> # never use > >> access !all > >> # use everywhere (leftmost precedence) > >> acct all,!all > > > > I get this one, but would hope to never see it. > > Example, used just to demonstrate the logic, never to be realistically > used as it doesn't make sense. > > > > >> # never use (leftmost precedence) > >> add_key !all,all > > > > This one confusing me, so I really hope to never see it. > > In fact, I'm confused by "all" and "!all" in places. In > > some cases, it seems like "all" is implied but in other cases, > > it's not. > > Example, see above. > > > > >> # use everywhere, but only on 64bits > >> adjtimex all:64 > >> # don't use on 32bit anywhere > >> afs_syscall !all:32 > > > > Does that mean that "all" or "all:64" are implied? > > It means that anything except 32bit architectures is implied, > which is - in the current concept - all:64, but might as well > be all:128. :) > > > > >> # use only on s390x and ppc64 > >> alarm s390x,ppc64 > > > > But here, nothing about "all" is implied. > > > >> # same as above > >> arch_prctl s390x,ppc64,!all > > > > Oh, but "!all" is implied? > > Example, see above. > > > > >> Some practical examples: > >> > >> # not on s390x, everywhere on ppc64, others only non-32bit > >> bind !s390x,ppc64,!all:32 > > > > So all:64 is implied. > > > >> # alternative specification of the above, more explicit > >> socket !s390x,ppc64,x86_64 > > > > x86_64 means not 32? x86_64 supports 32-bit syscalls on a 64-bit > > kernel, which is different than just x86, which is just a 32-bit > > arch? > > > >> # everywhere but 64bit x86_64 > >> socketcall !x86_64:64 > >> > >> Existing rules from utils/bin/Makefile (some of the special cases): > >> > >> clone2 ia64 > >> # in Makefile as "ONLY86", but in fact this > >> fork !s390x,!ppc64,!ia64,!aarch64 > >> vfork !s390x,!ppc64,!ia64,!aarch64 > >> # ugh, duplications in Makefile > >> mmap2 ppc64:32,s390x:32,all:32 > >> uselib ppc64:32,s390x:32,all:32,all > > > > Does that one really mean all? > >> > >> (most of the arch-specific syscalls are missing here, probably due to > >> most wrappers using glibc) > >> > >> I've tried to make it as simple / readable as possible. > >> Please note that this is just a draft in case we go with (A), showing > >> more realistically how it would look. > > > > Yes, thanks. It's really helpful to have concrete examples to > > consider. > >> > >> As for the "useless" lines having just the syscall name - we definitely > >> want these to remain, since they assert that the syscall is being > >> tested. Autodetecting anything not on the list wouldn't ensure that. > > > > After reading all this, I would think that a line with just the syscall > > name > > would mean we know there is a syscall but we're not testing it, unless > > "all" > > is implied, but I think that's confusing. > > > > I may have incorrectly explained the logic, basically: > > - if archlist is missing, the syscall is relevant everywhere > - if archlist is specified, the syscall is relevant only to > architectures specified in archlist > - the archlist is parsed from left to right and the first "matching" > archspec is used (think: iptables rules) > > Eg. the difference for an Intel system reading > > syscall1 "ppc64" > syscall2 "ppc64,!s390x" > > is that syscall1 never matches (x86_64 == ppc64 is false), but syscall2 > does, (x86_64 == ppc64 || x86_64 == !s390x is true). That syscall2 example is confusing. Effectively it is the same as !s390x isn't it? > > Some more examples from the point of view of x86_64 using an alternate > pseudocode: > > >> bind !s390x,ppc64,!all:32 > > (x86_64 not in (s390x) || x86_64 in (ppc64) || x86_64 not in (all:32)) > would match on the first comparison Again, isn't this effectively !s390x? AFAICS i386 for example gets in because of the first expression. Or does using all imply this logic? (arch not in s390x or arch in ppc64) and arch not 32bit? BTW does s390x mean both s390x:64, s390x:32? > > >> mmap2 ppc64:32,s390x:32,all:32 > (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32)) > would match on the third comparison if MODE is 32 > > >> uselib ppc64:32,s390x:32,all:32,all > (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32) || > x86_64 in (all)) > would match on the third comparison if MODE is 32, match on the fourth > comparison otherwise > > > I'm open to modifications of the logic, I guess we could make the > inclusion more explicit, so that "!s390x" wouldn't mean anything without > "all" after it, adding explicit "all" on all lines, my original design > was to be more inclusive by default (and make use of !<arch> to exclude > architectures that are known to not have the syscall). Why would not be good to add and between the expressions and except all only for inclusion? So for example all:32,!i386 could mean everything 32bit except i386 The current proposal seems quite confusing to me. I reread the previous email also but it didn't help. Best regards, /M > > Jiri > > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future. > http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > -- Miroslav Vadkerti :: Senior Quality Assurance Engineer / RHCE :: BaseOS QE - Security Phone +420 532 294 129 :: CR cell +420 776 864 252 :: SR cell +421 904 135 440 IRC mvadkert at #qe #urt #brno #rpmdiff :: GnuPG ID 0x25881087 at pgp.mit.edu Red Hat s.r.o, Purkyňova 99/71, 612 45, Brno, Czech Republic |
From: Jiri J. <jja...@re...> - 2014-08-11 09:05:18
|
On 08/11/2014 07:17 AM, Miroslav Vadkerti wrote: > Hi, > >> >> - if archlist is missing, the syscall is relevant everywhere >> - if archlist is specified, the syscall is relevant only to >> architectures specified in archlist >> - the archlist is parsed from left to right and the first "matching" >> archspec is used (think: iptables rules) >> >> Eg. the difference for an Intel system reading >> >> syscall1 "ppc64" >> syscall2 "ppc64,!s390x" >> >> is that syscall1 never matches (x86_64 == ppc64 is false), but syscall2 >> does, (x86_64 == ppc64 || x86_64 == !s390x is true). > > That syscall2 example is confusing. Effectively it is the same as !s390x > isn't it? Yes, I should have chosen more "valid" rules, but my point was to illustrate the first-match logic on expressions. > >> >> Some more examples from the point of view of x86_64 using an alternate >> pseudocode: >> >>>> bind !s390x,ppc64,!all:32 >> >> (x86_64 not in (s390x) || x86_64 in (ppc64) || x86_64 not in (all:32)) >> would match on the first comparison > > Again, isn't this effectively !s390x? AFAICS i386 for example gets in because > of the first expression. Or does using all imply this logic? > > (arch not in s390x or arch in ppc64) and arch not 32bit? Indeed, i386 (like anything 32bit/64bit, which is not s390x) would match the first rule. > > BTW does s390x mean both s390x:64, s390x:32? It means "disregard MODE", meaning 64, 32 or any other value. > >> >>>> mmap2 ppc64:32,s390x:32,all:32 >> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32)) >> would match on the third comparison if MODE is 32 >> >>>> uselib ppc64:32,s390x:32,all:32,all >> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32) || >> x86_64 in (all)) >> would match on the third comparison if MODE is 32, match on the fourth >> comparison otherwise >> >> >> I'm open to modifications of the logic, I guess we could make the >> inclusion more explicit, so that "!s390x" wouldn't mean anything without >> "all" after it, adding explicit "all" on all lines, my original design >> was to be more inclusive by default (and make use of !<arch> to exclude >> architectures that are known to not have the syscall). > > Why would not be good to add and between the expressions and except > all only for inclusion? > > So for example > > all:32,!i386 could mean everything 32bit except i386 How would you specify, for example, "only s390x and ppc64"? (arch == s390x && arch == ppc64) never matches. > > The current proposal seems quite confusing to me. I reread the previous email > also but it didn't help. The idea was to create a simple and easy-to-understand syntax. If it's confusing, then it's not what I was looking for. What about using the explicit approach that Linda pointed out? Ie. # run only on s390x and ppc64 syscall1 s390x,ppc64 # run on everything except s390x and ppc64 syscall2 !s390x,!ppc64,all # don't run on anything syscall3 # run on everything syscall4 all # run on everything 32-bit syscall5 all:32 # run on everything non-64bit (basically same as above) syscall6 !all:64,all # not on s390x, everywhere on ppc64, others only 64bit bind !s390x,ppc64,all:64 # not on s390x, everywhere on ppc64, others only non-32bit bind !s390x,ppc64,!all:32,all # everywhere but 64bit x86_64 socketcall !x86_64:64,all The difference against my original proposal is that negation doesn't automatically include the remainder from "all". IOW that !s390x doesn't add anything, only stops the expression processing on s390x. This "kind of" works as logical AND, but not really. :) > > Best regards, > /M Jiri |
From: Miroslav V. <mva...@re...> - 2014-08-11 09:32:01
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, On 08/11/2014 11:05 AM, Jiri Jaburek wrote: > On 08/11/2014 07:17 AM, Miroslav Vadkerti wrote: >> Hi, >> >>> >>> - if archlist is missing, the syscall is relevant everywhere - if archlist is specified, >>> the syscall is relevant only to architectures specified in archlist - the archlist is >>> parsed from left to right and the first "matching" archspec is used (think: iptables >>> rules) >>> >>> Eg. the difference for an Intel system reading >>> >>> syscall1 "ppc64" syscall2 "ppc64,!s390x" >>> >>> is that syscall1 never matches (x86_64 == ppc64 is false), but syscall2 does, (x86_64 == >>> ppc64 || x86_64 == !s390x is true). >> >> That syscall2 example is confusing. Effectively it is the same as !s390x isn't it? > > Yes, I should have chosen more "valid" rules, but my point was to illustrate the first-match > logic on expressions. > >> >>> >>> Some more examples from the point of view of x86_64 using an alternate pseudocode: >>> >>>>> bind !s390x,ppc64,!all:32 >>> >>> (x86_64 not in (s390x) || x86_64 in (ppc64) || x86_64 not in (all:32)) would match on the >>> first comparison >> >> Again, isn't this effectively !s390x? AFAICS i386 for example gets in because of the first >> expression. Or does using all imply this logic? >> >> (arch not in s390x or arch in ppc64) and arch not 32bit? > > Indeed, i386 (like anything 32bit/64bit, which is not s390x) would match the first rule. > >> >> BTW does s390x mean both s390x:64, s390x:32? > > It means "disregard MODE", meaning 64, 32 or any other value. > >> >>> >>>>> mmap2 ppc64:32,s390x:32,all:32 >>> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32)) would match on the >>> third comparison if MODE is 32 >>> >>>>> uselib ppc64:32,s390x:32,all:32,all >>> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32) || x86_64 in (all)) >>> would match on the third comparison if MODE is 32, match on the fourth comparison >>> otherwise >>> >>> >>> I'm open to modifications of the logic, I guess we could make the inclusion more explicit, >>> so that "!s390x" wouldn't mean anything without "all" after it, adding explicit "all" on >>> all lines, my original design was to be more inclusive by default (and make use of !<arch> >>> to exclude architectures that are known to not have the syscall). >> >> Why would not be good to add and between the expressions and except all only for inclusion? >> >> So for example >> >> all:32,!i386 could mean everything 32bit except i386 > > How would you specify, for example, "only s390x and ppc64"? (arch == s390x && arch == ppc64) > never matches. Yep, that would be obscure via negation. So maybe switch to and logic only if 'all' specified. But I got less thinking to this than you, so I will now let this thoughts be :) > >> >> The current proposal seems quite confusing to me. I reread the previous email also but it >> didn't help. > > The idea was to create a simple and easy-to-understand syntax. If it's confusing, then it's > not what I was looking for. > > What about using the explicit approach that Linda pointed out? Ie. > > # run only on s390x and ppc64 syscall1 s390x,ppc64 # run on everything except s390x and ppc64 > syscall2 !s390x,!ppc64,all # don't run on anything syscall3 # run on everything syscall4 all # > run on everything 32-bit syscall5 all:32 # run on everything non-64bit (basically same as > above) syscall6 !all:64,all > > # not on s390x, everywhere on ppc64, others only 64bit bind !s390x,ppc64,all:64 # not on > s390x, everywhere on ppc64, others only non-32bit bind !s390x,ppc64,!all:32,all # everywhere > but 64bit x86_64 socketcall !x86_64:64,all > > The difference against my original proposal is that negation doesn't automatically include the > remainder from "all". IOW that !s390x doesn't add anything, only stops the expression > processing on s390x. This "kind of" works as logical AND, but not really. :) This looks more understandable, and I like it more then the previous implicit 'all'. Thanks for the explanation. /M > >> >> Best regards, /M > > Jiri > - -- Miroslav Vadkerti :: Senior Quality Assurance Engineer / RHCSS :: BaseOS QE - Security Phone +420 532 294 129 :: CR cell +420 776 864 252 :: SR cell +421 904 135 440 IRC mvadkert at #qe #urt #brno #rpmdiff :: GnuPG ID 0x25881087 at pgp.mit.edu Red Hat s.r.o, Purkyňova 99/71, 612 45, Brno, Czech Republic -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJT6I2EAAoJEBliWhMliBCH1BEH/RKT5AOFj0SJDD2o/yc3VeM4 QIp7VR36D5RRn27dgNB8/b7r6FW5NF8/ZC9H4+gdT4bNaw8ib6DkRJ9ez+HwaPmd vr8Qa2syrzQISlm6vqJp8ahDNgYeJQugVRxAGe2pF/iSdP+d3gkrFydaahgHVZzg LyprVcp5CHPNFEkOjV62Zb2RXy9HcZf0+2AuMyBWAf0+Lf/ooJa5nEfMhwGmkjcj 8FyHWVO+yTE999hEaK6s77A53NgvBAksKAloAy9ae1ObYG2A1Yo6Wqz4LN/XDqEL K8qN7sAABOpGVw2AC1UeLcttfIfFjlfXepumEOR9zOwCmdq2CnF401PkoiQSwtU= =+zUA -----END PGP SIGNATURE----- |
From: Linda K. <lin...@hp...> - 2014-08-13 16:45:43
|
On 8/11/2014 5:05 AM, Jiri Jaburek wrote: > On 08/11/2014 07:17 AM, Miroslav Vadkerti wrote: >> Hi, >> >>> >>> - if archlist is missing, the syscall is relevant everywhere >>> - if archlist is specified, the syscall is relevant only to >>> architectures specified in archlist >>> - the archlist is parsed from left to right and the first "matching" >>> archspec is used (think: iptables rules) >>> >>> Eg. the difference for an Intel system reading >>> >>> syscall1 "ppc64" >>> syscall2 "ppc64,!s390x" >>> >>> is that syscall1 never matches (x86_64 == ppc64 is false), but syscall2 >>> does, (x86_64 == ppc64 || x86_64 == !s390x is true). >> >> That syscall2 example is confusing. Effectively it is the same as !s390x >> isn't it? > > Yes, I should have chosen more "valid" rules, but my point was to > illustrate the first-match logic on expressions. > >> >>> >>> Some more examples from the point of view of x86_64 using an alternate >>> pseudocode: >>> >>>>> bind !s390x,ppc64,!all:32 >>> >>> (x86_64 not in (s390x) || x86_64 in (ppc64) || x86_64 not in (all:32)) >>> would match on the first comparison >> >> Again, isn't this effectively !s390x? AFAICS i386 for example gets in because >> of the first expression. Or does using all imply this logic? >> >> (arch not in s390x or arch in ppc64) and arch not 32bit? > > Indeed, i386 (like anything 32bit/64bit, which is not s390x) would > match the first rule. > >> >> BTW does s390x mean both s390x:64, s390x:32? > > It means "disregard MODE", meaning 64, 32 or any other value. > >> >>> >>>>> mmap2 ppc64:32,s390x:32,all:32 >>> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32)) >>> would match on the third comparison if MODE is 32 >>> >>>>> uselib ppc64:32,s390x:32,all:32,all >>> (x86_64 in (ppc64:32) || x86_64 in (s390x:32) || x86_64 in (all:32) || >>> x86_64 in (all)) >>> would match on the third comparison if MODE is 32, match on the fourth >>> comparison otherwise >>> >>> >>> I'm open to modifications of the logic, I guess we could make the >>> inclusion more explicit, so that "!s390x" wouldn't mean anything without >>> "all" after it, adding explicit "all" on all lines, my original design >>> was to be more inclusive by default (and make use of !<arch> to exclude >>> architectures that are known to not have the syscall). >> >> Why would not be good to add and between the expressions and except >> all only for inclusion? >> >> So for example >> >> all:32,!i386 could mean everything 32bit except i386 > > How would you specify, for example, "only s390x and ppc64"? > (arch == s390x && arch == ppc64) never matches. > >> >> The current proposal seems quite confusing to me. I reread the previous email >> also but it didn't help. > > The idea was to create a simple and easy-to-understand syntax. If it's > confusing, then it's not what I was looking for. > > What about using the explicit approach that Linda pointed out? > Ie. > > # run only on s390x and ppc64 > syscall1 s390x,ppc64 > # run on everything except s390x and ppc64 > syscall2 !s390x,!ppc64,all > # don't run on anything > syscall3 > # run on everything > syscall4 all > # run on everything 32-bit > syscall5 all:32 > # run on everything non-64bit (basically same as above) > syscall6 !all:64,all > > # not on s390x, everywhere on ppc64, others only 64bit > bind !s390x,ppc64,all:64 > # not on s390x, everywhere on ppc64, others only non-32bit > bind !s390x,ppc64,!all:32,all > # everywhere but 64bit x86_64 > socketcall !x86_64:64,all > > The difference against my original proposal is that negation doesn't > automatically include the remainder from "all". IOW that !s390x doesn't > add anything, only stops the expression processing on s390x. > This "kind of" works as logical AND, but not really. :) This works better for me. Thanks for all the examples. -- ljk > >> >> Best regards, >> /M > > Jiri > > > ------------------------------------------------------------------------------ > _______________________________________________ > Audit-test-developer mailing list > Aud...@li... > https://lists.sourceforge.net/lists/listinfo/audit-test-developer > |