From: Jay L. <jl...@sg...> - 2005-02-19 00:51:45
Attachments:
acct_common
|
It started with the need of CSA to handle end-of-process (eop) at do_exit() at exit.c. The hook at exit.c was BSD Accounting specific. Since the need of Linux system accounting has gone beyond what BSD accounting provides, i think it is a good idea to create a thin layer of common code for various accounting packages, such as BSD accounting, CSA, ELSA, etc. The hook to do_exit() at exit.c was changed to invoke a routine in the common code which would then invoke those accounting packages that register to the acct_common to handle do_exit situation. Here is the description of this acct_common patch: 1) two new files at include/linux/acct_common.h and kernel/acct_common.c 2) A new config flag CONFIG_ACCT_COMMON is created and CONFIG_BSD_PROCESS_ACCT and a future CSA config flag depend on it. I think it is a good idea to always have acct_common in the kernel; in that case, the new config flag may not be really necessary. I can go either way. 3) Accounting packages can register themselves to acct_common for callbacks. Only do_exit handling is defined now. BSD acct.c has been modified to register/unergister to acct_common. 4) The 'enhanced acct data collection' routines have been moved from acct.c to acct_common.c. Files used to #include <linux/acct.h> were modified to #include <linux/acct_common.h>. This patch was generated against 2.6.11-rc3-mm2. Signed-off-by: Jay Lan <jl...@sg...> |
From: Andrew M. <ak...@os...> - 2005-02-19 01:11:41
|
Jay Lan <jl...@sg...> wrote: > > Since the need of Linux system accounting has gone beyond what BSD > accounting provides, i think it is a good idea to create a thin layer > of common code for various accounting packages, such as BSD accounting, > CSA, ELSA, etc. The hook to do_exit() at exit.c was changed to invoke > a routine in the common code which would then invoke those accounting > packages that register to the acct_common to handle do_exit situation. This all seems to be heading in the wrong direction. Do we really want to have lots of different system accounting packages all hooking into a generic we-cant-decide-what-to-do-so-we-added-some-pointless-overhead framework? Can't we get _one_ accounting system in there, get it right, avoid the framework? |
From: Guillaume T. <gui...@bu...> - 2005-02-21 06:53:55
|
On Fri, 2005-02-18 at 17:16 -0800, Andrew Morton wrote: > Jay Lan <jl...@sg...> wrote: > > > > Since the need of Linux system accounting has gone beyond what BSD > > accounting provides, i think it is a good idea to create a thin layer > > of common code for various accounting packages, such as BSD accounting, > > CSA, ELSA, etc. The hook to do_exit() at exit.c was changed to invoke > > a routine in the common code which would then invoke those accounting > > packages that register to the acct_common to handle do_exit situation. > > This all seems to be heading in the wrong direction. Do we really want to > have lots of different system accounting packages all hooking into a > generic we-cant-decide-what-to-do-so-we-added-some-pointless-overhead > framework? > > Can't we get _one_ accounting system in there, get it right, avoid the > framework? Is it possible to just merge the BSD accounting and the CSA accounting by adding in the current BSD per-process accounting structure some missing fields like the mm integral provided by the CSA patch? ELSA is just a user of the accounting data. We need a hook in the do_fork() routine to manage group of processes, not to do accounting. Guillaume |
From: Kaigai K. <ka...@ak...> - 2005-02-21 07:53:35
|
Hello, everyone. Andrew Morton wrote: > Jay Lan <jl...@sg...> wrote: > >>Since the need of Linux system accounting has gone beyond what BSD >>accounting provides, i think it is a good idea to create a thin layer >>of common code for various accounting packages, such as BSD accounting, >>CSA, ELSA, etc. The hook to do_exit() at exit.c was changed to invoke >>a routine in the common code which would then invoke those accounting >>packages that register to the acct_common to handle do_exit situation. > > > This all seems to be heading in the wrong direction. Do we really want to > have lots of different system accounting packages all hooking into a > generic we-cant-decide-what-to-do-so-we-added-some-pointless-overhead > framework? > > Can't we get _one_ accounting system in there, get it right, avoid the > framework? I think there are two issues about system accounting framework. Issue: 1) How to define the appropriate unit for accounting ? Current BSD-accountiong make a collection per process accounting information. CSA make additionally a collection per process-aggregation accounting. It is appropriate to make the fork-exit event handling framework for definition of the process-aggregation, such as PAGG. This system-accounting per process-aggregation is quite useful, thought I tried the SGI's implementation named 'job' in past days. Issue: 2) What items should be collected for accounting information ? BSD-accounting collects PID/UID/GID, User/Sys/Elapsed-Time, and # of minor/major page faults. SGI's CSA collects VM/RSS size on exit time, Integral-VM/RSS, and amount of block-I/O additionally. I think it's hard to implement the accounting-engine as a kernel loadable module using any kinds of framework. Because, we must put callback functions into all around the kernel for this purpose. Thus, I make a proposion as follows: We should separate the process-aggregation functionality and collecting accounting informations. Something of framework to implement process-aggregation is necessary. And, making a collection of accounting information should be merged into BSD-accounting and implemented as a part of monolithic kernel as Guillaume said. Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Jay L. <jl...@sg...> - 2005-02-22 20:25:49
|
Kaigai Kohei wrote: > Hello, everyone. > > Andrew Morton wrote: > > Jay Lan <jl...@sg...> wrote: > > > >>Since the need of Linux system accounting has gone beyond what BSD > >>accounting provides, i think it is a good idea to create a thin layer > >>of common code for various accounting packages, such as BSD accounting, > >>CSA, ELSA, etc. The hook to do_exit() at exit.c was changed to invoke > >>a routine in the common code which would then invoke those accounting > >>packages that register to the acct_common to handle do_exit situation. > > > > > > This all seems to be heading in the wrong direction. Do we really > want to > > have lots of different system accounting packages all hooking into a > > generic we-cant-decide-what-to-do-so-we-added-some-pointless-overhead > > framework? > > > > Can't we get _one_ accounting system in there, get it right, avoid the > > framework? > > I think there are two issues about system accounting framework. > > Issue: 1) How to define the appropriate unit for accounting ? > Current BSD-accountiong make a collection per process accounting > information. > CSA make additionally a collection per process-aggregation accounting. The 'enhanced acct data collection' patches that were added to 2-6-11-rc* tree still do collection of per process data. CSA added those per-process data to per-aggregation ("job") data structure at do_exit() time when a process termintes. > > It is appropriate to make the fork-exit event handling framework for > definition > of the process-aggregation, such as PAGG. > > This system-accounting per process-aggregation is quite useful, > thought I tried the SGI's implementation named 'job' in past days. > > > Issue: 2) What items should be collected for accounting information ? > BSD-accounting collects PID/UID/GID, User/Sys/Elapsed-Time, and # of > minor/major page faults. SGI's CSA collects VM/RSS size on exit time, > Integral-VM/RSS, and amount of block-I/O additionally. These data are now collected in 2.6.11-rc* code. Note that these data are still per-process. > > I think it's hard to implement the accounting-engine as a kernel loadable > module using any kinds of framework. Because, we must put callback > functions > into all around the kernel for this purpose. > > Thus, I make a proposion as follows: > We should separate the process-aggregation functionality and collecting > accounting informations. I totally agree with this! Actually that was what we have done. The data collection part of code has been unified. > Something of framework to implement process-aggregation is necessary. > And, making a collection of accounting information should be merged > into BSD-accounting and implemented as a part of monolithic kernel > as Guillaume said. This sounds good. I am interested in learning how ELSA saves off the per-process accounting data before the data got disposed. If that scheme works for CSA, we would be very happy to adopt the scheme. The current BSD scheme is very insufficient. The code is very BSD centric and it provides no way to handle process-aggregation. Thanks, - jay > > Thanks. |
From: Kaigai K. <ka...@ak...> - 2005-02-23 07:06:30
|
Hi, Thanks for your comments. >> I think there are two issues about system accounting framework. >> >> Issue: 1) How to define the appropriate unit for accounting ? >> Current BSD-accountiong make a collection per process accounting >> information. >> CSA make additionally a collection per process-aggregation accounting. > > > The 'enhanced acct data collection' patches that were added to > 2-6-11-rc* tree still do collection of per process data. Hmm, I have not noticed this extension. But I made sure about it. The following your two patches implements enhanced data collection, didn't it? - ChangeLog for 2.6.11-rc1 [PATCH] enhanced I/O accounting data patch [PATCH] enhanced Memory accounting data collection Since making a collection per process accounting is unified to the stock kernel, I want to have a discussion about remaining half, "How to define the appropriate unit for accounting ?" We can agree that only per process-accounting is so rigid, I think. Then, process-aggregation should be provided in one way or another. [1] Is it necessary 'fork/exec/exit' event handling framework ? The common agreement for the method of dealing with process aggregation has not been constructed yet, I understood. And, we will not able to integrate each process aggregation model because of its diverseness. For example, a process which belong to JOB-A must not belong any other 'JOB-X' in CSA-model. But, In ELSA-model, a process in BANK-B can concurrently belong to BANK-B1 which is a child of BANK-B. And, there are other defferences: Whether a process not to belong to any process-aggregation is permitted or not ? Whether a process-aggregation should be inherited to child process or not ? (There is possibility not to be inherited in a rule-based process aggregation like CKRM) Some process-aggregation model have own philosophy and implemantation, so it's hard to integrate. Thus, I think that common 'fork/exec/exit' event handling framework to implement any kinds of process-aggregation. [2] What implementation should be adopted ? I think registerable hooks on fork/execve/exit is necessary, not only exit() hook. Because a rule or policy based process-aggregation model requirees to catch the transition of a process status. It might be enough to hook the exit() event only in process-accounting, but it's not kind for another customer. Thus, I recommend SGI's PAGG. In my understanding, the reason for not to include such a framework is that increase of unidentifiable (proprietary) modules is worried. But, SI can divert LSM to implemente process-aggregation if they ignore the LSM's original purpose, for example. # I'm strongly opposed to such a movement as a SELinux's user :-) So, I think such a fork/execve/exit hooks is harmless now. Is this the time to unify it? Thanks. > CSA added those per-process data to per-aggregation ("job") data > structure at do_exit() time when a process termintes. > >> >> It is appropriate to make the fork-exit event handling framework for >> definition >> of the process-aggregation, such as PAGG. >> >> This system-accounting per process-aggregation is quite useful, >> thought I tried the SGI's implementation named 'job' in past days. >> >> >> Issue: 2) What items should be collected for accounting information ? >> BSD-accounting collects PID/UID/GID, User/Sys/Elapsed-Time, and # of >> minor/major page faults. SGI's CSA collects VM/RSS size on exit time, >> Integral-VM/RSS, and amount of block-I/O additionally. > > > These data are now collected in 2.6.11-rc* code. Note that these data > are still per-process. > >> >> I think it's hard to implement the accounting-engine as a kernel loadable >> module using any kinds of framework. Because, we must put callback >> functions >> into all around the kernel for this purpose. >> >> Thus, I make a proposion as follows: >> We should separate the process-aggregation functionality and collecting >> accounting informations. > > > I totally agree with this! Actually that was what we have done. The data > collection part of code has been unified. > >> Something of framework to implement process-aggregation is necessary. >> And, making a collection of accounting information should be merged >> into BSD-accounting and implemented as a part of monolithic kernel >> as Guillaume said. > > > This sounds good. I am interested in learning how ELSA saves off > the per-process accounting data before the data got disposed. If > that scheme works for CSA, we would be very happy to adopt the > scheme. The current BSD scheme is very insufficient. The code is > very BSD centric and it provides no way to handle process-aggregation. > > Thanks, > - jay > >> >> Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Andrew M. <ak...@os...> - 2005-02-23 07:21:02
|
Kaigai Kohei <ka...@ak...> wrote: > > The common agreement for the method of dealing with process aggregation > has not been constructed yet, I understood. And, we will not able to > integrate each process aggregation model because of its diverseness. > > For example, a process which belong to JOB-A must not belong any other > 'JOB-X' in CSA-model. But, In ELSA-model, a process in BANK-B can concurrently > belong to BANK-B1 which is a child of BANK-B. > > And, there are other defferences: > Whether a process not to belong to any process-aggregation is permitted or not ? > Whether a process-aggregation should be inherited to child process or not ? > (There is possibility not to be inherited in a rule-based process aggregation like CKRM) > > Some process-aggregation model have own philosophy and implemantation, > so it's hard to integrate. Thus, I think that common 'fork/exec/exit' event handling > framework to implement any kinds of process-aggregation. We really want to avoid doing such stuff in-kernel if at all possible, of course. Is it not possible to implement the fork/exec/exit notifications to userspace so that a daemon can track the process relationships and perform aggregation based upon individual tasks' accounting? That's what one of the accounting systems is proposing doing, I believe. (In fact, why do we even need the notifications? /bin/ps can work this stuff out). |
From: Guillaume T. <gui...@bu...> - 2005-02-23 08:34:21
|
On Tue, 2005-02-22 at 23:20 -0800, Andrew Morton wrote: > Kaigai Kohei <ka...@ak...> wrote: > > > > The common agreement for the method of dealing with process aggregation > > has not been constructed yet, I understood. And, we will not able to > > integrate each process aggregation model because of its diverseness. > > > > For example, a process which belong to JOB-A must not belong any other > > 'JOB-X' in CSA-model. But, In ELSA-model, a process in BANK-B can concurrently > > belong to BANK-B1 which is a child of BANK-B. > > > > And, there are other defferences: > > Whether a process not to belong to any process-aggregation is permitted or not ? > > Whether a process-aggregation should be inherited to child process or not ? > > (There is possibility not to be inherited in a rule-based process aggregation like CKRM) > > > > Some process-aggregation model have own philosophy and implemantation, > > so it's hard to integrate. Thus, I think that common 'fork/exec/exit' event handling > > framework to implement any kinds of process-aggregation. I can add "policies". With ELSA, a process belongs to one or several groups and if a process is removed from one group, its children still belong to the group. Thus a good idea could be to associate a "philosophy" to a group. For exemple, when a group of processes is created it can be tagged as UNIQUE or SHARED. UNIQUE means that a process that belongs to it could not be added in another group by opposition to SHARED. It's not needed inside the kernel. > We really want to avoid doing such stuff in-kernel if at all possible, of > course. > > Is it not possible to implement the fork/exec/exit notifications to > userspace so that a daemon can track the process relationships and perform > aggregation based upon individual tasks' accounting? That's what one of > the accounting systems is proposing doing, I believe. It's what I'm proposing. The problem is to be alerted when a new process is created in order to add it in the correct group of processes if the parent belongs to one (or several) groups. The notification can be done with the fork connector patch. > (In fact, why do we even need the notifications? /bin/ps can work this > stuff out). Yes it can but the risk is to lose some forks no? I think that /bin/ps is using the /proc interface. If we're polling the /proc to catch process creation we may lost some of them. With the fork connector we catch all forks and we can check that by using the sequence number (incremented by each fork) of the message. Guillaume |
From: Tim S. <ti...@ph...> - 2005-02-23 09:52:53
|
On Tue, 22 Feb 2005, Andrew Morton wrote: > We really want to avoid doing such stuff in-kernel if at all possible, of > course. > > Is it not possible to implement the fork/exec/exit notifications to > userspace so that a daemon can track the process relationships and perform > aggregation based upon individual tasks' accounting? That's what one of > the accounting systems is proposing doing, I believe. > > (In fact, why do we even need the notifications? /bin/ps can work this > stuff out). I had started a proof of concept implementation that could reconstruct the whole process tree from userspace just from the BSD accounting currently in the kernel (+ the conceptual bug-fix that I misnamed "[RFC] "biological parent" pid"). This could do the whole job ID thing from userspace. Unfortunately, I haven't had time to work on it recently. Also, doing per-job accounting might actually be more lightweight than per-process accounting, so I'm not at all opposed to unifying CSA and BSD accounting into one mechanism that just writes different file formats. A complete framework seems like overkill to me, too. Tim |
From: Jay L. <jl...@sg...> - 2005-02-24 22:27:46
|
Tim Schmielau wrote: > On Tue, 22 Feb 2005, Andrew Morton wrote: > > >>We really want to avoid doing such stuff in-kernel if at all possible, of >>course. >> >>Is it not possible to implement the fork/exec/exit notifications to >>userspace so that a daemon can track the process relationships and perform >>aggregation based upon individual tasks' accounting? That's what one of >>the accounting systems is proposing doing, I believe. >> >>(In fact, why do we even need the notifications? /bin/ps can work this >>stuff out). > > > > I had started a proof of concept implementation that could reconstruct the > whole process tree from userspace just from the BSD accounting currently > in the kernel (+ the conceptual bug-fix that I misnamed "[RFC] "biological > parent" pid"). This could do the whole job ID thing from userspace. > Unfortunately, I haven't had time to work on it recently. > > Also, doing per-job accounting might actually be more lightweight than > per-process accounting, so I'm not at all opposed to unifying CSA and BSD > accounting into one mechanism that just writes different file formats. Thanks, Tim! After spending some time studying how ELSA works, it appeared to me that CSA still needs a hook for do_exit. Since people agreed that a complete framework was an overkill, i would be glad to submit another patch later just to provide a CSA exit-handling inside the acct_process(). Thanks, - jay > > A complete framework seems like overkill to me, too. > > Tim > > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech |
From: Kaigai K. <ka...@ak...> - 2005-02-23 11:28:45
|
Hi, Thanks for your comments. Andrew Morton wrote: >> Some process-aggregation model have own philosophy and implemantation, >> so it's hard to integrate. Thus, I think that common 'fork/exec/exit' event handling >> framework to implement any kinds of process-aggregation. > > > We really want to avoid doing such stuff in-kernel if at all possible, of > course. > > Is it not possible to implement the fork/exec/exit notifications to > userspace so that a daemon can track the process relationships and perform > aggregation based upon individual tasks' accounting? That's what one of > the accounting systems is proposing doing, I believe. > > (In fact, why do we even need the notifications? /bin/ps can work this > stuff out). It's hard to prove that we can't implement the process-aggregation only in user-space, but there are some difficulties on imaplementation, I think. For example, each process must have a tag or another identifier to explain what process-aggregation does it belong, but kernel does not support thoes kind of information, currently. Thus, we can't guarantee associating one process-aggregation with one process. # /proc/<uid>/loginuid might be candidate, but it's out of original purpose. We might be able to make alike system, but is it hard to implement strict process-aggregation without any kernel supports? I think that well thought out kernel-modification is better than ad-hoc implementation on user-space. Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Jay L. <jl...@sg...> - 2005-02-23 20:50:33
|
Kaigai Kohei wrote: > Hi, Thanks for your comments. > > >> I think there are two issues about system accounting framework. > >> > >> Issue: 1) How to define the appropriate unit for accounting ? > >> Current BSD-accountiong make a collection per process accounting > >> information. > >> CSA make additionally a collection per process-aggregation accounting. > > > > > > The 'enhanced acct data collection' patches that were added to > > 2-6-11-rc* tree still do collection of per process data. > > Hmm, I have not noticed this extension. But I made sure about it. > The following your two patches implements enhanced data collection, > didn't it? Yes! > > - ChangeLog for 2.6.11-rc1 > [PATCH] enhanced I/O accounting data patch > [PATCH] enhanced Memory accounting data collection > > Since making a collection per process accounting is unified to the stock > kernel, > I want to have a discussion about remaining half, "How to define the > appropriate > unit for accounting ?" > We can agree that only per process-accounting is so rigid, I think. > Then, process-aggregation should be provided in one way or another. > > [1] Is it necessary 'fork/exec/exit' event handling framework ? > > The common agreement for the method of dealing with process aggregation > has not been constructed yet, I understood. And, we will not able to > integrate each process aggregation model because of its diverseness. > > For example, a process which belong to JOB-A must not belong any other > 'JOB-X' in CSA-model. But, In ELSA-model, a process in BANK-B can > concurrently > belong to BANK-B1 which is a child of BANK-B. > > And, there are other defferences: > Whether a process not to belong to any process-aggregation is permitted > or not ? > Whether a process-aggregation should be inherited to child process or not ? > (There is possibility not to be inherited in a rule-based process > aggregation like CKRM) Guillaume answered this question, and i think a policy would work. > > Some process-aggregation model have own philosophy and implemantation, > so it's hard to integrate. Thus, I think that common 'fork/exec/exit' > event handling > framework to implement any kinds of process-aggregation. BSD needs an exit hook and ELSA needs a fork hook. I am still evaluating whether CSA can use the ELSA module. If CSA can use the ELSA module, CSA maybe would be fine with the fork hook. > > > [2] What implementation should be adopted ? > > I think registerable hooks on fork/execve/exit is necessary, not only > exit() hook. > Because a rule or policy based process-aggregation model requirees to catch > the transition of a process status. > It might be enough to hook the exit() event only in process-accounting, > but it's not kind for another customer. > > Thus, I recommend SGI's PAGG. > > In my understanding, the reason for not to include such a framework is that > increase of unidentifiable (proprietary) modules is worried. If we code the hooks explicitly in the kernel, such as in acct.c, then the concern of unidentifiable modules should be taken care of. A registerable framework was my preference. But if that causes concern, it would be fine for me to do it explicit way. An example is for acct_process() to invoke do_acct_process(). That means whoever intends to use even an existing hook needs to present their cases, i guess. Thanks, - jay > But, SI can divert LSM to implemente process-aggregation if they ignore > the LSM's original purpose, for example. > # I'm strongly opposed to such a movement as a SELinux's user :-) > > So, I think such a fork/execve/exit hooks is harmless now. > Is this the time to unify it? > > Thanks. > > > CSA added those per-process data to per-aggregation ("job") data > > structure at do_exit() time when a process termintes. > > > >> > >> It is appropriate to make the fork-exit event handling framework for > >> definition > >> of the process-aggregation, such as PAGG. > >> > >> This system-accounting per process-aggregation is quite useful, > >> thought I tried the SGI's implementation named 'job' in past days. > >> > >> > >> Issue: 2) What items should be collected for accounting information ? > >> BSD-accounting collects PID/UID/GID, User/Sys/Elapsed-Time, and # of > >> minor/major page faults. SGI's CSA collects VM/RSS size on exit time, > >> Integral-VM/RSS, and amount of block-I/O additionally. > > > > > > These data are now collected in 2.6.11-rc* code. Note that these data > > are still per-process. > > > >> > >> I think it's hard to implement the accounting-engine as a kernel > loadable > >> module using any kinds of framework. Because, we must put callback > >> functions > >> into all around the kernel for this purpose. > >> > >> Thus, I make a proposion as follows: > >> We should separate the process-aggregation functionality and collecting > >> accounting informations. > > > > > > I totally agree with this! Actually that was what we have done. The data > > collection part of code has been unified. > > > >> Something of framework to implement process-aggregation is necessary. > >> And, making a collection of accounting information should be merged > >> into BSD-accounting and implemented as a part of monolithic kernel > >> as Guillaume said. > > > > > > This sounds good. I am interested in learning how ELSA saves off > > the per-process accounting data before the data got disposed. If > > that scheme works for CSA, we would be very happy to adopt the > > scheme. The current BSD scheme is very insufficient. The code is > > very BSD centric and it provides no way to handle process-aggregation. > > > > Thanks, > > - jay > > > >> > >> Thanks. > |
From: Kaigai K. <ka...@ak...> - 2005-02-25 05:07:28
|
Sorry for this late reply. >> [1] Is it necessary 'fork/exec/exit' event handling framework ? ...<ommited>... >> Some process-aggregation model have own philosophy and implemantation, >> so it's hard to integrate. Thus, I think that common 'fork/exec/exit' >> event handling >> framework to implement any kinds of process-aggregation. > > > BSD needs an exit hook and ELSA needs a fork hook. I am still > evaluating whether CSA can use the ELSA module. If CSA can use the > ELSA module, CSA maybe would be fine with the fork hook. If CSA can use an ELSA module, then we must modify the kernel-tree for ELSA's fork-connecter. This means it's hard to implement the fork/exec/exit event notification to userspace (,or any kernel module) without kernel-support. How CSA shoule be implemented is interesting and important, but should it be main subject in this discussion that such a kinds of kernel hook is necessary to implement process-accounting per process-aggregation reasonable ? In my understanding, what Andrew Morton said is "If target functionality can implement in user space only, then we should not modify the kernel-tree". But, any kind of kernel support was required to handle process lifecycle events for the accounting per process-aggregation and so on, from our discussion. I'm also opposed to an adhoc approach, like CSA depending on ELSA. We should walk hight road. Thanks, -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Andrew M. <ak...@os...> - 2005-02-25 05:29:37
|
Kaigai Kohei <ka...@ak...> wrote: > > In my understanding, what Andrew Morton said is "If target functionality can > implement in user space only, then we should not modify the kernel-tree". fork, exec and exit upcalls sound pretty good to me. As long as a) they use the same common machinery and b) they are next-to-zero cost if something is listening on the netlink socket but no accounting daemon is running. Question is: is this sufficient for CSA? |
From: Jay L. <jl...@sg...> - 2005-02-25 17:32:08
|
Andrew Morton wrote: > Kaigai Kohei <ka...@ak...> wrote: > >>In my understanding, what Andrew Morton said is "If target functionality can >> implement in user space only, then we should not modify the kernel-tree". > > > fork, exec and exit upcalls sound pretty good to me. As long as > > a) they use the same common machinery and > > b) they are next-to-zero cost if something is listening on the netlink > socket but no accounting daemon is running. > > Question is: is this sufficient for CSA? Yes, fork, exec, and exit upcalls are sufficient for CSA. The framework i proposed earlier should satisfy your requirement a and b, and provides upcalls needed by BSD, ELSA and CSA. Maybe i misunderstood your concern of the 'very light weight' framework i proposed besides being "overkill"? - jay |
From: Chris W. <ch...@os...> - 2005-02-25 17:46:09
|
* Jay Lan (jl...@sg...) wrote: > Andrew Morton wrote: > >Kaigai Kohei <ka...@ak...> wrote: > > > >>In my understanding, what Andrew Morton said is "If target functionality > >>can > >>implement in user space only, then we should not modify the kernel-tree". > > > > > >fork, exec and exit upcalls sound pretty good to me. As long as > > > >a) they use the same common machinery and > > > >b) they are next-to-zero cost if something is listening on the netlink > > socket but no accounting daemon is running. > > > >Question is: is this sufficient for CSA? > > Yes, fork, exec, and exit upcalls are sufficient for CSA. As soon as you want to throttle tasks at the Job level, this would be insufficient. But, IIRC, that's not one of PAGG/Job/CSA's requirements right? thanks, -chris -- Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net |
From: Jay L. <jl...@sg...> - 2005-02-25 18:11:17
|
Chris Wright wrote: > * Jay Lan (jl...@sg...) wrote: > >>Andrew Morton wrote: >> >>>Kaigai Kohei <ka...@ak...> wrote: >>> >>> >>>>In my understanding, what Andrew Morton said is "If target functionality >>>>can >>>>implement in user space only, then we should not modify the kernel-tree". >>> >>> >>>fork, exec and exit upcalls sound pretty good to me. As long as >>> >>>a) they use the same common machinery and >>> >>>b) they are next-to-zero cost if something is listening on the netlink >>> socket but no accounting daemon is running. >>> >>>Question is: is this sufficient for CSA? >> >>Yes, fork, exec, and exit upcalls are sufficient for CSA. > > > As soon as you want to throttle tasks at the Job level, this would be > insufficient. But, IIRC, that's not one of PAGG/Job/CSA's requirements > right? PAGG serves more than JOB+CSA. I am looking into possiblity/feasibility of implementing JOB at userspace. However, even with JOB as a kernel module, the fork, exec and exit upcalls would be sufficient to support JOB+CSA. Thanks, - jay > > thanks, > -chris |
From: Andrew M. <ak...@os...> - 2005-02-25 21:31:38
|
Jay Lan <jl...@sg...> wrote: > > Andrew Morton wrote: > > Kaigai Kohei <ka...@ak...> wrote: > > > >>In my understanding, what Andrew Morton said is "If target functionality can > >> implement in user space only, then we should not modify the kernel-tree". > > > > > > fork, exec and exit upcalls sound pretty good to me. As long as > > > > a) they use the same common machinery and > > > > b) they are next-to-zero cost if something is listening on the netlink > > socket but no accounting daemon is running. > > > > Question is: is this sufficient for CSA? > > Yes, fork, exec, and exit upcalls are sufficient for CSA. > > The framework i proposed earlier should satisfy your requirement a > and b, and provides upcalls needed by BSD, ELSA and CSA. Maybe i > misunderstood your concern of the 'very light weight' framework > i proposed besides being "overkill"? "upcall" is poorly defined. What I meant was that ELSA can perform its function when the kernel merely sends asynchronous notifications of forks out to userspace via netlink. Further, I'm wondering if CSA can perform its function with the same level of kernel support, perhaps with the addition of netlink-based notification of exec and exit as well. The framework patch which you sent was designed to permit the addition of more kernel accounting code, which is heading in the opposite direction. In other words: given that ELSA can do its thing via existing accounting interfaces and a fork notifier, why does CSA need to add lots more kernel code? |
From: Jay L. <jl...@sg...> - 2005-02-25 22:18:12
|
Andrew Morton wrote: > Jay Lan <jl...@sg...> wrote: > >>Andrew Morton wrote: >> > Kaigai Kohei <ka...@ak...> wrote: >> > >> >>In my understanding, what Andrew Morton said is "If target functionality can >> >> implement in user space only, then we should not modify the kernel-tree". >> > >> > >> > fork, exec and exit upcalls sound pretty good to me. As long as >> > >> > a) they use the same common machinery and >> > >> > b) they are next-to-zero cost if something is listening on the netlink >> > socket but no accounting daemon is running. >> > >> > Question is: is this sufficient for CSA? >> >> Yes, fork, exec, and exit upcalls are sufficient for CSA. >> >> The framework i proposed earlier should satisfy your requirement a >> and b, and provides upcalls needed by BSD, ELSA and CSA. Maybe i >> misunderstood your concern of the 'very light weight' framework >> i proposed besides being "overkill"? > > > "upcall" is poorly defined. > > What I meant was that ELSA can perform its function when the kernel merely > sends asynchronous notifications of forks out to userspace via netlink. > > Further, I'm wondering if CSA can perform its function with the same level > of kernel support, perhaps with the addition of netlink-based notification > of exec and exit as well. > > The framework patch which you sent was designed to permit the addition of > more kernel accounting code, which is heading in the opposite direction. > > In other words: given that ELSA can do its thing via existing accounting > interfaces and a fork notifier, why does CSA need to add lots more kernel > code? Here are some codes from do_exit() starting line 813 (based on 2.6.11-rc4-mm1): 813 acct_update_integrals(tsk); 814 update_mem_hiwater(tsk); 815 group_dead = atomic_dec_and_test(&tsk->signal->live); 816 if (group_dead) { 817 del_timer_sync(&tsk->signal->real_timer); 818 acct_process(code); 819 } 820 exit_mm(tsk); The acct_process() is called to save off BSD accounting data at line 818. The next statement at 820, tsk->mm is disposed and all data saved at tsk->mm is gone, including memory hiwater marks information saved at line 814. The complete tsk is disposed before exit of do_exit() routine. In separate emails discussion thread among interested parties, i asked Guillaume to clarify this question. I suspect ELSA counts on BSD's acct_process() at line 818 to save most accounting data. If that is the case and since ELSA wants extended accounting data collection, a way to save the extended acct data would be essential to ELSA as well. I can better asnwer your "why ELSA can do but CSA can't" question after i learn more from Guilluame. Later, - jay > > ------------------------------------------------------- > SF email is sponsored by - The IT Product Guide > Read honest & candid reviews on hundreds of IT Products from real users. > Discover which products truly live up to the hype. Start reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Lse-tech mailing list > Lse...@li... > https://lists.sourceforge.net/lists/listinfo/lse-tech |
From: Marcelo T. <mar...@cy...> - 2005-02-27 14:11:08
|
On Thu, Feb 24, 2005 at 09:28:39PM -0800, Andrew Morton wrote: > Kaigai Kohei <ka...@ak...> wrote: > > > > In my understanding, what Andrew Morton said is "If target functionality can > > implement in user space only, then we should not modify the kernel-tree". > > fork, exec and exit upcalls sound pretty good to me. As long as > > a) they use the same common machinery and > > b) they are next-to-zero cost if something is listening on the netlink > socket but no accounting daemon is running. b) would involved being able to avoid sending netlink messages in case there are no listeners. AFAIK that isnt possible currently, netlink sends packets unconditionally. Am I wrong? |
From: KaiGai K. <ka...@ak...> - 2005-02-27 15:20:21
|
Hi, >>Kaigai Kohei <ka...@ak...> wrote: >> >>>In my understanding, what Andrew Morton said is "If target functionality can >>> implement in user space only, then we should not modify the kernel-tree". >> >>fork, exec and exit upcalls sound pretty good to me. As long as >> >>a) they use the same common machinery and >> >>b) they are next-to-zero cost if something is listening on the netlink >> socket but no accounting daemon is running. > > > b) would involved being able to avoid sending netlink messages in case there are > no listeners. AFAIK that isnt possible currently, netlink sends > packets unconditionally. > > Am I wrong? In current implementaion, you might be right. But we should make an effort to achieve the requirement-(b) from now. And, why can't netlink packets send always? If there are fork/exec/exit hooks, and they call CSA or other process-grouping modules, then those modules will decide whether packets for interaction with the daemon should be sent or not. In most considerable case, CSA's kernel-loadable-module using such hooks will not be loaded when no accounting daemon is running. Adversely, this module must be loaded when accounting daemon needs CSA's netlink packets. Thus, it is only necessary to refer flag valiable and to execute conditional-jump when no-accounting daemon is running. In my estimation, we must pay additional cost for an increment-operation, an decrement-op, an comparison-op and an conditional jump-op. It's enough lightweight, I think. For example: If CSA's module isn't loaded, 'privates_for_grouping' will be empty. inline int on_fork_hook(task_struct *parent, task_struct *newtask){ rcu_read_lock(); if( !list_empty(&parent->privates_for_grouping) ){ ..<Calling to any process grouping module>..; } rcu_read_unlock(); } Thanks, -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Marcelo T. <mar...@cy...> - 2005-02-27 18:25:34
|
On Mon, Feb 28, 2005 at 12:20:40AM +0900, KaiGai Kohei wrote: > Hi, > > >>Kaigai Kohei <ka...@ak...> wrote: > >> > >>>In my understanding, what Andrew Morton said is "If target functionality > >>>can > >>>implement in user space only, then we should not modify the kernel-tree". > >> > >>fork, exec and exit upcalls sound pretty good to me. As long as > >> > >>a) they use the same common machinery and > >> > >>b) they are next-to-zero cost if something is listening on the netlink > >> socket but no accounting daemon is running. > > > > > >b) would involved being able to avoid sending netlink messages in case > >there are no listeners. AFAIK that isnt possible currently, netlink sends > >packets unconditionally. > > > >Am I wrong? > > In current implementaion, you might be right. > But we should make an effort to achieve the requirement-(b) from now. Yep, the netlink people should be able to help - they known what would be required for not sending messages in case there is no listener registered. Maybe its already possible? I have never used netlink myself. > And, why can't netlink packets send always? > If there are fork/exec/exit hooks, and they call CSA or other > process-grouping modules, > then those modules will decide whether packets for interaction with the > daemon should be > sent or not. The netlink data will be sent to userspace at fork/exec/exit hooks - one wants to avoid that if there are no listeners, so setups which dont want to run the accounting daemon dont pay the cost of building and sending the information through netlink. Thats what Andrew asked for if I understand correctly. > In most considerable case, CSA's kernel-loadable-module using such hooks > will not be loaded > when no accounting daemon is running. Adversely, this module must be loaded > when accounting > daemon needs CSA's netlink packets. > Thus, it is only necessary to refer flag valiable and to execute > conditional-jump > when no-accounting daemon is running. That would be one hack, although it is uglier than the pure netlink selection. > In my estimation, we must pay additional cost for an increment-operation, > an decrement-op, > an comparison-op and an conditional jump-op. It's enough lightweight, I > think. > > For example: > If CSA's module isn't loaded, 'privates_for_grouping' will be empty. > > inline int on_fork_hook(task_struct *parent, task_struct *newtask){ > rcu_read_lock(); > if( !list_empty(&parent->privates_for_grouping) ){ > ..<Calling to any process grouping module>..; > } > rcu_read_unlock(); > } Andrew has been talking about sending data over netlink to implement the accounting at userspace, so this piece of code is out of the game, no? |
From: David S. M. <da...@da...> - 2005-02-27 19:28:45
|
On Sun, 27 Feb 2005 11:03:55 -0300 Marcelo Tosatti <mar...@cy...> wrote: > Yep, the netlink people should be able to help - they known what would be > required for not sending messages in case there is no listener registered. Please CC: ne...@os... to get some netlink discussions going if wanted. |
From: Kaigai K. <ka...@ak...> - 2005-02-28 01:58:27
|
Hello, Marcelo Tosatti wrote: > Yep, the netlink people should be able to help - they known what would be > required for not sending messages in case there is no listener registered. > > Maybe its already possible? I have never used netlink myself. If we notify the fork/exec/exit-events to user-space directly as you said, I don't think some hackings on netlink is necessary. For example, such packets is sent only when /proc/sys/.../process_grouping is set, and user-side daemon set this value, and unset when daemon will exit. It's not necessary to take too seriously. >>And, why can't netlink packets send always? >>If there are fork/exec/exit hooks, and they call CSA or other >>process-grouping modules, >>then those modules will decide whether packets for interaction with the >>daemon should be >>sent or not. > > > The netlink data will be sent to userspace at fork/exec/exit hooks - one wants > to avoid that if there are no listeners, so setups which dont want to run the > accounting daemon dont pay the cost of building and sending the information > through netlink. > > Thats what Andrew asked for if I understand correctly. Does it mean "netlink packets shouled be sent to userspace unconditionally." ? I have advocated steadfastly that fork/exec/exit hooks is necessary to support process-grouping and to account per process-grouping. It intend to be decided whether packets should be sent or not by hooked functions, in my understanding. Is it also one of the implementations whether using netlink-socket or not ? >>In most considerable case, CSA's kernel-loadable-module using such hooks >>will not be loaded >>when no accounting daemon is running. Adversely, this module must be loaded >>when accounting >>daemon needs CSA's netlink packets. >>Thus, it is only necessary to refer flag valiable and to execute >>conditional-jump >>when no-accounting daemon is running. > > > That would be one hack, although it is uglier than the pure netlink > selection. No, I can't agree this opinion. It means netlink-packets will be sent unconditionally when fork/exec/exit occur. Nobady can decide which packet is sent user-space, I think. In addition, the definition of process grouping is lightweight in many cases. For example, CpuSet can define own process-group by one increment-operation. I think it's not impossible to implement it in userspace, but it's not reasonable. An implementation as a kernel loadable module is reasonable and enough tiny. >>In my estimation, we must pay additional cost for an increment-operation, >>an decrement-op, >>an comparison-op and an conditional jump-op. It's enough lightweight, I >>think. >> >>For example: >>If CSA's module isn't loaded, 'privates_for_grouping' will be empty. >> >>inline int on_fork_hook(task_struct *parent, task_struct *newtask){ >> rcu_read_lock(); >> if( !list_empty(&parent->privates_for_grouping) ){ >> ..<Calling to any process grouping module>..; >> } >> rcu_read_unlock(); >>} > > > Andrew has been talking about sending data over netlink to implement the > accounting at userspace, so this piece of code is out of the game, no? Indeed, I'm not opposed to implement the accounting in userspace and using netlink-socket for kernel-daemon communication. But definition of process-grouping based on any grouping policy should be done in kernel space at reasonability viewpoint. Thanks. -- Linux Promotion Center, NEC KaiGai Kohei <ka...@ak...> |
From: Thomas G. <tg...@su...> - 2005-02-28 02:32:04
|
First of all, I'm not aware of the whole discussion, ignore this if it has been brought to attention already. > > Yep, the netlink people should be able to help - they known what would be > > required for not sending messages in case there is no listener registered. > > > > Maybe its already possible? I have never used netlink myself. The easiest way is to use netlink_broadcast() and have userspace register to a netlink multicast group (set .nl_groups before connecting the socket). The netlink message will be sent to only those netlink sockets assigned to the group, no message will be send out if no userspace listeners has registered. Did you have a look at the syscall enter/exit audit netlink hooks before trying to invent your own thing? I can also give you some code if you want, I use it to track the path of skbs in the net stack. It puts events into a preallocated ring buffer and a separate kernel thread broadcasts them over netlink. The events can be enqueued in any context at the cost of a possible ring buffer overrun resulting in loss of events. It's just a debugging hack though. |