From: John H. <jh...@sg...> - 2004-08-30 19:17:38
|
On Thu, Aug 26, 2004 at 08:53:49PM -0700, Andrew Morton wrote: > Thanks, guys. So we now know that there are three potential > implementations which do much the same thing, yes? I believe CSA does than the others. > I didn't get a sense of a preferred direction, but at least nobody is > flaming anybody else yet ;) > > It strikes me that CSA is the most actively developed and is the furthest > along. But that enhancing BSD accounting might be the least intrusive and > most back-compatible approach. > > Is that a fair summary? If not, what should I have said? Does anyone know if CSA is a super-set of BSD accounting and ELSA? What would be missing? I'm unconvinced that enhancing BSD accounting to encompass the capabilities of CSA is appropriate. I think we can make the data collection additions common. That should encompass the bulk of the invasive changes that are required by at least CSA proper (ie there are still the PAGG changes for job support that we can discuss separately). Not sure about BSD accounting and ELSA. With that cooperation, we can then either proceed with further cooperation, or if the goals and users of the different accounting approaches dictate different kernel modules and user support, I'd propose that might be OK. John |
From: John H. <jh...@sg...> - 2004-08-26 18:45:34
|
On Thu, Aug 26, 2004 at 07:15:40PM +0200, Tim Schmielau wrote: > ... > IMHO CSA, ELSA and BSD accounting are too similar to have more than one of > them in the kernel. We should either improve BSD accounting to do the job, > or kill it in favor of a different implementation. > > Tim We should at least have common data collection in the kernel. I could more easily understand different accounting packages on top of that that might meet different needs of different classes of users. John |
From: Tim S. <ti...@ph...> - 2004-08-27 08:27:40
|
On Thu, 26 Aug 2004, John Hesterberg wrote: > On Thu, Aug 26, 2004 at 07:15:40PM +0200, Tim Schmielau wrote: > > ... > > IMHO CSA, ELSA and BSD accounting are too similar to have more than one of > > them in the kernel. We should either improve BSD accounting to do the job, > > or kill it in favor of a different implementation. > > > > Tim > > We should at least have common data collection in the kernel. > > I could more easily understand different accounting packages on top of > that that might meet different needs of different classes of users. Sorry, that is of course what I meant - I am only talking about kernel code. Tim |
From: John H. <jh...@sg...> - 2004-08-27 19:34:38
|
On Fri, Aug 27, 2004 at 07:42:18AM +0200, Guillaume Thouvenin wrote: > On Thu, Aug 26, 2004 at 10:05:37PM +0200, Tim Schmielau wrote: > > > > It should be easy to combine the data collection enhancements from > > CSA and ELSA to provide a common superset of information. > > ELSA uses current BSD accounting. The only difference with BSD is that > accounting is done for a group of processes. I didn't use PAGG and > rewrite something because I thought (I was wrong) that PAGG project > wasn't maintained. I continue to maintain ELSA just because there is, > until today, no solution for doing job accounting. > So, the data collection enhancements from ELSA is not very useful. > > > With the new BSD acct v3 format, it should be possible to do per job > > accounting entirely from userspace, using pid and ppid information to > > reconstruct the process tree and some userland database for the > > pid -> job mapping. It would, however, be greatly simplified if the > > accounting records provided some kind of job id, and some indicator > > whether or not this process was the last of a job (group). > > I like this solution. > In fact what I proposed was to have PAGG and a modified BSD accounting > that can be used with PAGG as both are already in the -mm tree. But > manage group of processes from userspace is, IMHO, a better solution as > modifications in the kernel will be minimal. The kernel part of linux-job is a module that uses PAGG, and isn't difficult. We've been running it in production for a couple years. I don't think a kernel-based job is a requirement, though, so I'd like to hear more about how you'd do it otherwise. The other comments about only one acct record per job vs one per process might be important, and that might mean the kernel has to know about the job. > Therefore the solution could be to enhance BSD accounting with data > collection from CSA and provide per job accounting with a userspace > mechanism. Sounds great to me... > > Best, > Guillaume How does the BSD accounting define jobs? What determines the job that a process is part of? An important aspect of linux-job (ie the job part of the pagg/job/csa stack) is that it is inescapable. The user doesn't get to determine or change their job (unlike process groups). For true accounting, that determines the real $$$ chargebacks on shared machines, this is necessary. Another aspect of jobs that isn't directly related to accounting is that it gives users and admins a way to query, and kill :-), all the processes that are part of the job. The inescapable part is again important...you can't fork off a process and detach it from the job to hide it. In fact, I've heard that some sites use pagg/job without CSA for this reason. It might have been an ISP or ASP, and they liked the containment linux-job provided. John |
From: Tim S. <ti...@ph...> - 2004-08-30 08:30:10
|
On Fri, 27 Aug 2004, John Hesterberg wrote: > On Fri, Aug 27, 2004 at 07:42:18AM +0200, Guillaume Thouvenin wrote: > > On Thu, Aug 26, 2004 at 10:05:37PM +0200, Tim Schmielau wrote: > > > > > With the new BSD acct v3 format, it should be possible to do per job > > > accounting entirely from userspace, using pid and ppid information to > > > reconstruct the process tree and some userland database for the > > > pid -> job mapping. It would, however, be greatly simplified if the > > > accounting records provided some kind of job id, and some indicator > > > whether or not this process was the last of a job (group). > > > > I like this solution. > > In fact what I proposed was to have PAGG and a modified BSD accounting > > that can be used with PAGG as both are already in the -mm tree. But > > manage group of processes from userspace is, IMHO, a better solution as > > modifications in the kernel will be minimal. > > The kernel part of linux-job is a module that uses PAGG, and > isn't difficult. We've been running it in production for a > couple years. Well, I'm rethinking my opinion of not wanting two accounting methods in the kernel. Make them share as much code as possible, with the only remaining difference being the format of the record and wether it is written per process or per job. Then we just have to make sure that both mechanisms get exercised regulary, to prevent bit-rot. > I don't think a kernel-based job is a requirement, though, > so I'd like to hear more about how you'd do it otherwise. > > The other comments about only one acct record per job vs one > per process might be important, and that might mean the kernel > has to know about the job. Yes, it would probably be easier if the kernel knows about the job and could stuff a job ID into the acct record. If that means going from 64 byte records to 128 bytes, this would again double the already larger overhead of BSD accounting, however. This lightweightness of CSA is why I am not opposed to its inclusion. On the other hand, there are a few uses (and users) of per-process accounting records, i.e. for security auditing, so we should not back it out of the kernel. > How does the BSD accounting define jobs? > What determines the job that a process is part of? BSD accounting doesn't have the concept of a job at all. When we discussed the v3 format, we considered adding a job ID field from PAGG, but a) nobody answered and b) there wasn't any space left in the record anyways. So a decision was postponed for a future 128 byte acct v4 structure. > An important aspect of linux-job (ie the job part of the pagg/job/csa > stack) is that it is inescapable. The user doesn't get to determine or > change their job (unlike process groups). For true accounting, that > determines the real $$$ chargebacks on shared machines, this is > necessary. My proposed solution for a userspace method would also be inescapable: To start a new job, just write out it's pid to a file that is only ever appended to, and consider all children of it as belonging to the same job. Inescapeable, but probably some overhead in userspace. Oh, wait - I think there is a problem in current BSD accounting, if the parent process dies before the child, and the child gets reparented to init. I should investigate that... > Another aspect of jobs that isn't directly related to accounting > is that it gives users and admins a way to query, and kill :-), > all the processes that are part of the job. The inescapable part > is again important...you can't fork off a process and detach it from > the job to hide it. In fact, I've heard that some sites use pagg/job > without CSA for this reason. It might have been an ISP or ASP, and > they liked the containment linux-job provided. Yes, it's probably a lot easier if you don't have to search accounting files to do that. So from my view, we might turn the discussion from whether we want CSA to how we integrate it, i.e. do some code review. Tim |