From: Till I. P. <ti...@in...> - 2002-12-19 00:46:25
|
Dear List(s), as part of my project I need to run a very high number of processes/threads on a linux machine. Right now I have a Dual-PIII 1.4G w/ 8GB RAM -- I am running 4000 processes w/ 2-3 threads each totaling in a process count of 15000+ processes (since Linux doesn't really distinguish between threads and processes...). Once I pass the 10000 (+/-) pocesses load increases drastically (on startup, although it returns to normal), however the system time (on one processor) reaches for 54% (12061 procs) while the only non sleeping process is top -- the system is basically doing nothing (except scheduling the "nothing" which consumes significant system time). Is there anything I can do to reduce that system load/time? (I haven't been able to exactly define the "line" but it definitly gets worse the more processes need to be handled.) Does any of the patchsets address this particular problem? BTW: The processes are all alike... Thanks for you help! Immanuel |
From: Till I. P. <ti...@in...> - 2002-12-19 00:53:51
|
forgot the kernel version (2.4.20aa1)... Till Immanuel Patzschke wrote: > Dear List(s), > > as part of my project I need to run a very high number of processes/threads on a > linux machine. Right now I have a Dual-PIII 1.4G w/ 8GB RAM -- I am running > 4000 processes w/ 2-3 threads each totaling in a process count of 15000+ > processes (since Linux doesn't really distinguish between threads and > processes...). > Once I pass the 10000 (+/-) pocesses load increases drastically (on startup, > although it returns to normal), however the system time (on one processor) > reaches for 54% (12061 procs) while the only non sleeping process is top -- the > system is basically doing nothing (except scheduling the "nothing" which > consumes significant system time). > Is there anything I can do to reduce that system load/time? (I haven't been > able to exactly define the "line" but it definitly gets worse the more processes > need to be handled.) > Does any of the patchsets address this particular problem? > BTW: The processes are all alike... > > Thanks for you help! > > Immanuel > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ |
From: William L. I. I. <wl...@ho...> - 2002-12-19 01:17:22
|
On Wed, Dec 18, 2002 at 04:53:45PM -0800, Till Immanuel Patzschke wrote: > forgot the kernel version (2.4.20aa1)... 2.4.20aa1 is missing some of the infrastructure to reduce the cpu consumption under high process count loads, but that's not going to help you anyway. 150K processes is not going to be feasible in the immediate future (months or longer away) so you'll have to figure out how to take that into account. Bill |
From: David L. <dl...@di...> - 2002-12-19 01:24:46
|
also top is very inefficant with large numbers of processes. use vmstat or cat out the files in /proc to get the info more efficiantly (it won't get you per process info, but it son't cause the interferance with your desired load that top gives you.) David Lang On Wed, 18 Dec 2002, William Lee Irwin III wrote: > Date: Wed, 18 Dec 2002 17:15:41 -0800 > From: William Lee Irwin III <wl...@ho...> > To: Till Immanuel Patzschke <ti...@in...> > Cc: lse-tech <lse...@li...>, > "lin...@vg..." <lin...@vg...> > Subject: Re: 15000+ processes -- poor performance ?! > > On Wed, Dec 18, 2002 at 04:53:45PM -0800, Till Immanuel Patzschke wrote: > > forgot the kernel version (2.4.20aa1)... > > 2.4.20aa1 is missing some of the infrastructure to reduce the cpu > consumption under high process count loads, but that's not going to > help you anyway. 150K processes is not going to be feasible in the > immediate future (months or longer away) so you'll have to figure out > how to take that into account. > > > Bill > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to maj...@vg... > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > |
From: William L. I. I. <wl...@ho...> - 2002-12-19 01:27:24
|
On Wed, Dec 18, 2002 at 05:12:41PM -0800, David Lang wrote: > also top is very inefficant with large numbers of processes. use vmstat > or cat out the files in /proc to get the info more efficiantly (it won't > get you per process info, but it son't cause the interferance with your > desired load that top gives you.) It's mostly just the fact top(1) doesn't scan /proc/ incrementally and that proc_pid_readdir() is quadratic in the number of tasks. Bill |
From: David L. <dl...@di...> - 2002-12-19 01:32:06
|
Ok, I wasn't sure of the cause, but I've seen this as far back as 2.2 I had a machine trying to run 2000 processes under 2.2 and 2.4.0 (after upping the 2.2 kernel limit) and top would cost me ~40% throughput on the machine (while claiming it was useing ~5% of the CPU) David Lang On Wed, 18 Dec 2002, William Lee Irwin III wrote: > Date: Wed, 18 Dec 2002 17:25:49 -0800 > From: William Lee Irwin III <wl...@ho...> > To: David Lang <dl...@di...> > Cc: Till Immanuel Patzschke <ti...@in...>, > lse-tech <lse...@li...>, > "lin...@vg..." <lin...@vg...> > Subject: Re: 15000+ processes -- poor performance ?! > > On Wed, Dec 18, 2002 at 05:12:41PM -0800, David Lang wrote: > > also top is very inefficant with large numbers of processes. use vmstat > > or cat out the files in /proc to get the info more efficiantly (it won't > > get you per process info, but it son't cause the interferance with your > > desired load that top gives you.) > > It's mostly just the fact top(1) doesn't scan /proc/ incrementally and > that proc_pid_readdir() is quadratic in the number of tasks. > > > Bill > |
From: William L. I. I. <wl...@ho...> - 2002-12-19 01:38:20
|
On Wed, Dec 18, 2002 at 05:20:02PM -0800, David Lang wrote: > Ok, I wasn't sure of the cause, but I've seen this as far back as 2.2 I > had a machine trying to run 2000 processes under 2.2 and 2.4.0 (after > upping the 2.2 kernel limit) and top would cost me ~40% throughput on the > machine (while claiming it was useing ~5% of the CPU) > David Lang It wasn't really lying to you. The issue is that the kernel samples at regular intervals to avoid timer reprogramming overhead. Now top(1) is isochronous in nature as it's trying to periodically refresh, and so it runs in lockstep with the clock interrupt, and the kernel hands back bad numbers to top(1). Bill |
From: Robert L. <rm...@te...> - 2002-12-19 01:43:08
|
On Wed, 2002-12-18 at 20:20, David Lang wrote: > Ok, I wasn't sure of the cause, but I've seen this as far back as 2.2 I > had a machine trying to run 2000 processes under 2.2 and 2.4.0 (after > upping the 2.2 kernel limit) and top would cost me ~40% throughput on the > machine (while claiming it was useing ~5% of the CPU) Yah a lot of it is like William is saying... you just do not want to read multiple files for each process in /proc when you have a kajillion processes, and that is what top does. Over and over. Work has gone into 2.5 to make this a lot better.. If you use threads with NPTL in 2.5, a lot of this is resolved, since the sub-threads will not show up in as /proc/#/ entries. Robert Love |
From: David L. <dav...@di...> - 2002-12-19 01:56:41
|
In my case I will still be running thousands of processes, so I have to just teach everyone not to use top instead. David Lang On 18 Dec 2002, Robert Love wrote: > Date: 18 Dec 2002 20:42:58 -0500 > From: Robert Love <rm...@te...> > To: David Lang <dl...@di...> > Cc: William Lee Irwin III <wl...@ho...>, > Till Immanuel Patzschke <ti...@in...>, > lse-tech <lse...@li...>, > "lin...@vg..." <lin...@vg...> > Subject: Re: 15000+ processes -- poor performance ?! > > On Wed, 2002-12-18 at 20:20, David Lang wrote: > > Ok, I wasn't sure of the cause, but I've seen this as far back as 2.2 I > > had a machine trying to run 2000 processes under 2.2 and 2.4.0 (after > > upping the 2.2 kernel limit) and top would cost me ~40% throughput on the > > machine (while claiming it was useing ~5% of the CPU) > > Yah a lot of it is like William is saying... you just do not want to > read multiple files for each process in /proc when you have a kajillion > processes, and that is what top does. Over and over. > > Work has gone into 2.5 to make this a lot better.. If you use threads > with NPTL in 2.5, a lot of this is resolved, since the sub-threads will > not show up in as /proc/#/ entries. > > Robert Love > |
From: William L. I. I. <wl...@ho...> - 2002-12-19 02:07:27
|
On Wed, Dec 18, 2002 at 05:44:46PM -0800, David Lang wrote: > In my case I will still be running thousands of processes, so I have to > just teach everyone not to use top instead. > David Lang Well, a better solution would be a userspace free of /proc/ dependency. Or actually fixing the kernel. proc_pid_readdir() wants an efficiently indexable linear list, e.g. TAOCP's 6.2.3 "Linear List Representation". At that point its expense is proportional to the buffer size and "seeking" about the list as it is wont to do is O(lg(processes)). Bill |
From: Denis V. <vd...@po...> - 2002-12-19 10:30:34
|
On 19 December 2002 00:05, William Lee Irwin III wrote: > On Wed, Dec 18, 2002 at 05:44:46PM -0800, David Lang wrote: > > In my case I will still be running thousands of processes, so I > > have to just teach everyone not to use top instead. > > David Lang > > Well, a better solution would be a userspace free of /proc/ > dependency. > > Or actually fixing the kernel. proc_pid_readdir() wants an > efficiently indexable linear list, e.g. TAOCP's 6.2.3 "Linear List > Representation". At that point its expense is proportional to the > buffer size and "seeking" about the list as it is wont to do is > O(lg(processes)). A short-time solution: run top d 30 to make it refresh only every 30 seconds. This will greatly reduce top's own load skew. -- vda |
From: William L. I. I. <wl...@ho...> - 2002-12-19 10:31:07
|
On 19 December 2002 00:05, William Lee Irwin III wrote: >> Well, a better solution would be a userspace free of /proc/ >> dependency. >> Or actually fixing the kernel. proc_pid_readdir() wants an >> efficiently indexable linear list, e.g. TAOCP's 6.2.3 "Linear List >> Representation". At that point its expense is proportional to the >> buffer size and "seeking" about the list as it is wont to do is >> O(lg(processes)). On Thu, Dec 19, 2002 at 01:05:03PM -0200, Denis Vlasenko wrote: > A short-time solution: run top d 30 to make it refresh only every 30 seconds. > This will greatly reduce top's own load skew. As userspace solutions go your suggestions is just as good. The kernel still needs to get its act together and with some urgency. Bill |
From: Denis V. <vd...@po...> - 2002-12-19 10:43:28
|
On 19 December 2002 08:27, William Lee Irwin III wrote: > On 19 December 2002 00:05, William Lee Irwin III wrote: > >> Well, a better solution would be a userspace free of /proc/ > >> dependency. > >> Or actually fixing the kernel. proc_pid_readdir() wants an > >> efficiently indexable linear list, e.g. TAOCP's 6.2.3 "Linear List > >> Representation". At that point its expense is proportional to the > >> buffer size and "seeking" about the list as it is wont to do is > >> O(lg(processes)). > > On Thu, Dec 19, 2002 at 01:05:03PM -0200, Denis Vlasenko wrote: > > A short-time solution: run top d 30 to make it refresh only every > > 30 seconds. This will greatly reduce top's own load skew. > > As userspace solutions go your suggestions is just as good. The > kernel still needs to get its act together and with some urgency. That was just a suggestion as to how to get realistic picture of system load for Till Immanuel Patzschke <ti...@in...>. -- vda |
From: Alex T. <bz...@tm...> - 2002-12-19 10:46:38
|
>>>>> William Lee Irwin (WLI) writes: WLI> On 19 December 2002 00:05, William Lee Irwin III wrote: >>> Well, a better solution would be a userspace free of /proc/ >>> dependency. Or actually fixing the kernel. proc_pid_readdir() >>> wants an efficiently indexable linear list, e.g. TAOCP's 6.2.3 >>> "Linear List Representation". At that point its expense is >>> proportional to the buffer size and "seeking" about the list as >>> it is wont to do is O(lg(processes)). WLI> On Thu, Dec 19, 2002 at 01:05:03PM -0200, Denis Vlasenko wrote: >> A short-time solution: run top d 30 to make it refresh only every >> 30 seconds. This will greatly reduce top's own load skew. WLI> As userspace solutions go your suggestions is just as good. The WLI> kernel still needs to get its act together and with some WLI> urgency. what about retreiving info from /proc/kmem or something like? just to avoid binary -> text(proc) -> binary |
From: William L. I. I. <wl...@ho...> - 2002-12-19 10:57:27
|
William Lee Irwin (WLI) writes: WLI> As userspace solutions go your suggestions is just as good. The WLI> kernel still needs to get its act together and with some WLI> urgency. On Thu, Dec 19, 2002 at 01:37:30PM +0300, Alex Tomas wrote: > what about retreiving info from /proc/kmem or something like? just to > avoid binary -> text(proc) -> binary That would also be an excellent userspace solution to this local DoS. Bill |
From: Martin J. B. <mb...@ar...> - 2002-12-19 15:24:50
|
> WLI> As userspace solutions go your suggestions is just as good. The > WLI> kernel still needs to get its act together and with some > WLI> urgency. > > what about retreiving info from /proc/kmem or something like? just to > avoid binary -> text(proc) -> binary The binary <-> text translation problem is less of an issue than all the syscall traffic, dcache hits, etc. Search linux-kernel archives for a recent thread entitiled "ps performance sucks" or something similar. M. |
From: William L. I. I. <wl...@ho...> - 2002-12-19 01:26:27
|
On Wed, Dec 18, 2002 at 04:53:45PM -0800, Till Immanuel Patzschke wrote: >> forgot the kernel version (2.4.20aa1)... On Wed, Dec 18, 2002 at 05:15:41PM -0800, William Lee Irwin III wrote: > 2.4.20aa1 is missing some of the infrastructure to reduce the cpu > consumption under high process count loads, but that's not going to > help you anyway. 150K processes is not going to be feasible in the > immediate future (months or longer away) so you'll have to figure out > how to take that into account. Er, sorry, on a brief rereading my eyes deceived me and I thought an extra zero got in there. 15K is fine on 2.5 + patches. Bill |
From: Martin J. B. <mb...@ar...> - 2002-12-19 00:54:32
|
> as part of my project I need to run a very high number of processes/threads on a > linux machine. Right now I have a Dual-PIII 1.4G w/ 8GB RAM -- I am running > 4000 processes w/ 2-3 threads each totaling in a process count of 15000+ > processes (since Linux doesn't really distinguish between threads and > processes...). > Once I pass the 10000 (+/-) pocesses load increases drastically (on startup, > although it returns to normal), however the system time (on one processor) > reaches for 54% (12061 procs) while the only non sleeping process is top -- the > system is basically doing nothing (except scheduling the "nothing" which > consumes significant system time). > Is there anything I can do to reduce that system load/time? (I haven't been > able to exactly define the "line" but it definitly gets worse the more processes > need to be handled.) You don't even specify what kernel you're using ... > Does any of the patchsets address this particular problem? Read the linux-kernel archives. M. |
From: Jeff G. <jg...@po...> - 2002-12-19 00:59:47
|
On Wed, Dec 18, 2002 at 04:46:15PM -0800, Till Immanuel Patzschke wrote: > Dear List(s), > > as part of my project I need to run a very high number of processes/threads on a > linux machine. Right now I have a Dual-PIII 1.4G w/ 8GB RAM -- I am running > 4000 processes w/ 2-3 threads each totaling in a process count of 15000+ > processes (since Linux doesn't really distinguish between threads and > processes...). > Once I pass the 10000 (+/-) pocesses load increases drastically (on startup, > although it returns to normal), however the system time (on one processor) > reaches for 54% (12061 procs) while the only non sleeping process is top -- the > system is basically doing nothing (except scheduling the "nothing" which > consumes significant system time). > Is there anything I can do to reduce that system load/time? (I haven't been > able to exactly define the "line" but it definitly gets worse the more processes > need to be handled.) Redesign your program to not do silly things like this. Unless you have hardware with 5000 or more CPUs... Jeff |
From: William L. I. I. <wl...@ho...> - 2002-12-19 01:13:18
|
On Wed, Dec 18, 2002 at 04:46:15PM -0800, Till Immanuel Patzschke wrote: > as part of my project I need to run a very high number of > processes/threads on a linux machine. Right now I have a Dual-PIII > 1.4G w/ 8GB RAM -- I am running 4000 processes w/ 2-3 threads each > totaling in a process count of 15000+ processes (since Linux doesn't > really distinguish between threads and processes...). You're for the most part SOL unless you can either hack the support or can wait for it to be finished. More details below. On Wed, Dec 18, 2002 at 04:46:15PM -0800, Till Immanuel Patzschke wrote: > Once I pass the 10000 (+/-) pocesses load increases drastically (on > startup, although it returns to normal), however the system time (on > one processor) reaches for 54% (12061 procs) while the only non > sleeping process is top -- the system is basically doing nothing > (except scheduling the "nothing" which > consumes significant system time). > Is there anything I can do to reduce that system load/time? (I > haven't been able to exactly define the "line" but it definitly gets > worse the more processes need to be handled.) > Does any of the patchsets address this particular problem? > BTW: The processes are all alike... > Thanks for you help! Try 2.5.52-mm1 + 2.5.52-wli-1. The -wli bits are orthogonal but they do a small bit to reduce the cpu inefficiencies of many task loads. -wli is actually maintenance and follow-through on various early 2.5 promises. proc_pid_readdir() is the cpu culprit, which I have not yet addressed. You are also going to have severe memory management problems due to the number of L2 and L3 pagetables created as well as kernel stacks. 2.5.52-mm1 will have 2 of 3 possible things that can be done about L3 pagetables. L2 pagetables limit you to 64K processes with more practical limits around 16K. As 16K is feasible here, you are running the wrong kernel version(s). Bill |
From: Denis V. <vd...@po...> - 2002-12-19 10:16:31
|
On 18 December 2002 22:46, Till Immanuel Patzschke wrote: > Dear List(s), > > as part of my project I need to run a very high number of > processes/threads on a linux machine. Right now I have a Dual-PIII > 1.4G w/ 8GB RAM -- I am running 4000 processes w/ 2-3 threads each > totaling in a process count of 15000+ processes (since Linux doesn't > really distinguish between threads and processes...). BTW, can you say _what_ are you trying to do? > Once I pass the 10000 (+/-) pocesses load increases drastically (on > startup, although it returns to normal), however the system time (on > one processor) reaches for 54% (12061 procs) while the only non > sleeping process is top -- the system is basically doing nothing > (except scheduling the "nothing" which consumes significant system > time). > Is there anything I can do to reduce that system load/time? (I > haven't been able to exactly define the "line" but it definitly gets > worse the more processes need to be handled.) > Does any of the patchsets address this particular problem? > BTW: The processes are all alike... You need to collect memory info (especially lowmem and highmem situation) and maybe profile your kernel to find out where does it spend that time doing "nothing". BTW, your .config? -- vda |
From: Rogier W. <R.E.Wolff@BitWizard.nl> - 2002-12-22 11:12:36
|
On Wed, Dec 18, 2002 at 04:46:15PM -0800, Till Immanuel Patzschke wrote: > Once I pass the 10000 (+/-) pocesses load increases drastically (on startup, > although it returns to normal), however the system time (on one processor) > reaches for 54% (12061 procs) while the only non sleeping process is top -- the > system is basically doing nothing (except scheduling the "nothing" which > consumes significant system time). Top is a performance hog: It goes through all processes every 3 seconds. This involves reading /proc and calling the kernel: system time. Find out if your system also runs that much system time if you just grab the "system time" from a single proc file by hand. Roger. -- ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** *-- BitWizard writes Linux device drivers for any device you may have! --* * The Worlds Ecosystem is a stable system. Stable systems may experience * * excursions from the stable situation. We are currently in such an * * excursion: The stable situation does not include humans. *************** |