From: Hubertus F. <fr...@wa...> - 2002-03-05 14:50:31
|
Can somebody post why this patch shouldn't be picked up ? The attached program shows the problem in user space and the patch is almost trivial .. -- Hubertus ---------- Forwarded Message ---------- Subject: [Lse-tech] get_pid() performance fix Date: Tue, 26 Feb 2002 18:58:51 -0500 From: "Rajan Ravindran" <ra...@us...> To: lse...@li... Cc: lin...@vg..., da...@su... Paul Larson posted a patch which fixes the get_pid() hang, if we run out of available pids. Nevertheless, we have observed that it takes a long time to find the next available pid once the last pid reaches the PID_MAX. This is due to a constant rescan of the task list while progressing only one pid at time, yielding an O(n**2) problem. Here is a patch (together with Paul's fix) which eliminates the time taken to search for an available pid, by not constantly restarting the search through the entire task list. Attached is also a user level test program getpid.c), by which one can simulate creation and deletion of processes. This application will measure the time taken for the get_pid() routine to find the next available process. example: getpid -p 32770 -n 3 which will try to create 32770 process, eventually get_pid can't find any free pid after 32767, so it will delete 3 pids randomly from the existing list and start calculating the time taken to find the available pid (which we deleted). In getpid.c the new fixes are inside the #define NEW_METHOD. Try compiling the getpid.c with and without -DNEW_METHOD compile option to see the performance improvement. here is an example output for the old and the new algorithm and their respective execution time to find a new pid. (See attached file: output) This can/should be applied to 2.5 and 2.4 kernels. (See attached file: getpid.c) Thanks, Rajan diff -Naur linux-2.5.5/kernel/fork.c linux-2.5.5-new/kernel/fork.c --- linux-2.5.5/kernel/fork.c Tue Feb 19 21:10:55 2002 +++ linux-2.5.5-new/kernel/fork.c Tue Feb 26 15:46:36 2002 @@ -24,6 +24,7 @@ #include <linux/file.h> #include <linux/binfmts.h> #include <linux/fs.h> +#include <linux/compiler.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> @@ -129,12 +130,13 @@ { static int next_safe = PID_MAX; struct task_struct *p; - int pid; + int pid, beginpid; if (flags & CLONE_PID) return current->pid; spin_lock(&lastpid_lock); + beginpid = last_pid; if((++last_pid) & 0xffff8000) { last_pid = 300; /* Skip daemons etc. */ goto inside; @@ -153,13 +155,18 @@ if(last_pid & 0xffff8000) last_pid = 300; next_safe = PID_MAX; + goto repeat; } - goto repeat; + if(unlikely(last_pid == beginpid)) + goto nomorepids; + continue; } if(p->pid > last_pid && next_safe > p->pid) next_safe = p->pid; if(p->pgrp > last_pid && next_safe > p->pgrp) next_safe = p->pgrp; + if(p->tgid > last_pid && next_safe > p->tgid) + next_safe = p->tgid; if(p->session > last_pid && next_safe > p->session) next_safe = p->session; } @@ -169,6 +176,11 @@ spin_unlock(&lastpid_lock); return pid; + +nomorepids: + read_unlock(&tasklist_lock); + spin_unlock(&lastpid_lock); + return 0; } static inline int dup_mmap(struct mm_struct * mm) @@ -667,6 +679,8 @@ copy_flags(clone_flags, p); p->pid = get_pid(clone_flags); + if (p->pid == 0 && current->pid != 0) + goto bad_fork_cleanup; INIT_LIST_HEAD(&p->run_list); ------------------------------------------------------- -- -- Hubertus Franke (fr...@wa...) |
From: OGAWA H. <hir...@ma...> - 2002-03-05 16:26:51
|
Hubertus Franke <fr...@wa...> writes: > @@ -153,13 +155,18 @@ > if(last_pid & 0xffff8000) > last_pid = 300; > next_safe = PID_MAX; > + goto repeat; > } > - goto repeat; > + if(unlikely(last_pid == beginpid)) > + goto nomorepids; > + continue; It isn't guaranteed that pid is unique. In the case: task->pid = 300, task->xxx = 301 pid 301 is free This get_pid() returns 301. Regards. -- OGAWA Hirofumi <hir...@ma...> |
From: Hubertus F. <fr...@wa...> - 2002-03-05 16:43:10
|
On Tuesday 05 March 2002 11:26 am, OGAWA Hirofumi wrote: > Hubertus Franke <fr...@wa...> writes: > > @@ -153,13 +155,18 @@ > > if(last_pid & 0xffff8000) > > last_pid = 300; > > next_safe = PID_MAX; > > + goto repeat; > > } > > - goto repeat; > > + if(unlikely(last_pid == beginpid)) > > + goto nomorepids; > > + continue; > > It isn't guaranteed that pid is unique. > > In the case: > task->pid = 300, task->xxx = 301 > pid 301 is free > > This get_pid() returns 301. > > Regards. No the point of this patch was to limit the search time for finding the next available pid. We are not mocking around with the logic that declares a pid available or not. That stays the same. However, one doesn't need to start every single time from the beginning to find the next available pid. -- -- Hubertus Franke (fr...@wa...) |
From: Rusty R. <ru...@ru...> - 2002-03-07 03:53:19
|
On Mon, 4 Mar 2002 20:57:49 -0500 Hubertus Franke <fr...@wa...> wrote: > > Can somebody post why this patch shouldn't be picked up ? > The attached program shows the problem in user space > and the patch is almost trivial .. At a cursory glance, this seems to be three patches: 1) Fix the get_pid() hang. 2) Speed up get_pid(). 3) And this piece I'm not sure about: > + if(p->tgid > last_pid && next_safe > p->tgid) > + next_safe = p->tgid; Please split, and send the fix get_pid() hang to trivial patch monkey, and push optimization to Linus. Cheers! Rusty. -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. |
From: Hubertus F. <fr...@wa...> - 2002-03-07 14:34:26
|
On Wednesday 06 March 2002 10:56 pm, Rusty Russell wrote: > On Mon, 4 Mar 2002 20:57:49 -0500 > > Hubertus Franke <fr...@wa...> wrote: > > Can somebody post why this patch shouldn't be picked up ? > > The attached program shows the problem in user space > > and the patch is almost trivial .. > > At a cursory glance, this seems to be three patches: > 1) Fix the get_pid() hang. > 2) Speed up get_pid(). > > 3) And this piece I'm not sure about: > > + if(p->tgid > last_pid && next_safe > p->tgid) > > + next_safe = p->tgid; > > Please split, and send the fix get_pid() hang to trivial patch monkey, > and push optimization to Linus. > > Cheers! > Rusty. Thanks, patch was bad and not properly functioning as pointed out to us. We are rewriting right now (actually <ra...@us...> is doing the coding. I am just there for idea bouncing easy if the office is 2 doors away. 1) was done by Greg Larson and was already submitted 2) once properly done, we will circulate before bothering Linus again 3) this must have come in because of a wrong patch generation. -- -- Hubertus Franke (fr...@wa...) |
From: Guest s. DW <dw...@wi...> - 2002-03-07 14:55:03
|
On Thu, Mar 07, 2002 at 09:35:09AM -0500, Hubertus Franke wrote: ... Long ago I submitted a patch that changed the max pid from 15 bits to 24 or 30 bits or so. (And of course removed the inefficiency noticed by some people in the current thread.) Probably this is a good moment to try and see what Linus thinks about this today. [Of course Albert Cahalan will object that this is bad for the columns of ps output. Maybe Alan will mutter something about sysvipc. Roughly speaking there are only advantages, especially since I think we'll have to do this sooner or later, and in such cases sooner is better.] |
From: Dave M. <dm...@us...> - 2002-03-07 15:23:06
|
--On Thursday, March 07, 2002 09:35:09 AM -0500 Hubertus Franke <fr...@wa...> wrote: >> At a cursory glance, this seems to be three patches: >> 1) Fix the get_pid() hang. >> 2) Speed up get_pid(). >> >> 3) And this piece I'm not sure about: >> > + if(p->tgid > last_pid && next_safe > p->tgid) >> > + next_safe = p->tgid; > 1) was done by Greg Larson and was already submitted > 2) once properly done, we will circulate before bothering Linus again > 3) this must have come in because of a wrong patch generation. Actually that was Paul Larson, but close enough :) 3) is actually a separate bugfix. I had a patch accepted some time back that added the check for tgid, but missed adding it to the section below. Paul caught it when he did his patch and added it for me. Dave McCracken ====================================================================== Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059 dm...@us... T/L 678-3059 |
From: Hubertus F. <fr...@wa...> - 2002-03-07 19:07:13
|
On Thursday 07 March 2002 09:54 am, Guest section DW wrote: > On Thu, Mar 07, 2002 at 09:35:09AM -0500, Hubertus Franke wrote: > ... > > Long ago I submitted a patch that changed the max pid from 15 bits to > 24 or 30 bits or so. (And of course removed the inefficiency noticed > by some people in the current thread.) > Probably this is a good moment to try and see what Linus thinks > about this today. > > [Of course Albert Cahalan will object that this is bad for the columns > of ps output. Maybe Alan will mutter something about sysvipc. > Roughly speaking there are only advantages, especially since > I think we'll have to do this sooner or later, and in such cases > sooner is better.] I don't think that will solve the N^2 problem you still have the algorithm do the following: if (++last_pid > next_safe) { repeat: last_pid++; foralltasks p: deal with wraparound; if (p uses last_pid) goto repeat determine next_safe } [ last_pid ... next_safe ) is the range that can be saftely used By extending it to 24 or larger bits all you do is handle the wraparound at some later point and less frequent It also becomes expensive if a large number of threads is present. Well, the problem can be fixed. Even in the current 16 bit approach. What we are experimenting around is a <mark-and-sweep> algorithm that would be invoked if the nr_threads is above a threshold. The algorithm would do something like this. Rajan will code it up and see its efficacy. if (last_pid >= next_safe) { inside: if (nr_threads > threshold) { // constant last_pid = get_pid_map(last_pid,&next_safe); } else { .. <as now> } } static unsigned long pid_map[1<<12]; /* determine a range of pids that is available for sure * [ last_pid .. next_safe ) * pid_map has stale information. some pids might be marked * as used even if they had been freed in the meantime */ int get_pid_map(int last_pid, int *next_safe) { int again = 1; repeat: for_each_task(p) mark_pids_in_bitmap; last_pid = ffz(pid_map); /* we will start from last_pid with wraparound */ if ((last_pid == -1) && (again)) { again = 0; memset(pid_map,0); goto repeat } } next_safe = first_non_zero(pid_map,last_pid); /* starting from last_pid */ return last_pid; } Note, if the pid map is to large, it can be done in smaller sections or windows. Also, note keeping stale information is OK actually desirable, as it avoids the sweep in almost all cases. -- -- Hubertus Franke (fr...@wa...) |
From: Richard B. J. <ro...@ch...> - 2002-03-07 19:44:26
|
On Thu, 7 Mar 2002, Hubertus Franke wrote: > On Thursday 07 March 2002 09:54 am, Guest section DW wrote: > > On Thu, Mar 07, 2002 at 09:35:09AM -0500, Hubertus Franke wrote: > > ... > > > > Long ago I submitted a patch that changed the max pid from 15 bits to > > 24 or 30 bits or so. (And of course removed the inefficiency noticed > > by some people in the current thread.) > > Probably this is a good moment to try and see what Linus thinks > > about this today. > > > > [Of course Albert Cahalan will object that this is bad for the columns > > of ps output. Maybe Alan will mutter something about sysvipc. > > Roughly speaking there are only advantages, especially since > > I think we'll have to do this sooner or later, and in such cases > > sooner is better.] > > I don't think that will solve the N^2 problem you still have the algorithm > do the following: > > if (++last_pid > next_safe) { > repeat: > last_pid++; > foralltasks p: > deal with wraparound; > if (p uses last_pid) goto repeat > determine next_safe > } > [ last_pid ... next_safe ) is the range that can be saftely used > > By extending it to 24 or larger bits all you do is handle the wraparound > at some later point and less frequent It also becomes expensive if a large > number of threads is present. > Well, the problem can be fixed. Even in the current 16 bit approach. > What we are experimenting around > is a <mark-and-sweep> algorithm that would be invoked if the nr_threads is > above a threshold. > The algorithm would do something like this. Rajan will code it up and see > its efficacy. > > if (last_pid >= next_safe) { > inside: > if (nr_threads > threshold) { // constant > last_pid = get_pid_map(last_pid,&next_safe); > } else { > .. <as now> > } > } > > static unsigned long pid_map[1<<12]; > > /* determine a range of pids that is available for sure > * [ last_pid .. next_safe ) > * pid_map has stale information. some pids might be marked > * as used even if they had been freed in the meantime > */ > > int get_pid_map(int last_pid, int *next_safe) > { > int again = 1; > repeat: > for_each_task(p) > mark_pids_in_bitmap; > last_pid = ffz(pid_map); /* we will start from last_pid with wraparound */ > if ((last_pid == -1) && (again)) { > again = 0; > memset(pid_map,0); > goto repeat > } > } > next_safe = first_non_zero(pid_map,last_pid); /* starting from last_pid */ > return last_pid; > } > > > > Note, if the pid map is to large, it can be done in smaller sections or > windows. Also, note keeping stale information is OK actually desirable, as > it avoids the sweep in almost all cases. > > -- > -- Hubertus Franke (fr...@wa...) > - If security issues were not a concern, you save the last PID, freed at _exit() and use that immediately. If it's already used, it's zeroed. exit() just stuffs the exit() PID into the variable, overwriting any previous, including zero. It needs to be handled under a lock, but it gets a quick return on investment for the usual fork() exec() stuff that interactive users use. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (799.53 BogoMips). Bill Gates? Who? |
From: Guest s. DW <dw...@wi...> - 2002-03-07 19:46:13
|
On Thu, Mar 07, 2002 at 02:07:38PM -0500, Hubertus Franke wrote: > On Thursday 07 March 2002 09:54 am, Guest section DW wrote: > > On Thu, Mar 07, 2002 at 09:35:09AM -0500, Hubertus Franke wrote: > > ... > > > > Long ago I submitted a patch that changed the max pid from 15 bits to > > 24 or 30 bits or so. (And of course removed the inefficiency noticed > > by some people in the current thread.) > I don't think that will solve the N^2 problem Do you understand "inefficiency"? And "remove"? |
From: Hubertus F. <fr...@wa...> - 2002-03-07 23:13:17
|
On Thursday 07 March 2002 02:46 pm, Guest section DW wrote: > On Thu, Mar 07, 2002 at 02:07:38PM -0500, Hubertus Franke wrote: > > On Thursday 07 March 2002 09:54 am, Guest section DW wrote: > > > On Thu, Mar 07, 2002 at 09:35:09AM -0500, Hubertus Franke wrote: > > > ... > > > > > > Long ago I submitted a patch that changed the max pid from 15 bits to > > > 24 or 30 bits or so. (And of course removed the inefficiency noticed > > > by some people in the current thread.) > > > > I don't think that will solve the N^2 problem > > Do you understand "inefficiency"? And "remove"? I do, what's your point ? I haven't seen your patch picked up.... I do understand that when you occasionally get into this scenario of the loop that particularly for large thread counts this stalls the system for seconds (say 30, with the tasklock held). -- -- Hubertus Franke (fr...@wa...) |