From: David E. <eng...@vn...> - 2002-02-20 00:03:41
|
Hi, I have been looking at a problem which fork07 uncovers in the Linux kernel. This testcase forks processes until -1 is returned indicating an error. On a typical machine with less than a certain amount of memory (probably 5 GB or so on ppc64, x86 I would guess might be a bit more), kernel memory is the first resource to become exhaused at which point -1 is returned. However, if the system has more than this amount of memory, memory is not the first resource exhausted, rather the total number of PIDs allocatable by Linux is exhaused. Fundamentally, Linux does not respond well to this condidtion. Where things fall apart is in the platform independent (ie this is a problem on all Linux systems, not just PPC64), implementation of kernel/fork.c:get_pid(). This code takes the tasklist_lock and then proceedes to loop over all tasks in the tasklist looking for a free pid to use for the new task being forked. As there are no free PIDs, the loop never completes, this processor stays stuck & eventually all other processors get stuck on the tasklist_lock. The problem is agravated on the 8GB system as the default ulimit -u value (max number of processes per user) is a function of total system memory. On a system with this much memory, all users can create 32K processes, the max available to the entire system. Even without this default, a collection of user level tasks can create a problem. The quick fix is to set ulimit to something reasonable before running this test. For example ulimit -u 4096. Longer term, this is a general Linux design question. I would recommend that fork07 is changed so that more people do not start hitting this known issue. Dave Engebretsen |
From: Manoj I. <ma...@au...> - 2002-02-20 06:52:58
|
Dave, A similar issue has already been raised to the kernel change team, the general opinion is that this is a sys-admin issue. The work around we normally do is to set the value in threads-max in /proc to some lower value (mine is set to 10000). This problem is also due to the fact that the tests are executed as root. As root you can even shutdown the system! I guess the tests should modified to handle such situations. Manoj ******************************************************************************* The greatest risk is not taking one. ******************************************************************************* On Tue, 19 Feb 2002, David Engebretsen wrote: > Hi, > > I have been looking at a problem which fork07 uncovers in the Linux kernel. > This testcase forks processes until -1 is returned indicating an error. > On a typical machine with less than a certain amount of memory (probably 5 GB > or so on ppc64, x86 I would guess might be a bit more), kernel memory is the > first resource to become exhaused at which point -1 is returned. However, > if the system has more than this amount of memory, memory is not the first > resource exhausted, rather the total number of PIDs allocatable by Linux > is exhaused. > > Fundamentally, Linux does not respond well to this condidtion. Where things > fall apart is in the platform independent (ie this is a problem on all Linux > systems, not just PPC64), implementation of kernel/fork.c:get_pid(). This code > takes the tasklist_lock and then proceedes to loop over all tasks in the > tasklist looking for a free pid to use for the new task being forked. As there > are no free PIDs, the loop never completes, this processor stays stuck & > eventually all other processors get stuck on the tasklist_lock. > > The problem is agravated on the 8GB system as the default ulimit -u value (max > number of processes per user) is a function of total system memory. On a system > with this much memory, all users can create 32K processes, the max available to > the entire system. Even without this default, a collection of user level > tasks can create a problem. > > The quick fix is to set ulimit to something reasonable before running this > test. For example ulimit -u 4096. Longer term, this is a general Linux > design question. I would recommend that fork07 is changed so that more people > do not start hitting this known issue. > > Dave Engebretsen > > _______________________________________________ > Ltp-list mailing list > Ltp...@li... > https://lists.sourceforge.net/lists/listinfo/ltp-list > |
From: David E. <eng...@vn...> - 2002-02-20 13:50:36
|
I think my basic point was missed - it would seem like a good idea to have the fork test set the process ulimit to something less than 32K to ensure this problem is not hit by others. As it stands, running fork07 out of the box on any 64b platform with sufficient memory will crash the system given the current design of the get_pid algorithm in Linux and the default process limits. The patch Paul referenced only impacts systems with highmem. That is not the problem in the case of ppc64 as we do not need highmem, so that patch will have no effect. Also, running as root does not really agravate this problem -- any user can easily lock up a system while running this fork test, if the system has enough memory. In fact, even if a sys admin sets 'reasonable' user limits, all you need are n users running a test like this to lock up the system. Dave. Manoj Iyer wrote: > > Dave, > > A similar issue has already been raised to the kernel change team, the > general opinion is that this is a sys-admin issue. > > The work around we normally do is to set the value in threads-max in /proc > to some lower value (mine is set to 10000). This problem is also due to > the fact that the tests are executed as root. As root you can even > shutdown the system! I guess the tests should modified to handle such > situations. > > Manoj > > ******************************************************************************* > The greatest risk is not taking one. > ******************************************************************************* > > On Tue, 19 Feb 2002, David Engebretsen wrote: > > > Hi, > > > > I have been looking at a problem which fork07 uncovers in the Linux kernel. > > This testcase forks processes until -1 is returned indicating an error. > > On a typical machine with less than a certain amount of memory (probably 5 GB > > or so on ppc64, x86 I would guess might be a bit more), kernel memory is the > > first resource to become exhaused at which point -1 is returned. However, > > if the system has more than this amount of memory, memory is not the first > > resource exhausted, rather the total number of PIDs allocatable by Linux > > is exhaused. > > > > Fundamentally, Linux does not respond well to this condidtion. Where things > > fall apart is in the platform independent (ie this is a problem on all Linux > > systems, not just PPC64), implementation of kernel/fork.c:get_pid(). This code > > takes the tasklist_lock and then proceedes to loop over all tasks in the > > tasklist looking for a free pid to use for the new task being forked. As there > > are no free PIDs, the loop never completes, this processor stays stuck & > > eventually all other processors get stuck on the tasklist_lock. > > > > The problem is agravated on the 8GB system as the default ulimit -u value (max > > number of processes per user) is a function of total system memory. On a system > > with this much memory, all users can create 32K processes, the max available to > > the entire system. Even without this default, a collection of user level > > tasks can create a problem. > > > > The quick fix is to set ulimit to something reasonable before running this > > test. For example ulimit -u 4096. Longer term, this is a general Linux > > design question. I would recommend that fork07 is changed so that more people > > do not start hitting this known issue. > > > > Dave Engebretsen > > > > _______________________________________________ > > Ltp-list mailing list > > Ltp...@li... > > https://lists.sourceforge.net/lists/listinfo/ltp-list > > |
From: Paul L. <pl...@au...> - 2002-02-20 14:38:49
|
On Wed, 2002-02-20 at 07:50, David Engebretsen wrote: > I think my basic point was missed - it would seem like a good idea to have the > fork test set the process ulimit to something less than 32K to ensure this > problem is not hit by others. As it stands, running fork07 out of the box on > any 64b platform with sufficient memory will crash the system given the current > design of the get_pid algorithm in Linux and the default process limits. > > The patch Paul referenced only impacts systems with highmem. That is not the > problem in the case of ppc64 as we do not need highmem, so that patch will have > no effect. Also, running as root does not really agravate this problem -- any > user can easily lock up a system while running this fork test, if the system has > enough memory. In fact, even if a sys admin sets 'reasonable' user limits, all > you need are n users running a test like this to lock up the system. Ahh, ok. I'm finally starting to wake up this morning. :) Do you know who is working on fixing this known problem? I really hate to change a testcase that is working as designed, and exposing a real problem with the kernel. If you don't want to see it happen anymore, you can comment it out of the runtest/syscalls. In any case, setting threads-max to a more sane number should work around this as well, since do_fork() does a check for that before get_pid is ever called. -Paul Larson |
From: David E. <eng...@vn...> - 2002-02-20 15:47:03
|
Paul Larson wrote: > > On Wed, 2002-02-20 at 07:50, David Engebretsen wrote: > > I think my basic point was missed - it would seem like a good idea to have the > > fork test set the process ulimit to something less than 32K to ensure this > > problem is not hit by others. As it stands, running fork07 out of the box on > > any 64b platform with sufficient memory will crash the system given the current > > design of the get_pid algorithm in Linux and the default process limits. > > > > The patch Paul referenced only impacts systems with highmem. That is not the > > problem in the case of ppc64 as we do not need highmem, so that patch will have > > no effect. Also, running as root does not really agravate this problem -- any > > user can easily lock up a system while running this fork test, if the system has > > enough memory. In fact, even if a sys admin sets 'reasonable' user limits, all > > you need are n users running a test like this to lock up the system. > Ahh, ok. I'm finally starting to wake up this morning. :) > > Do you know who is working on fixing this known problem? I really hate > to change a testcase that is working as designed, and exposing a real > problem with the kernel. If you don't want to see it happen anymore, > you can comment it out of the runtest/syscalls. I will start working on a fix when I get a chance. > > In any case, setting threads-max to a more sane number should work > around this as well, since do_fork() does a check for that before > get_pid is ever called. Yes, what I have told the testers is to set ulimit on the number of processes. I just thought you may want to consider a change to the testcase as it may take a while for a fix to work into the kernel. Your choice of course. Dave. |
From: Paul L. <pl...@au...> - 2002-02-20 15:57:22
Attachments:
pidmax.patch
|
On Wed, 2002-02-20 at 09:46, David Engebretsen wrote: > Yes, what I have told the testers is to set ulimit on the number of processes. > I just thought you may want to consider a change to the testcase as it may take > a while for a fix to work into the kernel. Your choice of course. I just hate to cover up a problem so that it can get ignored. It's easy to work around if you don't want to continue to hang there. Try this patch and see if it helps any. I'm not sure if this is the right way to do it, but I suspect that max_threads should not be allowed to default to a higher number than PID_MAX. That seems wrong to me, especially since max_threads is getting checked in do_fork() do see if we have gone over the max. The patch is against 2.4.18-rc2, but it's small and should be easy to hand patch if it doesn't work against your kernel version. Thanks, Paul Larson |
From: Paul L. <pl...@au...> - 2002-02-20 21:30:45
Attachments:
getpid.patch
|
Actually try, this patch instead. It fixes it in a much better way by only letting get_pid() search through all the processes for an available pid one time, rather than holding the locks and looping through it forever. If it doesn't find one after the first pass, fork returns -EAGAIN. This should be applied instead of the first one I sent, not in addition to. Please let me know if you see anything wrong with it. -Paul Larson |
From: Randy.Dunlap <rdd...@os...> - 2002-02-20 22:47:26
Attachments:
forker.log.gz
forker.c
|
On 20 Feb 2002, Paul Larson wrote: | Actually try, this patch instead. It fixes it in a much better way by | only letting get_pid() search through all the processes for an available | pid one time, rather than holding the locks and looping through it | forever. If it doesn't find one after the first pass, fork returns | -EAGAIN. This should be applied instead of the first one I sent, not in | addition to. Please let me know if you see anything wrong with it. Hi, // he says from his hiding spot; Very similar to patch I've had on my disk for a few weeks now, but a low priority for me, or you would have already seen it. I don't recall the original problem (maybe I'll check the email archives), but I do have a couple of comments, since I've experimented with this a bit. diff -Naur linux-2.4.18-rc2/kernel/fork.c linux-getpid/kernel/fork.c --- linux-2.4.18-rc2/kernel/fork.c Wed Feb 20 09:54:39 2002 +++ linux-getpid/kernel/fork.c Wed Feb 20 15:32:33 2002 @@ -85,12 +85,13 @@ { static int next_safe = PID_MAX; struct task_struct *p; - int pid; + int pid, beginpid; if (flags & CLONE_PID) return current->pid; spin_lock(&lastpid_lock); + beginpid = last_pid; if((++last_pid) & 0xffff8000) { last_pid = 300; /* Skip daemons etc. */ goto inside; @@ -110,12 +111,19 @@ last_pid = 300; next_safe = PID_MAX; } + if(last_pid == beginpid) { + read_unlock(&tasklist_lock); + spin_unlock(&lastpid_lock); + return -1; + } | Same code down to here, except that I used "pid_begin", and | since the if-block above is "unlikely", I used a goto pid_error: | to the end of the function. I'd suggest using unlikely() on it, | or pushing it to the end of the function. | Oh, I also returned 0, but I prefer the -1 here. | It helps on checking the return value of get_pid() below. goto repeat; } if(p->pid > last_pid && next_safe > p->pid) next_safe = p->pid; if(p->pgrp > last_pid && next_safe > p->pgrp) next_safe = p->pgrp; + if(p->tgid > last_pid && next_safe > p->tgid) + next_safe = p->tgid; if(p->session > last_pid && next_safe > p->session) next_safe = p->session; } @@ -620,6 +628,8 @@ copy_flags(clone_flags, p); p->pid = get_pid(clone_flags); + if (p->pid == -1) + goto bad_fork_cleanup; p->run_list.next = NULL; p->run_list.prev = NULL; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ and then, once you have this in place, if you try to fork 65000 processes, the get_pid() function [local, proper] looping forever won't be the problem, as I expected it to be. At least in my testing, there were VM and/or lock problems that caused my 'forker' program to never finish. The kernel just looped forever, trying to alloc_pages, shrink_caches, __get_free_pages, etc. 'forker' is a kiss program that tries to fork 65000 processes. That's its only mission in life. (forker.c attached; maybe it's similar to the LTP test) (100 KB 'forker.log.gz' attached) -- ~Randy |
From: Paul L. <pl...@au...> - 2002-02-20 23:22:16
|
> At least in my testing, there were VM and/or lock problems that > caused my 'forker' program to never finish. The kernel just looped > forever, trying to alloc_pages, shrink_caches, __get_free_pages, etc. Yep, I'm assuming this is on i386? There is a problem with the fact that max_threads is calculated based on all the memory (including highmem). Dave McCracken has a fix for that issue, here's a link to the archive: http://marc.theaimsgroup.com/?l=linux-kernel&m=100506843702466&w=2 We keep getting bit by that one too, but I brought it to Marcelo's attention after the last round of RC testing we did and he said he'd take a look at it for the 2.4.19-pre series. Hope that helps, and thanks for the input. -Paul Larson |
From: David E. <eng...@vn...> - 2002-02-20 22:01:52
|
I don't currently have access to the system which hit this, but this patch is pretty much what I had in mind to try. I think it will fix the problem for this testcase. Dave. Paul Larson wrote: > > Actually try, this patch instead. It fixes it in a much better way by > only letting get_pid() search through all the processes for an available > pid one time, rather than holding the locks and looping through it > forever. If it doesn't find one after the first pass, fork returns > -EAGAIN. This should be applied instead of the first one I sent, not in > addition to. Please let me know if you see anything wrong with it. > > -Paul Larson > > -------------------------------------------------------------------------------- > Name: getpid.patch > getpid.patch Type: text/x-patch > Encoding: quoted-printable |
From: Dave E. <eng...@vn...> - 2002-02-21 11:43:35
|
I will not have time to test this real soon. You should be able to test on any system by setting PID_MAX to a smaller value. Dave. Paul Larson wrote: > On Wed, 2002-02-20 at 16:01, David Engebretsen wrote: > > I don't currently have access to the system which hit this, but this patch is > > pretty much what I had in mind to try. I think it will fix the problem for this > > testcase. > Do you think you might be able to try it anytime soon? I've tried it on > my machines and it at least doesn't break anything, but it would be nice > to know if it really fixed the originally reported problem. > > Thanks, > Paul Larson |
From: Paul L. <pl...@au...> - 2002-02-20 13:12:11
|
The problem is on machines with highmem, the calculation for threads-max is wrong. It is taking all them memory into account when it calculates the value of max-threads, but the memory it can use for a task must come from kernel memory. Dave McCracken wrote a patch for this against 2.4.14 that can be found here: http://marc.theaimsgroup.com/?l=linux-kernel&m=100506843702466&w=2 Give this a try and see if it works. It has been recently resubmitted, so hopefully we'll see it fixed soon. Until then, the workaround that Manoj told you about should work fine (manually setting /proc/sys/kernel/threads-max to something more reasonable). Thanks, Paul Larson |