Thread: [LTP] fork07 issues

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

I have been looking at a problem which fork07 uncovers in the Linux kernel.  
This testcase forks processes until -1 is returned indicating an error. 
On a typical machine with less than a certain amount of memory (probably 5 GB 
or so on ppc64, x86 I would guess might be a bit more), kernel memory is the 
first resource to become exhaused at which point -1 is returned.  However, 
if the system has more than this amount of memory, memory is not the first 
resource exhausted, rather the total number of PIDs allocatable by Linux 
is exhaused.  

Fundamentally, Linux does not respond well to this condidtion.  Where things
fall apart is in the platform independent (ie this is a problem on all Linux
systems, not just PPC64), implementation of kernel/fork.c:get_pid().  This code
takes the tasklist_lock and then proceedes to loop over all tasks in the
tasklist looking for a free pid to use for the new task being forked.  As there
are no free PIDs, the loop never completes, this processor stays stuck &
eventually all other processors get stuck on the tasklist_lock.

The problem is agravated on the 8GB system as the default ulimit -u value (max
number of processes per user) is a function of total system memory.  On a system
with this much memory, all users can create 32K processes, the max available to
the entire system.  Even without this default, a collection of user level
tasks can create a problem.

The quick fix is to set ulimit to something reasonable before running this
test.  For example ulimit -u 4096.  Longer term, this is a general Linux 
design question. I would recommend that fork07 is changed so that more people
do not start hitting this known issue.

Dave Engebretsen

Thread: [LTP] fork07 issues

Testsuite to validate the reliability, robustness, stability of Linux.

ltp-list