Menu

#157 process_stress infinite loop

Scheduler
open
Testcases (113)
5
2011-04-06
2011-04-01
No

./testcases/kernel/sched/process_stress/process -b10 -d2 can go into an infinite loop.

The test creates a tree of processes. Each process will send a message
(msgsnd, queue up) to all of its siblings. After sending messages, it
will then read (msgrcv) from its siblings.

The issue/bug is that on a large systems with lots of cores/processors
(64 cores in my case) it is possible for many processes to be sending
messages in parallel. All before any/many processes have started
reading/receiving messages. This will result in the message queue filling
up before the processes start emptying (msgrcv) the queue. Once the queue
is full, processes retry forever hoping someone will drain. But, everyone
is sending 'at once'.

This 'works' in environments with lower core/processor counts as processes
are able to queue up request and drain some in the same time slice.

To test/verify you can add a 'sleep(1)' to the testcase after it does
all the sends, and before doing any receives. I ran this on a small
(4 processor) system and it failed (ran forever) in the same manner as
it did on the large system.

Discussion

  • Cyril Hrubis

    Cyril Hrubis - 2011-04-06
    • assigned_to: nobody --> metan
    • status: open --> pending
     
  • Cyril Hrubis

    Cyril Hrubis - 2011-04-06

    Thanks for the detailed report. We've seen similar symptoms while testing SUSE linux here in suse. I'll look at the code.

     
  • Cyril Hrubis

    Cyril Hrubis - 2011-04-06
    • status: pending --> open
     

Log in to post a comment.