From: NIIBE Y. <gn...@m1...> - 2001-07-12 07:22:40
|
Masahiro Abe wrote: > No, it didn't work. It's got relatively backward, seldom see the message > from INIT. Then, what you see is another bug. In my environments (CqREEK & SolutionEngine), it works fine now. The email attached may be related to your issue. ------- start of forwarded message (RFC 934 encapsulation) ------- Content-Length: 1814 Message-ID: <20010711175809.F3496@athlon.random> References: <200107110849.f6B8nlm00414@df1tlpc.local.here> <shs...@ch...> <3B4...@uo...> <151...@ch...> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <151...@ch...>; from tro...@fy... on Wed, Jul 11, 2001 at 04:22:04PM +0200 X-GnuPG-Key-URL: http://e-mind.com/~andrea/aa.gnupg.asc X-PGP-Key-URL: http://e-mind.com/~andrea/aa.asc Precedence: bulk X-Mailing-List: lin...@vg... From: Andrea Arcangeli <an...@su...> Sender: lin...@vg... To: Trond Myklebust <tro...@fy...> Cc: Andrew Morton <an...@uo...>, Klaus Dittrich <kl...@t-...>, Linus Torvalds <tor...@tr...>, lin...@vg... Subject: Re: 2.4.7p6 hang Date: Wed, 11 Jul 2001 17:58:09 +0200 On Wed, Jul 11, 2001 at 04:22:04PM +0200, Trond Myklebust wrote: > >>>>> " " == Andrew Morton <an...@uo...> writes: > > > Trond Myklebust wrote: > >> > >> ... I have the same problem on my setup. To me, it looks like > >> the loop in spawn_ksoftirqd() is suffering from some sort of > >> atomicity problem. > > > Does a `set_current_state(TASK_RUNNING);' in spawn_ksoftirqd() > > fix it? If so we have a rogue initcall... > > Nope. The same thing happens as before. > > A couple of debugging statements show that ksoftirqd_CPU0 gets created > fine, and that ksoftirqd_task(0) is indeed getting set correctly > before we loop in spawn_ksoftirqd(). > After this the second call to kernel_thread() succeeds, but > ksoftirqd() itself never gets called before the hang occurs. ksoftirqd is quite scheduler intensive, and while its startup is correct (no need of any change there), it tends to trigger scheduler bugs (one of those bugs was just fixed in pre5). The reason I never seen the deadlock I also fixed this other scheduler bug in my tree: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.7pre5aa1/00_sched-yield-1 this one I forgot to sumbit but here it is now for easy merging: - --- 2.4.4aa3/kernel/sched.c.~1~ Sun Apr 29 17:37:05 2001 +++ 2.4.4aa3/kernel/sched.c Tue May 1 16:39:42 2001 @@ -674,8 +674,10 @@ #endif spin_unlock_irq(&runqueue_lock); - - if (prev == next) + if (prev == next) { + current->policy &= ~SCHED_YIELD; goto same_process; + } #ifdef CONFIG_SMP /* Andrea - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to maj...@vg... More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ------- end ------- |