From: Bodo S. <bst...@fu...> - 2005-02-04 12:22:30
|
Nick Piggin wrote: > Bodo Stroesser wrote: > >> Working with the new UML skas0 mode on my Xeon HT host, sporadically I >> saw >> some processes on UML segfaulting. >> >> In all cases, I could track this down to be caused by a gs segment >> register, >> that had the wrong contents. >> >> This again is caused by a problem in the host linux: A ptraced child >> going to >> stop and having woken up its parent, will save some of its registers >> (on i386 >> they are fs, gs and the fp-registers) very late in switch_to. The >> parent is >> granted access to child's registers as soon, as the child is removed from >> the runqueue. Thus, in rare cases, the parent might access child's >> register >> savearea before the registers really are saved. >> >> This problem might also be the reason for problems with floatpoint on >> UML, >> that were reported some time ago. >> >> I've written a test program, that reproduces the problem on my 2.6.9 >> vanilla >> host quite quick. Using SuSE kernel 2.6.5-7.97-smp, I can't reproduce the >> problem, although the relevant parts seem to be unchanged. Maybe not >> related >> changes modify the timing? >> >> I also created a patch, that fixes the problem on my 2.6.9 host. This >> probably >> isn't a sane patch, but is enough to demonstrate, where I think, the >> bug is. >> Both files are attached. >> >> Bodo >> >> >> ------------------------------------------------------------------------ >> >> --- a/include/linux/sched.h 2005-02-02 22:15:51.000000000 +0100 >> +++ b/include/linux/sched.h 2005-02-02 22:22:54.000000000 +0100 >> @@ -584,6 +584,7 @@ struct task_struct { >> struct mempolicy *mempolicy; >> short il_next; /* could be shared with used_math */ >> #endif >> + volatile long saving; >> }; >> >> static inline pid_t process_group(struct task_struct *tsk) >> --- a/kernel/sched.c 2005-02-02 21:32:51.000000000 +0100 >> +++ b/kernel/sched.c 2005-02-02 22:12:14.000000000 +0100 >> @@ -2689,8 +2689,10 @@ need_resched: >> if (unlikely((prev->state & TASK_INTERRUPTIBLE) && >> unlikely(signal_pending(prev)))) >> prev->state = TASK_RUNNING; >> - else >> + else { >> + prev->saving = 1; >> deactivate_task(prev, rq); >> + } >> } >> >> cpu = smp_processor_id(); >> --- a/kernel/ptrace.c 2005-02-02 22:12:33.000000000 +0100 >> +++ b/kernel/ptrace.c 2005-02-02 22:20:46.000000000 +0100 >> @@ -96,6 +96,7 @@ int ptrace_check_attach(struct task_stru >> >> if (!ret && !kill) { >> wait_task_inactive(child); >> + while ( child->saving ) ; >> } >> >> /* All systems go.. */ >> --- a/arch/i386/kernel/process.c 2005-02-02 22:18:29.000000000 +0100 >> +++ b/arch/i386/kernel/process.c 2005-02-02 22:19:22.000000000 +0100 >> @@ -577,6 +577,9 @@ struct task_struct fastcall * __switch_t >> asm volatile("movl %%fs,%0":"=m" (*(int *)&prev->fs)); >> asm volatile("movl %%gs,%0":"=m" (*(int *)&prev->gs)); >> >> + wmb(); >> + prev_p->saving=0; >> + >> /* >> * Restore %fs and %gs if needed. >> */ >> > > I don't see how this could help because AFAIKS, child->saving is only > set and cleared while the runqueue is locked. And the same runqueue lock > is taken by wait_task_inactive. > Sorry, that not right. There are some routines called by sched(), that release and reacquire the runqueue lock. Bodo |