On Fri, 25 Jan 2002, Jeff Dike wrote:
> This is the relevant part:
>
> > #12 0xa01720ef in sig_handler (sig=11, sc=
> > {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43,
> > __dsh = 0, edi = 0, esi = 2686303703, ebp = 2693791544, esp = 2693791488,
> > ebx = 2693716480, edx = 2686707352, ecx = 0, eax = 8, trapno = 14, err =
> > 4, eip = 2685823312, cs = 35, __csh = 0, eflags = 66182, esp_at_signal =
> > 2693791488, ss = 43, __ssh = 0, fpstate = 0xa08ffc80, oldmask = 0, cr2 =
> > 12}) at trap_user.c:458
> > #13 <signal handler called>
> > #14 rwsem_down_read_failed (sem=0x0)
> > at /usr/src/linux-2.4.17-8um/include/linux/list.h:40
> > #15 0xa01cf064 in _init () at hostfs_kern.c:784
> > #16 0xa0045783 in sync_old_buffers () at buffer.c:2681
> > #17 0xa0045c35 in kupdate (startup=0xa0220610) at buffer.c:2850
> > #18 0xa01743fe in new_thread_proc (t=0xa08fc000) at process_kern.c:130
>
> That NULL sem being passed into rwsem_down_read_failed is certainly
> suspicious.
When it gets into super.c:sync_supers at ..
while (sb != sb_entry(&super_blocks))
if (sb->s_dirt) {
sb->s_count++;
spin_unlock(&sb_lock);
down_read(&sb->s_umount);
write_super(sb);
drop_super(sb);
goto restart;
} else
sb = sb_entry(sb->s_list.next);
spin_unlock(&sb_lock);
.. it loops 12 times, on the 12th try it passes
{count = -65535, wait_lock = {gcc_is_buggy = 0}, wait_list = {
next = 0xa08eda3c, prev = 0xa08eda3c}}
to down_read, and then proceeds to panic.
> There aren't any semaphores in sync_old_buffers, but there are in
> sync_supers, which is what's being called at buffer.c:2681. However,
> the only semaphore I see there is
> down_read(&sb->s_umount);
> which can't be NULL. So I would start sticking breakpoints in there to
> try to figure out what semaphore that is and why it's NULL (and make sure
> it is NULL, although the fault address of 12 seems to bear that out).
>
> Jeff
|