From: Henry N. <Henry.Ne@Arcor.de> - 2005-03-18 10:21:32
|
Nuno Lucas wrote: > gboutwel, dando pulos de alegria, escreveu : > >> I'm afraid I'm not to good with kernel panics yet. Anyone want >> to take a stab at the problem? I'm think it's possible a general >> kernel problem, but there are references to coLinux timer interrupt. > > > What the call trace tell us is that during the processing of the timer > interrupt, that in colinux gets called as a result of co_handle_jiffies, > a page fault occurred. > > There can be no exceptions (like a page fault) during the processing of > interrupts, so the panic message was issued. > > It occurred in account_user_time(), which is responsible for updating the > time spent in user mode (the user/sys time values top shows), because > the scheduler was running a user process at the time of the interrupt. > > > Now a bit of speculation on my part, as I'm a kernel noob... > > A user mode program calls exit_group() (same as exit(), but for all > threads in the process). > The system changes to kernel mode and executes do_group_exit(). > While executing it an interrupt occurs, forcing the switch to the > Windows host (I think we could deduce the interrupt number from the > function name, but need to check). > After letting the host OS process the interrupt (by co_callback), > the control returns to colinux, which after seeing there are no messages > calls co_handle_jiffies (our "timer emulator"). > co_handle_jiffies fakes a timer interrupt, which panics while trying to > update elapsed user time for the current process. > > One thing seems strange to me. If the kernel was running a syscall > before the interrupt, it should be updating elapsed system time, > not user time. > > Now someone to tell me if my thoughts are correct... I'm agre with your speculation. Only exit_group don't know. If this so, this must a big bug in kernel for detect sys/user in timer handler.Think this should be correct. I see in interrupt handle a loop over list, that do call time update function for every process, user and sys. My question: What was the exeption "Write protected page?". In disassemble seen, that read from p->user works, and write fails? Perhaps the exit_group is doing modify on memory tables for freeing process contect and than the timer interrupts this? Perhaps do we a proxy interrupt call, where interrupt is disabled in real linux? Please see disassembled files. I have compiled coLinux and the complete cross tools for under coLinux. Need 88 minutes and no crash! See the result and times user/sys: http://www.henrynestler.com/colinux/screenshoots/suse90-build-done.png -- Henry Nestler |