From: Dan A. <da...@co...> - 2004-09-09 19:51:43
|
Hello, Thanks to Joe I was able to reproduce the problem locally and worked on a fix which makes the problem disappear. For the technical aspect: Since until now none of the code in arch/i386/kernel/head.S actually runs when coLinux starts I've suspected that there could be some initialization problem - if the kernel is not precisely aware of the CPU's capabilities and decides not to save the MMX registers between context switches it can indeed cause a problem. In a multi-process scenario (such as the one with gdb and Joe's program) the problem was indeed triggered before. I've already commited the patch into the monotone server. If you can check that firefox / xdvi aren't crashing anymore I'd be glad. diff -u b/arch/i386/kernel/cooperative.c b/arch/i386/kernel/cooperative.c --- b/arch/i386/kernel/cooperative.c +++ b/arch/i386/kernel/cooperative.c @@ -85,6 +85,15 @@ __asm__ __volatile__("movl %%cr4, %0" : "=r" (mmu_cr4_features)); } +asm( + "" + ".section .text\n" + ".globl co_arch_start_kernel\n" + "co_arch_start_kernel:\n" + " call co_startup_entry\n" + ".previous\n" + ""); + void co_start_arch(void) { co_early_cpu_init(); diff -u b/arch/i386/kernel/head.S b/arch/i386/kernel/head.S --- b/arch/i386/kernel/head.S +++ b/arch/i386/kernel/head.S @@ -238,6 +238,7 @@ rep movsl 1: +ENTRY(co_startup_entry) checkCPUtype: movl $-1,X86_CPUID # -1 for no CPUID initially diff -u b/include/linux/cooperative_internal.h b/include/linux/cooperative_internal.h --- b/include/linux/cooperative_internal.h +++ b/include/linux/cooperative_internal.h @@ -28,6 +28,7 @@ extern void co_idle_processor(void); extern void co_terminate(co_termination_reason_t reason); extern void co_start_kernel(void); +extern void co_arch_start_kernel(void); extern void co_handle_jiffies(long count); extern void co_send_message(co_module_t from, diff -u b/kernel/cooperative.c b/kernel/cooperative.c --- b/kernel/cooperative.c +++ b/kernel/cooperative.c @@ -48,8 +48,9 @@ memcpy(co_boot_parameters, &co_passage_page->params[10], sizeof(co_boot_parameters)); - start_kernel(); + co_arch_start_kernel(); + /* should never be reached */ co_terminate(CO_TERMINATE_END); } -- Dan Aloni da...@co... |
From: <sl...@bl...> - 2004-09-10 16:02:27
|
Dan Aloni <da...@co...> writes: > Thanks to Joe I was able to reproduce the problem locally and worked > on a fix which makes the problem disappear. Yay! Thank you, thank you, thank you. > For the technical aspect: Since until now none of the code in > arch/i386/kernel/head.S actually runs when coLinux starts I've suspected > that there could be some initialization problem - if the kernel is not > precisely aware of the CPU's capabilities and decides not to save the > MMX registers between context switches it can indeed cause a problem. In > a multi-process scenario (such as the one with gdb and Joe's program) the > problem was indeed triggered before. Here's my big question: How could things work at all? Was it sometimes saving MMX registers between context switches and sometimes not? I saw a lot of behavior under the debugger which left me puzzled about this. > I've already commited the patch into the monotone server. If you can > check that firefox / xdvi aren't crashing anymore I'd be glad. Okay, I am downloading the 20040910 snapshot. (Am I correct in assuming the fix is in this snapshot?) I hope I will have time to try it out next week. -- Joe |
From: <sl...@bl...> - 2004-09-13 09:29:19
|
sl...@bl... (Joe Wells (reverse mailbox letters for non-public replies)) writes: > > I've already commited the patch into the monotone server. If you can > > check that firefox / xdvi aren't crashing anymore I'd be glad. > > Okay, I am downloading the 20040910 snapshot. (Am I correct in > assuming the fix is in this snapshot?) I hope I will have time to try > it out next week. I can now confirm that the 2004-09-10 snapshot fixes all of the problems I had with wrong floating point exceptions. Thanks again for fixing this! -- Joe |