From: Denis D. <ddo...@po...> - 2001-01-16 07:48:14
Attachments:
kernel-signal-cache.patch
|
Hello Niibe-san, I have been getting rare crashes in Linux 2.4.0 when handling SIGALRM signals. I have managed to trace the problem down and it seems that the sigreturn trap code is not getting flushed to the signal frame sometimes. The problem goes away completely if I do a flush_cache_all instead of flush_icache_range. I have attached a patch that does this. I cannot see anything wrong with the current flush_icache_range usage but suspect it is broken. The flush_cache_all is complete overkill so would be very keen to hear if you have a fix for flush_icache_range. The problem is very difficult to reproduce in a test program. It only happens when parts of the stack have already been mapped into the address cache and the signal handler is very short. I have a compiled version of lrz from the lrzsz package that consistently crashes 2.4.0 everytime it is run. I can send this to you if you want. Regards, Denis |
From: SUGIOKA T. <su...@it...> - 2001-01-17 01:09:03
|
At 18:45 01/01/16 +1100, Denis Dowling wrote: >I have been getting rare crashes in Linux 2.4.0 when handling SIGALRM >signals. I have managed to trace the problem down and it seems that the >sigreturn trap code is not getting flushed to the signal frame >sometimes. The problem goes away completely if I do a flush_cache_all >instead of flush_icache_range. I have attached a patch that does this. I >cannot see anything wrong with the current flush_icache_range usage but >suspect it is broken. The flush_cache_all is complete overkill so would >be very keen to hear if you have a fix for flush_icache_range. Currently flush_icache_range uses A-bit. but A-bit works only when ITLB entry exists. UTLB entry should not work. So every I-cache entry should be compared with physical address by hand or all I-cache should be invalidated. ---- SUGIOKA Toshinobu |
From: Denis D. <ddo...@po...> - 2001-01-19 01:51:16
|
SUGIOKA Toshinobu wrote: > > At 18:45 01/01/16 +1100, Denis Dowling wrote: > >I have been getting rare crashes in Linux 2.4.0 when handling SIGALRM > >signals. I have managed to trace the problem down and it seems that the > >sigreturn trap code is not getting flushed to the signal frame > >sometimes. The problem goes away completely if I do a flush_cache_all > >instead of flush_icache_range. I have attached a patch that does this. I > >cannot see anything wrong with the current flush_icache_range usage but > >suspect it is broken. The flush_cache_all is complete overkill so would > >be very keen to hear if you have a fix for flush_icache_range. > > Currently flush_icache_range uses A-bit. but A-bit works only when > ITLB entry exists. UTLB entry should not work. > So every I-cache entry should be compared with physical address by hand > or all I-cache should be invalidated. At the moment I just completely invalidate the I and U caches. Do you have any ideas on how to do this more efficiently? I still have problems when I try to restart processes in the background. sh-2.03# xterm (here I type ^Z) [2]+ Stopped xterm sh-2.03# bg [2]+ xterm & xterm: Error 50, errno 14: Bad address sh-2.03# [2]+ Done(50) xterm I am not sure if this problem is related? Regards, Denis. |
From: SUGIOKA T. <su...@it...> - 2001-01-19 03:50:54
|
At 12:48 01/01/19 +1100, Denis Dowling <ddo...@po...> wrote: > >sh-2.03# xterm > >(here I type ^Z) > >[2]+ Stopped xterm >sh-2.03# bg >[2]+ xterm & >xterm: Error 50, errno 14: Bad address >sh-2.03# >[2]+ Done(50) xterm > > >I am not sure if this problem is related? I hear that xterm triggers binutils's bug. Kaz Kojima's latest patch may fix this problem. Please check http://dodo.nurs.or.jp/~kkojima/index.html/temporary-README-new-toolchain If you update binutils, you should rebuild gcc with new binutils glibc-2.2, and then all other shared libraries in this order. Rpms of new tool chains are at http://www.sh-linux.org/rpm-index/index.html If you are RedHat user, please test them. ---- SUGIOKA Toshinobu |
From: Denis D. <ddo...@po...> - 2001-01-19 04:36:17
|
I don't think this is a binutils bug. I had to apply all of Kaz Kojima's patches at the start of this year to get shared libraries to work. I have just rechecked the patches and I am running the latest copies of each. SUGIOKA Toshinobu wrote: > > At 12:48 01/01/19 +1100, Denis Dowling <ddo...@po...> wrote: > > > >sh-2.03# xterm > > > >(here I type ^Z) > > > >[2]+ Stopped xterm > >sh-2.03# bg > >[2]+ xterm & > >xterm: Error 50, errno 14: Bad address > >sh-2.03# > >[2]+ Done(50) xterm > > > > > >I am not sure if this problem is related? > > I hear that xterm triggers binutils's bug. > Kaz Kojima's latest patch may fix this problem. > > Please check > http://dodo.nurs.or.jp/~kkojima/index.html/temporary-README-new-toolchain > > If you update binutils, you should rebuild gcc with new binutils > glibc-2.2, and then all other shared libraries in this order. > > Rpms of new tool chains are at http://www.sh-linux.org/rpm-index/index.html > If you are RedHat user, please test them. |
From: NIIBE Y. <gn...@m1...> - 2001-01-23 07:57:54
|
SUGIOKA Toshinobu wrote: > Currently flush_icache_range uses A-bit. but A-bit works only when > ITLB entry exists. UTLB entry should not work. > So every I-cache entry should be compared with physical address by hand > or all I-cache should be invalidated. Umm... I didn't know that. I've checked the hardware manual, and you're right. We need ITLB entry to flush I-cache with A-bit. While The Right Thing may be fixing flush_icache_range, I'm not sure about the semantics well, right now. It seems for me that we don't need to purge out the I-cache, but just write back D-cache, with newer scheme (Documentation/cachetlb.txt), but I'm not sure. Besides, currently, the arguments (START, END) to flush_icache_range is large one (i.e. mmap ranges) except signal handling, so we need to re-write the function anyway. Could you please try this out? This purges the entry regardless it's the page mapped or not. This also includes a bug fix of SH-3 shared page. X server worked quite badly in SH-3 implementation because of that. 2001-01-23 NIIBE Yutaka <gn...@m1...> * arch/sh/kernel/signal.c (setup_frame, setup_rt_frame): Use flush_cache_sigtramp. * arch/sh/mm/cache.c (flush_cache_sigtramp): Implemented. * include/asm-sh/pgtable.h (_PAGE_SHARED): Always _PAGE_U0_SHARED. (was conditionally _PAGE_HW_SHARED on SH-3). With _PAGE_HW_SHARED, all processes share the page, while proper semantics is "some processes share the page". (flush_cache_sigtramp): New function. Index: arch/sh/kernel/signal.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/kernel/signal.c,v retrieving revision 1.13 diff -u -r1.13 signal.c --- arch/sh/kernel/signal.c 2000/09/04 06:41:45 1.13 +++ arch/sh/kernel/signal.c 2001/01/23 07:46:38 @@ -433,7 +433,7 @@ current->comm, current->pid, frame, regs->pc, regs->pr); #endif - flush_icache_range(regs->pr, regs->pr+4); + flush_cache_sigtramp(regs->pr); return; give_sigsegv: @@ -507,7 +507,7 @@ current->comm, current->pid, frame, regs->pc, regs->pr); #endif - flush_icache_range(regs->pr, regs->pr+4); + flush_cache_sigtramp(regs->pr); return; give_sigsegv: Index: arch/sh/mm/cache.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache.c,v retrieving revision 1.21 diff -u -r1.21 cache.c --- arch/sh/mm/cache.c 2000/09/18 05:15:35 1.21 +++ arch/sh/mm/cache.c 2001/01/23 07:46:38 @@ -244,6 +244,22 @@ } /* + * Write back the D-cache and purge the I-cache for signal trampoline. + */ +void flush_cache_sigtramp(unsigned long addr) +{ + unsigned long v, index; + + v = addr & ~(L1_CACHE_BYTES-1); + asm volatile("ocbwb %0" + : /* no output */ + : "m" (__m(v))); + + index = CACHE_IC_ADDRESS_ARRAY| (v&CACHE_IC_ENTRY_MASK); + ctrl_outl(0, index); /* Clear out Valid-bit */ +} + +/* * Invalidate the I-cache of the page (don't need to write back D-cache). * * Called from kernel/ptrace.c, mm/memory.c after flush_page_to_ram is called. Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.16 diff -u -r1.16 pgtable.h --- include/asm-sh/pgtable.h 2000/11/22 07:05:18 1.16 +++ include/asm-sh/pgtable.h 2001/01/23 07:46:41 @@ -39,6 +39,7 @@ #define flush_dcache_page(page) do { } while (0) #define flush_icache_range(start, end) do { } while (0) #define flush_icache_page(vma,pg) do { } while (0) +#define flush_cache_sigtramp(vaddr) do { } while (0) #elif defined(__SH4__) /* * Caches are broken on SH-4, so we need them. @@ -52,6 +53,7 @@ extern void flush_dcache_page(struct page *pg); extern void flush_icache_range(unsigned long start, unsigned long end); extern void flush_icache_page(struct vm_area_struct *vma, struct page *pg); +extern void flush_cache_sigtramp(unsigned long addr); #endif /* @@ -125,11 +127,7 @@ /* Hardware flags: SZ=1 (4k-byte) */ #define _PAGE_FLAGS_HARD 0x00000010 -#if defined(__sh3__) -#define _PAGE_SHARED _PAGE_HW_SHARED -#elif defined(__SH4__) #define _PAGE_SHARED _PAGE_U0_SHARED -#endif #define _PAGE_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY) #define _KERNPG_TABLE (_PAGE_PRESENT | _PAGE_RW | _PAGE_ACCESSED | _PAGE_DIRTY) -- |