From: Manuel L. <ma...@ro...> - 2007-10-16 11:52:45
|
Greetings, With 2.6.23 my board(s) are again experiencing random segfaults and pipe failures (compilejobs randomly abort with bash segfaulting and/or gcc/make stop with "too many commandline options" errors, ...) I tracked it to these two commits; reverting both stabilizes the system again: sh: Reclaim beginning of P3 space for vmalloc area http://git.kernel.org/?p=linux/kernel/git/lethal/sh-2.6.git;a=commit;h=f0b859e3d63a07995f0db294864c2f3c9228f1e4 sh: Add kmap_coherent()/kunmap_coherent() interface for SH-4. http://git.kernel.org/?p=linux/kernel/git/lethal/sh-2.6.git;a=commit;h=8cf1a74305688c85fc8d23ab7432a0c447ee6413 Thanks, Manuel Lauss |
From: Paul M. <le...@li...> - 2007-10-17 04:49:29
|
On Tue, Oct 16, 2007 at 01:52:31PM +0200, Manuel Lauss wrote: > With 2.6.23 my board(s) are again experiencing random segfaults and > pipe failures (compilejobs randomly abort with bash segfaulting > and/or gcc/make stop with "too many commandline options" errors, ...) > I tracked it to these two commits; reverting both stabilizes the system > again: > What CPUs are you running on, and what sort of cache configuration are you using? Presumably write-through caching also works fine? I don't experience any breakage on a 4-way d-cache, but there have been problem reports on other CPUs with 2-way. Specifically it seems to be a cache colouring problem, bumping up the page size also corrects the issue. I'll see about reproducing and debugging it on a 2-way dcache.. |
From: Manuel L. <ma...@ro...> - 2007-10-17 06:26:28
|
On Wed, Oct 17, 2007 at 01:48:51PM +0900, Paul Mundt wrote: > On Tue, Oct 16, 2007 at 01:52:31PM +0200, Manuel Lauss wrote: > > With 2.6.23 my board(s) are again experiencing random segfaults and > > pipe failures (compilejobs randomly abort with bash segfaulting > > and/or gcc/make stop with "too many commandline options" errors, ...) > > I tracked it to these two commits; reverting both stabilizes the system > > again: > > > What CPUs are you running on, and what sort of cache configuration are > you using? Presumably write-through caching also works fine? 7760 with "standard" config: write-back, direct-mapped disabled, no SQ, no OCRAM. Haven't tried WT; kernels with WT used to never boot. I'll give it a shot. > I don't experience any breakage on a 4-way d-cache, but there have been > problem reports on other CPUs with 2-way. Specifically it seems to be a > cache colouring problem, bumping up the page size also corrects the > issue. > > I'll see about reproducing and debugging it on a 2-way dcache.. Thanks you very much! Manuel Lauss |
From: Mike F. <va...@ge...> - 2007-10-17 16:35:14
|
On Wednesday 17 October 2007, Paul Mundt wrote: > On Tue, Oct 16, 2007 at 01:52:31PM +0200, Manuel Lauss wrote: > > With 2.6.23 my board(s) are again experiencing random segfaults and > > pipe failures (compilejobs randomly abort with bash segfaulting > > and/or gcc/make stop with "too many commandline options" errors, ...) > > I tracked it to these two commits; reverting both stabilizes the system > > again: > > What CPUs are you running on, and what sort of cache configuration are > you using? Presumably write-through caching also works fine? > > I don't experience any breakage on a 4-way d-cache, but there have been > problem reports on other CPUs with 2-way. Specifically it seems to be a > cache colouring problem, bumping up the page size also corrects the > issue. > > I'll see about reproducing and debugging it on a 2-way dcache.. i'm seeing random crashes on my lantank running 2.6.23 ... rebooting into=20 2.6.16.xx works fine. that has a SH7751R with 2-way caches in write-back mode ... havent had time= to=20 try the suggestions above though :/ =2Dmike |
From: Paul M. <le...@li...> - 2007-11-05 07:37:21
|
On Wed, Oct 17, 2007 at 12:32:45PM -0400, Mike Frysinger wrote: > On Wednesday 17 October 2007, Paul Mundt wrote: > i'm seeing random crashes on my lantank running 2.6.23 ... rebooting into > 2.6.16.xx works fine. > > that has a SH7751R with 2-way caches in write-back mode ... havent had time to Well, no luck reproducing things on SH7751R or SH7760. If you have a reproduceable workload, that would really help. On the other hand, there was at least one bug in the page colouring, so I've ripped out the old code and made the kmap_coherent() interface more consistently used. This implementation can still be optimized with regards to the page's dcache state, but I'm more concerned about correctness at the moment. See how the following patch works for you. --- arch/sh/mm/clear_page.S | 45 -------------------------- arch/sh/mm/copy_page.S | 61 ----------------------------------- arch/sh/mm/pg-sh4.c | 75 ++++++++++++++++++++++++++++++-------------- include/asm-sh/cacheflush.h | 18 ++++++++-- include/asm-sh/page.h | 11 ++++-- 5 files changed, 73 insertions(+), 137 deletions(-) diff --git a/arch/sh/mm/clear_page.S b/arch/sh/mm/clear_page.S index 8a70613..7a7c81e 100644 --- a/arch/sh/mm/clear_page.S +++ b/arch/sh/mm/clear_page.S @@ -150,48 +150,3 @@ ENTRY(__clear_user) .long 8b, .Lbad_clear_user .long 9b, .Lbad_clear_user .previous - -#if defined(CONFIG_CPU_SH4) -/* - * __clear_user_page - * @to: P3 address (with same color) - * @orig_to: P1 address - * - * void __clear_user_page(void *to, void *orig_to) - */ - -/* - * r0 --- scratch - * r4 --- to - * r5 --- orig_to - * r6 --- to + PAGE_SIZE - */ -ENTRY(__clear_user_page) - mov.l .Lpsz,r0 - mov r4,r6 - add r0,r6 - mov #0,r0 - ! -1: ocbi @r5 - add #32,r5 - movca.l r0,@r4 - mov r4,r1 - add #32,r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - mov.l r0,@-r4 - add #28,r4 - cmp/eq r6,r4 - bf/s 1b - ocbwb @r1 - ! - rts - nop -.Lpsz: .long PAGE_SIZE - -#endif - diff --git a/arch/sh/mm/copy_page.S b/arch/sh/mm/copy_page.S index 3d8409d..4068501 100644 --- a/arch/sh/mm/copy_page.S +++ b/arch/sh/mm/copy_page.S @@ -68,67 +68,6 @@ ENTRY(copy_page_slow) rts nop -#if defined(CONFIG_CPU_SH4) -/* - * __copy_user_page - * @to: P1 address (with same color) - * @from: P1 address - * @orig_to: P1 address - * - * void __copy_user_page(void *to, void *from, void *orig_to) - */ - -/* - * r0, r1, r2, r3, r4, r5, r6, r7 --- scratch - * r8 --- from + PAGE_SIZE - * r9 --- orig_to - * r10 --- to - * r11 --- from - */ -ENTRY(__copy_user_page) - mov.l r8,@-r15 - mov.l r9,@-r15 - mov.l r10,@-r15 - mov.l r11,@-r15 - mov r4,r10 - mov r5,r11 - mov r6,r9 - mov r5,r8 - mov.l .Lpsz,r0 - add r0,r8 - ! -1: ocbi @r9 - add #32,r9 - mov.l @r11+,r0 - mov.l @r11+,r1 - mov.l @r11+,r2 - mov.l @r11+,r3 - mov.l @r11+,r4 - mov.l @r11+,r5 - mov.l @r11+,r6 - mov.l @r11+,r7 - movca.l r0,@r10 - mov r10,r0 - add #32,r10 - mov.l r7,@-r10 - mov.l r6,@-r10 - mov.l r5,@-r10 - mov.l r4,@-r10 - mov.l r3,@-r10 - mov.l r2,@-r10 - mov.l r1,@-r10 - ocbwb @r0 - cmp/eq r11,r8 - bf/s 1b - add #28,r10 - ! - mov.l @r15+,r11 - mov.l @r15+,r10 - mov.l @r15+,r9 - mov.l @r15+,r8 - rts - nop -#endif .align 2 .Lpsz: .long PAGE_SIZE /* diff --git a/arch/sh/mm/pg-sh4.c b/arch/sh/mm/pg-sh4.c index 25f5c6f..8c7a9ca 100644 --- a/arch/sh/mm/pg-sh4.c +++ b/arch/sh/mm/pg-sh4.c @@ -9,6 +9,8 @@ #include <linux/mm.h> #include <linux/mutex.h> #include <linux/fs.h> +#include <linux/highmem.h> +#include <linux/module.h> #include <asm/mmu_context.h> #include <asm/cacheflush.h> @@ -50,34 +52,61 @@ static inline void kunmap_coherent(struct page *page) void clear_user_page(void *to, unsigned long address, struct page *page) { __set_bit(PG_mapped, &page->flags); - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) - clear_page(to); - else { - void *vto = kmap_coherent(page, address); - __clear_user_page(vto, to); - kunmap_coherent(vto); - } + + clear_page(to); + if ((((address & PAGE_MASK) ^ (unsigned long)to) & CACHE_ALIAS)) + __flush_wback_region(to, PAGE_SIZE); } -/* - * copy_user_page - * @to: P1 address - * @from: P1 address - * @address: U0 address to be mapped - * @page: page (virt_to_page(to)) - */ -void copy_user_page(void *to, void *from, unsigned long address, - struct page *page) +void copy_to_user_page(struct vm_area_struct *vma, struct page *page, + unsigned long vaddr, void *dst, const void *src, + unsigned long len) { + void *vto; + __set_bit(PG_mapped, &page->flags); - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) - copy_page(to, from); - else { - void *vfrom = kmap_coherent(page, address); - __copy_user_page(vfrom, from, to); - kunmap_coherent(vfrom); - } + + vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK); + memcpy(vto, src, len); + kunmap_coherent(vto); + + if (vma->vm_flags & VM_EXEC) + flush_cache_page(vma, vaddr, page_to_pfn(page)); +} + +void copy_from_user_page(struct vm_area_struct *vma, struct page *page, + unsigned long vaddr, void *dst, const void *src, + unsigned long len) +{ + void *vfrom; + + __set_bit(PG_mapped, &page->flags); + + vfrom = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK); + memcpy(dst, vfrom, len); + kunmap_coherent(vfrom); +} + +void copy_user_highpage(struct page *to, struct page *from, + unsigned long vaddr, struct vm_area_struct *vma) +{ + void *vfrom, *vto; + + __set_bit(PG_mapped, &to->flags); + + vto = kmap_atomic(to, KM_USER1); + vfrom = kmap_coherent(from, vaddr); + copy_page(vto, vfrom); + kunmap_coherent(vfrom); + + if (((vaddr ^ (unsigned long)vto) & CACHE_ALIAS)) + __flush_wback_region(vto, PAGE_SIZE); + + kunmap_atomic(vto, KM_USER1); + /* Make sure this page is cleared on other CPU's too before using it */ + smp_wmb(); } +EXPORT_SYMBOL(copy_user_highpage); /* * For SH-4, we have our own implementation for ptep_get_and_clear diff --git a/include/asm-sh/cacheflush.h b/include/asm-sh/cacheflush.h index aa558da..b912461 100644 --- a/include/asm-sh/cacheflush.h +++ b/include/asm-sh/cacheflush.h @@ -43,21 +43,31 @@ extern void __flush_purge_region(void *start, int size); extern void __flush_invalidate_region(void *start, int size); #endif -#define flush_cache_vmap(start, end) flush_cache_all() -#define flush_cache_vunmap(start, end) flush_cache_all() +#ifdef CONFIG_CPU_SH4 +extern void copy_to_user_page(struct vm_area_struct *vma, + struct page *page, unsigned long vaddr, void *dst, const void *src, + unsigned long len); -#define copy_to_user_page(vma, page, vaddr, dst, src, len) \ +extern void copy_from_user_page(struct vm_area_struct *vma, + struct page *page, unsigned long vaddr, void *dst, const void *src, + unsigned long len); +#else +#define copy_to_user_page(vma, page, vaddr, dst, src, len) \ do { \ flush_cache_page(vma, vaddr, page_to_pfn(page));\ memcpy(dst, src, len); \ flush_icache_user_range(vma, page, vaddr, len); \ } while (0) -#define copy_from_user_page(vma, page, vaddr, dst, src, len) \ +#define copy_from_user_page(vma, page, vaddr, dst, src, len) \ do { \ flush_cache_page(vma, vaddr, page_to_pfn(page));\ memcpy(dst, src, len); \ } while (0) +#endif + +#define flush_cache_vmap(start, end) flush_cache_all() +#define flush_cache_vunmap(start, end) flush_cache_all() #define HAVE_ARCH_UNMAPPED_AREA diff --git a/include/asm-sh/page.h b/include/asm-sh/page.h index 3aa8b07..d00a8fd 100644 --- a/include/asm-sh/page.h +++ b/include/asm-sh/page.h @@ -73,10 +73,13 @@ extern void copy_page_nommu(void *to, void *from); #if !defined(CONFIG_CACHE_OFF) && defined(CONFIG_MMU) && \ (defined(CONFIG_CPU_SH4) || defined(CONFIG_SH7705_CACHE_32KB)) struct page; -extern void clear_user_page(void *to, unsigned long address, struct page *pg); -extern void copy_user_page(void *to, void *from, unsigned long address, struct page *pg); -extern void __clear_user_page(void *to, void *orig_to); -extern void __copy_user_page(void *to, void *from, void *orig_to); +struct vm_area_struct; +extern void clear_user_page(void *to, unsigned long address, struct page *page); +#ifdef CONFIG_CPU_SH4 +extern void copy_user_highpage(struct page *to, struct page *from, + unsigned long vaddr, struct vm_area_struct *vma); +#define __HAVE_ARCH_COPY_USER_HIGHPAGE +#endif #else #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) |
From: Manuel L. <ma...@ro...> - 2007-11-05 07:56:43
|
Hi Paul, On Mon, Nov 05, 2007 at 04:37:07PM +0900, Paul Mundt wrote: > On Wed, Oct 17, 2007 at 12:32:45PM -0400, Mike Frysinger wrote: > > On Wednesday 17 October 2007, Paul Mundt wrote: > > i'm seeing random crashes on my lantank running 2.6.23 ... rebooting into > > 2.6.16.xx works fine. > > > > that has a SH7751R with 2-way caches in write-back mode ... havent had time to > Well, no luck reproducing things on SH7751R or SH7760. If you have a > reproduceable workload, that would really help. Usually I let it compile GCC or other large codebases (qt-4, openssl). When the gcc compilejob survives (after ~48 hours) I assume the system is stable ;) > On the other hand, there was at least one bug in the page colouring, so > I've ripped out the old code and made the kmap_coherent() interface more > consistently used. This implementation can still be optimized with > regards to the page's dcache state, but I'm more concerned about > correctness at the moment. See how the following patch works for you. I'll let it run for a few days. Thanks! Manuel Lauss |
From: Paul M. <le...@li...> - 2007-11-05 08:01:45
|
On Mon, Nov 05, 2007 at 08:49:54AM +0100, Manuel Lauss wrote: > On Mon, Nov 05, 2007 at 04:37:07PM +0900, Paul Mundt wrote: > > On Wed, Oct 17, 2007 at 12:32:45PM -0400, Mike Frysinger wrote: > > > On Wednesday 17 October 2007, Paul Mundt wrote: > > > i'm seeing random crashes on my lantank running 2.6.23 ... rebooting into > > > 2.6.16.xx works fine. > > > > > > that has a SH7751R with 2-way caches in write-back mode ... havent had time to > > Well, no luck reproducing things on SH7751R or SH7760. If you have a > > reproduceable workload, that would really help. > > Usually I let it compile GCC or other large codebases (qt-4, openssl). When > the gcc compilejob survives (after ~48 hours) I assume the system is stable ;) > On 4-way dcache I don't see any problems at least. My SH7785 has been building various toolchains for about a week straight without any issues, though that has been off of SATA, I have not tried alternate roots. I also wrote an exerciser for breaking COW pages, which stresses the kmap_coherent() path pretty well, but have likewise not hit issues there either (on 2-way or 4-way). Looks like the next step is large builds on 2-way or direct-mapped, then. |
From: Manuel L. <ma...@ro...> - 2007-11-08 07:26:33
|
Hello Paul, On Mon, Nov 05, 2007 at 04:37:07PM +0900, Paul Mundt wrote: > On Wed, Oct 17, 2007 at 12:32:45PM -0400, Mike Frysinger wrote: > > On Wednesday 17 October 2007, Paul Mundt wrote: > > i'm seeing random crashes on my lantank running 2.6.23 ... rebooting into > > 2.6.16.xx works fine. > > > > that has a SH7751R with 2-way caches in write-back mode ... havent had time to > Well, no luck reproducing things on SH7751R or SH7760. If you have a > reproduceable workload, that would really help. > > On the other hand, there was at least one bug in the page colouring, so > I've ripped out the old code and made the kmap_coherent() interface more > consistently used. This implementation can still be optimized with > regards to the page's dcache state, but I'm more concerned about > correctness at the moment. See how the following patch works for you. The patch seems to help. The GCC build has not finished yet but is in the final stages; if there were problems it would've failed a _lot_ sooner. Thank you very much! Manuel Lauss |
From: Paul M. <le...@li...> - 2007-11-08 07:57:45
|
On Thu, Nov 08, 2007 at 08:26:20AM +0100, Manuel Lauss wrote: > On Mon, Nov 05, 2007 at 04:37:07PM +0900, Paul Mundt wrote: > > Well, no luck reproducing things on SH7751R or SH7760. If you have a > > reproduceable workload, that would really help. > > > > On the other hand, there was at least one bug in the page colouring, so > > I've ripped out the old code and made the kmap_coherent() interface more > > consistently used. This implementation can still be optimized with > > regards to the page's dcache state, but I'm more concerned about > > correctness at the moment. See how the following patch works for you. > > The patch seems to help. The GCC build has not finished yet but is in the > final stages; if there were problems it would've failed a _lot_ sooner. > > Thank you very much! > Good to hear, I'll queue this for -rc3 then. Magnus still reports a bug with 4k pages on SH7751R, which doesn't show up when using 64k pages, so there may still be some issues to iron out on certain workloads. There are also some additional patches to switch the lazy D-cache writeback method for even lazier writeback, but these are 2.6.25 material at this point, especially since SH7751R may still have some issues with the current interface. I look forward to the day when the L1 dcache set associativity doubles again, so we can avoid all of this colouring tedium in the future. On 4-way it's already a bit of a toss-up. Now if only people would stop using antiquated direct-mapped and 2-way CPUs.. ;-) |
From: Manuel L. <ma...@ro...> - 2007-11-08 08:06:02
|
> I look forward to the day when the L1 dcache set associativity doubles > again, so we can avoid all of this colouring tedium in the future. On > 4-way it's already a bit of a toss-up. Now if only people would stop > using antiquated direct-mapped and 2-way CPUs.. ;-) That might start when Renesas actually ships those things outside of Japan ;-) (Actually it depends on the business unit: The SHMobile chips (7722/7723) are apparently _way_ easier to get than the 7780 series) Manuel Lauss |