You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
(12) |
May
(82) |
Jun
(72) |
Jul
(39) |
Aug
(104) |
Sep
(61) |
Oct
(55) |
Nov
(101) |
Dec
(48) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(52) |
Feb
(67) |
Mar
(18) |
Apr
(16) |
May
(33) |
Jun
(12) |
Jul
(102) |
Aug
(168) |
Sep
(65) |
Oct
(60) |
Nov
(43) |
Dec
(121) |
2002 |
Jan
(69) |
Feb
(32) |
Mar
(90) |
Apr
(59) |
May
(45) |
Jun
(43) |
Jul
(33) |
Aug
(21) |
Sep
(11) |
Oct
(20) |
Nov
(26) |
Dec
(3) |
2003 |
Jan
(12) |
Feb
(18) |
Mar
(11) |
Apr
(11) |
May
(41) |
Jun
(76) |
Jul
(77) |
Aug
(15) |
Sep
(38) |
Oct
(56) |
Nov
(19) |
Dec
(39) |
2004 |
Jan
(17) |
Feb
(52) |
Mar
(36) |
Apr
(34) |
May
(48) |
Jun
(85) |
Jul
(38) |
Aug
(42) |
Sep
(41) |
Oct
(77) |
Nov
(27) |
Dec
(19) |
2005 |
Jan
(32) |
Feb
(35) |
Mar
(29) |
Apr
(8) |
May
(7) |
Jun
(31) |
Jul
(46) |
Aug
(93) |
Sep
(65) |
Oct
(85) |
Nov
(219) |
Dec
(47) |
2006 |
Jan
(170) |
Feb
(103) |
Mar
(49) |
Apr
(43) |
May
(45) |
Jun
(29) |
Jul
(77) |
Aug
(82) |
Sep
(43) |
Oct
(45) |
Nov
(26) |
Dec
(85) |
2007 |
Jan
(42) |
Feb
(48) |
Mar
(64) |
Apr
(31) |
May
(88) |
Jun
(53) |
Jul
(175) |
Aug
(212) |
Sep
(91) |
Oct
(103) |
Nov
(110) |
Dec
(5) |
2008 |
Jan
(20) |
Feb
(11) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(12) |
Nov
|
Dec
|
From: Masahiro A. <m-...@aa...> - 2001-08-07 14:20:18
|
I may not be clearly understanding your point but I would like to note... On Sun, 5 Aug 2001 17:26:12 +0900 (JST) NIIBE Yutaka <gn...@m1...> wrote: > Honestly speaking, I couldn't find a good way to extend current > implementation to general kernel FPU support (more than one kernel > task uses FPU). If we really need this, I think that it is good > design general one rather than extending current one. > > Comments? Opinions? Does anyone have driver which uses FPU? RTLinux for SH4 uses FPU. Or more precisely, real-time task running under rtlinux can use fpu. There can be more than one task running simultaneously. I'm taking Summer break now and my brain stopped working ^^;), but I would like to say that: 1) it would be good if linux-sh kernel can handle fpu issues correctly, even if it is compiled with -m4 (without -m4-nofpu option). 2) if it is not possible, rtlinux patch should take care of that. 3) or, I (or somebody) should contribute to what Niibe-san is concerned about now. Here is my opinion. I'll revisit this issue when I get back to work. -- Masahiro ABE, A&D Co., Ltd. |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 14:17:50
|
SUGIOKA Toshinobu writes: > Yes. This works good, nice performance. Thanks for testing. I'll commit following. It does more clean-up. * arch/sh/mm/cache-sh4.c (ptep_get_and_clear): Moved to ... (check_cache_page): Removed. (__flush_icache_page): Removed. * include/asm-sh/pgtable.h (__flush_icache_page): Removed. * include/asm-sh/pgalloc.h (ptep_get_and_clear): ... here. (ptep_test_and_clear_young, ptep_test_and_clear_dirty, ptep_set_wrprotect, ptep_mkdirty): Moved from pgtable.h. (ptep_get_and_clear needs definition of mm.h). * include/asm-sh/pgtable.h (PG_mapped): Renamed from PG_mapped_with_alias. (__flush_cache_page): Removed last argument, and add first arg. * arch/sh/mm/cache-sh4.c (__flush_cache_page): Take u0 address as first argument. Don't care about I-cache. (flush_dcache_page): Follow the change. * include/asm-sh/ide.h (ide_insw): Removed. * drivers/cdrom/gdrom.c (gdrom_intr): Remove __flush_wback_region. * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when first mapped, even if it has no alias. (We needed this to for NFS). Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.11 diff -u -r1.11 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/07 05:20:02 1.11 +++ arch/sh/mm/cache-sh4.c 2001/08/07 14:09:50 @@ -196,9 +196,8 @@ /* * Writeback&Invalidate the D-cache of the page - * Invalidate the I-cache of the page, if needed */ -void __flush_cache_page(unsigned long phys, int exec) +void __flush_cache_page(unsigned long u0, unsigned long phys) { unsigned long addr, data; unsigned long flags; @@ -207,74 +206,65 @@ save_and_cli(flags); jump_to_P2(); - /* Loop all the D-cache */ - for (addr = CACHE_OC_ADDRESS_ARRAY; - addr < (CACHE_OC_ADDRESS_ARRAY - +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); - addr += (1<<CACHE_OC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - - if (exec) - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - back_to_P1(); - restore_flags(flags); -} - -void __flush_icache_page(unsigned long u0, unsigned long phys) -{ - unsigned long addr, data; - unsigned long flags; - phys|=CACHE_VALID; - - save_and_cli(flags); if (u0) { - jump_to_P2(); - /* Loop half of the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY|(u0&0x1000); - addr < ((CACHE_IC_ADDRESS_ARRAY|(u0&0x1000)) - +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + if ((u0^phys) & CACHE_ALIAS) { + /* Loop 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (u0 & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (u0 & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + } + /* Loop another 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (phys & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (phys & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); if (data == phys) ctrl_outl(0, addr); } - back_to_P1(); } else { - jump_to_P2(); - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES << CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + /* Loop all the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY; + addr < (CACHE_OC_ADDRESS_ARRAY + +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); if (data == phys) ctrl_outl(0, addr); } - back_to_P1(); } + +#if 0 /* DEBUG DEBUG */ + /* Loop all the I-cache */ + for (addr = CACHE_IC_ADDRESS_ARRAY; + addr < (CACHE_IC_ADDRESS_ARRAY + +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) { + printk(KERN_INFO "__flush_cache_page: I-cache entry found\n"); + ctrl_outl(0, addr); + } + } +#endif + back_to_P1(); restore_flags(flags); } /* - * Write back & invalidate the I/D-cache of the page. + * Write back & invalidate the D-cache of the page. * (To avoid "alias" issues) */ void flush_dcache_page(struct page *page) { if (test_bit(PG_mapped_with_alias, &page->flags)) - __flush_cache_page(PHYSADDR(page_address(page)), 1); + __flush_cache_page(0, PHYSADDR(page_address(page))); } void flush_cache_all(void) @@ -393,68 +383,6 @@ } /* - * Check entries of the I-cache & D-cache of the page. - * (To see "alias" issues) - */ -void check_cache_page(struct page *pg) -{ - unsigned long phys, addr, data, i; - unsigned long kaddr; - unsigned long cache_line_index; - int bingo = 0; - unsigned long flags; - - /* Physical address of this page */ - phys = PHYSADDR(page_address(pg)); - kaddr = phys + PAGE_OFFSET; - cache_line_index = (kaddr&CACHE_OC_ENTRY_MASK)>>CACHE_OC_ENTRY_SHIFT; - - save_and_cli(flags); - jump_to_P2(); - /* Loop all the D-cache */ - for (i=0; i<CACHE_OC_NUM_ENTRIES; i++) { - addr = CACHE_OC_ADDRESS_ARRAY| (i<<CACHE_OC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & (CACHE_UPDATED|CACHE_VALID)) - == (CACHE_UPDATED|CACHE_VALID) - && (data&PAGE_MASK) == phys) { - data &= ~(CACHE_VALID|CACHE_UPDATED); - ctrl_outl(data, addr); - if ((i^cache_line_index)&0x180) - bingo |= 1; - } - } - - cache_line_index &= 0xff; - /* Loop all the I-cache */ - for (i=0; i<CACHE_IC_NUM_ENTRIES; i++) { - addr = CACHE_IC_ADDRESS_ARRAY| (i<<CACHE_IC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & CACHE_VALID) && (data&PAGE_MASK) == phys) { - data &= ~CACHE_VALID; - ctrl_outl(data, addr); - if (((i^cache_line_index)&0x80)) - bingo |= 2; - } - } - back_to_P1(); - restore_flags(flags); - - if (bingo) { - extern void dump_stack(void); - - if (bingo&1) - printk("BINGO!\n"); -#if 0 - if (bingo&2) - printk("Bingo!\n"); -#endif - dump_stack(); - printk("--------------------\n"); - } -} - -/* * clear_user_page * @to: P1 address * @address: U0 address to be mapped @@ -523,18 +451,4 @@ pte_clear(pte); up(&p3map_sem[(address & CACHE_ALIAS)>>12]); } -} - -pte_t ptep_get_and_clear(pte_t *ptep) -{ - pte_t pte = *ptep; - - if (!pte_not_present(pte)) { - struct page *page = pte_page(pte); - if (VALID_PAGE(page)&& - (!page->mapping || !(page->mapping->i_mmap_shared))) - __clear_bit(PG_mapped_with_alias, &page->flags); - } - pte_clear(ptep); - return pte; } Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.45 diff -u -r1.45 fault.c --- arch/sh/mm/fault.c 2001/08/07 05:20:02 1.45 +++ arch/sh/mm/fault.c 2001/08/07 14:09:50 @@ -290,14 +290,12 @@ return; #if defined(__SH4__) - if ((address ^ pte_val(pte)) & CACHE_ALIAS) { - page = pte_page(pte); - if (VALID_PAGE(page) && - !test_bit(PG_mapped_with_alias, &page->flags)) { - unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; - __flush_cache_page(phys, 1); - set_bit(PG_mapped_with_alias, &page->flags); - } + page = pte_page(pte); + if (VALID_PAGE(page) && + !test_bit(PG_mapped_with_alias, &page->flags)) { + unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; + __flush_cache_page(address, phys); + set_bit(PG_mapped_with_alias, &page->flags); } #endif Index: drivers/cdrom/gdrom.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/cdrom/gdrom.c,v retrieving revision 1.4 diff -u -r1.4 gdrom.c --- drivers/cdrom/gdrom.c 2001/08/03 23:50:59 1.4 +++ drivers/cdrom/gdrom.c 2001/08/07 14:09:50 @@ -144,7 +144,6 @@ } insw(GDROM_DATA, ctrl->buf, count/2); - __flush_wback_region(ctrl->buf, count); ctrl->buf += count; ctrl->size -= count; } Index: include/asm-sh/ide.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/ide.h,v retrieving revision 1.16 diff -u -r1.16 ide.h --- include/asm-sh/ide.h 2001/08/03 11:22:06 1.16 +++ include/asm-sh/ide.h 2001/08/07 14:09:52 @@ -17,21 +17,6 @@ #include <linux/config.h> #include <asm/machvec.h> -#if defined(__SH4__) -#undef insw -#define insw(port, buf, nr) ide_insw((port), (buf), (nr)) - -static __inline__ void ide_insw(unsigned long port, - void *dst, - unsigned long count) -{ - extern void _insw (unsigned long port, void *dst, unsigned long count); - - _insw(port, dst, count); - __flush_wback_region(dst, (count << 1)); -} -#endif - #ifndef MAX_HWIFS /* Should never have less than 2, ide-pci.c(ide_match_hwif) requires it */ #define MAX_HWIFS 2 Index: include/asm-sh/pgalloc.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgalloc.h,v retrieving revision 1.9 diff -u -r1.9 pgalloc.h --- include/asm-sh/pgalloc.h 2001/07/18 04:24:43 1.9 +++ include/asm-sh/pgalloc.h 2001/08/07 14:09:53 @@ -95,4 +95,62 @@ { /* Nothing to do */ } +#if defined(__SH4__) +/* + * For SH-4, we have our own implementation for ptep_get_and_clear + */ +static inline pte_t ptep_get_and_clear(pte_t *ptep) +{ + pte_t pte = *ptep; + + pte_clear(ptep); + if (!pte_not_present(pte)) { + struct page *page = pte_page(pte); + if (VALID_PAGE(page)&& + (!page->mapping || !(page->mapping->i_mmap_shared))) + __clear_bit(PG_mapped_with_alias, &page->flags); + } + return pte; +} +#else +static inline pte_t ptep_get_and_clear(pte_t *ptep) +{ + pte_t pte = *ptep; + pte_clear(ptep); + return pte; +} +#endif + +/* + * Following functions are same as generic ones. + */ +static inline int ptep_test_and_clear_young(pte_t *ptep) +{ + pte_t pte = *ptep; + if (!pte_young(pte)) + return 0; + set_pte(ptep, pte_mkold(pte)); + return 1; +} + +static inline int ptep_test_and_clear_dirty(pte_t *ptep) +{ + pte_t pte = *ptep; + if (!pte_dirty(pte)) + return 0; + set_pte(ptep, pte_mkclean(pte)); + return 1; +} + +static inline void ptep_set_wrprotect(pte_t *ptep) +{ + pte_t old_pte = *ptep; + set_pte(ptep, pte_wrprotect(old_pte)); +} + +static inline void ptep_mkdirty(pte_t *ptep) +{ + pte_t old_pte = *ptep; + set_pte(ptep, pte_mkdirty(old_pte)); +} #endif /* __ASM_SH_PGALLOC_H */ Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.41 diff -u -r1.41 pgtable.h --- include/asm-sh/pgtable.h 2001/08/07 05:20:03 1.41 +++ include/asm-sh/pgtable.h 2001/08/07 14:09:53 @@ -52,6 +52,10 @@ /* * Caches are broken on SH-4, so we need them. */ + +/* Page is 4K, OC size is 16K, there are four lines. */ +#define CACHE_ALIAS 0x00003000 + extern void flush_cache_all(void); extern void flush_cache_mm(struct mm_struct *mm); extern void flush_cache_range(struct mm_struct *mm, unsigned long start, @@ -70,8 +74,7 @@ extern void __flush_purge_region(void *start, int size); /* Flush a page */ -extern void __flush_cache_page(unsigned long phys, int exec); -extern void __flush_icache_page(unsigned long u0, unsigned long phys); +extern void __flush_cache_page(unsigned long u0, unsigned long phys); /* Initialization of P3 area for copy_user_page */ extern void p3_cache_init(void); @@ -297,55 +300,6 @@ * (We needed atomic implementation for SMP) * */ - -#if defined(__SH4__) -/* Page is 4K, OC size is 16K, there are four lines. */ -#define CACHE_ALIAS 0x00003000 -/* - * For SH-4, we have our own implementation for ptep_get_and_clear - */ -extern pte_t ptep_get_and_clear(pte_t *ptep); -#else -static inline pte_t ptep_get_and_clear(pte_t *ptep) -{ - pte_t pte = *ptep; - pte_clear(ptep); - return pte; -} -#endif - -/* - * Following functions are same as generic ones. - */ -static inline int ptep_test_and_clear_young(pte_t *ptep) -{ - pte_t pte = *ptep; - if (!pte_young(pte)) - return 0; - set_pte(ptep, pte_mkold(pte)); - return 1; -} - -static inline int ptep_test_and_clear_dirty(pte_t *ptep) -{ - pte_t pte = *ptep; - if (!pte_dirty(pte)) - return 0; - set_pte(ptep, pte_mkclean(pte)); - return 1; -} - -static inline void ptep_set_wrprotect(pte_t *ptep) -{ - pte_t old_pte = *ptep; - set_pte(ptep, pte_wrprotect(old_pte)); -} - -static inline void ptep_mkdirty(pte_t *ptep) -{ - pte_t old_pte = *ptep; - set_pte(ptep, pte_mkdirty(old_pte)); -} #define pte_same(A,B) (pte_val(A) == pte_val(B)) -- |
From: SUGIOKA T. <su...@it...> - 2001-08-07 11:26:57
|
At 19:59 01/08/07 +0900, NIIBE Yutaka <gn...@m1...> wrote: >SUGIOKA Toshinobu wrote: > > Yes. it works with your patch. but performance seems went bad. > > I think unneeded cache flush occur on simple TLB miss exception. > >I see. How about this one. > > * include/asm-sh/pgtable.h (PG_mapped): Renamed from > PG_mapped_with_alias. > (__flush_cache_page): Removed last argument, and add first arg. > * arch/sh/mm/cache-sh4.c (__flush_cache_page): Take u0 address > as first argument. Don't care about I-cache. > (flush_dcache_page): Follow the change. > > * include/asm-sh/ide.h (ide_insw): Removed. > * drivers/cdrom/gdrom.c (gdrom_intr): Remove __flush_wback_region. > > * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when first > mapped, even if it has no alias. (We needed this to for NFS). Yes. This works good, nice performance. ---- SUGIOKA Toshinobu |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 10:59:59
|
SUGIOKA Toshinobu wrote: > Yes. it works with your patch. but performance seems went bad. > I think unneeded cache flush occur on simple TLB miss exception. I see. How about this one. * include/asm-sh/pgtable.h (PG_mapped): Renamed from PG_mapped_with_alias. (__flush_cache_page): Removed last argument, and add first arg. * arch/sh/mm/cache-sh4.c (__flush_cache_page): Take u0 address as first argument. Don't care about I-cache. (flush_dcache_page): Follow the change. * include/asm-sh/ide.h (ide_insw): Removed. * drivers/cdrom/gdrom.c (gdrom_intr): Remove __flush_wback_region. * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when first mapped, even if it has no alias. (We needed this to for NFS). Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.11 diff -u -r1.11 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/07 05:20:02 1.11 +++ arch/sh/mm/cache-sh4.c 2001/08/07 10:56:42 @@ -196,9 +196,8 @@ /* * Writeback&Invalidate the D-cache of the page - * Invalidate the I-cache of the page, if needed */ -void __flush_cache_page(unsigned long phys, int exec) +void __flush_cache_page(unsigned long u0, unsigned long phys) { unsigned long addr, data; unsigned long flags; @@ -207,26 +206,53 @@ save_and_cli(flags); jump_to_P2(); - /* Loop all the D-cache */ - for (addr = CACHE_OC_ADDRESS_ARRAY; - addr < (CACHE_OC_ADDRESS_ARRAY - +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); - addr += (1<<CACHE_OC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - if (exec) - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + if (u0) { + if ((u0^phys) & CACHE_ALIAS) { + /* Loop 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (u0 & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (u0 & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + } + /* Loop another 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (phys & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (phys & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + } else { + /* Loop all the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY; + addr < (CACHE_OC_ADDRESS_ARRAY + +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); if (data == phys) ctrl_outl(0, addr); } + } + +#if 0 /* DEBUG DEBUG */ + /* Loop all the I-cache */ + for (addr = CACHE_IC_ADDRESS_ARRAY; + addr < (CACHE_IC_ADDRESS_ARRAY + +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) { + printk(KERN_INFO "__flush_cache_page: I-cache entry found\n"); + ctrl_outl(0, addr); + } + } +#endif back_to_P1(); restore_flags(flags); } @@ -268,13 +294,13 @@ } /* - * Write back & invalidate the I/D-cache of the page. + * Write back & invalidate the D-cache of the page. * (To avoid "alias" issues) */ void flush_dcache_page(struct page *page) { if (test_bit(PG_mapped_with_alias, &page->flags)) - __flush_cache_page(PHYSADDR(page_address(page)), 1); + __flush_cache_page(0, PHYSADDR(page_address(page))); } void flush_cache_all(void) Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.45 diff -u -r1.45 fault.c --- arch/sh/mm/fault.c 2001/08/07 05:20:02 1.45 +++ arch/sh/mm/fault.c 2001/08/07 10:56:42 @@ -290,14 +290,12 @@ return; #if defined(__SH4__) - if ((address ^ pte_val(pte)) & CACHE_ALIAS) { - page = pte_page(pte); - if (VALID_PAGE(page) && - !test_bit(PG_mapped_with_alias, &page->flags)) { - unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; - __flush_cache_page(phys, 1); - set_bit(PG_mapped_with_alias, &page->flags); - } + page = pte_page(pte); + if (VALID_PAGE(page) && + !test_bit(PG_mapped_with_alias, &page->flags)) { + unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; + __flush_cache_page(address, phys); + set_bit(PG_mapped_with_alias, &page->flags); } #endif Index: drivers/cdrom/gdrom.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/cdrom/gdrom.c,v retrieving revision 1.4 diff -u -r1.4 gdrom.c --- drivers/cdrom/gdrom.c 2001/08/03 23:50:59 1.4 +++ drivers/cdrom/gdrom.c 2001/08/07 10:56:43 @@ -144,7 +144,6 @@ } insw(GDROM_DATA, ctrl->buf, count/2); - __flush_wback_region(ctrl->buf, count); ctrl->buf += count; ctrl->size -= count; } Index: include/asm-sh/ide.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/ide.h,v retrieving revision 1.16 diff -u -r1.16 ide.h --- include/asm-sh/ide.h 2001/08/03 11:22:06 1.16 +++ include/asm-sh/ide.h 2001/08/07 10:56:45 @@ -17,21 +17,6 @@ #include <linux/config.h> #include <asm/machvec.h> -#if defined(__SH4__) -#undef insw -#define insw(port, buf, nr) ide_insw((port), (buf), (nr)) - -static __inline__ void ide_insw(unsigned long port, - void *dst, - unsigned long count) -{ - extern void _insw (unsigned long port, void *dst, unsigned long count); - - _insw(port, dst, count); - __flush_wback_region(dst, (count << 1)); -} -#endif - #ifndef MAX_HWIFS /* Should never have less than 2, ide-pci.c(ide_match_hwif) requires it */ #define MAX_HWIFS 2 Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.41 diff -u -r1.41 pgtable.h --- include/asm-sh/pgtable.h 2001/08/07 05:20:03 1.41 +++ include/asm-sh/pgtable.h 2001/08/07 10:56:45 @@ -70,7 +70,7 @@ extern void __flush_purge_region(void *start, int size); /* Flush a page */ -extern void __flush_cache_page(unsigned long phys, int exec); +extern void __flush_cache_page(unsigned long u0, unsigned long phys); extern void __flush_icache_page(unsigned long u0, unsigned long phys); /* Initialization of P3 area for copy_user_page */ -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 10:46:27
|
This is another approach. But this is not so good because we touch generic part in archtecture specific way. Index: fs/nfs/read.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/fs/nfs/read.c,v retrieving revision 1.9 diff -u -r1.9 read.c --- fs/nfs/read.c 2001/04/09 05:32:46 1.9 +++ fs/nfs/read.c 2001/08/07 10:43:04 @@ -448,7 +448,7 @@ count -= PAGE_CACHE_SIZE; } else SetPageError(page); - flush_dcache_page(page); + __flush_wback_region(page_address(page), PAGE_SIZE); kunmap(page); UnlockPage(page); -- |
From: SUGIOKA T. <su...@it...> - 2001-08-07 10:25:54
|
At 19:04 01/08/07 +0900, SUGIOKA Toshinobu <su...@it...> wrote: >Following patch improves 'exec' system call performance much on my NFS >environment. > > * arch/sh/mm/fault.c (__do_page_fault): Don't call update_mmu_cache(). Sorry, This does not work. ---- SUGIOKA Toshinobu |
From: SUGIOKA T. <su...@it...> - 2001-08-07 10:10:07
|
At 17:32 01/08/07 +0900, NIIBE Yutaka <gn...@m1...> wrote: >Here's the patch (for me, this is work around). Please try. It works >for my environment. > > * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when first > mapped, even if it has no alias. We need this to use NFS. > Yes. it works with your patch. but performance seems went bad. I think unneeded cache flush occur on simple TLB miss exception. Following patch improves 'exec' system call performance much on my NFS environment. * arch/sh/mm/fault.c (__do_page_fault): Don't call update_mmu_cache(). Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.45 diff -u -r1.45 fault.c --- arch/sh/mm/fault.c 2001/08/07 05:20:02 1.45 +++ arch/sh/mm/fault.c 2001/08/07 09:42:03 @@ -237,6 +237,7 @@ pmd_t *pmd; pte_t *pte; pte_t entry; + unsigned long pteval; if (address >= P3SEG && address < P4SEG) dir = pgd_offset_k(address); @@ -270,7 +271,24 @@ __flush_tlb_page(get_asid(), address&PAGE_MASK); #endif set_pte(pte, entry); - update_mmu_cache(NULL, address, entry); + + /* Set PTEH register */ + ctrl_outl((address & MMU_VPN_MASK) | get_asid(), MMU_PTEH); + + pteval = pte_val(entry); +#if defined(__SH4__) + /* Set PTEA register */ + /* TODO: make this look less hacky */ + ctrl_outl(((pteval >> 28) & 0xe) | (pteval & 0x1), MMU_PTEA); +#endif + + /* Set PTEL register */ + pteval &= _PAGE_FLAGS_HARDWARE_MASK; /* drop software flags */ + /* conveniently, we want all the software flags to be 0 anyway */ + ctrl_outl(pteval, MMU_PTEL); + + /* Load the TLB */ + asm volatile("ldtlb": /* no output */ : /* no input */ : "memory"); return 0; } ---- SUGIOKA Toshinobu |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 08:32:49
|
SUGIOKA Toshinobu wrote: > It seems that current cvs kernel have some cache handling problem. > When it boots with NFS root file system, user process does not run at all. > > If I remove checking of PG_mapped_with_alias in flush_dcache_page(), > it works. Thanks for your report. Yup, I've found a issue for NFS. It's not that cache alias issue, but I-cache/D-cache coherency issue. We write-back the data in the GD-ROM or IDE driver. However, NFS doesn't do that. This causes problem for the archtecture which has write-back cache and I-cache/D-cache coherency issue. Here's the patch (for me, this is work around). Please try. It works for my environment. * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when first mapped, even if it has no alias. We need this to use NFS. Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.45 diff -u -r1.45 fault.c --- arch/sh/mm/fault.c 2001/08/07 05:20:02 1.45 +++ arch/sh/mm/fault.c 2001/08/07 08:25:41 @@ -290,14 +290,14 @@ return; #if defined(__SH4__) - if ((address ^ pte_val(pte)) & CACHE_ALIAS) { - page = pte_page(pte); - if (VALID_PAGE(page) && - !test_bit(PG_mapped_with_alias, &page->flags)) { - unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; - __flush_cache_page(phys, 1); + page = pte_page(pte); + if (VALID_PAGE(page) && + !test_bit(PG_mapped_with_alias, &page->flags)) { + unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; + + __flush_cache_page(phys, 1); + if ((address ^ pte_val(pte)) & CACHE_ALIAS) set_bit(PG_mapped_with_alias, &page->flags); - } } #endif -- |
From: SUGIOKA T. <su...@it...> - 2001-08-07 07:54:15
|
Hi. It seems that current cvs kernel have some cache handling problem. When it boots with NFS root file system, user process does not run at all. If I remove checking of PG_mapped_with_alias in flush_dcache_page(), it works. ---- SUGIOKA Toshinobu |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 02:53:36
|
M. R. Brown wrote: > Hmm, if this fixes the problem then OK, but IIRC the other problem was that > the default 8139C DMA buffer was too large for the DC's 32K of BBA PCI DMA, > and it needed an #ifdef to select a smaller size. I'll try to dig up my > original patches and post them. Thanks. The patch does fix _a_ problem, but it still does not work. Yes, DC only has 32KB, and I think that we handle it correctly. (Though, pci_free_consistent has not been implemented.) > I probably have the tables wrong for VGA using 640x480. The current pvr2 > driver is in dire need of an overhaul, it's one of the things on my "clean-up > list". Does it at least start up in 640x480 in NTSC mode? Well, it took time to get TV. :-) Yes, it works. -------------------- Console: switching to colour frame buffer device 80x30 fb0: NEC PowerVR2 frame buffer device, using 600k/8192k of video memory fb0: Mode 640x480-16 pitch = 1280 cable: COMPOSITE video output: NTSC -------------------- When I connect VGA adaptor, it goes like this. -------------------- Console: switching to colour frame buffer device 80x30 fb0: NEC PowerVR2 frame buffer device, using 300k/8192k of video memory fb0: Mode 640x240-16 pitch = 1280 cable: VGA video output: VGA -------------------- -- |
From: M. R. B. <mr...@0x...> - 2001-08-07 01:47:17
|
* NIIBE Yutaka <gn...@m1...> on Tue, Aug 07, 2001: > Currently, Dreamcast BBA doesn't work. I'm sorry that I think it is > broken when I've included the changes from mainline (in 2.4.6 and in > 2.4.7). At least, it seems the patch attached is needed to detect > 8138C. > Hmm, if this fixes the problem then OK, but IIRC the other problem was that the default 8139C DMA buffer was too large for the DC's 32K of BBA PCI DMA, and it needed an #ifdef to select a smaller size. I'll try to dig up my original patches and post them. > Also, PowerVR2 frame buffer starts 640x240 mode. I'm using VGA > adaptor. Is there any reason why it doesn't start 640x480? What's > wrong? > I probably have the tables wrong for VGA using 640x480. The current pvr2 driver is in dire need of an overhaul, it's one of the things on my "clean-up list". Does it at least start up in 640x480 in NTSC mode? M. R. > Index: drivers/net/8139too.c > =================================================================== > RCS file: /cvsroot/linuxsh/kernel/drivers/net/8139too.c,v > retrieving revision 1.21 > diff -u -p -r1.21 8139too.c > --- drivers/net/8139too.c 2001/07/23 00:39:39 1.21 > +++ drivers/net/8139too.c 2001/08/07 00:13:45 > @@ -1406,6 +1406,7 @@ static void rtl8139_hw_start (struct net > > tp->rx_config = rtl8139_rx_config | AcceptBroadcast | AcceptMyPhys; > RTL_W32 (RxConfig, tp->rx_config); > + tp->rx_config = 0; > > /* Check this value: the documentation for IFG contradicts ifself. */ > RTL_W32 (TxConfig, (TX_DMA_BURST << TxDMAShift)); > -- > > _______________________________________________ > linuxsh-dev mailing list > lin...@li... > http://lists.sourceforge.net/lists/listinfo/linuxsh-dev |
From: NIIBE Y. <gn...@m1...> - 2001-08-07 00:18:31
|
Currently, Dreamcast BBA doesn't work. I'm sorry that I think it is broken when I've included the changes from mainline (in 2.4.6 and in 2.4.7). At least, it seems the patch attached is needed to detect 8138C. Also, PowerVR2 frame buffer starts 640x240 mode. I'm using VGA adaptor. Is there any reason why it doesn't start 640x480? What's wrong? Index: drivers/net/8139too.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/net/8139too.c,v retrieving revision 1.21 diff -u -p -r1.21 8139too.c --- drivers/net/8139too.c 2001/07/23 00:39:39 1.21 +++ drivers/net/8139too.c 2001/08/07 00:13:45 @@ -1406,6 +1406,7 @@ static void rtl8139_hw_start (struct net tp->rx_config = rtl8139_rx_config | AcceptBroadcast | AcceptMyPhys; RTL_W32 (RxConfig, tp->rx_config); + tp->rx_config = 0; /* Check this value: the documentation for IFG contradicts ifself. */ RTL_W32 (TxConfig, (TX_DMA_BURST << TxDMAShift)); -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-06 09:21:17
|
OK, with this patch, I've finally got reliable implementation. Some clean up is still needed, but I think that all the bugs are gone. I've changed the logic to defer the actual flusing. I've introduced, PG_mapped_with_alias flag. This flag will be set at update_mmu_cache, if it maps with the address of alias. We can do this, because all the vma areas are aligned to 16KB. The flag will be cleard when the PTE gets cleared. * include/asm-sh/pgtable.h (PG_mapped_with_alias): New macro. (PG_dcache_dirty: Deleted. (ptep_get_and_clear, ptep_test_and_clear_young, ptep_test_and_clear_dirty, ptep_set_wrprotect, ptep_mkdirty, pte_same): Define here (was: included by <asm-generic/pgtable.h>). * arch/sh/mm/cache-sh4.c (flush_dcache_page): New implementation. Check if it's mapped or not. (ptep_get_and_clear): New function (was: generic implementation). * arch/sh/mm/fault.c (update_mmu_cache): Flush the cache when it's mapped at first, and mark the page as it's mapped. Bug fix: check the page is VALID or not. * arch/sh/mm/cache-sh4.c (CACHE_ALIAS): Moved to ... * include/asm-sh/pgtable.h (CACHE_ALIAS): ... here. Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.10 diff -u -p -r1.10 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/06 02:19:56 1.10 +++ arch/sh/mm/cache-sh4.c 2001/08/06 09:05:45 @@ -41,9 +41,6 @@ #define CACHE_IC_NUM_ENTRIES 256 #define CACHE_OC_NUM_ENTRIES 512 -/* Page is 4K, OC size is 16K, there are four lines. */ -#define CACHE_ALIAS 0x00003000 - static void __init detect_cpu_and_cache_system(void) { @@ -206,7 +203,7 @@ void __flush_cache_page(unsigned long ph unsigned long addr, data; unsigned long flags; - phys|=CACHE_VALID; + phys |= CACHE_VALID; save_and_cli(flags); jump_to_P2(); @@ -274,19 +271,10 @@ void __flush_icache_page(unsigned long u * Write back & invalidate the I/D-cache of the page. * (To avoid "alias" issues) */ -void flush_dcache_page(struct page *pg) +void flush_dcache_page(struct page *page) { - if (!pg->mapping - || pg->mapping->i_mmap || pg->mapping->i_mmap_shared) { - unsigned long phys; - - phys = PHYSADDR(page_address(pg)); - if (pg->mapping) - __flush_cache_page(phys, 1); - else - __flush_cache_page(phys, 0); - } else - set_bit(PG_dcache_dirty, &pg->flags); + if (test_bit(PG_mapped_with_alias, &page->flags)) + __flush_cache_page(PHYSADDR(page_address(page)), 1); } void flush_cache_all(void) @@ -535,4 +523,18 @@ void copy_user_page(void *to, void *from pte_clear(pte); up(&p3map_sem[(address & CACHE_ALIAS)>>12]); } +} + +pte_t ptep_get_and_clear(pte_t *ptep) +{ + pte_t pte = *ptep; + + if (!pte_not_present(pte)) { + struct page *page = pte_page(pte); + if (VALID_PAGE(page)&& + (!page->mapping || !(page->mapping->i_mmap_shared))) + __clear_bit(PG_mapped_with_alias, &page->flags); + } + pte_clear(ptep); + return pte; } Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.44 diff -u -p -r1.44 fault.c --- arch/sh/mm/fault.c 2001/07/18 04:27:38 1.44 +++ arch/sh/mm/fault.c 2001/08/06 09:05:45 @@ -290,15 +290,15 @@ void update_mmu_cache(struct vm_area_str return; #if defined(__SH4__) - page = pte_page(pte); - - if (__test_and_clear_bit(PG_dcache_dirty, &page->flags)) - if (page->mapping) { - unsigned long phys; - /* Physical address of this page */ - phys = PHYSADDR(pte_val(pte)&PAGE_MASK); + if ((address ^ pte_val(pte)) & CACHE_ALIAS) { + page = pte_page(pte); + if (VALID_PAGE(page) && + !test_bit(PG_mapped_with_alias, &page->flags)) { + unsigned long phys = pte_val(pte) & PTE_PHYS_MASK; __flush_cache_page(phys, 1); + set_bit(PG_mapped_with_alias, &page->flags); } + } #endif save_and_cli(flags); Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.40 diff -u -p -r1.40 pgtable.h --- include/asm-sh/pgtable.h 2001/08/05 01:54:18 1.40 +++ include/asm-sh/pgtable.h 2001/08/06 09:05:47 @@ -75,6 +75,11 @@ extern void __flush_icache_page(unsigned /* Initialization of P3 area for copy_user_page */ extern void p3_cache_init(void); + +#define PG_mapped_with_alias PG_arch_1 + +/* We provide our own get_unmapped_area to avoid cache alias issue */ +#define HAVE_ARCH_UNMAPPED_AREA #endif /* @@ -290,9 +295,60 @@ extern void update_mmu_cache(struct vm_a * * We just can use generic implementation, as SuperH has no SMP feature. * (We needed atomic implementation for SMP) + * */ -#include <asm-generic/pgtable.h> +#if defined(__SH4__) +/* Page is 4K, OC size is 16K, there are four lines. */ +#define CACHE_ALIAS 0x00003000 +/* + * For SH-4, we have our own implementation for ptep_get_and_clear + */ +extern pte_t ptep_get_and_clear(pte_t *ptep); +#else +static inline pte_t ptep_get_and_clear(pte_t *ptep) +{ + pte_t pte = *ptep; + pte_clear(ptep); + return pte; +} +#endif + +/* + * Following functions are same as generic ones. + */ +static inline int ptep_test_and_clear_young(pte_t *ptep) +{ + pte_t pte = *ptep; + if (!pte_young(pte)) + return 0; + set_pte(ptep, pte_mkold(pte)); + return 1; +} + +static inline int ptep_test_and_clear_dirty(pte_t *ptep) +{ + pte_t pte = *ptep; + if (!pte_dirty(pte)) + return 0; + set_pte(ptep, pte_mkclean(pte)); + return 1; +} + +static inline void ptep_set_wrprotect(pte_t *ptep) +{ + pte_t old_pte = *ptep; + set_pte(ptep, pte_wrprotect(old_pte)); +} + +static inline void ptep_mkdirty(pte_t *ptep) +{ + pte_t old_pte = *ptep; + set_pte(ptep, pte_mkdirty(old_pte)); +} + +#define pte_same(A,B) (pte_val(A) == pte_val(B)) + #endif /* !__ASSEMBLY__ */ /* Needs to be defined here and not in linux/mm.h, as it is arch dependent */ @@ -300,12 +356,5 @@ extern void update_mmu_cache(struct vm_a #define kern_addr_valid(addr) (1) #define io_remap_page_range remap_page_range - -#if defined(__SH4__) -#define PG_dcache_dirty PG_arch_1 - -/* We provide our own get_unmapped_area to avoid cache alias issue */ -#define HAVE_ARCH_UNMAPPED_AREA -#endif #endif /* __ASM_SH_PAGE_H */ -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-06 06:29:26
|
Using serial console, sometimes we see garbage data at the beginning. The cause is udelay in the initialization routine of sh-sci.c. The initialization will be called before the calibration of loops_per_jiffy. I put big value there. * arch/sh/kernel/setup.c (boot_cpu_data): Let have initial value for loops_per_jiffy. Index: arch/sh/kernel/setup.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/kernel/setup.c,v retrieving revision 1.29 diff -u -p -r1.29 setup.c --- arch/sh/kernel/setup.c 2001/07/24 01:24:28 1.29 +++ arch/sh/kernel/setup.c 2001/08/06 06:17:25 @@ -47,7 +47,12 @@ * Machine setup.. */ -struct sh_cpuinfo boot_cpu_data = { CPU_SH_NONE, 0, }; +/* + * Initialize loops_per_jiffy as 10000000 (1000MIPS). + * This value will be used at the very early stage of serial setup. + * The bigger value means no problem. + */ +struct sh_cpuinfo boot_cpu_data = { CPU_SH_NONE, 0, 10000000, }; struct screen_info screen_info; unsigned char aux_device_present = 0xaa; -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-06 02:18:01
|
NIIBE Yutaka wrote: > (flush_cache_page): New implementation. (Assumes PTE is valid.) This was not correct. We flush the cache of different MM, so we can't directry do with OCB functions. Following is updated version. * arch/sh/mm/clear_page.S: Use aligned address for write back. * arch/sh/mm/__copy_user_page-sh4.S: Likewise. * arch/sh/mm/copy_page.S: Likewise. * arch/sh/mm/cache-sh4.c (CACHE_IC_NUM_WAYS, CACHE_OC_NUM_WAYS): Removed. (cache_wback_all): Removed and integrate to cache_init. * arch/sh/mm/copy_page.S: Write back TO, * arch/sh/mm/clear_page.S: Write back TO. * arch/sh/mm/cache-sh3.c (cache_init): Read CCR at P2. * arch/sh/mm/cache-sh4.c (cache_init): Likewise. (__flush_cache_page): Fix bug. Call restore_flags. (flush_cache_page): New implementation. (clear_user_page, copy_user_page): Do it in assembler routines. Index: arch/sh/mm/__copy_user_page-sh4.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/__copy_user_page-sh4.S,v retrieving revision 1.2 diff -u -p -r1.2 __copy_user_page-sh4.S --- arch/sh/mm/__copy_user_page-sh4.S 2001/07/28 14:22:06 1.2 +++ arch/sh/mm/__copy_user_page-sh4.S 2001/08/06 02:13:13 @@ -46,6 +46,7 @@ ENTRY(__copy_user_page) mov.l @r11+,r6 mov.l @r11+,r7 movca.l r0,@r10 + mov r10,r0 add #32,r10 mov.l r7,@-r10 mov.l r6,@-r10 @@ -54,11 +55,10 @@ ENTRY(__copy_user_page) mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 - mov r10,r0 - add #28,r10 + ocbwb @r0 cmp/eq r11,r8 bf/s 1b - ocbwb @r0 + add #28,r10 ! mov.l @r15+,r11 mov.l @r15+,r10 Index: arch/sh/mm/cache-sh3.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh3.c,v retrieving revision 1.1 diff -u -p -r1.1 cache-sh3.c --- arch/sh/mm/cache-sh3.c 2001/07/23 09:02:17 1.1 +++ arch/sh/mm/cache-sh3.c 2001/08/06 02:13:13 @@ -131,8 +131,8 @@ void __init cache_init(void) detect_cpu_and_cache_system(); - ccr = ctrl_inl(CCR); jump_to_P2(); + ccr = ctrl_inl(CCR); if (ccr & CCR_CACHE_ENABLE) /* * XXX: Should check RA here. Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.9 diff -u -p -r1.9 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/04 00:04:48 1.9 +++ arch/sh/mm/cache-sh4.c 2001/08/06 02:13:13 @@ -40,33 +40,9 @@ #define CACHE_IC_ENTRY_MASK 0x1fe0 #define CACHE_IC_NUM_ENTRIES 256 #define CACHE_OC_NUM_ENTRIES 512 -#define CACHE_OC_NUM_WAYS 1 -#define CACHE_IC_NUM_WAYS 1 -/* - * Write back all the cache. - * - * For SH-4, we only need to flush (write back) Operand Cache, - * as Instruction Cache doesn't have "updated" data. - * - * Assumes that this is called in interrupt disabled context, and P2. - * Shuld be INLINE function. - */ -static inline void cache_wback_all(void) -{ - unsigned long addr, data, i, j; - - for (i=0; i<CACHE_OC_NUM_ENTRIES; i++) { - for (j=0; j<CACHE_OC_NUM_WAYS; j++) { - addr = CACHE_OC_ADDRESS_ARRAY|(j<<CACHE_OC_WAY_SHIFT)| - (i<<CACHE_OC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & (CACHE_UPDATED|CACHE_VALID)) - == (CACHE_UPDATED|CACHE_VALID)) - ctrl_outl(data & ~CACHE_UPDATED, addr); - } - } -} +/* Page is 4K, OC size is 16K, there are four lines. */ +#define CACHE_ALIAS 0x00003000 static void __init detect_cpu_and_cache_system(void) @@ -86,14 +62,25 @@ void __init cache_init(void) detect_cpu_and_cache_system(); - ccr = ctrl_inl(CCR); jump_to_P2(); - if (ccr & CCR_CACHE_ENABLE) + ccr = ctrl_inl(CCR); + if (ccr & CCR_CACHE_ENABLE) { /* * XXX: Should check RA here. * If RA was 1, we only need to flush the half of the caches. */ - cache_wback_all(); + unsigned long addr, data; + + for (addr = CACHE_OC_ADDRESS_ARRAY; + addr < (CACHE_OC_ADDRESS_ARRAY+ + (CACHE_OC_NUM_ENTRIES << CACHE_OC_ENTRY_SHIFT)); + addr += (1 << CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr); + if ((data & (CACHE_UPDATED|CACHE_VALID)) + == (CACHE_UPDATED|CACHE_VALID)) + ctrl_outl(data & ~CACHE_UPDATED, addr); + } + } ctrl_outl(CCR_CACHE_INIT, CCR); back_to_P1(); @@ -130,7 +117,7 @@ void __flush_wback_region(void *start, i unsigned long v; unsigned long begin, end; - begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); + begin = (unsigned long)start & ~(L1_CACHE_BYTES-1); end = ((unsigned long)start + size + L1_CACHE_BYTES-1) & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { @@ -150,7 +137,7 @@ void __flush_purge_region(void *start, i unsigned long v; unsigned long begin, end; - begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); + begin = (unsigned long)start & ~(L1_CACHE_BYTES-1); end = ((unsigned long)start + size + L1_CACHE_BYTES-1) & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { @@ -169,7 +156,7 @@ void __flush_invalidate_region(void *sta unsigned long v; unsigned long begin, end; - begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); + begin = (unsigned long)start & ~(L1_CACHE_BYTES-1); end = ((unsigned long)start + size + L1_CACHE_BYTES-1) & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { @@ -244,7 +231,7 @@ void __flush_cache_page(unsigned long ph ctrl_outl(0, addr); } back_to_P1(); - save_and_cli(flags); + restore_flags(flags); } void __flush_icache_page(unsigned long u0, unsigned long phys) @@ -357,26 +344,64 @@ void flush_cache_range(struct mm_struct * * ADDR: Virtual Address (U0 address) */ -void flush_cache_page(struct vm_area_struct *vma, unsigned long addr) +void flush_cache_page(struct vm_area_struct *vma, unsigned long address) { pgd_t *dir; pmd_t *pmd; pte_t *pte; pte_t entry; - unsigned long phys; + unsigned long phys, addr, addr_end, data; + unsigned long flags; - dir = pgd_offset(vma->vm_mm, addr); - pmd = pmd_offset(dir, addr); - if (pmd_none(*pmd)) - return; - if (pmd_bad(*pmd)) + dir = pgd_offset(vma->vm_mm, address); + pmd = pmd_offset(dir, address); + if (pmd_none(*pmd) || pmd_bad(*pmd)) return; - pte = pte_offset(pmd, addr); + pte = pte_offset(pmd, address); entry = *pte; if (pte_none(entry) || !pte_present(entry)) return; - phys = PHYSADDR(pte_val(entry)&PAGE_MASK); - __flush_cache_page(phys, (vma->vm_flags & VM_EXEC)); + + phys = pte_val(entry)&PTE_PHYS_MASK; + + phys |= CACHE_VALID; + save_and_cli(flags); + jump_to_P2(); + + /* We only need to flush D-cache when we have alias */ + if ((address^phys) & CACHE_ALIAS) { + /* Loop 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (address & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (address & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + /* Loop another 4K of the D-cache */ + for (addr = CACHE_OC_ADDRESS_ARRAY | (phys & CACHE_ALIAS); + addr < (CACHE_OC_ADDRESS_ARRAY + (phys & CACHE_ALIAS) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + } + + if (vma->vm_flags & VM_EXEC) + /* Loop 4K of the I-cache */ + for (addr = CACHE_IC_ADDRESS_ARRAY|(address&0x1000); + addr < ((CACHE_IC_ADDRESS_ARRAY|(address&0x1000)) + +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + back_to_P1(); + restore_flags(flags); } /* @@ -441,9 +466,6 @@ void check_cache_page(struct page *pg) } } -/* Page is 4K, OC size is 16K, there are four lines. */ -#define CACHE_ALIAS 0x00003000 - /* * clear_user_page * @to: P1 address @@ -451,10 +473,9 @@ void check_cache_page(struct page *pg) */ void clear_user_page(void *to, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) clear_page(to); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | @@ -488,10 +509,9 @@ void clear_user_page(void *to, unsigned */ void copy_user_page(void *to, void *from, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) copy_page(to, from); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | Index: arch/sh/mm/clear_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/clear_page.S,v retrieving revision 1.2 diff -u -p -r1.2 clear_page.S --- arch/sh/mm/clear_page.S 2001/07/23 10:27:24 1.2 +++ arch/sh/mm/clear_page.S 2001/08/06 02:13:13 @@ -30,6 +30,7 @@ ENTRY(clear_page) mov.l r0,@r4 #elif defined(__SH4__) movca.l r0,@r4 + mov r4,r1 #endif add #32,r4 mov.l r0,@-r4 @@ -39,6 +40,9 @@ ENTRY(clear_page) mov.l r0,@-r4 mov.l r0,@-r4 mov.l r0,@-r4 +#if defined(__SH4__) + ocbwb @r1 +#endif cmp/eq r5,r4 bf/s 1b add #28,r4 Index: arch/sh/mm/copy_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/copy_page.S,v retrieving revision 1.3 diff -u -p -r1.3 copy_page.S --- arch/sh/mm/copy_page.S 2001/07/28 15:23:38 1.3 +++ arch/sh/mm/copy_page.S 2001/08/06 02:13:13 @@ -44,6 +44,7 @@ ENTRY(copy_page) mov.l r0,@r10 #elif defined(__SH4__) movca.l r0,@r10 + mov r10,r0 #endif add #32,r10 mov.l r7,@-r10 @@ -53,6 +54,9 @@ ENTRY(copy_page) mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 +#if defined(__SH4__) + ocbwb @r0 +#endif cmp/eq r11,r8 bf/s 1b add #28,r10 -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-05 08:26:18
|
Last year, we've implemented using FPU in kernel. That is, not assuming kernel doesn't use FPU, we enable drivers to use FPU. This was done because GCC at that time, uses FPU for division. In the code, we use init_task as special task, and when kernel uses FPU, we pretend as if "user process init_task" uses FPU. It's a kind of hack. The kernel has been changed since, this doesn't work any more, because of special kernel thread. So far so good, as kernel threads do not do general things. However, now we have ksoftirqd kernel threads, and it handles software interrupts (or bottom halves). This means, ksoftirqd calls routines of drivers, and if we need using FPU in kernel, we need another approach. I think that we should remove the support of FPU in kernel to clean up things. I think that we can assume GCC provide some way of not using FPU for division. Honestly speaking, I couldn't find a good way to extend current implementation to general kernel FPU support (more than one kernel task uses FPU). If we really need this, I think that it is good design general one rather than extending current one. Comments? Opinions? Does anyone have driver which uses FPU? -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-05 08:11:00
|
* arch/sh/mm/cache-sh4.c (CACHE_IC_NUM_WAYS, CACHE_OC_NUM_WAYS): Removed. (cache_wback_all): Removed and integrate to cache_init. * arch/sh/mm/copy_page.S: Write back TO, * arch/sh/mm/clear_page.S: Write back TO. * arch/sh/mm/cache-sh3.c (cache_init): Read CCR at P2. * arch/sh/mm/cache-sh4.c (cache_init): Likewise. (__flush_cache_page): Fix bug. Call restore_flags. (flush_cache_page): New implementation. (Assumes PTE is valid.) (clear_user_page, copy_user_page): Do it in assembler routines. Index: arch/sh/mm/cache-sh3.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh3.c,v retrieving revision 1.1 diff -u -p -r1.1 cache-sh3.c --- arch/sh/mm/cache-sh3.c 2001/07/23 09:02:17 1.1 +++ arch/sh/mm/cache-sh3.c 2001/08/05 08:06:53 @@ -131,8 +131,8 @@ void __init cache_init(void) detect_cpu_and_cache_system(); - ccr = ctrl_inl(CCR); jump_to_P2(); + ccr = ctrl_inl(CCR); if (ccr & CCR_CACHE_ENABLE) /* * XXX: Should check RA here. Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.9 diff -u -p -r1.9 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/04 00:04:48 1.9 +++ arch/sh/mm/cache-sh4.c 2001/08/05 08:06:53 @@ -40,33 +40,9 @@ #define CACHE_IC_ENTRY_MASK 0x1fe0 #define CACHE_IC_NUM_ENTRIES 256 #define CACHE_OC_NUM_ENTRIES 512 -#define CACHE_OC_NUM_WAYS 1 -#define CACHE_IC_NUM_WAYS 1 -/* - * Write back all the cache. - * - * For SH-4, we only need to flush (write back) Operand Cache, - * as Instruction Cache doesn't have "updated" data. - * - * Assumes that this is called in interrupt disabled context, and P2. - * Shuld be INLINE function. - */ -static inline void cache_wback_all(void) -{ - unsigned long addr, data, i, j; - - for (i=0; i<CACHE_OC_NUM_ENTRIES; i++) { - for (j=0; j<CACHE_OC_NUM_WAYS; j++) { - addr = CACHE_OC_ADDRESS_ARRAY|(j<<CACHE_OC_WAY_SHIFT)| - (i<<CACHE_OC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & (CACHE_UPDATED|CACHE_VALID)) - == (CACHE_UPDATED|CACHE_VALID)) - ctrl_outl(data & ~CACHE_UPDATED, addr); - } - } -} +/* Page is 4K, OC size is 16K, there are four lines. */ +#define CACHE_ALIAS 0x00003000 static void __init detect_cpu_and_cache_system(void) @@ -86,15 +62,26 @@ void __init cache_init(void) detect_cpu_and_cache_system(); - ccr = ctrl_inl(CCR); jump_to_P2(); - if (ccr & CCR_CACHE_ENABLE) + ccr = ctrl_inl(CCR); + if (ccr & CCR_CACHE_ENABLE) { /* * XXX: Should check RA here. * If RA was 1, we only need to flush the half of the caches. */ - cache_wback_all(); + unsigned long addr, data; + for (addr = CACHE_OC_ADDRESS_ARRAY; + addr < (CACHE_OC_ADDRESS_ARRAY+ + (CACHE_OC_NUM_ENTRIES << CACHE_OC_ENTRY_SHIFT)); + addr += (1 << CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr); + if ((data & (CACHE_UPDATED|CACHE_VALID)) + == (CACHE_UPDATED|CACHE_VALID)) + ctrl_outl(data & ~CACHE_UPDATED, addr); + } + } + ctrl_outl(CCR_CACHE_INIT, CCR); back_to_P1(); } @@ -244,7 +231,7 @@ void __flush_cache_page(unsigned long ph ctrl_outl(0, addr); } back_to_P1(); - save_and_cli(flags); + restore_flags(flags); } void __flush_icache_page(unsigned long u0, unsigned long phys) @@ -357,26 +344,48 @@ void flush_cache_range(struct mm_struct * * ADDR: Virtual Address (U0 address) */ -void flush_cache_page(struct vm_area_struct *vma, unsigned long addr) +void flush_cache_page(struct vm_area_struct *vma, unsigned long address) { pgd_t *dir; pmd_t *pmd; pte_t *pte; pte_t entry; - unsigned long phys; + unsigned long phys, addr, data; + unsigned long flags; - dir = pgd_offset(vma->vm_mm, addr); - pmd = pmd_offset(dir, addr); - if (pmd_none(*pmd)) - return; - if (pmd_bad(*pmd)) + __flush_purge_region((void *)(address&PAGE_MASK), PAGE_SIZE); + + dir = pgd_offset(vma->vm_mm, address); + pmd = pmd_offset(dir, address); + if (pmd_none(*pmd) || pmd_bad(*pmd)) { + printk(KERN_ERR "flush_cache_page: No valid PMD"); return; - pte = pte_offset(pmd, addr); + } + pte = pte_offset(pmd, address); entry = *pte; - if (pte_none(entry) || !pte_present(entry)) + if (pte_none(entry) || !pte_present(entry)) { + printk(KERN_ERR "flush_cache_page: No valid PTE"); return; - phys = PHYSADDR(pte_val(entry)&PAGE_MASK); - __flush_cache_page(phys, (vma->vm_flags & VM_EXEC)); + } + + phys = pte_val(entry)&PTE_PHYS_MASK; + if (((address^phys) & CACHE_ALIAS) != 0) + __flush_invalidate_region((void *)P1SEGADDR(phys), PAGE_SIZE); + + phys |= CACHE_VALID; + save_and_cli(flags); + jump_to_P2(); + /* Loop 4K of the I-cache */ + for (addr = CACHE_IC_ADDRESS_ARRAY|(address&0x1000); + addr < ((CACHE_IC_ADDRESS_ARRAY|(address&0x1000)) + +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); + } + back_to_P1(); + restore_flags(flags); } /* @@ -441,9 +450,6 @@ void check_cache_page(struct page *pg) } } -/* Page is 4K, OC size is 16K, there are four lines. */ -#define CACHE_ALIAS 0x00003000 - /* * clear_user_page * @to: P1 address @@ -451,10 +457,9 @@ void check_cache_page(struct page *pg) */ void clear_user_page(void *to, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) clear_page(to); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | @@ -488,10 +493,9 @@ void clear_user_page(void *to, unsigned */ void copy_user_page(void *to, void *from, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) copy_page(to, from); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | Index: arch/sh/mm/clear_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/clear_page.S,v retrieving revision 1.2 diff -u -p -r1.2 clear_page.S --- arch/sh/mm/clear_page.S 2001/07/23 10:27:24 1.2 +++ arch/sh/mm/clear_page.S 2001/08/05 08:06:53 @@ -39,6 +39,9 @@ ENTRY(clear_page) mov.l r0,@-r4 mov.l r0,@-r4 mov.l r0,@-r4 +#if defined(__SH4__) + ocbwb @r4 +#endif cmp/eq r5,r4 bf/s 1b add #28,r4 Index: arch/sh/mm/copy_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/copy_page.S,v retrieving revision 1.3 diff -u -p -r1.3 copy_page.S --- arch/sh/mm/copy_page.S 2001/07/28 15:23:38 1.3 +++ arch/sh/mm/copy_page.S 2001/08/05 08:06:53 @@ -53,6 +53,9 @@ ENTRY(copy_page) mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 +#if defined(__SH4__) + ocbwb @r10 +#endif cmp/eq r11,r8 bf/s 1b add #28,r10 -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-05 01:56:26
|
NIIBE Yutaka wrote: > I'll commit changes from mainline. There's no SuperH related changes. Done. -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-05 01:56:08
|
NIIBE Yutaka wrote: > Here's updated version: > > * mm/vmscan.c (try_to_swap_out): Bug fix. Flush the page before > we clear the PTE. [...] > * include/asm/pgtable.h (PTE_PHYS_MASK): Defined. > (pte_page): Bug fix. Use PTE_PHYS_MASK. Only committed those part. Others are questionable. -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-04 10:34:59
|
NIIBE Yutaka wrote: > Bug fix is for flush_cache_page. It's very long standing bug. > When it is called from vmscan.c:try_to_swapout, the PTE is cleared > before the call, so it does nothing but just return (without flushing). > To flush, we need physical address, so I added PAGE argument. I'll > ask this change to lkml. I think that for physically tagged cache, > we need this. It found that the bug is actually in vmscan.c:try_to_swapout. Here's updated version: * mm/vmscan.c (try_to_swap_out): Bug fix. Flush the page before we clear the PTE. * arch/sh/mm/cache-sh4.c (copy_user_page, clear_user_page): Don't need to write back (it's done in copy_page). (check_cache_page, __flush_cache_page, __flush_icache_page): Removed. (flush_dcache_page): Flush P1 D-cache only (we should not have valid U0 cache entry here). (flush_cache_page): Flush only U0. (we should no have valid P1 cache entry here). * arch/sh/mm/fault.c (update_mmu_cache): Flush P1 D-cache only (we should not have valid U0 cache entry here). * arch/sh/mm/copy_page.S: Write back TO, purge FROM. * arch/sh/mm/clear_page.S: Write back TO. * arch/sh/mm/__copy_user_page-sh4.S: Purge FROM. * include/asm/pgtable.h (PTE_PHYS_MASK): Defined. (pte_page): Bug fix. Use PTE_PHYS_MASK. (__flush_cache_page, __flush_icache_page): Removed. * drivers/cdrom/gdrom.c (gdrom_intr): Purge the cache (was: write-back). * include/asm-sh/ide.h (ide_insw): Purge the cache (was: write-back). (ide_outsw, outsw): New function. Let use this for IDE driver, not to have data on cache. Index: arch/sh/mm/__copy_user_page-sh4.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/__copy_user_page-sh4.S,v retrieving revision 1.2 diff -u -p -r1.2 __copy_user_page-sh4.S --- arch/sh/mm/__copy_user_page-sh4.S 2001/07/28 14:22:06 1.2 +++ arch/sh/mm/__copy_user_page-sh4.S 2001/08/04 10:28:16 @@ -44,7 +44,9 @@ ENTRY(__copy_user_page) mov.l @r11+,r4 mov.l @r11+,r5 mov.l @r11+,r6 - mov.l @r11+,r7 + mov.l @r11,r7 + ocbp @r11 + add #4,r11 movca.l r0,@r10 add #32,r10 mov.l r7,@-r10 @@ -54,11 +56,10 @@ ENTRY(__copy_user_page) mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 - mov r10,r0 - add #28,r10 + ocbwb @r10 cmp/eq r11,r8 bf/s 1b - ocbwb @r0 + add #28,r10 ! mov.l @r15+,r11 mov.l @r15+,r10 Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.9 diff -u -p -r1.9 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/04 00:04:48 1.9 +++ arch/sh/mm/cache-sh4.c 2001/08/04 10:28:16 @@ -211,95 +211,16 @@ void flush_cache_sigtramp(unsigned long } /* - * Writeback&Invalidate the D-cache of the page - * Invalidate the I-cache of the page, if needed - */ -void __flush_cache_page(unsigned long phys, int exec) -{ - unsigned long addr, data; - unsigned long flags; - - phys|=CACHE_VALID; - - save_and_cli(flags); - jump_to_P2(); - /* Loop all the D-cache */ - for (addr = CACHE_OC_ADDRESS_ARRAY; - addr < (CACHE_OC_ADDRESS_ARRAY - +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); - addr += (1<<CACHE_OC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - - if (exec) - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - back_to_P1(); - save_and_cli(flags); -} - -void __flush_icache_page(unsigned long u0, unsigned long phys) -{ - unsigned long addr, data; - unsigned long flags; - - phys|=CACHE_VALID; - - save_and_cli(flags); - if (u0) { - jump_to_P2(); - /* Loop half of the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY|(u0&0x1000); - addr < ((CACHE_IC_ADDRESS_ARRAY|(u0&0x1000)) - +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - back_to_P1(); - } else { - jump_to_P2(); - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES << CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - back_to_P1(); - } - restore_flags(flags); -} - -/* * Write back & invalidate the I/D-cache of the page. * (To avoid "alias" issues) */ -void flush_dcache_page(struct page *pg) +void flush_dcache_page(struct page *page) { - if (!pg->mapping - || pg->mapping->i_mmap || pg->mapping->i_mmap_shared) { - unsigned long phys; - - phys = PHYSADDR(page_address(pg)); - if (pg->mapping) - __flush_cache_page(phys, 1); - else - __flush_cache_page(phys, 0); - } else - set_bit(PG_dcache_dirty, &pg->flags); + if (!page->mapping + || page->mapping->i_mmap || page->mapping->i_mmap_shared) + __flush_purge_region(page_address(page), PAGE_SIZE); + else + set_bit(PG_dcache_dirty, &page->flags); } void flush_cache_all(void) @@ -357,88 +278,46 @@ void flush_cache_range(struct mm_struct * * ADDR: Virtual Address (U0 address) */ -void flush_cache_page(struct vm_area_struct *vma, unsigned long addr) +void flush_cache_page(struct vm_area_struct *vma, unsigned long address) { pgd_t *dir; pmd_t *pmd; pte_t *pte; pte_t entry; unsigned long phys; + unsigned long addr, data; + unsigned long flags; + + __flush_purge_region((void *)(address&PAGE_MASK), PAGE_SIZE); + if ((vma->vm_flags & VM_EXEC) == 0) + return; - dir = pgd_offset(vma->vm_mm, addr); - pmd = pmd_offset(dir, addr); + dir = pgd_offset(vma->vm_mm, address); + pmd = pmd_offset(dir, address); if (pmd_none(*pmd)) return; if (pmd_bad(*pmd)) return; - pte = pte_offset(pmd, addr); + pte = pte_offset(pmd, address); entry = *pte; if (pte_none(entry) || !pte_present(entry)) return; - phys = PHYSADDR(pte_val(entry)&PAGE_MASK); - __flush_cache_page(phys, (vma->vm_flags & VM_EXEC)); -} -/* - * Check entries of the I-cache & D-cache of the page. - * (To see "alias" issues) - */ -void check_cache_page(struct page *pg) -{ - unsigned long phys, addr, data, i; - unsigned long kaddr; - unsigned long cache_line_index; - int bingo = 0; - unsigned long flags; - - /* Physical address of this page */ - phys = PHYSADDR(page_address(pg)); - kaddr = phys + PAGE_OFFSET; - cache_line_index = (kaddr&CACHE_OC_ENTRY_MASK)>>CACHE_OC_ENTRY_SHIFT; + phys = (pte_val(entry) & PTE_PHYS_MASK) | CACHE_VALID; save_and_cli(flags); jump_to_P2(); - /* Loop all the D-cache */ - for (i=0; i<CACHE_OC_NUM_ENTRIES; i++) { - addr = CACHE_OC_ADDRESS_ARRAY| (i<<CACHE_OC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & (CACHE_UPDATED|CACHE_VALID)) - == (CACHE_UPDATED|CACHE_VALID) - && (data&PAGE_MASK) == phys) { - data &= ~(CACHE_VALID|CACHE_UPDATED); - ctrl_outl(data, addr); - if ((i^cache_line_index)&0x180) - bingo |= 1; - } - } - - cache_line_index &= 0xff; - /* Loop all the I-cache */ - for (i=0; i<CACHE_IC_NUM_ENTRIES; i++) { - addr = CACHE_IC_ADDRESS_ARRAY| (i<<CACHE_IC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & CACHE_VALID) && (data&PAGE_MASK) == phys) { - data &= ~CACHE_VALID; - ctrl_outl(data, addr); - if (((i^cache_line_index)&0x80)) - bingo |= 2; - } + /* Loop I-cache 4K */ + for (addr = CACHE_IC_ADDRESS_ARRAY|(address&0x1000); + addr < ((CACHE_IC_ADDRESS_ARRAY|(address&0x1000)) + +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); } back_to_P1(); restore_flags(flags); - - if (bingo) { - extern void dump_stack(void); - - if (bingo&1) - printk("BINGO!\n"); -#if 0 - if (bingo&2) - printk("Bingo!\n"); -#endif - dump_stack(); - printk("--------------------\n"); - } } /* Page is 4K, OC size is 16K, there are four lines. */ @@ -451,10 +330,9 @@ void check_cache_page(struct page *pg) */ void clear_user_page(void *to, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) clear_page(to); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | @@ -488,10 +366,9 @@ void clear_user_page(void *to, unsigned */ void copy_user_page(void *to, void *from, unsigned long address) { - if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { + if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) copy_page(to, from); - __flush_wback_region(to, PAGE_SIZE); - } else { + else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | _PAGE_DIRTY | _PAGE_ACCESSED | Index: arch/sh/mm/clear_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/clear_page.S,v retrieving revision 1.2 diff -u -p -r1.2 clear_page.S --- arch/sh/mm/clear_page.S 2001/07/23 10:27:24 1.2 +++ arch/sh/mm/clear_page.S 2001/08/04 10:28:16 @@ -39,6 +39,9 @@ ENTRY(clear_page) mov.l r0,@-r4 mov.l r0,@-r4 mov.l r0,@-r4 +#if defined(__SH4__) + ocbwb @r4 +#endif cmp/eq r5,r4 bf/s 1b add #28,r4 Index: arch/sh/mm/copy_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/copy_page.S,v retrieving revision 1.3 diff -u -p -r1.3 copy_page.S --- arch/sh/mm/copy_page.S 2001/07/28 15:23:38 1.3 +++ arch/sh/mm/copy_page.S 2001/08/04 10:28:16 @@ -39,10 +39,13 @@ ENTRY(copy_page) mov.l @r11+,r4 mov.l @r11+,r5 mov.l @r11+,r6 - mov.l @r11+,r7 #if defined(__sh3__) + mov.l @r11+,r7 mov.l r0,@r10 #elif defined(__SH4__) + mov.l @r11,r7 + ocbp @r11 + add #4,r11 movca.l r0,@r10 #endif add #32,r10 @@ -53,6 +56,9 @@ ENTRY(copy_page) mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 +#if defined(__SH4__) + ocbwb @r10 +#endif cmp/eq r11,r8 bf/s 1b add #28,r10 Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.44 diff -u -p -r1.44 fault.c --- arch/sh/mm/fault.c 2001/07/18 04:27:38 1.44 +++ arch/sh/mm/fault.c 2001/08/04 10:28:16 @@ -294,10 +294,8 @@ void update_mmu_cache(struct vm_area_str if (__test_and_clear_bit(PG_dcache_dirty, &page->flags)) if (page->mapping) { - unsigned long phys; - /* Physical address of this page */ - phys = PHYSADDR(pte_val(pte)&PAGE_MASK); - __flush_cache_page(phys, 1); + unsigned long phys = pte_val(pte)&PTE_PHYS_MASK; + __flush_purge_region((void *)P1SEGADDR(phys), PAGE_SIZE); } #endif Index: drivers/block/rd.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/block/rd.c,v retrieving revision 1.8 diff -u -p -r1.8 rd.c --- drivers/block/rd.c 2001/07/31 00:05:21 1.8 +++ drivers/block/rd.c 2001/08/04 10:28:16 @@ -226,6 +226,7 @@ static int rd_make_request(request_queue * sbh might be - NeilBrown */ bdata = bh_kmap(sbh); + flush_cache_all(); if (rw == READ) { if (sbh != rbh) memcpy(bdata, rbh->b_data, rbh->b_size); Index: drivers/cdrom/gdrom.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/cdrom/gdrom.c,v retrieving revision 1.4 diff -u -p -r1.4 gdrom.c --- drivers/cdrom/gdrom.c 2001/08/03 23:50:59 1.4 +++ drivers/cdrom/gdrom.c 2001/08/04 10:28:16 @@ -144,7 +144,7 @@ gdrom_intr(int irq, void *dev_id, struct } insw(GDROM_DATA, ctrl->buf, count/2); - __flush_wback_region(ctrl->buf, count); + __flush_purge_region(ctrl->buf, count); ctrl->buf += count; ctrl->size -= count; } Index: include/asm-sh/ide.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/ide.h,v retrieving revision 1.16 diff -u -p -r1.16 ide.h --- include/asm-sh/ide.h 2001/08/03 11:22:06 1.16 +++ include/asm-sh/ide.h 2001/08/04 10:28:18 @@ -25,10 +25,23 @@ static __inline__ void ide_insw(unsigned void *dst, unsigned long count) { - extern void _insw (unsigned long port, void *dst, unsigned long count); + extern void _insw(unsigned long port, void *dst, unsigned long count); _insw(port, dst, count); - __flush_wback_region(dst, (count << 1)); + __flush_purge_region(dst, (count << 1)); +} + +#undef outsw +#define outsw(port, buf, nr) ide_outsw((port), (buf), (nr)) + +static __inline__ void ide_outsw(unsigned long port, + void *src, + unsigned long count) +{ + extern void _outsw(unsigned long port, void *src, unsigned long count); + + _outsw(port, src, count); + __flush_purge_region(src, (count << 1)); } #endif Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.39 diff -u -p -r1.39 pgtable.h --- include/asm-sh/pgtable.h 2001/08/03 11:22:06 1.39 +++ include/asm-sh/pgtable.h 2001/08/04 10:28:18 @@ -69,10 +69,6 @@ extern void __flush_wback_region(void *s /* Flush (write-back & invalidate) a region (smaller than a page) */ extern void __flush_purge_region(void *start, int size); -/* Flush a page */ -extern void __flush_cache_page(unsigned long phys, int exec); -extern void __flush_icache_page(unsigned long u0, unsigned long phys); - /* Initialization of P3 area for copy_user_page */ extern void p3_cache_init(void); #endif @@ -101,6 +97,8 @@ extern unsigned long empty_zero_page[102 #define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE) #define FIRST_USER_PGD_NR 0 +#define PTE_PHYS_MASK 0x1ffff000 + #ifndef __ASSEMBLY__ /* * First 1MB map is used by fixed purpose. @@ -205,7 +203,7 @@ extern unsigned long empty_zero_page[102 */ #define page_address(page) ((page)->virtual) /* P1 address of the page */ #define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) -#define pte_page(x) phys_to_page(pte_val(x)) +#define pte_page(x) phys_to_page(pte_val(x)&PTE_PHYS_MASK) /* * The following only work if pte_present() is true. Index: mm/vmscan.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/mm/vmscan.c,v retrieving revision 1.41 diff -u -p -r1.41 vmscan.c --- mm/vmscan.c 2001/07/31 00:05:21 1.41 +++ mm/vmscan.c 2001/08/04 10:28:24 @@ -65,6 +65,7 @@ static void try_to_swap_out(struct mm_st * is needed on CPUs which update the accessed and dirty * bits in hardware. */ + flush_cache_page(vma, address); pte = ptep_get_and_clear(page_table); flush_tlb_page(vma, address); @@ -102,7 +103,6 @@ drop_pte: * Basically, this just makes it possible for us to do * some real work in the future in "refill_inactive()". */ - flush_cache_page(vma, address); if (!pte_dirty(pte)) goto drop_pte; -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-04 09:18:01
|
I'll commit changes from mainline. There's no SuperH related changes. -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-04 05:15:19
|
Following patch will eliminate cache alias issue. Bug fix is for flush_cache_page. It's very long standing bug. When it is called from vmscan.c:try_to_swapout, the PTE is cleared before the call, so it does nothing but just return (without flushing). To flush, we need physical address, so I added PAGE argument. I'll ask this change to lkml. I think that for physically tagged cache, we need this. Other things are change the cache handling strategy. Current implementation allows having cache on P1, and write back U0 and purge P1. I think that not allowing cache on P1 is better. * kernel/ptrace.c, mm/vmscan.c: Use of flush_cache_page. * arch/sh/mm/cache-sh4.c (copy_user_page): Don't need to write back (it's done in copy_page). (clear_user_page): Likewise. (check_cache_page): Removed. (flush_cache_page): Bug fix. Added PAGE argument. With this, flush U0 cache only (we should not have P1 data here). Note that PTE comes with 0 from try_to_swap_out. (was: nothing done but return because PTE==0) (__flush_cache_page): Removed. (flush_dcache_page): Flush P1 cache only (we should not have U0 cache here). * arch/sh/mm/fault.c (update_mmu_cache): Flush P1 cache only (we should not have U0 cache here). * arch/sh/mm/copy_page.S: Write back TO, purge FROM. * arch/sh/mm/clear_page.S: Write back TO. * arch/sh/mm/__copy_user_page-sh4.S: Purge FROM. * include/asm/pgtable.h (PTE_PHYS_MASK): Defined. (pte_page): Bug fix. Use PTE_PHYS_MASK. (__flush_cache_page): Removed. (flush_cache_page): Added PAGE argument. * drivers/cdrom/gdrom.c (gdrom_intr): Purge the cache (was: write-back). * include/asm-sh/ide.h (ide_insw): Purge the cache (was: write-back). (ide_outsw, outsw): New function. Let use this for IDE driver, not to have data on cache. Index: arch/sh/mm/__copy_user_page-sh4.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/__copy_user_page-sh4.S,v retrieving revision 1.2 diff -u -r1.2 __copy_user_page-sh4.S --- arch/sh/mm/__copy_user_page-sh4.S 2001/07/28 14:22:06 1.2 +++ arch/sh/mm/__copy_user_page-sh4.S 2001/08/04 04:02:55 @@ -44,7 +44,9 @@ mov.l @r11+,r4 mov.l @r11+,r5 mov.l @r11+,r6 - mov.l @r11+,r7 + mov.l @r11,r7 + ocbp @r11 + add #4,r11 movca.l r0,@r10 add #32,r10 mov.l r7,@-r10 Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.9 diff -u -r1.9 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/04 00:04:48 1.9 +++ arch/sh/mm/cache-sh4.c 2001/08/04 04:02:55 @@ -211,42 +211,8 @@ } /* - * Writeback&Invalidate the D-cache of the page - * Invalidate the I-cache of the page, if needed + * Invalidate the I-cache of the page */ -void __flush_cache_page(unsigned long phys, int exec) -{ - unsigned long addr, data; - unsigned long flags; - - phys|=CACHE_VALID; - - save_and_cli(flags); - jump_to_P2(); - /* Loop all the D-cache */ - for (addr = CACHE_OC_ADDRESS_ARRAY; - addr < (CACHE_OC_ADDRESS_ARRAY - +(CACHE_OC_NUM_ENTRIES<< CACHE_OC_ENTRY_SHIFT)); - addr += (1<<CACHE_OC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - - if (exec) - /* Loop all the I-cache */ - for (addr = CACHE_IC_ADDRESS_ARRAY; - addr < (CACHE_IC_ADDRESS_ARRAY - +(CACHE_IC_NUM_ENTRIES<< CACHE_IC_ENTRY_SHIFT)); - addr += (1<<CACHE_IC_ENTRY_SHIFT)) { - data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); - if (data == phys) - ctrl_outl(0, addr); - } - back_to_P1(); - save_and_cli(flags); -} - void __flush_icache_page(unsigned long u0, unsigned long phys) { unsigned long addr, data; @@ -257,7 +223,7 @@ save_and_cli(flags); if (u0) { jump_to_P2(); - /* Loop half of the I-cache */ + /* Loop 4K of the I-cache */ for (addr = CACHE_IC_ADDRESS_ARRAY|(u0&0x1000); addr < ((CACHE_IC_ADDRESS_ARRAY|(u0&0x1000)) +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); @@ -287,19 +253,17 @@ * Write back & invalidate the I/D-cache of the page. * (To avoid "alias" issues) */ -void flush_dcache_page(struct page *pg) +void flush_dcache_page(struct page *page) { - if (!pg->mapping - || pg->mapping->i_mmap || pg->mapping->i_mmap_shared) { - unsigned long phys; - - phys = PHYSADDR(page_address(pg)); - if (pg->mapping) - __flush_cache_page(phys, 1); - else - __flush_cache_page(phys, 0); + if (!page->mapping + || page->mapping->i_mmap || page->mapping->i_mmap_shared) { + __flush_purge_region(page_address(page), PAGE_SIZE); + if (page->mapping) { + unsigned long phys = PHYSADDR(page_address(page)); + __flush_icache_page(0, phys); + } } else - set_bit(PG_dcache_dirty, &pg->flags); + set_bit(PG_dcache_dirty, &page->flags); } void flush_cache_all(void) @@ -357,88 +321,42 @@ * * ADDR: Virtual Address (U0 address) */ -void flush_cache_page(struct vm_area_struct *vma, unsigned long addr) +void flush_cache_page(struct vm_area_struct *vma, unsigned long address, + struct page *page) { - pgd_t *dir; - pmd_t *pmd; - pte_t *pte; - pte_t entry; unsigned long phys; - - dir = pgd_offset(vma->vm_mm, addr); - pmd = pmd_offset(dir, addr); - if (pmd_none(*pmd)) - return; - if (pmd_bad(*pmd)) - return; - pte = pte_offset(pmd, addr); - entry = *pte; - if (pte_none(entry) || !pte_present(entry)) - return; - phys = PHYSADDR(pte_val(entry)&PAGE_MASK); - __flush_cache_page(phys, (vma->vm_flags & VM_EXEC)); -} - -/* - * Check entries of the I-cache & D-cache of the page. - * (To see "alias" issues) - */ -void check_cache_page(struct page *pg) -{ - unsigned long phys, addr, data, i; - unsigned long kaddr; - unsigned long cache_line_index; - int bingo = 0; + unsigned long addr, data; unsigned long flags; - /* Physical address of this page */ - phys = PHYSADDR(page_address(pg)); - kaddr = phys + PAGE_OFFSET; - cache_line_index = (kaddr&CACHE_OC_ENTRY_MASK)>>CACHE_OC_ENTRY_SHIFT; - + /* + * We may not have valid PTE here, so can't do OCBP with U0 address. + * (See try_to_swap_out.) + */ + phys = PHYSADDR(page_address(page)) | CACHE_VALID; save_and_cli(flags); jump_to_P2(); - /* Loop all the D-cache */ - for (i=0; i<CACHE_OC_NUM_ENTRIES; i++) { - addr = CACHE_OC_ADDRESS_ARRAY| (i<<CACHE_OC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & (CACHE_UPDATED|CACHE_VALID)) - == (CACHE_UPDATED|CACHE_VALID) - && (data&PAGE_MASK) == phys) { - data &= ~(CACHE_VALID|CACHE_UPDATED); - ctrl_outl(data, addr); - if ((i^cache_line_index)&0x180) - bingo |= 1; - } + /* Loop D-cache 4K */ + for (addr = CACHE_OC_ADDRESS_ARRAY|(address&0x3000); + addr < ((CACHE_OC_ADDRESS_ARRAY|(address&0x3000)) + +(CACHE_OC_NUM_ENTRIES/4<<CACHE_OC_ENTRY_SHIFT)); + addr += (1<<CACHE_OC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); } - cache_line_index &= 0xff; - /* Loop all the I-cache */ - for (i=0; i<CACHE_IC_NUM_ENTRIES; i++) { - addr = CACHE_IC_ADDRESS_ARRAY| (i<<CACHE_IC_ENTRY_SHIFT); - data = ctrl_inl(addr); - if ((data & CACHE_VALID) && (data&PAGE_MASK) == phys) { - data &= ~CACHE_VALID; - ctrl_outl(data, addr); - if (((i^cache_line_index)&0x80)) - bingo |= 2; + if (vma->vm_flags & VM_EXEC) + /* Loop I-cache 4K */ + for (addr = CACHE_IC_ADDRESS_ARRAY|(address&0x1000); + addr < ((CACHE_IC_ADDRESS_ARRAY|(address&0x1000)) + +(CACHE_IC_NUM_ENTRIES/2<<CACHE_IC_ENTRY_SHIFT)); + addr += (1<<CACHE_IC_ENTRY_SHIFT)) { + data = ctrl_inl(addr)&(0x1ffff000|CACHE_VALID); + if (data == phys) + ctrl_outl(0, addr); } - } back_to_P1(); - restore_flags(flags); - - if (bingo) { - extern void dump_stack(void); - - if (bingo&1) - printk("BINGO!\n"); -#if 0 - if (bingo&2) - printk("Bingo!\n"); -#endif - dump_stack(); - printk("--------------------\n"); - } + save_and_cli(flags); } /* Page is 4K, OC size is 16K, there are four lines. */ @@ -453,7 +371,6 @@ { if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { clear_page(to); - __flush_wback_region(to, PAGE_SIZE); } else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | @@ -490,7 +407,6 @@ { if (((address ^ (unsigned long)to) & CACHE_ALIAS) == 0) { copy_page(to, from); - __flush_wback_region(to, PAGE_SIZE); } else { pgprot_t pgprot = __pgprot(_PAGE_PRESENT | _PAGE_RW | _PAGE_CACHABLE | Index: arch/sh/mm/clear_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/clear_page.S,v retrieving revision 1.2 diff -u -r1.2 clear_page.S --- arch/sh/mm/clear_page.S 2001/07/23 10:27:24 1.2 +++ arch/sh/mm/clear_page.S 2001/08/04 04:02:55 @@ -39,6 +39,9 @@ mov.l r0,@-r4 mov.l r0,@-r4 mov.l r0,@-r4 +#if defined(__SH4__) + ocbwb @r4 +#endif cmp/eq r5,r4 bf/s 1b add #28,r4 Index: arch/sh/mm/copy_page.S =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/copy_page.S,v retrieving revision 1.3 diff -u -r1.3 copy_page.S --- arch/sh/mm/copy_page.S 2001/07/28 15:23:38 1.3 +++ arch/sh/mm/copy_page.S 2001/08/04 04:02:55 @@ -39,10 +39,13 @@ mov.l @r11+,r4 mov.l @r11+,r5 mov.l @r11+,r6 - mov.l @r11+,r7 #if defined(__sh3__) + mov.l @r11+,r7 mov.l r0,@r10 #elif defined(__SH4__) + mov.l @r11,r7 + ocbp @r11 + add #4,r11 movca.l r0,@r10 #endif add #32,r10 @@ -53,6 +56,9 @@ mov.l r3,@-r10 mov.l r2,@-r10 mov.l r1,@-r10 +#if defined(__SH4__) + ocbwb @r10 +#endif cmp/eq r11,r8 bf/s 1b add #28,r10 Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.44 diff -u -r1.44 fault.c --- arch/sh/mm/fault.c 2001/07/18 04:27:38 1.44 +++ arch/sh/mm/fault.c 2001/08/04 04:02:55 @@ -294,10 +294,9 @@ if (__test_and_clear_bit(PG_dcache_dirty, &page->flags)) if (page->mapping) { - unsigned long phys; - /* Physical address of this page */ - phys = PHYSADDR(pte_val(pte)&PAGE_MASK); - __flush_cache_page(phys, 1); + unsigned long phys = pte_val(pte)&PTE_PHYS_MASK; + __flush_purge_region((void *)P1SEGADDR(phys), PAGE_SIZE); + __flush_icache_page(address, phys); } #endif Index: drivers/cdrom/gdrom.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/drivers/cdrom/gdrom.c,v retrieving revision 1.4 diff -u -r1.4 gdrom.c --- drivers/cdrom/gdrom.c 2001/08/03 23:50:59 1.4 +++ drivers/cdrom/gdrom.c 2001/08/04 04:02:56 @@ -144,7 +144,7 @@ } insw(GDROM_DATA, ctrl->buf, count/2); - __flush_wback_region(ctrl->buf, count); + __flush_purge_region(ctrl->buf, count); ctrl->buf += count; ctrl->size -= count; } Index: include/asm-sh/ide.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/ide.h,v retrieving revision 1.16 diff -u -r1.16 ide.h --- include/asm-sh/ide.h 2001/08/03 11:22:06 1.16 +++ include/asm-sh/ide.h 2001/08/04 04:02:59 @@ -25,10 +25,23 @@ void *dst, unsigned long count) { - extern void _insw (unsigned long port, void *dst, unsigned long count); + extern void _insw(unsigned long port, void *dst, unsigned long count); _insw(port, dst, count); - __flush_wback_region(dst, (count << 1)); + __flush_purge_region(dst, (count << 1)); +} + +#undef outsw +#define outsw(port, buf, nr) ide_outsw((port), (buf), (nr)) + +static __inline__ void ide_outsw(unsigned long port, + void *src, + unsigned long count) +{ + extern void _outsw(unsigned long port, void *src, unsigned long count); + + _outsw(port, src, count); + __flush_purge_region(src, (count << 1)); } #endif Index: include/asm-sh/pgtable.h =================================================================== RCS file: /cvsroot/linuxsh/kernel/include/asm-sh/pgtable.h,v retrieving revision 1.39 diff -u -r1.39 pgtable.h --- include/asm-sh/pgtable.h 2001/08/03 11:22:06 1.39 +++ include/asm-sh/pgtable.h 2001/08/04 04:02:59 @@ -36,7 +36,7 @@ #define flush_cache_all() do { } while (0) #define flush_cache_mm(mm) do { } while (0) #define flush_cache_range(mm, start, end) do { } while (0) -#define flush_cache_page(vma, vmaddr) do { } while (0) +#define flush_cache_page(vma, vmaddr, page) do { } while (0) #define flush_page_to_ram(page) do { } while (0) #define flush_dcache_page(page) do { } while (0) #define flush_icache_range(start, end) do { } while (0) @@ -56,7 +56,8 @@ extern void flush_cache_mm(struct mm_struct *mm); extern void flush_cache_range(struct mm_struct *mm, unsigned long start, unsigned long end); -extern void flush_cache_page(struct vm_area_struct *vma, unsigned long addr); +extern void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, + struct page *page); extern void flush_dcache_page(struct page *pg); extern void flush_icache_range(unsigned long start, unsigned long end); extern void flush_cache_sigtramp(unsigned long addr); @@ -70,7 +71,7 @@ extern void __flush_purge_region(void *start, int size); /* Flush a page */ -extern void __flush_cache_page(unsigned long phys, int exec); +extern void __flush_cache_page(unsigned long phys, unsigned long u0, int exec); extern void __flush_icache_page(unsigned long u0, unsigned long phys); /* Initialization of P3 area for copy_user_page */ @@ -101,6 +102,8 @@ #define USER_PTRS_PER_PGD (TASK_SIZE/PGDIR_SIZE) #define FIRST_USER_PGD_NR 0 +#define PTE_PHYS_MASK 0x1ffff000 + #ifndef __ASSEMBLY__ /* * First 1MB map is used by fixed purpose. @@ -205,7 +208,7 @@ */ #define page_address(page) ((page)->virtual) /* P1 address of the page */ #define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT)) -#define pte_page(x) phys_to_page(pte_val(x)) +#define pte_page(x) phys_to_page(pte_val(x)&PTE_PHYS_MASK) /* * The following only work if pte_present() is true. Index: kernel/ptrace.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/kernel/ptrace.c,v retrieving revision 1.8 diff -u -r1.8 ptrace.c --- kernel/ptrace.c 2001/07/23 00:00:57 1.8 +++ kernel/ptrace.c 2001/08/04 04:03:03 @@ -100,7 +100,7 @@ } get_page(page); spin_unlock(&mm->page_table_lock); - flush_cache_page(vma, addr); + flush_cache_page(vma, addr, page); if (write) { maddr = kmap(page); Index: mm/vmscan.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/mm/vmscan.c,v retrieving revision 1.41 diff -u -r1.41 vmscan.c --- mm/vmscan.c 2001/07/31 00:05:21 1.41 +++ mm/vmscan.c 2001/08/04 04:03:04 @@ -102,7 +102,7 @@ * Basically, this just makes it possible for us to do * some real work in the future in "refill_inactive()". */ - flush_cache_page(vma, address); + flush_cache_page(vma, address, page); if (!pte_dirty(pte)) goto drop_pte; -- |
From: NIIBE Y. <gn...@m1...> - 2001-08-04 00:04:02
|
This is not an issue actually, because all the call are aligned. But a bug is a bug, is a bug. * arch/sh/mm/cache-sh4.c (__flush_wback_region): Fix bug of expression of END. (__flush_purge_region): Likewise. (__flush_invalidate_region): Likewise. Index: arch/sh/mm/cache-sh4.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/cache-sh4.c,v retrieving revision 1.8 diff -u -r1.8 cache-sh4.c --- arch/sh/mm/cache-sh4.c 2001/08/03 11:22:06 1.8 +++ arch/sh/mm/cache-sh4.c 2001/08/03 23:53:51 @@ -131,7 +131,8 @@ unsigned long begin, end; begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); - end = begin + size; + end = ((unsigned long)start + size + L1_CACHE_BYTES-1) + & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { asm volatile("ocbwb %0" : /* no output */ @@ -150,7 +151,8 @@ unsigned long begin, end; begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); - end = begin + size; + end = ((unsigned long)start + size + L1_CACHE_BYTES-1) + & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { asm volatile("ocbp %0" : /* no output */ @@ -168,7 +170,8 @@ unsigned long begin, end; begin = (unsigned long)start& ~(L1_CACHE_BYTES-1); - end = begin + size; + end = ((unsigned long)start + size + L1_CACHE_BYTES-1) + & ~(L1_CACHE_BYTES-1); for (v = begin; v < end; v+=L1_CACHE_BYTES) { asm volatile("ocbi %0" : /* no output */ -- |
From: Greg B. <gb...@po...> - 2001-08-03 15:39:09
|
"M. R. Brown" wrote: > > * Greg Banks <gb...@po...> on Fri, Aug 03, 2001: > > > > > I think you've just illustrated my point about CVS version numbers becoming > > non-intuitive once you start using branches. > > With tags, keep track of version numbers isn't that big of a deal. The point is that CVS branches are complex and difficult for many people to consistently handle properly. When you accidentally check a change into the wrong branch, having a tag is not going to help you recover from that situation. > I still can't fathom the general sentiment of this group that "just because > we don't know how to do it right, we'll brute force the hell out of it." You seem to be under the impression that "right" = "the way that uses the maximum CVS features". Me, I figure "right" = "the way that will work, will continue to work, and will gain acceptance easily". > I > mean come on, if you're doing project management, you use the right tools > for the right job. Unless they're from Microsoft, yes. And presuming I can afford to buy it. And presuming it runs on a platform I have access to. > What else are you going to use besides CVS? No-one's suggesting anything else now. BitKeeper *was* suggested 18 months ago. > And since > it _is_ CVS, why not use it efficiently and correctly? That's precisely what I'm trying to do. Using CVS in a way which results in more errors and more time fixing them, is *not* efficient. Remember, we're trying to maximise productivity, not code coverage of the CVS executable. > I you want me to write a quick tutorial on CVS, for the benefit of those > who wish to use it but know little about it, then I'll do it in tandem with > the Developers Guidelines. Sure. Do you want to look at the one I wrote and presented in Mar 2000, that convinced people to start using CVS? Look, it's not like I'm averse to imposing a learning curve on people, *if* it leads to clear benefits. But the way I see it is, we can go CVS branches or separate directories. The storage requirements are basically the same, the bandwidth requirements are basically the same, the practical results (when everything works right) is basically the same. The only difference is that CVS branches require all the developers to learn something new (and they get to stuff it up more frequently) in exchange for which a handful of developers get some warm fuzzy feelings about "correctness". So where's the clear benefit? If someone asked you to learn to drive your car with a joystick instead of a steering wheel, because it was "correct", what would you say? You'd say, "so what's in it for me?" Then you'd answer yourself, "well, nothing." > > > If people don't like the idea of dealing > > > with CVS branches, so be it. That still shouldn't have any bearing on the > > > drop-in tree or CVS tag usage. > > > > Yes, separate issues. > > > > Like Paul and you've said, we won't know unless we try, right? Yeah. Maybe people won't mind learning complex stuff for no good reason. Weirder things have happened. I suggest we drop the argument about CVS branches, which is moot until 2.5 is released anyway, and concentrate on stuff we can agree on. Otherwise we'll go around in these pointless little circles. Greg. -- If it's a choice between being a paranoid, hyper-suspicious global village idiot, or a gullible, mega-trusting sheep, I don't look good in mint sauce. - jd, slashdot, 11Feb2000. |
From: M. R. B. <mr...@0x...> - 2001-08-03 14:46:47
|
* Greg Banks <gb...@po...> on Fri, Aug 03, 2001: > > I think you've just illustrated my point about CVS version numbers becoming > non-intuitive once you start using branches. With tags, keep track of version numbers isn't that big of a deal. > > > > A drop-in tree can be done without CVS branches just fine, though CVS branches > > would be a lot cleaner in the long run. > > So basically, we spend a large learning curve and an indeterminate > number of mistakes to gain some cleanliness? I'm not convinced > this tradeoff is a good idea. > I still can't fathom the general sentiment of this group that "just because we don't know how to do it right, we'll brute force the hell out of it." I mean come on, if you're doing project management, you use the right tools for the right job. What else are you going to use besides CVS? And since it _is_ CVS, why not use it efficiently and correctly? There are too many helpful resources out there that this "large learning curve" becomes a moot point very quickly. I you want me to write a quick tutorial on CVS, for the benefit of those who wish to use it but know little about it, then I'll do it in tandem with the Developers Guidelines. > > If people don't like the idea of dealing > > with CVS branches, so be it. That still shouldn't have any bearing on the > > drop-in tree or CVS tag usage. > > Yes, separate issues. > Like Paul and you've said, we won't know unless we try, right? M. R. |