From: Stuart M. <Stu...@st...> - 2000-06-23 18:26:48
|
Niibe-san I got both your mails about the cache problem. Thanks for confirming my thoughts, and I'll post something on the kernel mailing list about this. However your first mail raised some interesting questions. > > Note this is on an SH4, so to some extent we have to regard the cache > > as virtual. > > More specifically, SH-4 has virtually tagged and physically indexed > cache system. I think that MIPS R4000 has same system. This was what I thought, until I started looking at it yesterday. You're right, it has the same synonym problem, due to cache size vs. page size issues. However there is one important difference, the hardware detects the synonym, and raises an exception when it would be created. This has advantages and disadvantages: - the advantage is that, although there is a performance penalty, there sould be no risk of errors caused by synonyms - however it does prevent legitimate synonyms, for example when all the mappings are read only. SH4 doesn't have this detection capability, so we have to make sure we catch any problems early. The other point at which synonyms can occur is when a page is explicitly mapped at multiple addresses in user space. So I had a go at writing a simple test case for this, and then fixing the problem. Attached is my test code, and a potential fix. This is taken almost unchanged from the Sparc sun4c code, which has a completely virtual cache. Its horrible, but I don't see any alternative, unless we can restrict the addresses at which shared pages can be mapped. This also got me thinking about another potential problem. In update_mmu_cache, the new TLB entry is copied into the TLB using the LDTLB instruction. This uses the URC bits of the MMUCR register to select the TLB entry to replace, so in effect the TLB replaced is selected at random. If however there is already a TLB entry for this page, then this could cause two TLB entries to refer to the same page, and cause a 'Multiple Hit Exception' when the page is next accessed. I guess the fact that we are not seeing these means that it is not a problem, but I've not convinced myself that it can never occur. If we have to, it would be pretty easy to fix, effectivly perform a flush_tlb_page before loading the new entry, but this messy. Any thoughts? > Generally speaking, I don't think we need flush at the time of EXIT. > We can postpone the flush at the time of reuse of the physical memory > (or if the memory is overwritten, we don't need to flush them). Agreed. However I came across an email from Linus (admitidly written way back in 1995) where he said that he preferred flushing when a page was freed, as this is likly to be less time critical than when it is allocated. This appears to be the way the kernel implements things. Stuart ------------------------------------------------------------------------------ #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <sys/mman.h> #include <stdio.h> #define SIZE (1024) main() { volatile int *p1, *p2; int fd; int i; int count; printf("Test running\n"); fd = open("/tmp/xx", O_RDWR | O_CREAT, 0666); if (fd < 0) { perror("Failed to open file"); exit(1); } for (i=0; i<SIZE; i++) { int dummy = 0; write(fd, &dummy, sizeof(int)); } p1 = mmap(NULL, SIZE * sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); if (p1 == MAP_FAILED) { perror("First mmap failed"); exit(1); } p2 = mmap(p1 + SIZE, SIZE * sizeof(int), PROT_READ | PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0); if (p2 == MAP_FAILED) { perror("Second mmap failed"); exit(1); } /* access the second mapping here, so that the TLB is set up, this * way we don't take a fault on the first access, which might cause * a flush. */ p2[0] = 1; printf("Running test\n"); for (i=0; i<SIZE; i++) { p1[i] = i; } count = 0; for (i=0; i<SIZE; i++) { int j; j = p2[i]; if (j != i) { printf("Diff at 0x%x (read 0x%x)\n", i, j); count++; } } printf("Differences: %d\n", count); printf("Finished\n"); } ------------------------------------------------------------------------------ Index: arch/sh/mm/fault.c =================================================================== RCS file: /cvsroot/linuxsh/kernel/arch/sh/mm/fault.c,v retrieving revision 1.11 diff -u -r1.11 fault.c --- arch/sh/mm/fault.c 2000/05/18 09:34:18 1.11 +++ arch/sh/mm/fault.c 2000/06/23 17:58:48 @@ -268,6 +268,82 @@ goto no_context; } +#ifdef __SH4__ +/* There are really two cases of aliases to watch out for, and these + * are: + * + * 1) A user's page which can be aliased with the kernels virtual + * mapping of the physical page. + * + * 2) Multiple user mappings of the same inode/anonymous object + * such that two copies of the same data for the same phys page + * can live (writable) in the cache at the same time. + * + * We handle number 1 by flushing the kernel copy of the page always + * after COW page operations. + */ +static void synonym_fixup(struct vm_area_struct *vma, unsigned long address, pte_t pte) +{ + pgd_t *pgdp; + pte_t *ptep; + + if (vma->vm_file) { + struct address_space *mapping; + unsigned long offset = (address & PAGE_MASK) - vma->vm_start; + struct vm_area_struct *vmaring; + int alias_found = 0; + + mapping = vma->vm_file->f_dentry->d_inode->i_mapping; + spin_lock(&mapping->i_shared_lock); + vmaring = mapping->i_mmap; + do { + unsigned long vaddr = vmaring->vm_start + offset; + unsigned long start; + + /* Do not mistake ourselves as another mapping. */ + if (vmaring == vma) + continue; + + printk("Found a shared mapping\n"); + if ((vaddr ^ address) & 0x3000) { + printk("Found an synonym problem\n"); + alias_found++; + start = vmaring->vm_start; + /* If this is already in the cache flush it, + * and remove any existing TLB entries. */ + while (start < vmaring->vm_end) { + pgdp = pgd_offset(vmaring->vm_mm, start); + if (!pgdp) + goto next; + ptep = pte_offset((pmd_t *) pgdp, start); + if (!ptep) + goto next; + + if (pte_val(*ptep) & _PAGE_PRESENT) { + printk("..and the page is present\n"); + flush_cache_page(vmaring, start); + *ptep = __pte(pte_val(*ptep) & + (~ _PAGE_CACHABLE)); + flush_tlb_page(vmaring, start); + } + next: + start += PAGE_SIZE; + } + } + } while ((vmaring = vmaring->vm_next_share) != NULL); + spin_unlock(&mapping->i_shared_lock); + + if (alias_found && (pte_val(pte) & _PAGE_CACHABLE)) { + printk("updating PTE\n"); + pgdp = pgd_offset(vma->vm_mm, address); + ptep = pte_offset((pmd_t *) pgdp, address); + *ptep = __pte(pte_val(*ptep) & (~ _PAGE_CACHABLE)); + pte = *ptep; + } + } +} +#endif + void update_mmu_cache(struct vm_area_struct * vma, unsigned long address, pte_t pte) { @@ -276,6 +352,11 @@ unsigned long pteaddr; save_and_cli(flags); + +#ifdef __SH4__ + if ((vma->vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)) + synonym_fixup(vma, address, pte); +#endif /* Set PTEH register */ if (vma) { |