From: NIIBE Y. <gn...@ch...> - 2000-06-23 02:29:38
|
Stuart Menefy wrote: > I've been battling against a problem for a couple of days now, which looks > like a 'generic' kernel problem, but it like to run it past you first. I know the battle is quite annoying... > Note this is on an SH4, so to some extent we have to regard the cache > as virtual. More specifically, SH-4 has virtually tagged and physically indexed cache system. I think that MIPS R4000 has same system. Generally speaking, I don't think we need flush at the time of EXIT. We can postpone the flush at the time of reuse of the physical memory (or if the memory is overwritten, we don't need to flush them). > As far as I can tell, the exact problem I have is this > - a page is allocated and mapped into a process's address space > (I think its part of the stack) and is written to. The written > data ends up in the cache as normal. > - the process then exits, and all the pages are freed. > - later another process executes a fork, and so the kernel needs to > allocate a page for the child process's pgd, and picks > the page which was previously part of the first processes stack > - the pgd for the child process is copied from the parent to the child > into the newly allocated page > - as part of duplicating the memory map for the child a > flush_cache_mm is called, which results in the entire cache being > flushed, including the data which was cached right at the beginning > of this description. This overwrites the pgd with garbage, and later > causes errors to be reported. I think that this (corrupt of the pgd) can't be happened in this sequence. The page allocated for PGD is initialized by get_pgd_slow (overwritten). If the corrupt of the memory occurs, the sequence would be like this: (1) There's the cache entry, say, for the virtual address VIRT, for physicall address PHYS. (2) (In some reason (bug)) the flush doesn't occur, where it should. (3) The page of PHYS are allocated to different purpose. (4) Some memory access occurs which invalidate VIRT. The entry is flushed and currupts the data at PHYS. Totally unrelated data is written back to physical memory. Your patch may change the behavior as it flushes the cache at the time of exit system call. However, I suspect there's real bug at other place. -- |