From: Erik A. H. <er...@he...> - 2002-06-12 20:33:46
|
On Thu, Jun 06, 2002 at 04:44:47PM -0400, Grant Taylor wrote: > >>>>> Erik Arjan Hendriks <er...@he...> writes: > >> - Removing seemingly uneeded icache flushes for non-executable pages > >> in load_map makes no difference, time-wise, on my platform. This > > > That's good. I don't find it surprising though since most of the > > addresses involved in the flush aren't going to be in the icache > > OK, I lied. Evidently that experiment was done with the wrong kernel > or something. In fact it's like 5X faster, as cache fluses are fairly > expensive on my (SMP) platform. Profile below. > > So you definitely want to stick an if around the flush_icache_range > call for the benefit of those of us with expensive ipi cache > nastiness: > > if (mmap_prot & PROT_EXEC) { > flush_icache_range(page.start, page.start + PAGE_SIZE); > } Cool. Patch added. > I also observe in my profile that the read's memcpy ("both_aligned" in > my profile) and the fault's memzero ("sb1_clear_page") each take > nearly half the time. I'm going to bodge together some way to skip > the page clear for this case; we already know we're overwriting the > whole page, so this should be OK as long as it gets carefully zeroed > on error. It'll be pretty ugly, though, so I suspect you won't > actually want it. [snip] I find it surprising that the cache stuff is so expensive. Is the hardware somewhat lacking in maintaining coherency between CPUs? - Erik |