From: Paul M. <pau...@re...> - 2007-02-09 10:23:47
|
On Fri, Feb 09, 2007 at 10:32:49AM +0100, Manuel Lauss wrote: > On Fri, Feb 09, 2007 at 05:34:24PM +0900, Paul Mundt wrote: > > On Thu, Feb 08, 2007 at 12:02:47PM +0100, Manuel Lauss wrote: > > > On Thu, Feb 08, 2007 at 05:26:23PM +0900, Paul Mundt wrote: > > > > Assuming no luck and you hit the same problem again, but at a different > > > > offset, and it's repeatable, can you walk the page tables and print out > > > > the pgprot encoding as well as the corresponding pfn << PAGE_SHIFT for > > > > each remapped PTE? > > > > > > That may take a (long) while, I don't know yet where to start :) > > > > > Try this.. > > Thank you. > > Courisouly, with the dumping enabled the oops did not occur > the first 3 times, strange, but after a another reset it is > as oops-happy as ever. > > This is the log of the first few *working* cases: > http://mlau.at/sh/sh4-ptedumper.txt > > And here from the oopsing ones: > http://mlau.at/sh/sh4-ptedumper-2.txt > http://mlau.at/sh/sh4-ptedumper-3.txt > > I'll try to find something with the JTAG probe next. > The page tables look perfectly sane, so that's quite interesting. At least we know it's not a problem of a bogus page table mapping, or pgprot oddities. The next thing to try would be to pre-fault the translation and make sure it's in the TLB so we don't take a TLB miss for the dying page to see if the issue persists for that page, or if it's bumped to the next one. You're clearly on a system that has the old-style PTEA, that might be something else to look at (and that's also set in the update_mmu_cache() path). The fact that this fault happens at a fixed location suggestions that it's not actually having a problem faulting in the translation, so I would imagine you're not going to find much in the TLB miss case. Once you've pre-faulted the page, please try to dump the whole page and see if it manages to blow up at that same location. You can setup the translation with: pte_t entry = pfn_pte(0x18000000 >> PAGE_SHIFT, pgprot); update_mmu_cache(NULL, 0xc0600000, entry); or you can of course just hack something stupid in to the dump code to pre-fault and bail out early (ie, a dummy read for every PTE). The next question would be what register you have at 0x18000ff0, whether you can use 16 or 32-bit reads, and if so, whether it's the same location that blows up. The fact it worked the first few times and it's an uncached mapping almost suggests a timing problem, oddly. |