From: Grant T. <gt...@sw...> - 2002-03-03 22:59:55
|
I've made vmadump go on the sb1 embedded mips cpu, at least mostly. Currently it only works in non-SMP mode... The main thing for the port, and something which I'm not sure how you'll want to deal with, is the fact that on MIPS, kernel entry points save only some registers (as there are plenty of registers, and only a few are really used for syscalls, this saves a little time). At key times later (context switch, etc) there is code to save the rest of them. Unfortunately it isn't at all easy to get at the unsaved registers, so I ended up saving them all always. The slightly better choice is to have kernel entry magically recognize the vmadump syscall number and save all just for it. Anyway, this somewhat precludes the current vmadump-as-a-module arrangement. There might be other platforms with this property that vmadump will need to deal with someday, so it's worth pondering a bit. Another key thing, and an apparent general bug, is the (somewhat new) flush_icache_range() call in load_map, which explodes every time. This appears to be passing in a userspace address, while flush_icache functions seem to expect a real address. I replaced this with a get_user_pages(), flush_icache_page() incantation: // bproc flushed a user address: flush_icache_range(page.start, page.start + PAGE_SIZE); { struct page *pages[1]; struct vm_area_struct *vmas[1]; int i; pages[0] = NULL; vmas[0] = NULL; /* It really seems like there should be a lock held over the get_user_pages() to flush_icache_page()? */ i = get_user_pages(current, current->mm, page.start, 1, 0, 1, pages, vmas); if (i == 1 && vmas[0] && pages[0]) { flush_icache_page(vmas[0], pages[0]); } else { printk("vmadump: trouble finding user page at 0x%x for icache flush!\n", page.start); } } Curiously, the bad flush_icache thing didn't fail in uniprocessor mode. I don't know why this would be so; it seems to me that flushing effectively random addresses would be poor either way. Evidently it's a nonfatal failure on my platform in uniprocessor mode. Regardless, now thawing under SMP never panics the kernel. Sometimes, thawing even works completely. However, it usually still fails, and in a funky way. The thawed process merely segfaults immediately or shortly after resurrection. Occasionally it (vmadtest) will even print some of it's .+'s as it does things and then segv. What's really interesting, is that often *my shell* will exit for no reason shortly after a segv'd thawed process dies. The shell is of course the parent process of the "vmadtest -u", but what in the thaw process makes the parent exit later I just don't understand. It seems like any shared-with-parent memory mappings would have been lost long ago when the "vmadtest -u" was exec'd; likewise for any inherited signal state information*. And I'm pretty sure my registers are coming back at me. If I run the thaw under GDB, then the GDB tends to bizzarely exit, so I can't really poke around easily to verify registers and memory contents ;( All in all, I'm inclined to think that the icache stuff is still wrong; it seems to be time-dependant as to whether or not the child will work, which I could see if there were dcache/icache interactions giving my core old code to run in the newly undumped process. Interestingly, using a flush_icache_all() at the end of the page loading behaves exactly the same. Can anyone offer any suggestions? I'm a bit puzzled... * Signal state can apparently be shared with another process, so vmadump may have a bug here if something clones keeping signals and then calls the vmadump thaw syscall before exec. There's a static function which breaks the sharing of task->sig in fs/exec.c. -- Grant Taylor - gtaylor<at>picante.com - http://www.picante.com/~gtaylor/ Linux Printing Website and HOWTO: http://www.linuxprinting.org/ |