From: Nikodemus S. <nik...@ra...> - 2008-08-15 09:17:31
|
On Fri, Aug 15, 2008 at 12:15 AM, Elliott Slaughter <ell...@gm...> wrote: (Sorry for the slightly disjointed reply -- you probably want to read it in reverse order...) > Exception Code: 0xc0000005. > Faulting IP: 0x412cb0. > page status: 0x10000. > Was writing: 0, where: 0x3fff8010. > fatal error encountered in SBCL pid 2304(tid 3948): > Exception too early in cold init, cannot continue. > Welcome to LDB, a low-level debugger for the Lisp runtime environment. > ldb> unknown command: ``;;;'' After you get to this point using Brian's suggestion of --core cold-sbcl.core, things to do: See if you can get an LDB backtrace. (Command is "ba".) Then, if you have a working GDB on Win32, attach it, and see if the C backtrace leads you anywhere. If it doesn't, look in sbcl.nm if you can figure out which C side function in SBCL the IP could be int (if any). Disassembling the area around the faulting IP might give a clue. Constructing a backtrace by hand is also an option -- sbcl-internals wiki has a short guide. The instruction pointer seems to be outside the Lisp heap, so the fault occurs either in C code of SBCL, or in some library. The address where the write (or read -- I don't think so?) faulted, on the other hand, is smack in the middle of dynamic space. (See address space layout in src/compiler/x86/parms.lisp -- search for the string win32 to find the relevant bits.) So, what occurs that during a C side call an attempt to write to a protected page was made. However (see handle_exception in win32-os.c), either is_valid_lisp_address didn't return true for the address for some reason, or gencgc_handle_wp_violation() declined to handle it. The first option is just wierd. The second can eg. occur if gc_init() has not completed setting up the page tables yet. So, tasks: * Verify that the faulting address is in the Lisp heap. (I believe so.) * Verify that the IP is not in Lisp space. (I believe so.) * Find out what causes the write -- by bisection via printf & /show if necessary, backtrace/examining sbcl.nm is likely to be faster. * When you know what is being written and where, you should be able to figure out which of the following three options is right: 1. Writing in the wrong address. 2. Writing in the right address but too early. 3. Writing in the right address, and everything should be set up -- and it still goes wrong. The something (eg. the heap_base pointer) has been corrupted earlier on. (There may be other possibilities as well, but these are the ones that spring to mind.) Cheers, -- Nikodemus |