#189 Crash on coLinux start

v0.7.x (release)
open
nobody
5
2012-03-28
2012-03-28
Lorne Sturtevant
No

coLinix is crashing when it's starting. My machine is Windows xp SP3. The crash only occurs when using a PAE kernel. If PAE is disabled, then it will not crash. I am using the 0.7.9 release and the Debian-6.0.1-squeeze.7z image.

This crash is not consistent. I've created a little script that starts coLinux, waits, then kills it. It does this in a loop until it crashes. It will eventually crash, but it can be from as little as 10 tries upwards of 174 tries in the loop. I've attached the script, it's test.sh. I run this under cygwin. I've also tried this on two separate boxes, both are winxp sp3 without PAE. One box seems to crash quicker than the other, but they both eventually do it.

I've attached one of the minidump files. From examining the dump files in windbg, they all appear the same. The crash is in a thread that's not running linux.sys. It always crashes at the nt!MiFreeInPageSupportBlock+0x2a.

Let me know if there's any more testing/information I can gather to track this down.

Discussion

  • test script

     
    Attachments
  • Crash mini dump file

     
    Attachments
  • Henry N.
    Henry N.
    2012-03-28

    Hello Lorne,
    the dump file not helps. I have not sane of your binary files here.
    Please use WinDbg, load the dump and post the text output from "!analyze -v" here.

    Debuggers help page was moved, but the old link is still available:
    http://msdl.microsoft.com/download/symbols/debuggers/dbg_x86_6.7.05.1.exe

     
  • Henry N.
    Henry N.
    2012-03-28

    Post your file "squeeze.conf" here.
    How many MEM do you have installed in your machine and how many have configured for coLinux?
    Do you also use CygWin shell to start colinux-daemon under normal usage?

     
  • Output of !analyze -v

     
    Attachments
  • coLinux configuration file.

     
    Attachments
  • I've added the fiiles you requested.

    I had previously downloaded all of the symbols from the large symbol pack. When I did that, the stack trace was showing that the crash was at nt!MiFreeInPageSupportBlock+0x2a. This time, setup the symbol path to download the symbols as needed. Now it's showing the crash at nt!KeBugCheck2+0x16. I think that is more accurate. I probably had the wrong symbols before.

    I've tried to use just the debian package and the 0.7.9 release with nothing modified. That should make my setup hopefully exactly the same as yours.

    As you can see in the config, 128MB of ram is allocated to coLinux and windows itself has 3.5GB of ram.

    I've run coLinux from cywin and from cmd.exe. I get crashes in both.

     
  • Henry N.
    Henry N.
    2012-04-08

    The analyze give not enough output for precise point into code.

    Have seen, that you stops colinux-damon with "kill -9" (SIGKILL).
    Does it also crash with normal termination "kill -15" (SIGTERM)?

     
  • Henry N.
    Henry N.
    2012-04-08

    Please try to run without "kill". Add "init=/sbin/halt" to colinux config, and remove the "kill" from test.sh.

    Additional add "export COLINUX_CONSOLE_EXIT_ON_DETACH=1" to the test.sh and remove the "-d". This lets see the boot process and close the console after every shut down.

    I also stored the count into file and have installed sync.exe to save it safely.
    (http://technet.microsoft.com/en-en/sysinternals/bb897438)

    It's running without problems since 100 loops now.

    === test.sh ===
    #!/bin/bash

    export COLINUX_CONSOLE_EXIT_ON_DETACH=1

    count=0
    while true; do
    ((count++))
    echo "$count" > /count.txt
    ./sync.exe
    sleep 2
    echo "Executing $count"
    ./colinux-daemon.exe @squeeze.conf
    done
    === end ===

    === squeeze.conf ===
    kernel=vmlinux
    cobd0="E:\images\rootfs_2gb.img"
    cobd1="E:\images\swap_128mb.img"
    root=/dev/cobd0
    ro
    initrd=initrd.gz
    mem=128
    eth0=slirp
    init=/sbin/halt
    === end ===

     
  • Henry N.
    Henry N.
    2012-04-08

    Testet 650 loops with Windows XP P2, 2048MB PAE (foreced with "/PAE" in boot.ini),
    and 200 loops with 5128 PAE, also Windwos XP SP2. All without crash.

     
  • I ran your revized scripts and I got the same crash. Running /sbin/halt is a nice way to get around the killing of the program.

    In all the crashes I've had, it's not from shutting down coLinux. It only crashes when starting it.

    Also, crashes as soon as colinux-daemon.exe is started. It doesn't even have to boot linux. For instance, in the test.sh script, if I change it to this:
    ===test.sh===
    #!/bin/bash

    export COLINUX_CONSOLE_EXIT_ON_DETACH=1

    count=0
    while true; do
    ((count++))
    echo "Executing $count"
    echo "$count" > count.txt
    ./sync.exe
    # ./colinux-daemon.exe @squeeze.conf
    ./colinux-daemon.exe kernel=vmlinux
    done
    ===end===

    This still crashes windows. Here, it's not even properly booting the squeeze image, it's just starting the kernel. I was able to reproduce this crash after 2, 18 and 44 runs through the script.

    One thing I noticed, you mentioned you tested this on Windows XP SP2. I'm running SP3 on this machine. That could make a difference. I think all of the machines I've tried to test on have SP3 machines. I'll try and get an SP2 machine to test on as well.