Menu

#127 Bugcheck in coLinux when starting VM from Virtual PC 2007

v0.7.x (release)
closed-accepted
Henry N.
5
2014-08-15
2008-05-07
No

I sent the message below to Dan Aloni but never got a reply.

I would have filed a bug back then, but there's no obvious link to the SF bug tracker on the colinux.org website.

---------- Forwarded message ----------
From: George V. Reilly <george@reilly.org>
Date: 2008/4/10
Subject: Bugcheck in coLinux when starting VM from VPC2007
To: Dan Aloni <da-x@colinux.org>

I installed andLinux beta 1 rc6 yesterday, on a 4GB quad-core box running x86 Vista SP1. As soon as I launch a virtual machine in Virtual PC 2007, I get a bugcheck in colinux-daemon. This is 100% repeatable.

0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

UNEXPECTED_KERNEL_MODE_TRAP (7f)
This means a trap occurred in kernel mode, and it's a trap of a kind
that the kernel isn't allowed to have/catch (bound trap) or that
is always instant death (double fault). The first number in the
bugcheck params is the number of the trap (8 = double fault, etc)
Consult an Intel x86 family manual to learn more about what these
traps are. Here is a *portion* of those codes:
If kv shows a taskGate
use .tss on the part before the colon, then kv.
Else if kv shows a trapframe
use .trap on that value
Else
.trap on the appropriate frame will show where the trap was taken
(on x86, this will be the ebp that goes with the procedure KiTrap)
Endif
kb will then show the corrected stack.
Arguments:
Arg1: 00000008, EXCEPTION_DOUBLE_FAULT
Arg2: 80154000
Arg3: 00000000
Arg4: 00000000

Debugging Details:
------------------

PEB is paged out (Peb.Ldr = 7ffdb00c). Type ".hh dbgerr001" for details

PEB is paged out (Peb.Ldr = 7ffdb00c). Type ".hh dbgerr001" for details

BUGCHECK_STR: 0x7f_8

TSS: 00000028 -- (.tss 0x28)
eax=0000000d ebx=73571250 ecx=883a8ac0 edx=73571280 esi=00000000 edi=73571132
eip=81d0576a esp=73570e6c ebp=73571220 iopl=0 nv up di pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010002
nt!KeBugCheck2+0x1f:
81d0576a 89442424 mov dword ptr [esp+24h],eax ss:0010:73570e90=????????
Resetting default scope

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

PROCESS_NAME: colinux-daemon.

CURRENT_IRQL: 0

LAST_CONTROL_TRANSFER: from 00000000 to 81d0576a

STACK_TEXT:
00000000 81d0576a 00000000 00000000 00000000 nt!KiTrap08+0x75
73571220 00000000 00000000 00000000 00000000 nt!KeBugCheck2+0x1f

STACK_COMMAND: kb

FOLLOWUP_IP:
nt!KiTrap08+75
81c91b9e ebee jmp nt!KiTrap08+0x65 (81c91b8e)

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: nt!KiTrap08+75

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrpamp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 47918b12

FAILURE_BUCKET_ID: 0x7f_8_nt!KiTrap08+75

BUCKET_ID: 0x7f_8_nt!KiTrap08+75

Followup: MachineOwner
---------

0: kd> kv
ChildEBP RetAddr Args to Child
00000000 81d0576a 00000000 00000000 00000000 nt!KiTrap08+0x75 (FPO: TSS 28:0)
73571220 00000000 00000000 00000000 00000000 nt!KeBugCheck2+0x1f
0: kd> .tss 0x28
eax=0000000d ebx=73571250 ecx=883a8ac0 edx=73571280 esi=00000000 edi=73571132
eip=81d0576a esp=73570e6c ebp=73571220 iopl=0 nv up di pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010002
nt!KeBugCheck2+0x1f:
81d0576a 89442424 mov dword ptr [esp+24h],eax ss:0010:73570e90=????????
0: kd> kv
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr Args to Child
73571220 00000000 00000000 00000000 00000000 nt!KeBugCheck2+0x1f
0: kd> .ecxr
Unable to get exception context, HRESULT 0x8000FFFF
0: kd> !thread
THREAD 883a8ac0 Cid 0978.09c0 Teb: 7ffde000 Win32Thread: 00000000 RUNNING on processor 0
IRP List:
87e239f8: (0006,0094) Flags: 00060070 Mdl: 00000000
87e69f68: (0006,0094) Flags: 00060900 Mdl: 88dff0a0
Not impersonating
DeviceMap 8ae08808
Owning Process 883855b8 Image: colinux-daemon.exe
Wait Start TickCount 53407 Ticks: 0
Context Switch Count 85610
UserTime 00:00:00.031
KernelTime 00:00:05.397
Win32 Start Address 0x766cd1b9
Stack Init 8b243000 Current 8b2429b8 Base 8b243000 Limit 8b240000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
ChildEBP RetAddr Args to Child
00000000 81d0576a 00000000 00000000 00000000 nt!KiTrap08+0x75 (FPO: TSS 28:0)
73571220 00000000 00000000 00000000 00000000 nt!KeBugCheck2+0x1f

0: kd> !pcr
KPCR for Processor 0 at 81d30800:
Major 1 Minor 1
NtTib.ExceptionList: 81d2abe8
NtTib.StackBase: 00000000
NtTib.StackLimit: 00000000
NtTib.SubSystemTib: 80154000
NtTib.Version: 003cb6e5
NtTib.UserPointer: 00000001
NtTib.SelfTib: 7ffde000

SelfPcr: 81d30800
Prcb: 81d30920
Irql: 0000001f
IRR: 00000000
IDR: ffffffff
InterruptMode: 00000000
IDT: 81afd400
GDT: 81afd000
TSS: 81d2e000

CurrentThread: 883a8ac0
NextThread: 00000000
IdleThread: 81d34640

DpcQueue:

I have a crashdump, compressed down to 57MB at http://www.georgevreilly.com/temp/colinux-daemon.dmp.bz2

Discussion

  • Henry N.

    Henry N. - 2008-05-09

    Logged In: YES
    user_id=579204
    Originator: NO

    Hello George,

    Dan is not active anymore.

    In the bugcheck I can not see any informations why it crashes. Shure, I see colinux-daemon process was named. If I see right, this userland task kills the nt kernel driver? That I can not understand. In typically bugs the colinux-daemon calls linux.sys and this can crash. Why not there?

    Next question is, what rule have VPC2007 there? What is the real host for coLinux, and what is the host for VPC2007?
    You runs VPC2007 as other task on the same machine, and then starts coLinux parallel?
    Or you have running coLinux and start than VPC2007 parallel?
    Or runs you one VM in other VM?

    Is the driver coLinux installed right and running? Please check it with "colinux-daemon --status-driver".

    Can you run colinux-daemon from in the Windows Debugger? ( I don't know how, please ask Google )

     
  • George V. Reilly

    Logged In: YES
    user_id=737437
    Originator: YES

    coLinux was installed by andLinux, not directly by me.

    C:\Program Files\andLinux>colinux-daemon --status-driver
    Cooperative Linux Daemon, 0.7.1
    Compiled on Sat Jul 14 12:15:18 2007

    checking if the driver is installed
    current state: 4 (fully initialized)
    current number of monitors: 1
    current linux api version: 10
    current periphery api version: 20

    I'm running both andLinux and VPC 2007 on the same instance of Vista x86 SP1. They're peers. Vista is the host for both VPC and andLinux. andLinux is *not* running inside Virtual PC.

    Did you try downloading http://www.georgevreilly.com/temp/colinux-daemon.dmp.bz2 and examining it inside WinDbg? That contains the kernel's state at the moment it bugchecked. The !analyze output above isn't enough to tell you why it crashed.

    I'm not in a position today to attach a kernel debugger to this machine. It's my main dev machine and I don't want to do things that might bluescreen it. I'll install andLinux on another machine and see if I can repro the issue there.

     
  • George V. Reilly

    Logged In: YES
    user_id=737437
    Originator: YES

    coLinux was installed by andLinux, not directly by me.

    C:\Program Files\andLinux>colinux-daemon --status-driver
    Cooperative Linux Daemon, 0.7.1
    Compiled on Sat Jul 14 12:15:18 2007

    checking if the driver is installed
    current state: 4 (fully initialized)
    current number of monitors: 1
    current linux api version: 10
    current periphery api version: 20

    I'm running both andLinux and VPC 2007 on the same instance of Vista x86 SP1. They're peers. Vista is the host for both VPC and andLinux. andLinux is *not* running inside Virtual PC.

    Did you try downloading http://www.georgevreilly.com/temp/colinux-daemon.dmp.bz2 and examining it inside WinDbg? That contains the kernel's state at the moment it bugchecked. The !analyze output above isn't enough to tell you why it crashed.

    I'm not in a position today to attach a kernel debugger to this machine. It's my main dev machine and I don't want to do things that might bluescreen it. I'll install andLinux on another machine and see if I can repro the issue there.

     
  • John Stroy

    John Stroy - 2009-02-04

    This happens in 0.7.3 as well.

    2.6.22.18-co-0.7.3 #1 PREEMPT Sat May 24 22:27:30 UTC 2008 i686 06/17

    I got this to happen on Windows Server 2003 Enterprise Edition with MSVPC2007.

     
  • Technologov

    Technologov - 2009-04-29

    This report looks similar to this:
    [https://sourceforge.net/tracker/?func=detail&atid=622063&aid=2760666&group_id=98788 this report]

     
  • Nobody/Anonymous

    I am having the same issue. Anyone looking into this? Any chance of this getting fixed? I can help with testing if needed.

     
  • Technologov

    Technologov - 2009-05-17

    This report looks similar to this:
    http://www.virtualbox.org/ticket/3724

    -Technologov

     
  • Henry N.

    Henry N. - 2009-05-20
    • labels: --> Crash / BSOD
    • assigned_to: nobody --> henryn
    • status: open --> open-accepted
     
  • Henry N.

    Henry N. - 2009-05-20

    Based on idea by Sander Vanleeuwen (http://www.virtualbox.org/ticket/3724#comment:8), have added a check for running VMX. If CPU is unning in VMX mode, coLinux still aborts the guest system, and no longer will crash the host. It's coded in current devel snapshot from SVN revision 1248.

    Sorry, that we currently don't have support for this mode. I'm locking into Intel manuals to find a solution to run in a cooperative way.

    Henry

     
  • Henry N.

    Henry N. - 2009-10-31

    Version 0.7.5 will catch this with an error on coLinux.

     
  • Henry N.

    Henry N. - 2009-10-31
    • status: open-accepted --> closed-accepted