#193 migration crashes destination node with current CVS

closed-fixed
Roger Tsang
5
2010-03-13
2009-12-31
John Hughes
No

Running the regression tests I get node2 crashing on the first test:

Unable to handle kernel NULL pointer dereference at virtual address 0000000c
printing eip:
c01ae2a2
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod e100 mii ne2k_pci 8390
CPU: 0
EIP: 0060:[<c01ae2a2>] Not tainted VLI
EFLAGS: 00000246 (2.6.11-ssi-686-smp)
EIP is at proc_pid_lookup+0x262/0x330
eax: 00000000 ebx: 00000000 ecx: cf820000 edx: 00010b31
esi: 00010b31 edi: cf9cf360 ebp: ceff9c80 esp: ceff9c5c
ds: 007b es: 007b ss: 0068
Process 1764324-badness (pid: 68401, threadinfo=ceff8000 task=cf866000)
Stack: 00010b31 c04c924c 00000000 00000000 00000000 00000000 c13abe2c ceff9d88
cefe22c0 ceff9ca0 c01ab604 c13abe2c cefe22c0 ceff9d88 cefe22c0 fffffff4
c13abea0 ceff9cc4 c017f7cb c13abe2c cefe22c0 ceff9d88 c13abe2c 00000000
Call Trace:
[<c0106a5f>] show_stack+0x7f/0xa0
[<c0106c04>] show_registers+0x164/0x230
[<c0106fb4>] die+0xf4/0x1c0
[<c011f3ed>] do_page_fault+0x46d/0x669
[<c01066c3>] error_code+0x2b/0x30
[<c01ab604>] proc_root_lookup+0x44/0x70
[<c017f7cb>] real_lookup+0xfb/0x140
[<c017fbac>] do_lookup+0x8c/0xa0
[<c017fd50>] link_path_walk+0x190/0xde0
[<c01809bb>] path_walk+0x1b/0x20
[<c026247c>] reop_import_path+0x36c/0x470
[<c0262902>] reop_make_file+0x72/0x140
[<c0289b11>] rmtfb_getcli_id+0x1f1/0x320
[<c0262a90>] reop_import_file+0xc0/0x130
[<c025ff5a>] reopen_unload_msg+0x15a/0x280
[<c026053d>] common_data_unload_msg+0x34d/0x4a0
[<c0260f45>] migrate_pproc_unload_msg+0x85/0x440
[<c025ed8e>] migrate_server+0xee/0x961
[<c025dc32>] migrate_server_setup+0x62/0x120
[<c01023a5>] kernel_thread_helper+0x5/0x10
Code: 89 04 24 e8 71 e1 fd ff 8b 55 0c 89 14 24 e8 86 e8 fd ff 31 c0 e9 44 ff ff ff b8 00 e0 ff ff 21 e0 8b 00 31 db 8b 80 94 00 00 00 <8b> 40 0c 66 83 b8 1c 01 00 00 01 74 1a 89 f0 c1 e8 10 85 c0 0f

Entering kdb (current=0xcf866000, pid 68401) on processor 0 Oops: Oops
due to oops @ 0xc01ae2a2
eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf820000 edx = 0x00010b31
esi = 0x00010b31 edi = 0xcf9cf360 esp = 0xceff9c5c eip = 0xc01ae2a2
ebp = 0xceff9c80 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246
xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff &regs = 0xceff9c28
[0]kdb>

Related

Bugs: #1

Discussion

1 2 > >> (Page 1 of 2)
  • John Hughes
    John Hughes
    2009-12-31

    • summary: test "badness in sk_del_node_init fails with current CVS --> migration crashes destination node with current CVS
     
  • John Hughes
    John Hughes
    2009-12-31

    In fact it seems like mor or less any migration attempt is killing the system:

     
  • John Hughes
    John Hughes
    2009-12-31

    I meant to add:

    sh -c 'echo 2 >/proc/self/goto; echo hi! >/dev/console'

     
  • Roger Tsang
    Roger Tsang
    2009-12-31

    I was able to reproduce this bug.
    Until this is fixed the workaround is `migrate 2 $$`

     
  • Roger Tsang
    Roger Tsang
    2009-12-31

    • labels: --> Process Management
     
  • Roger Tsang
    Roger Tsang
    2009-12-31

    bug fix

     
    Attachments
  • Roger Tsang
    Roger Tsang
    2009-12-31

    • assigned_to: nobody --> rogertsang
    • status: open --> open-accepted
     
  • Roger Tsang
    Roger Tsang
    2009-12-31

    Try attached patch.

     
  • John Hughes
    John Hughes
    2010-01-01

    Ok, that patch fixes it for me

     
  • Roger Tsang
    Roger Tsang
    2010-02-02

    • status: open-accepted --> open-fixed
     
1 2 > >> (Page 1 of 2)