Bugs item #2719752, was opened at 2009-03-28 18:12
Message generated for change (Settings changed) made by rogertsang
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Process Management
Group: v2.0.0pre2
>Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Roger Tsang (rogertsang)
Assigned to: Roger Tsang (rogertsang)
Summary: stuck while tracing ssi-ksync
Initial Comment:
>From John Hughes (March 16):
# strace -f -o zz /sbin/ssi-ksync
...
pvpop_rmv_child_from_parent: waiting for parent lock 68272
pvpop_rmv_child_from_parent: waiting for parent lock 68272
pvpop_rmv_child_from_parent: waiting for parent lock 68272
pvpop_rmv_child_from_parent: waiting for parent lock 68272
...
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2009-10-26 23:48
Message:
checked-in
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2009-07-04 16:54
Message:
This pvpop_rmv_child_from_parent() was remotely called from
__ptrace_unlink() path and is stuck waiting for lock held by strace process
which is waiting for remote child to get reaped.
----------------------------------------------------------------------
Comment By: Roger Tsang (rogertsang)
Date: 2009-03-28 18:16
Message:
>From John Hughes (March 16):
Further investigation shows that the problem is happening in
mkdhcpd.conf:
# strace -f -o zz /sbin/mkdhcpd.conf
pvpop_rmv_child_from_parent: waiting for parent lock xxxxx
If I power-off the node that's waiting for the parent lock then the other
node crashes:
Taking over master from node 2.
Node 2 has gone down!!!
passed the first scan in ipcname_pull_data
num_objects[MSG] = 0
num_objects[SEM] = 0
num_objects[SHM] = 0
ipcnameserver ready completed
write handler down off 2400000 len 4096
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
fsck 1.37 (21-Mar-2005)
EXT3 FS on hdb1, internal journal
/etc/init.d/rc.sysrecover running
ptrace_unlink: vpop_reclaim failed Unable to handle kernel NULL pointer
dereference at virtual address 00000008
printing eip:
c0241911
*pde = 00000000
Oops: 0000 [#1]
SMP Modules linked in: parport_pc parport floppy uhci_hcd ohci_hcd
ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod ext3 jbd ne2k_pci 8390
CPU: 0
EIP: 0060:[<c0241911>] Not tainted VLI
EFLAGS: 00000282 (2.6.11-ssi-686-smp) EIP is at
pvpop_reassign_child+0x281/0x920
eax: 00000000 ebx: 0002074b ecx: 00000000 edx: cea561f0
esi: 0002074c edi: cf6ae000 ebp: cf6afebc esp: cf6afe1c
ds: 007b es: 007b ss: 0068
Process VPROC Slave Dae (pid: 67247, threadinfo=cf6ae000 task=cf980b10)
Stack: 0002074b 00000001 c04cffc1 00000001 cf6afe4c c01540a2 00000014
00000000 00000000 cf532040 cf532000 00000000 cea561f0 c0155036
cf6afe78 c025c57f 00000001 000008d0 cf9ef700 00000010 cea56cc0
00000001 00000001 cf6afe94 Call Trace:
[<c010679f>] show_stack+0x7f/0xa0
[<c0106944>] show_registers+0x164/0x230
[<c0106cf4>] die+0xf4/0x1c0
[<c011f39d>] do_page_fault+0x46d/0x669
[<c0106403>] error_code+0x2b/0x30
[<c025965a>] vproc_child_lost_parent+0x7a/0x160
[<c0259ead>] vproc_lost_relations+0xad/0x218
[<c024d554>] pvpsop_lost_relations+0x94/0xa0
[<c025956e>] vproc_carelist_cleanup+0x7e/0xf0
[<c0256933>] vproc_origin_node_cleanup+0x63/0x2b0
[<c0256dde>] vproc_origin_cleanup+0xbe/0x190
[<c0257fc1>] vproc_slave_daemon+0x1a1/0x260
[<c01023a5>] kernel_thread_helper+0x5/0x10
Code: 8b 4d 8c 89 0c 24 e8 6f a9 01 00 ba 01 00 00 00 89 54 24 04 8b 55 90
8b 42 2c 89 04 24 e8 88 a6 01 00 89 45 8c 8b 4d 8c 8b 55 90 <8b> 41 08 39
42 28 0f 84 12 03 00 00 8b 85 7c ff ff ff 85 c0 75
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541
|