Thread: [SSI-devel] [ ssic-linux-Bugs-2719752 ] stuck while tracing ssi-ksync
Brought to you by:
brucewalker,
rogertsang
From: SourceForge.net <no...@so...> - 2009-03-28 22:12:50
|
Bugs item #2719752, was opened at 2009-03-28 18:12 Message generated for change (Tracker Item Submitted) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Roger Tsang (rogertsang) Assigned to: Nobody/Anonymous (nobody) Summary: stuck while tracing ssi-ksync Initial Comment: >From John Hughes (March 16): # strace -f -o zz /sbin/ssi-ksync ... pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 ... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 |
From: SourceForge.net <no...@so...> - 2009-03-28 22:16:05
|
Bugs item #2719752, was opened at 2009-03-28 18:12 Message generated for change (Comment added) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Roger Tsang (rogertsang) Assigned to: Nobody/Anonymous (nobody) Summary: stuck while tracing ssi-ksync Initial Comment: >From John Hughes (March 16): # strace -f -o zz /sbin/ssi-ksync ... pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 ... ---------------------------------------------------------------------- >Comment By: Roger Tsang (rogertsang) Date: 2009-03-28 18:16 Message: >From John Hughes (March 16): Further investigation shows that the problem is happening in mkdhcpd.conf: # strace -f -o zz /sbin/mkdhcpd.conf pvpop_rmv_child_from_parent: waiting for parent lock xxxxx If I power-off the node that's waiting for the parent lock then the other node crashes: Taking over master from node 2. Node 2 has gone down!!! passed the first scan in ipcname_pull_data num_objects[MSG] = 0 num_objects[SEM] = 0 num_objects[SHM] = 0 ipcnameserver ready completed write handler down off 2400000 len 4096 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. fsck 1.37 (21-Mar-2005) EXT3 FS on hdb1, internal journal /etc/init.d/rc.sysrecover running ptrace_unlink: vpop_reclaim failed Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: c0241911 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod ext3 jbd ne2k_pci 8390 CPU: 0 EIP: 0060:[<c0241911>] Not tainted VLI EFLAGS: 00000282 (2.6.11-ssi-686-smp) EIP is at pvpop_reassign_child+0x281/0x920 eax: 00000000 ebx: 0002074b ecx: 00000000 edx: cea561f0 esi: 0002074c edi: cf6ae000 ebp: cf6afebc esp: cf6afe1c ds: 007b es: 007b ss: 0068 Process VPROC Slave Dae (pid: 67247, threadinfo=cf6ae000 task=cf980b10) Stack: 0002074b 00000001 c04cffc1 00000001 cf6afe4c c01540a2 00000014 00000000 00000000 cf532040 cf532000 00000000 cea561f0 c0155036 cf6afe78 c025c57f 00000001 000008d0 cf9ef700 00000010 cea56cc0 00000001 00000001 cf6afe94 Call Trace: [<c010679f>] show_stack+0x7f/0xa0 [<c0106944>] show_registers+0x164/0x230 [<c0106cf4>] die+0xf4/0x1c0 [<c011f39d>] do_page_fault+0x46d/0x669 [<c0106403>] error_code+0x2b/0x30 [<c025965a>] vproc_child_lost_parent+0x7a/0x160 [<c0259ead>] vproc_lost_relations+0xad/0x218 [<c024d554>] pvpsop_lost_relations+0x94/0xa0 [<c025956e>] vproc_carelist_cleanup+0x7e/0xf0 [<c0256933>] vproc_origin_node_cleanup+0x63/0x2b0 [<c0256dde>] vproc_origin_cleanup+0xbe/0x190 [<c0257fc1>] vproc_slave_daemon+0x1a1/0x260 [<c01023a5>] kernel_thread_helper+0x5/0x10 Code: 8b 4d 8c 89 0c 24 e8 6f a9 01 00 ba 01 00 00 00 89 54 24 04 8b 55 90 8b 42 2c 89 04 24 e8 88 a6 01 00 89 45 8c 8b 4d 8c 8b 55 90 <8b> 41 08 39 42 28 0f 84 12 03 00 00 8b 85 7c ff ff ff 85 c0 75 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 |
From: SourceForge.net <no...@so...> - 2009-07-04 20:54:54
|
Bugs item #2719752, was opened at 2009-03-28 18:12 Message generated for change (Comment added) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management >Group: v2.0.0pre2 Status: Open >Resolution: Accepted Priority: 5 Private: No Submitted By: Roger Tsang (rogertsang) >Assigned to: Roger Tsang (rogertsang) Summary: stuck while tracing ssi-ksync Initial Comment: >From John Hughes (March 16): # strace -f -o zz /sbin/ssi-ksync ... pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 ... ---------------------------------------------------------------------- >Comment By: Roger Tsang (rogertsang) Date: 2009-07-04 16:54 Message: This pvpop_rmv_child_from_parent() was remotely called from __ptrace_unlink() path and is stuck waiting for lock held by strace process which is waiting for remote child to get reaped. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-03-28 18:16 Message: >From John Hughes (March 16): Further investigation shows that the problem is happening in mkdhcpd.conf: # strace -f -o zz /sbin/mkdhcpd.conf pvpop_rmv_child_from_parent: waiting for parent lock xxxxx If I power-off the node that's waiting for the parent lock then the other node crashes: Taking over master from node 2. Node 2 has gone down!!! passed the first scan in ipcname_pull_data num_objects[MSG] = 0 num_objects[SEM] = 0 num_objects[SHM] = 0 ipcnameserver ready completed write handler down off 2400000 len 4096 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. fsck 1.37 (21-Mar-2005) EXT3 FS on hdb1, internal journal /etc/init.d/rc.sysrecover running ptrace_unlink: vpop_reclaim failed Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: c0241911 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod ext3 jbd ne2k_pci 8390 CPU: 0 EIP: 0060:[<c0241911>] Not tainted VLI EFLAGS: 00000282 (2.6.11-ssi-686-smp) EIP is at pvpop_reassign_child+0x281/0x920 eax: 00000000 ebx: 0002074b ecx: 00000000 edx: cea561f0 esi: 0002074c edi: cf6ae000 ebp: cf6afebc esp: cf6afe1c ds: 007b es: 007b ss: 0068 Process VPROC Slave Dae (pid: 67247, threadinfo=cf6ae000 task=cf980b10) Stack: 0002074b 00000001 c04cffc1 00000001 cf6afe4c c01540a2 00000014 00000000 00000000 cf532040 cf532000 00000000 cea561f0 c0155036 cf6afe78 c025c57f 00000001 000008d0 cf9ef700 00000010 cea56cc0 00000001 00000001 cf6afe94 Call Trace: [<c010679f>] show_stack+0x7f/0xa0 [<c0106944>] show_registers+0x164/0x230 [<c0106cf4>] die+0xf4/0x1c0 [<c011f39d>] do_page_fault+0x46d/0x669 [<c0106403>] error_code+0x2b/0x30 [<c025965a>] vproc_child_lost_parent+0x7a/0x160 [<c0259ead>] vproc_lost_relations+0xad/0x218 [<c024d554>] pvpsop_lost_relations+0x94/0xa0 [<c025956e>] vproc_carelist_cleanup+0x7e/0xf0 [<c0256933>] vproc_origin_node_cleanup+0x63/0x2b0 [<c0256dde>] vproc_origin_cleanup+0xbe/0x190 [<c0257fc1>] vproc_slave_daemon+0x1a1/0x260 [<c01023a5>] kernel_thread_helper+0x5/0x10 Code: 8b 4d 8c 89 0c 24 e8 6f a9 01 00 ba 01 00 00 00 89 54 24 04 8b 55 90 8b 42 2c 89 04 24 e8 88 a6 01 00 89 45 8c 8b 4d 8c 8b 55 90 <8b> 41 08 39 42 28 0f 84 12 03 00 00 8b 85 7c ff ff ff 85 c0 75 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 |
From: SourceForge.net <no...@so...> - 2009-10-27 03:48:08
|
Bugs item #2719752, was opened at 2009-03-28 18:12 Message generated for change (Settings changed) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v2.0.0pre2 Status: Open >Resolution: Fixed Priority: 5 Private: No Submitted By: Roger Tsang (rogertsang) Assigned to: Roger Tsang (rogertsang) Summary: stuck while tracing ssi-ksync Initial Comment: >From John Hughes (March 16): # strace -f -o zz /sbin/ssi-ksync ... pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 ... ---------------------------------------------------------------------- >Comment By: Roger Tsang (rogertsang) Date: 2009-10-26 23:48 Message: checked-in ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-07-04 16:54 Message: This pvpop_rmv_child_from_parent() was remotely called from __ptrace_unlink() path and is stuck waiting for lock held by strace process which is waiting for remote child to get reaped. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-03-28 18:16 Message: >From John Hughes (March 16): Further investigation shows that the problem is happening in mkdhcpd.conf: # strace -f -o zz /sbin/mkdhcpd.conf pvpop_rmv_child_from_parent: waiting for parent lock xxxxx If I power-off the node that's waiting for the parent lock then the other node crashes: Taking over master from node 2. Node 2 has gone down!!! passed the first scan in ipcname_pull_data num_objects[MSG] = 0 num_objects[SEM] = 0 num_objects[SHM] = 0 ipcnameserver ready completed write handler down off 2400000 len 4096 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. fsck 1.37 (21-Mar-2005) EXT3 FS on hdb1, internal journal /etc/init.d/rc.sysrecover running ptrace_unlink: vpop_reclaim failed Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: c0241911 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod ext3 jbd ne2k_pci 8390 CPU: 0 EIP: 0060:[<c0241911>] Not tainted VLI EFLAGS: 00000282 (2.6.11-ssi-686-smp) EIP is at pvpop_reassign_child+0x281/0x920 eax: 00000000 ebx: 0002074b ecx: 00000000 edx: cea561f0 esi: 0002074c edi: cf6ae000 ebp: cf6afebc esp: cf6afe1c ds: 007b es: 007b ss: 0068 Process VPROC Slave Dae (pid: 67247, threadinfo=cf6ae000 task=cf980b10) Stack: 0002074b 00000001 c04cffc1 00000001 cf6afe4c c01540a2 00000014 00000000 00000000 cf532040 cf532000 00000000 cea561f0 c0155036 cf6afe78 c025c57f 00000001 000008d0 cf9ef700 00000010 cea56cc0 00000001 00000001 cf6afe94 Call Trace: [<c010679f>] show_stack+0x7f/0xa0 [<c0106944>] show_registers+0x164/0x230 [<c0106cf4>] die+0xf4/0x1c0 [<c011f39d>] do_page_fault+0x46d/0x669 [<c0106403>] error_code+0x2b/0x30 [<c025965a>] vproc_child_lost_parent+0x7a/0x160 [<c0259ead>] vproc_lost_relations+0xad/0x218 [<c024d554>] pvpsop_lost_relations+0x94/0xa0 [<c025956e>] vproc_carelist_cleanup+0x7e/0xf0 [<c0256933>] vproc_origin_node_cleanup+0x63/0x2b0 [<c0256dde>] vproc_origin_cleanup+0xbe/0x190 [<c0257fc1>] vproc_slave_daemon+0x1a1/0x260 [<c01023a5>] kernel_thread_helper+0x5/0x10 Code: 8b 4d 8c 89 0c 24 e8 6f a9 01 00 ba 01 00 00 00 89 54 24 04 8b 55 90 8b 42 2c 89 04 24 e8 88 a6 01 00 89 45 8c 8b 4d 8c 8b 55 90 <8b> 41 08 39 42 28 0f 84 12 03 00 00 8b 85 7c ff ff ff 85 c0 75 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 |
From: SourceForge.net <no...@so...> - 2010-03-13 19:58:12
|
Bugs item #2719752, was opened at 2009-03-28 18:12 Message generated for change (Settings changed) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v2.0.0pre2 >Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: Roger Tsang (rogertsang) Assigned to: Roger Tsang (rogertsang) Summary: stuck while tracing ssi-ksync Initial Comment: >From John Hughes (March 16): # strace -f -o zz /sbin/ssi-ksync ... pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 pvpop_rmv_child_from_parent: waiting for parent lock 68272 ... ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-10-26 23:48 Message: checked-in ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-07-04 16:54 Message: This pvpop_rmv_child_from_parent() was remotely called from __ptrace_unlink() path and is stuck waiting for lock held by strace process which is waiting for remote child to get reaped. ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2009-03-28 18:16 Message: >From John Hughes (March 16): Further investigation shows that the problem is happening in mkdhcpd.conf: # strace -f -o zz /sbin/mkdhcpd.conf pvpop_rmv_child_from_parent: waiting for parent lock xxxxx If I power-off the node that's waiting for the parent lock then the other node crashes: Taking over master from node 2. Node 2 has gone down!!! passed the first scan in ipcname_pull_data num_objects[MSG] = 0 num_objects[SEM] = 0 num_objects[SHM] = 0 ipcnameserver ready completed write handler down off 2400000 len 4096 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. fsck 1.37 (21-Mar-2005) EXT3 FS on hdb1, internal journal /etc/init.d/rc.sysrecover running ptrace_unlink: vpop_reclaim failed Unable to handle kernel NULL pointer dereference at virtual address 00000008 printing eip: c0241911 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd i2c_piix4 i2c_core ide_scsi scsi_mod ext3 jbd ne2k_pci 8390 CPU: 0 EIP: 0060:[<c0241911>] Not tainted VLI EFLAGS: 00000282 (2.6.11-ssi-686-smp) EIP is at pvpop_reassign_child+0x281/0x920 eax: 00000000 ebx: 0002074b ecx: 00000000 edx: cea561f0 esi: 0002074c edi: cf6ae000 ebp: cf6afebc esp: cf6afe1c ds: 007b es: 007b ss: 0068 Process VPROC Slave Dae (pid: 67247, threadinfo=cf6ae000 task=cf980b10) Stack: 0002074b 00000001 c04cffc1 00000001 cf6afe4c c01540a2 00000014 00000000 00000000 cf532040 cf532000 00000000 cea561f0 c0155036 cf6afe78 c025c57f 00000001 000008d0 cf9ef700 00000010 cea56cc0 00000001 00000001 cf6afe94 Call Trace: [<c010679f>] show_stack+0x7f/0xa0 [<c0106944>] show_registers+0x164/0x230 [<c0106cf4>] die+0xf4/0x1c0 [<c011f39d>] do_page_fault+0x46d/0x669 [<c0106403>] error_code+0x2b/0x30 [<c025965a>] vproc_child_lost_parent+0x7a/0x160 [<c0259ead>] vproc_lost_relations+0xad/0x218 [<c024d554>] pvpsop_lost_relations+0x94/0xa0 [<c025956e>] vproc_carelist_cleanup+0x7e/0xf0 [<c0256933>] vproc_origin_node_cleanup+0x63/0x2b0 [<c0256dde>] vproc_origin_cleanup+0xbe/0x190 [<c0257fc1>] vproc_slave_daemon+0x1a1/0x260 [<c01023a5>] kernel_thread_helper+0x5/0x10 Code: 8b 4d 8c 89 0c 24 e8 6f a9 01 00 ba 01 00 00 00 89 54 24 04 8b 55 90 8b 42 2c 89 04 24 e8 88 a6 01 00 89 45 8c 8b 4d 8c 8b 55 90 <8b> 41 08 39 42 28 0f 84 12 03 00 00 8b 85 7c ff ff ff 85 c0 75 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=2719752&group_id=32541 |