Thread: [SSI-devel] [ ssic-linux-Bugs-1938520 ] onnode 2 ls /proc/$$/task/1 causes oops
Brought to you by:
brucewalker,
rogertsang
From: SourceForge.net <no...@so...> - 2008-04-09 11:22:32
|
Bugs item #1938520, was opened at 2008-04-09 13:21 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v1.9.3 Status: Open Resolution: None Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Nobody/Anonymous (nobody) Summary: onnode 2 ls /proc/$$/task/1 causes oops Initial Comment: On node 1 run: onnode 2 ls /proc/$$/task/1 And node 2 oopses in proc_task_lookup. If any task other than "1" is used it works ok. Here's what the oops looks like: Oops: 0000 [#1] SMP Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390 CPU: 0 EIP: 0060:[<c01a8714>] Not tainted VLI EFLAGS: 00000246 (2.6.11-ssi-686-smp) EIP is at proc_task_lookup+0xf4/0x210 eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10 esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0 ds: 007b es: 007b ss: 0068 Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390) Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000 cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c Call Trace: [<c010694f>] show_stack+0x7f/0xa0 [<c0106b04>] show_registers+0x164/0x220 [<c0106e94>] die+0xf4/0x1c0 [<c011f1b5>] do_page_fault+0x375/0x695 [<c01065a3>] error_code+0x2b/0x30 [<c017c0fc>] real_lookup+0xec/0x120 [<c017c4d6>] do_lookup+0x86/0xa0 [<c017cba8>] link_path_walk+0x6b8/0xd60 [<c017d51d>] path_lookup+0x9d/0x1b0 [<c017d7df>] __user_walk+0x3f/0x80 [<c01773eb>] vfs_lstat+0x1b/0x60 [<c0177b5b>] sys_lstat64+0x1b/0x40 [<c0105a3b>] syscall_call+0x7/0xb Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops due to oops @ 0xc01a8714 eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac [0]kdb> bt Stack traceback for pid 68605 0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls EBP EIP Function (args) 0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00) 0xc01065a3 error_code+0x2b Interrupt registers: eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac 0x00000246 <unknown>+0x246 0xce55ae00 <unknown> [0]kdb> ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 |
From: SourceForge.net <no...@so...> - 2008-04-09 13:07:24
|
Bugs item #1938520, was opened at 2008-04-09 13:21 Message generated for change (Comment added) made by hughesj You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v1.9.3 Status: Open Resolution: None Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Nobody/Anonymous (nobody) Summary: onnode 2 ls /proc/$$/task/1 causes oops Initial Comment: On node 1 run: onnode 2 ls /proc/$$/task/1 And node 2 oopses in proc_task_lookup. If any task other than "1" is used it works ok. Here's what the oops looks like: Oops: 0000 [#1] SMP Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390 CPU: 0 EIP: 0060:[<c01a8714>] Not tainted VLI EFLAGS: 00000246 (2.6.11-ssi-686-smp) EIP is at proc_task_lookup+0xf4/0x210 eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10 esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0 ds: 007b es: 007b ss: 0068 Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390) Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000 cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c Call Trace: [<c010694f>] show_stack+0x7f/0xa0 [<c0106b04>] show_registers+0x164/0x220 [<c0106e94>] die+0xf4/0x1c0 [<c011f1b5>] do_page_fault+0x375/0x695 [<c01065a3>] error_code+0x2b/0x30 [<c017c0fc>] real_lookup+0xec/0x120 [<c017c4d6>] do_lookup+0x86/0xa0 [<c017cba8>] link_path_walk+0x6b8/0xd60 [<c017d51d>] path_lookup+0x9d/0x1b0 [<c017d7df>] __user_walk+0x3f/0x80 [<c01773eb>] vfs_lstat+0x1b/0x60 [<c0177b5b>] sys_lstat64+0x1b/0x40 [<c0105a3b>] syscall_call+0x7/0xb Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops due to oops @ 0xc01a8714 eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac [0]kdb> bt Stack traceback for pid 68605 0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls EBP EIP Function (args) 0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00) 0xc01065a3 error_code+0x2b Interrupt registers: eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac 0x00000246 <unknown>+0x246 0xce55ae00 <unknown> [0]kdb> ---------------------------------------------------------------------- >Comment By: John Hughes (hughesj) Date: 2008-04-09 15:07 Message: Logged In: YES user_id=166336 Originator: YES Ok, here's what happens: Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 is not pid1 and pid2 is not on the node doing the stat causes the oops. In proc_task_lookup we have: vp = LOCATE_VPROC_PID(tid, "temp"); if (!vp) goto out; VPROC_LOCK_EXCL(vp, "proc_task_lookup"); task = PVP(vp)->pvp_pproc; if (task) get_task_struct(task); VPROC_UNLOCK_EXCL(vp, "proc_task_lookup"); error = PVPOP_PROCFS_GETATTR(vp, 0, 0, NULL, NULL, NULL, &tgid); /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); goto out_drop_task; } Which will jump to drop_task if the the pid is not a member of the thread group, but if the pid is not on this node "task" will be null, so we try to do put_task_struct (null). The fix is: Index: fs/proc/base.c =================================================================== RCS file: /usr/local/lib/cvs-repo/openssi-kernel/fs/proc/base.c,v retrieving revision 1.1.1.1.2.1.2.2.4.2 diff -u -r1.1.1.1.2.1.2.2.4.2 base.c --- fs/proc/base.c 9 Apr 2008 10:10:55 -0000 1.1.1.1.2.1.2.2.4.2 +++ fs/proc/base.c 9 Apr 2008 13:02:45 -0000 @@ -2485,7 +2485,10 @@ /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); - goto out_drop_task; + if (task) + goto out_drop_task; + else + goto out; } if( task == NULL ) { /* remote process */ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 |
From: SourceForge.net <no...@so...> - 2008-04-09 23:20:21
|
Bugs item #1938520, was opened at 2008-04-09 07:21 Message generated for change (Comment added) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v1.9.3 Status: Open >Resolution: Accepted Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Nobody/Anonymous (nobody) Summary: onnode 2 ls /proc/$$/task/1 causes oops Initial Comment: On node 1 run: onnode 2 ls /proc/$$/task/1 And node 2 oopses in proc_task_lookup. If any task other than "1" is used it works ok. Here's what the oops looks like: Oops: 0000 [#1] SMP Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390 CPU: 0 EIP: 0060:[<c01a8714>] Not tainted VLI EFLAGS: 00000246 (2.6.11-ssi-686-smp) EIP is at proc_task_lookup+0xf4/0x210 eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10 esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0 ds: 007b es: 007b ss: 0068 Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390) Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000 cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c Call Trace: [<c010694f>] show_stack+0x7f/0xa0 [<c0106b04>] show_registers+0x164/0x220 [<c0106e94>] die+0xf4/0x1c0 [<c011f1b5>] do_page_fault+0x375/0x695 [<c01065a3>] error_code+0x2b/0x30 [<c017c0fc>] real_lookup+0xec/0x120 [<c017c4d6>] do_lookup+0x86/0xa0 [<c017cba8>] link_path_walk+0x6b8/0xd60 [<c017d51d>] path_lookup+0x9d/0x1b0 [<c017d7df>] __user_walk+0x3f/0x80 [<c01773eb>] vfs_lstat+0x1b/0x60 [<c0177b5b>] sys_lstat64+0x1b/0x40 [<c0105a3b>] syscall_call+0x7/0xb Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops due to oops @ 0xc01a8714 eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac [0]kdb> bt Stack traceback for pid 68605 0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls EBP EIP Function (args) 0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00) 0xc01065a3 error_code+0x2b Interrupt registers: eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac 0x00000246 <unknown>+0x246 0xce55ae00 <unknown> [0]kdb> ---------------------------------------------------------------------- >Comment By: Roger Tsang (rogertsang) Date: 2008-04-09 19:20 Message: Logged In: YES user_id=1246761 Originator: NO Looks good. The following will be included in 2.0.0pre3. --- linux.orig/fs/proc/base.c +++ linux/fs/proc/base.c @@ -2377,6 +2377,15 @@ static struct dentry *proc_task_lookup(s /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); +#ifdef PROC_TASK_LOOKUP_FIX + /* [ ssic-linux-Bugs-1938520 ] + * Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 + * is not pid1 and pid2 is not on the node doing the stat + * causes the oops. -hughesj + */ + if (!task) + goto out; +#endif goto out_drop_task; } ---------------------------------------------------------------------- Comment By: John Hughes (hughesj) Date: 2008-04-09 09:07 Message: Logged In: YES user_id=166336 Originator: YES Ok, here's what happens: Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 is not pid1 and pid2 is not on the node doing the stat causes the oops. In proc_task_lookup we have: vp = LOCATE_VPROC_PID(tid, "temp"); if (!vp) goto out; VPROC_LOCK_EXCL(vp, "proc_task_lookup"); task = PVP(vp)->pvp_pproc; if (task) get_task_struct(task); VPROC_UNLOCK_EXCL(vp, "proc_task_lookup"); error = PVPOP_PROCFS_GETATTR(vp, 0, 0, NULL, NULL, NULL, &tgid); /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); goto out_drop_task; } Which will jump to drop_task if the the pid is not a member of the thread group, but if the pid is not on this node "task" will be null, so we try to do put_task_struct (null). The fix is: Index: fs/proc/base.c =================================================================== RCS file: /usr/local/lib/cvs-repo/openssi-kernel/fs/proc/base.c,v retrieving revision 1.1.1.1.2.1.2.2.4.2 diff -u -r1.1.1.1.2.1.2.2.4.2 base.c --- fs/proc/base.c 9 Apr 2008 10:10:55 -0000 1.1.1.1.2.1.2.2.4.2 +++ fs/proc/base.c 9 Apr 2008 13:02:45 -0000 @@ -2485,7 +2485,10 @@ /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); - goto out_drop_task; + if (task) + goto out_drop_task; + else + goto out; } if( task == NULL ) { /* remote process */ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 |
From: SourceForge.net <no...@so...> - 2008-05-23 23:47:34
|
Bugs item #1938520, was opened at 2008-04-09 07:21 Message generated for change (Settings changed) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v1.9.3 >Status: Closed Resolution: Accepted Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Nobody/Anonymous (nobody) Summary: onnode 2 ls /proc/$$/task/1 causes oops Initial Comment: On node 1 run: onnode 2 ls /proc/$$/task/1 And node 2 oopses in proc_task_lookup. If any task other than "1" is used it works ok. Here's what the oops looks like: Oops: 0000 [#1] SMP Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390 CPU: 0 EIP: 0060:[<c01a8714>] Not tainted VLI EFLAGS: 00000246 (2.6.11-ssi-686-smp) EIP is at proc_task_lookup+0xf4/0x210 eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10 esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0 ds: 007b es: 007b ss: 0068 Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390) Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000 cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c Call Trace: [<c010694f>] show_stack+0x7f/0xa0 [<c0106b04>] show_registers+0x164/0x220 [<c0106e94>] die+0xf4/0x1c0 [<c011f1b5>] do_page_fault+0x375/0x695 [<c01065a3>] error_code+0x2b/0x30 [<c017c0fc>] real_lookup+0xec/0x120 [<c017c4d6>] do_lookup+0x86/0xa0 [<c017cba8>] link_path_walk+0x6b8/0xd60 [<c017d51d>] path_lookup+0x9d/0x1b0 [<c017d7df>] __user_walk+0x3f/0x80 [<c01773eb>] vfs_lstat+0x1b/0x60 [<c0177b5b>] sys_lstat64+0x1b/0x40 [<c0105a3b>] syscall_call+0x7/0xb Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops due to oops @ 0xc01a8714 eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac [0]kdb> bt Stack traceback for pid 68605 0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls EBP EIP Function (args) 0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00) 0xc01065a3 error_code+0x2b Interrupt registers: eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac 0x00000246 <unknown>+0x246 0xce55ae00 <unknown> [0]kdb> ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2008-04-09 19:20 Message: Logged In: YES user_id=1246761 Originator: NO Looks good. The following will be included in 2.0.0pre3. --- linux.orig/fs/proc/base.c +++ linux/fs/proc/base.c @@ -2377,6 +2377,15 @@ static struct dentry *proc_task_lookup(s /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); +#ifdef PROC_TASK_LOOKUP_FIX + /* [ ssic-linux-Bugs-1938520 ] + * Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 + * is not pid1 and pid2 is not on the node doing the stat + * causes the oops. -hughesj + */ + if (!task) + goto out; +#endif goto out_drop_task; } ---------------------------------------------------------------------- Comment By: John Hughes (hughesj) Date: 2008-04-09 09:07 Message: Logged In: YES user_id=166336 Originator: YES Ok, here's what happens: Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 is not pid1 and pid2 is not on the node doing the stat causes the oops. In proc_task_lookup we have: vp = LOCATE_VPROC_PID(tid, "temp"); if (!vp) goto out; VPROC_LOCK_EXCL(vp, "proc_task_lookup"); task = PVP(vp)->pvp_pproc; if (task) get_task_struct(task); VPROC_UNLOCK_EXCL(vp, "proc_task_lookup"); error = PVPOP_PROCFS_GETATTR(vp, 0, 0, NULL, NULL, NULL, &tgid); /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); goto out_drop_task; } Which will jump to drop_task if the the pid is not a member of the thread group, but if the pid is not on this node "task" will be null, so we try to do put_task_struct (null). The fix is: Index: fs/proc/base.c =================================================================== RCS file: /usr/local/lib/cvs-repo/openssi-kernel/fs/proc/base.c,v retrieving revision 1.1.1.1.2.1.2.2.4.2 diff -u -r1.1.1.1.2.1.2.2.4.2 base.c --- fs/proc/base.c 9 Apr 2008 10:10:55 -0000 1.1.1.1.2.1.2.2.4.2 +++ fs/proc/base.c 9 Apr 2008 13:02:45 -0000 @@ -2485,7 +2485,10 @@ /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); - goto out_drop_task; + if (task) + goto out_drop_task; + else + goto out; } if( task == NULL ) { /* remote process */ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 |
From: SourceForge.net <no...@so...> - 2008-05-24 00:21:43
|
Bugs item #1938520, was opened at 2008-04-09 07:21 Message generated for change (Settings changed) made by rogertsang You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Process Management Group: v1.9.3 Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: John Hughes (hughesj) Assigned to: Nobody/Anonymous (nobody) Summary: onnode 2 ls /proc/$$/task/1 causes oops Initial Comment: On node 1 run: onnode 2 ls /proc/$$/task/1 And node 2 oopses in proc_task_lookup. If any task other than "1" is used it works ok. Here's what the oops looks like: Oops: 0000 [#1] SMP Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390 CPU: 0 EIP: 0060:[<c01a8714>] Not tainted VLI EFLAGS: 00000246 (2.6.11-ssi-686-smp) EIP is at proc_task_lookup+0xf4/0x210 eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10 esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0 ds: 007b es: 007b ss: 0068 Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390) Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000 cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c Call Trace: [<c010694f>] show_stack+0x7f/0xa0 [<c0106b04>] show_registers+0x164/0x220 [<c0106e94>] die+0xf4/0x1c0 [<c011f1b5>] do_page_fault+0x375/0x695 [<c01065a3>] error_code+0x2b/0x30 [<c017c0fc>] real_lookup+0xec/0x120 [<c017c4d6>] do_lookup+0x86/0xa0 [<c017cba8>] link_path_walk+0x6b8/0xd60 [<c017d51d>] path_lookup+0x9d/0x1b0 [<c017d7df>] __user_walk+0x3f/0x80 [<c01773eb>] vfs_lstat+0x1b/0x60 [<c0177b5b>] sys_lstat64+0x1b/0x40 [<c0105a3b>] syscall_call+0x7/0xb Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops due to oops @ 0xc01a8714 eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac [0]kdb> bt Stack traceback for pid 68605 0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls EBP EIP Function (args) 0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00) 0xc01065a3 error_code+0x2b Interrupt registers: eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10 esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714 ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xcdcdedac 0x00000246 <unknown>+0x246 0xce55ae00 <unknown> [0]kdb> ---------------------------------------------------------------------- Comment By: Roger Tsang (rogertsang) Date: 2008-04-09 19:20 Message: Logged In: YES user_id=1246761 Originator: NO Looks good. The following will be included in 2.0.0pre3. --- linux.orig/fs/proc/base.c +++ linux/fs/proc/base.c @@ -2377,6 +2377,15 @@ static struct dentry *proc_task_lookup(s /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); +#ifdef PROC_TASK_LOOKUP_FIX + /* [ ssic-linux-Bugs-1938520 ] + * Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 + * is not pid1 and pid2 is not on the node doing the stat + * causes the oops. -hughesj + */ + if (!task) + goto out; +#endif goto out_drop_task; } ---------------------------------------------------------------------- Comment By: John Hughes (hughesj) Date: 2008-04-09 09:07 Message: Logged In: YES user_id=166336 Originator: YES Ok, here's what happens: Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 is not pid1 and pid2 is not on the node doing the stat causes the oops. In proc_task_lookup we have: vp = LOCATE_VPROC_PID(tid, "temp"); if (!vp) goto out; VPROC_LOCK_EXCL(vp, "proc_task_lookup"); task = PVP(vp)->pvp_pproc; if (task) get_task_struct(task); VPROC_UNLOCK_EXCL(vp, "proc_task_lookup"); error = PVPOP_PROCFS_GETATTR(vp, 0, 0, NULL, NULL, NULL, &tgid); /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); goto out_drop_task; } Which will jump to drop_task if the the pid is not a member of the thread group, but if the pid is not on this node "task" will be null, so we try to do put_task_struct (null). The fix is: Index: fs/proc/base.c =================================================================== RCS file: /usr/local/lib/cvs-repo/openssi-kernel/fs/proc/base.c,v retrieving revision 1.1.1.1.2.1.2.2.4.2 diff -u -r1.1.1.1.2.1.2.2.4.2 base.c --- fs/proc/base.c 9 Apr 2008 10:10:55 -0000 1.1.1.1.2.1.2.2.4.2 +++ fs/proc/base.c 9 Apr 2008 13:02:45 -0000 @@ -2485,7 +2485,10 @@ /* leader's pid should be this vproc tgid */ if ( error || leader->vp_pid != tgid ) { VPROC_RELE(vp, "proc_task_lookup"); - goto out_drop_task; + if (task) + goto out_drop_task; + else + goto out; } if( task == NULL ) { /* remote process */ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=405834&aid=1938520&group_id=32541 |