Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#153 onnode 2 ls /proc/$$/task/1 causes oops

v1.9.3
closed-fixed
nobody
5
2008-05-24
2008-04-09
John Hughes
No

On node 1 run:

onnode 2 ls /proc/$$/task/1

And node 2 oopses in proc_task_lookup.

If any task other than "1" is used it works ok.

Here's what the oops looks like:

Oops: 0000 [#1]
SMP
Modules linked in: ext3 jbd parport_pc parport floppy uhci_hcd ohci_hcd ehci_hcd ide_scsi scsi_mod i2c_piix4 i2c_core ne2k_pci 8390
CPU: 0
EIP: 0060:[<c01a8714>] Not tainted VLI
EFLAGS: 00000246 (2.6.11-ssi-686-smp)
EIP is at proc_task_lookup+0xf4/0x210
eax: 00000000 ebx: 00000000 ecx: cf06d500 edx: ce55ae10
esi: ce55ae84 edi: ce55ae00 ebp: cdcdee18 esp: cdcdede0
ds: 007b es: 007b ss: 0068
Process ls (pid: 68605, threadinfo=cdcde000 task=cf12d390)
Stack: ce55ae00 00000000 00000000 00000000 00000000 00000000 cdcdee08 cdcde000
cf02ba00 00000001 00000001 fffffff4 cdc849f4 cdc84a6c cdcdee3c c017c0fc
cdc849f4 cdc83680 cdcdef10 cdc83680 00000000 cdcdef10 c13b4f80 cdcdee5c
Call Trace:
[<c010694f>] show_stack+0x7f/0xa0
[<c0106b04>] show_registers+0x164/0x220
[<c0106e94>] die+0xf4/0x1c0
[<c011f1b5>] do_page_fault+0x375/0x695
[<c01065a3>] error_code+0x2b/0x30
[<c017c0fc>] real_lookup+0xec/0x120
[<c017c4d6>] do_lookup+0x86/0xa0
[<c017cba8>] link_path_walk+0x6b8/0xd60
[<c017d51d>] path_lookup+0x9d/0x1b0
[<c017d7df>] __user_walk+0x3f/0x80
[<c01773eb>] vfs_lstat+0x1b/0x60
[<c0177b5b>] sys_lstat64+0x1b/0x40
[<c0105a3b>] syscall_call+0x7/0xb
Code: c0 89 44 24 08 31 c0 89 54 24 14 89 44 24 04 e8 f3 6c 08 00 85 c0 75 0b 8b 55 e8 8b 45 f0 39 42 04 74 34 89 3c 24 e8 cc 53 09 00 <f0> ff 4b 08 0f 94 c0 84 c0 75 11 b8 fe ff ff ff 83 c4 2c 5b 5e

Entering kdb (current=0xcf12d390, pid 68605) on processor 0 Oops: Oops
due to oops @ 0xc01a8714
eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10
esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714
ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246
xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff &regs = 0xcdcdedac
[0]kdb> bt
Stack traceback for pid 68605
0xcf12d390 68605 68485 1 0 R 0xcf12d570 *ls
EBP EIP Function (args)
0xcdcdee18 0xc01a8714 proc_task_lookup+0xf4 (0x0, 0xcf06d500, 0xce55ae10, 0xce55ae84, 0xce55ae00)
0xc01065a3 error_code+0x2b
Interrupt registers:
eax = 0x00000000 ebx = 0x00000000 ecx = 0xcf06d500 edx = 0xce55ae10
esi = 0xce55ae84 edi = 0xce55ae00 esp = 0xcdcdede0 eip = 0xc01a8714
ebp = 0xcdcdee18 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00000246
xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff &regs = 0xcdcdedac
0x00000246 <unknown>+0x246
0xce55ae00 <unknown>
[0]kdb>

Related

Bugs: #1

Discussion

  • John Hughes
    John Hughes
    2008-04-09

    Logged In: YES
    user_id=166336
    Originator: YES

    Ok, here's what happens:

    Any attempt to do stat ("/proc/pid1/task/pid2") where pid2 is not pid1 and pid2 is not on the node doing the stat causes the oops.

    In proc_task_lookup we have:

    vp = LOCATE_VPROC_PID(tid, "temp");

    if (!vp)
    goto out;

    VPROC_LOCK_EXCL(vp, "proc_task_lookup");
    task = PVP(vp)->pvp_pproc;
    if (task)
    get_task_struct(task);
    VPROC_UNLOCK_EXCL(vp, "proc_task_lookup");

    error = PVPOP_PROCFS_GETATTR(vp, 0, 0, NULL, NULL, NULL, &tgid);

    /* leader's pid should be this vproc tgid */
    if ( error || leader->vp_pid != tgid ) {
    VPROC_RELE(vp, "proc_task_lookup");
    goto out_drop_task;
    }

    Which will jump to drop_task if the the pid is not a member of the thread group, but if the pid is not on this node "task" will be null, so we try to do put_task_struct (null).

    The fix is:
    Index: fs/proc/base.c
    ===================================================================
    RCS file: /usr/local/lib/cvs-repo/openssi-kernel/fs/proc/base.c,v
    retrieving revision 1.1.1.1.2.1.2.2.4.2
    diff -u -r1.1.1.1.2.1.2.2.4.2 base.c
    --- fs/proc/base.c 9 Apr 2008 10:10:55 -0000 1.1.1.1.2.1.2.2.4.2
    +++ fs/proc/base.c 9 Apr 2008 13:02:45 -0000
    @@ -2485,7 +2485,10 @@
    /* leader's pid should be this vproc tgid */
    if ( error || leader->vp_pid != tgid ) {
    VPROC_RELE(vp, "proc_task_lookup");
    - goto out_drop_task;
    + if (task)
    + goto out_drop_task;
    + else
    + goto out;
    }

    if( task == NULL ) { /* remote process */

     
  • Roger Tsang
    Roger Tsang
    2008-04-09

    Logged In: YES
    user_id=1246761
    Originator: NO

    Looks good. The following will be included in 2.0.0pre3.

    --- linux.orig/fs/proc/base.c
    +++ linux/fs/proc/base.c
    @@ -2377,6 +2377,15 @@ static struct dentry *proc_task_lookup(s
    /* leader's pid should be this vproc tgid */
    if ( error || leader->vp_pid != tgid ) {
    VPROC_RELE(vp, "proc_task_lookup");
    +#ifdef PROC_TASK_LOOKUP_FIX
    + /* [ ssic-linux-Bugs-1938520 ]
    + * Any attempt to do stat ("/proc/pid1/task/pid2") where pid2
    + * is not pid1 and pid2 is not on the node doing the stat
    + * causes the oops. -hughesj
    + */
    + if (!task)
    + goto out;
    +#endif
    goto out_drop_task;
    }

     
  • Roger Tsang
    Roger Tsang
    2008-04-09

    • status: open --> open-accepted
     
  • Roger Tsang
    Roger Tsang
    2008-05-23

    • status: open-accepted --> closed-accepted
     
  • Roger Tsang
    Roger Tsang
    2008-05-24

    • status: closed-accepted --> closed-fixed