#270 writev() can cause OOPS

bug
closed-fixed
kernel (207)
5
2009-02-07
2009-02-05
Bill Pemberton
No

A writev() call can cause an oops due to a NULL pointer in attach_nobh_buffers(). This can be reproduced by running the writev01 testcase that comes with the Linux Test Project (http://ltp.sourceforge.net/) if the /tmp filesystem resides on a JFS filesystem. I've verified the crash only occurs with JFS and not with ext3.

The bug is that nobh_write_end() is getting called with fsdata == NULL, which results in attach_nobh_buffers() being called with head == NULL -- this will cause a NULL pointer dereference. Here is a trace:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810f7eda>] attach_nobh_buffers+0x3a/0x80
PGD 79d61067 PUD 7659f067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:01:04.0/local_cpus
CPU 0
Modules linked in: netconsole configfs jfs [last unloaded: scsi_wait_scan]
Pid: 8215, comm: writev01 Not tainted 2.6.29-rc3 #1
RIP: 0010:[<ffffffff810f7eda>] [<ffffffff810f7eda>] attach_nobh_buffers+0x3a/0x80
RSP: 0018:ffff88007651dab8 EFLAGS: 00010202
RAX: 0000000000000101 RBX: ffffe20001963db0 RCX: 00000000000000c0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880077ceece8
RBP: ffff88007651dac8 R08: 0000000000000040 R09: ffffe20001963db0
R10: 00000000000000c0 R11: ffff880077ceec78 R12: 0000000000000000
R13: 0000000000000040 R14: ffff880077ceeb68 R15: 0000000000000000
FS: 00007f701aeda6f0(0000) GS:ffffffff81851080(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000076c56000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process writev01 (pid: 8215, threadinfo ffff88007651c000, task ffff8800768223a0)
Stack:
ffffe20001963db0 0000000000000000 ffff88007651db28 ffffffff810fb4e0
0000000000000018 ffff880077ceec78 ffff8800000000c0 0000000000000180
ffff88007e5b8900 ffffffffa0026440 00000000000000c0 0000000000000180
Call Trace:
[<ffffffff810fb4e0>] nobh_write_end+0x140/0x150
[<ffffffff8109a007>] generic_file_buffered_write+0x187/0x310
[<ffffffff8109b1b1>] __generic_file_aio_write_nolock+0x261/0x470
[<ffffffff8103689e>] ? __wake_up+0x4e/0x70
[<ffffffff8109b4c7>] generic_file_aio_write+0x67/0xd0
[<ffffffff8109b460>] ? generic_file_aio_write+0x0/0xd0
[<ffffffff810d44db>] do_sync_readv_writev+0xeb/0x130
[<ffffffff8105ae50>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8125b3bb>] ? tty_put_char+0x3b/0x50
[<ffffffff8105b096>] ? remove_wait_queue+0x46/0x60
[<ffffffff811b50e1>] ? security_file_permission+0x11/0x20
[<ffffffff810d543f>] do_readv_writev+0xcf/0x1e0
[<ffffffff8125d5ca>] ? tty_write+0x22a/0x280
[<ffffffff814e6ca1>] ? mutex_lock+0x11/0x30
[<ffffffff810d5590>] vfs_writev+0x40/0x60
[<ffffffff810d5600>] sys_writev+0x50/0xb0
[<ffffffff8100c6db>] system_call_fastpath+0x16/0x1b
Code: fb 74 69 48 8b 7f 18 48 83 c7 70 e8 31 02 3f 00 4c 89 e2 eb 0c 0f 1f 40 00 49$
RIP [<ffffffff810f7eda>] attach_nobh_buffers+0x3a/0x80
RSP <ffff88007651dab8>
CR2: 0000000000000000

The attached patch will prevent the problem. However, I don't think this is really the correct fix since ext3 doesn't have this problem.

Related

Patches: #1

Discussion

  • Bill Pemberton
    Bill Pemberton
    2009-02-05

     
    Attachments
  • David Kleikamp
    David Kleikamp
    2009-02-05

    I'm not sure if that is the right fix. I'm going to have take a closer look at this. It looks like either the page should have buffers or fsdata should be non-null, but I'll have to look at what's happening in this case.

     
  • David Kleikamp
    David Kleikamp
    2009-02-05

    • assigned_to: nobody --> shaggyk
     
  • David Kleikamp
    David Kleikamp
    2009-02-06

    Upon further review, I think your patch is correct. Commit 5b41e74a is responsible for this.

    http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=5b41e74a

    The problem with that patch is that if the page was already marked PAGE_MAPPED_TO_DISK upon entering nobh_write_begin(), then fsdata will not be initialized with a list of buffers. However, in that case, the page should already be up to date, so this page shouldn't have the uninitialized data problem.

    Actually, you patch can be simplified. a non-null head implies that page_has_buffers() is false. No need to test both.

    I will submit a modified patch to mainline.

    Thanks,
    Shaggy

     
  • David Kleikamp
    David Kleikamp
    2009-02-07

    Linus added this patch to the mainline kernel. Thank you!

     
  • David Kleikamp
    David Kleikamp
    2009-02-07

    • status: open --> closed-fixed