Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#138 CFS oops on XFS failover

v1.9.2
closed-fixed
Roger Tsang
Filesystem (49)
5
2014-06-27
2007-04-15
Roger Tsang
No

/abc is a hard mount.

[/sbin/fsck.xfs (1) -- /abc] fsck.xfs /dev/drbd3
XFS mounting filesystem drbd3
Starting XFS recovery on filesystem: drbd3 (dev: drbd3)
Ending XFS recovery on filesystem: drbd3 (dev: drbd3)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
00000000
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: cpufreq_ondemand tun ipt_REJECT ipt_state ipt_multiport iptable_filter ipt_MASQUERADE iptable_nat ip_conntrack ip_tables softdog xfs exportfs dm_mod uhci_hcd ehci_hcd usbcore drbd via_rhine sk98lin r8169 forcedeth
CPU: 0
EIP: 0060:[<00000000>] Not tainted VLI
EFLAGS: 00010202 (2.6.11-ssi5.12)
EIP is at 0x0
eax: fab2c3a0 ebx: d357c1f4 ecx: f6026a8c edx: c24dd7ec
esi: f6026a00 edi: cccf5ce0 ebp: cccf5ca4 esp: cccf5c68
ds: 007b es: 007b ss: 0068
Process mount (pid: 71515, threadinfo=cccf4000 task=f6734330)
Stack: c0286757 d357c1f4 7837c809 00000000 00000010 c04e8900 00000000 000000d0
00000000 00000000 f6f9c000 cccf4000 f4c34300 00000001 00000001 cccf5d08
c028d832 cccf5ce0 cccf5ccc f4903654 f490365c c277b840 00000010 cccf5d14
Call Trace:
[<c010623f>] show_stack+0x7f/0xa0
[<c01063e6>] show_registers+0x166/0x230
[<c0106786>] die+0xf6/0x1c0
[<c011b34d>] do_page_fault+0x45d/0x652
[<c0105e9f>] error_code+0x2b/0x30
[<c028d832>] cfstok_start_svrcfstok+0xf2/0x100
[<c028d8a8>] cfstok_rebuild_tokens+0x68/0x140
[<c0293dc4>] cfs_rebuild_tokens+0x64/0x140
[<c029566f>] cfs_sb_start_haves+0xdf/0x110
[<c029632d>] cfsd_rb_phase_haves_0+0x4d/0xa0
[<c02903d4>] cfs_send_rb_phase_haves+0x104/0x120
[<c0296aa9>] cfs_rb_phase_haves+0x29/0x30
[<c02967e1>] cfs_rb_do_phase+0x231/0x2d0
[<c029601c>] cfsd_rb_0+0x3c/0xf0
[<c028ff42>] cfs_send_rb+0xf2/0x100
[<c02949f9>] cfs_do_kern_remount+0xd9/0x14e
[<c0294912>] cfs_do_remount+0x52/0x60
[<c0294841>] cfs_remount+0xc1/0x140
[<c020e3ae>] ssisys_cfs_remount+0x7e/0xb0
[<c020bc0b>] do_ssisys+0x9b/0x1f0
[<c020bdae>] sys_ssisys+0x4e/0x70
[<c0105305>] sysenter_past_esp+0x52/0x75
Code: Bad EIP value.

Entering kdb (current=0xf6734330, pid 71515) on processor 0 Oops: Oops
due to oops @ 0x0
eax = 0xfab2c3a0 ebx = 0xd357c1f4 ecx = 0xf6026a8c edx = 0xc24dd7ec
esi = 0xf6026a00 edi = 0xcccf5ce0 esp = 0xcccf5c68 eip = 0x00000000
ebp = 0xcccf5ca4 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00010202
xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff &regs = 0xcccf5c34
[0]kdb> bt
Stack traceback for pid 71515
0xf6734330 71515 71505 1 0 R 0xf6734500 *mount
EBP EIP Function (args)
0xcccf5ca4 0x00000000 <unknown>
[0]kdb>

Discussion

  • Roger Tsang
    Roger Tsang
    2007-04-15

    Logged In: YES
    user_id=1246761
    Originator: YES

    The title of this bug report may be misleading. This might not be XFS specific as rc.nodedown calls mount -a -d {down_node} after fsck.

     
  • Roger Tsang
    Roger Tsang
    2007-04-16

    Logged In: YES
    user_id=1246761
    Originator: YES

    This oops occurs only when there is a CFS stacked XFS hard mount.

     
  • Roger Tsang
    Roger Tsang
    2007-04-19

    Logged In: YES
    user_id=1246761
    Originator: YES

    Oops occured while inside inline iget() call to read_inode() which is depreciated in XFS.

    Need to stop using iget() in CFS and instead use exportfs.

     
  • Roger Tsang
    Roger Tsang
    2007-04-21

    Logged In: YES
    user_id=1246761
    Originator: YES

    Testing XFS failover without iget by rebuilding the exportfs file handle.

     
  • Roger Tsang
    Roger Tsang
    2007-04-21

    • assigned_to: nobody --> rogertsang
    • status: open --> open-accepted
     
  • Roger Tsang
    Roger Tsang
    2007-05-08

    • status: open-accepted --> open-fixed
     
  • Roger Tsang
    Roger Tsang
    2007-05-26

    Logged In: YES
    user_id=1246761
    Originator: YES

    This oops is fixed, but XFS does not offer data consistency in failover scenarios due to XFS data journaling is writeback.

     
  • Roger Tsang
    Roger Tsang
    2007-06-17

    Logged In: YES
    user_id=1246761
    Originator: YES

    ...but CFS does O_SYNC for chard mounts. Testing stability of this fix.

     
  • Roger Tsang
    Roger Tsang
    2007-08-05

    • status: open-fixed --> closed-fixed