|
From: Zhen Xu <zhe...@gm...> - 2015-07-12 12:49:53
|
All, We have had a lot of system panic and kernel crash with the latest QLogic driver. QLA: 8.07.00.24.Trunk-SCST.16-k CentOS 7 stock kernel 3.10.0-229.7.2.el7.x86_64 SCST version 3.1.0-pre1 The symptom is very similar to what Marc has reported back on Jun 12. Reverted back to 8.0.0-pre1-TRUNK-15, everything seems to be stable without any issue. [ 603.657025] qla2x00t(4): session for port 21:00:00:24:ff:33:6b:7f (loop ID 0) scheduled for deletion in 35 secs [ 603.662800] qla2x00t(5): session for port 21:00:00:24:ff:33:6b:7e (loop ID 0) scheduled for deletion in 35 secs [ 603.791197] qla2x00t(4): LIP reset occurred [ 603.791229] qla2x00t(4): LIP reset (loop 0xffff), subcode 2 [ 603.796647] qla2x00t(5): LIP reset occurred [ 603.796713] qla2x00t(5): LIP reset (loop 0xffff), subcode 2 [ 603.800163] qla2x00t(4): Async event 0x8030 occurred: ignoring (m[1]=0, m[2]= 2, m[3]=0, m[4]=3e80) [ 603.801287] qla2x00t(4): LIP reset (loop 0xffff), subcode 3 [ 603.805138] qla2x00t(5): Async event 0x8030 occurred: ignoring (m[1]=0, m[2]= 2, m[3]=0, m[4]=d642) [ 603.806252] qla2x00t(5): LIP reset (loop 0xffff), subcode 3 [ 603.891228] qla2x00t(4): Port config changed (2a) [ 603.893397] qla2x00t(5): Port config changed (2a) [ 604.409799] qla2x00t(4): local session for port 21:00:00:24:ff:33:6b:7f (loo p ID 0) login_state ff reappeared [ 604.409813] qla2x00t(4): local session for port 21:00:00:24:ff:33:6b:7f (loop ID 0) became global [ 604.734027] qla2x00t(5): local session for port 21:00:00:24:ff:33:6b:7e (loo p ID 0) login_state ff reappeared [ 604.734039] qla2x00t(5): local session for port 21:00:00:24:ff:33:6b:7e (loop ID 0) became global [ 614.914651] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 614.914670] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:24:ff:33:6b:7f, target 21:00:00:24:ff:33:6b:84) [ 614.914703] qla2x00t(5): Doing NEXUS_LOSS_SESS [ 614.914713] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789dbd0, initiator 2 1:00:00:24:ff:33:6b:7e, target 21:00:00:24:ff:33:6b:85) [ 614.914752] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 614.914766] scst: TM fn 6 (mcmd ffff88070789dbd0) finished, status 0 [ 627.304713] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 627.304729] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789dbd0, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 627.304780] scst: TM fn 6 (mcmd ffff88070789dbd0) finished, status 0 [ 648.321975] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 648.321993] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789dbd0, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 648.322047] scst: TM fn 6 (mcmd ffff88070789dbd0) finished, status 0 [ 669.339281] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 669.339294] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 669.339325] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 690.360300] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 690.360321] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 690.360381] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 711.373768] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 711.373791] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 711.373847] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 732.391053] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 732.391075] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 732.391144] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 753.408299] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 753.408322] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 753.408378] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 774.425552] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 774.425574] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 774.425645] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 795.442860] qla2x00t(4): Doing NEXUS_LOSS_SESS [ 795.442882] scst: TM fn NEXUS_LOSS_SESS/6 (mcmd ffff88070789de70, initiator 2 1:00:00:e0:8b:9a:bb:cc, target 21:00:00:24:ff:33:6b:84) [ 795.442934] scst: TM fn 6 (mcmd ffff88070789de70) finished, status 0 [ 840.607272] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 840.607292] arcmsr0: scsi id = 0 lun = 0 ccb = '0xffff8800cf796300' poll comm and abort successfully [ 843.235411] arcmsr0: abort device command of scsi id = 0 lun = 0 [ 843.235480] arcmsr: executing bus reset eh.....num_resets = 0, num_aborts = 2 [ 843.235616] arcmsr0: executing hw bus reset ..... [ 869.094476] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 1 [ 869.094619] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF O------------- - 3.10.0-229.7.2.el7.x86_64 #1 [ 869.094775] Hardware name: Dell Inc. PowerEdge R805/0F705T, BIOS 4.2.1 04/14/ 2010 [ 869.094895] ffffffff8182b528 ea8a0a378d071c4b ffff880e2fc05c60 ffffffff81604 386 [ 869.095037] ffff880e2fc05ce0 ffffffff815fdc2a 0000000000000010 ffff880e2fc05 cf0 [ 869.095175] ffff880e2fc05c90 ea8a0a378d071c4b 0000000000000000 0000000000000 001 [ 869.095314] Call Trace: [ 869.095356] <NMI> [<ffffffff81604386>] dump_stack+0x19/0x1b [ 869.095478] [<ffffffff815fdc2a>] panic+0xd8/0x1e7 [ 869.095563] [<ffffffff8110a7f0>] ? watchdog_enable_all_cpus.part.2+0x40/0x40 [ 869.095679] [<ffffffff8110a8b2>] watchdog_overflow_callback+0xc2/0xd0 [ 869.095788] [<ffffffff8114c991>] __perf_event_overflow+0xa1/0x250 [ 869.095890] [<ffffffff8114b669>] ? perf_event_update_userpage+0x19/0x100 [ 869.096001] [<ffffffff8114d494>] perf_event_overflow+0x14/0x20 [ 869.096101] [<ffffffff810297d1>] x86_pmu_handle_irq+0x151/0x1b0 [ 869.096203] [<ffffffff81190961>] ? unmap_kernel_range_noflush+0x11/0x20 [ 869.096315] [<ffffffff81373854>] ? ghes_copy_tofrom_phys+0x124/0x210 [ 869.096421] [<ffffffff8160d44b>] perf_event_nmi_handler+0x2b/0x50 [ 869.096521] [<ffffffff8160cb99>] nmi_handle.isra.0+0x69/0xb0 [ 869.096614] [<ffffffff8160ccb0>] do_nmi+0xd0/0x340 [ 869.096694] [<ffffffff8160bff1>] end_repeat_nmi+0x1e/0x2e [ 869.096788] [<ffffffff8160b647>] ? _raw_spin_lock_irqsave+0x47/0x60 [ 869.096892] [<ffffffff8160b647>] ? _raw_spin_lock_irqsave+0x47/0x60 [ 869.096996] [<ffffffff8160b647>] ? _raw_spin_lock_irqsave+0x47/0x60 [ 869.097095] <<EOE>> <IRQ> [<ffffffffa061f4e6>] qla24xx_msix_rsp_q+0x36/0x1 00 [qla2xxx_scst] [ 869.097296] [<ffffffff8110b36e>] handle_irq_event_percpu+0x3e/0x1e0 [ 869.097400] [<ffffffff8110b54d>] handle_irq_event+0x3d/0x60 [ 869.097493] [<ffffffff8110e1e7>] handle_edge_irq+0x77/0x130 [ 869.097589] [<ffffffff81015c9f>] handle_irq+0xbf/0x150 [ 869.097679] [<ffffffff81077d27>] ? irq_enter+0x17/0xa0 [ 869.097767] [<ffffffff816166af>] do_IRQ+0x4f/0xf0 [ 869.097848] [<ffffffff8160b96d>] common_interrupt+0x6d/0x6d [ 869.097946] [<ffffffffa05b7c98>] ? q24_atio_pkt+0xd8/0x250 [qla2x00tgt] [ 869.098057] [<ffffffff8160b41a>] ? _raw_spin_unlock_irqrestore+0xa/0x40 [ 869.098191] [<ffffffffa061f68c>] qla24xx_msix_default+0xdc/0x310 [qla2xxx_sc st] [ 869.098316] [<ffffffff813b8120>] ? add_interrupt_randomness+0x50/0x1b0 [ 869.098424] [<ffffffff8110b36e>] handle_irq_event_percpu+0x3e/0x1e0 [ 869.098527] [<ffffffff8110b54d>] handle_irq_event+0x3d/0x60 [ 869.098619] [<ffffffff8110e1e7>] handle_edge_irq+0x77/0x130 [ 869.098712] [<ffffffff81015c9f>] handle_irq+0xbf/0x150 [ 869.102091] [<ffffffff8160fc4a>] ? atomic_notifier_call_chain+0x1a/0x20 [ 869.105528] [<ffffffff816166af>] do_IRQ+0x4f/0xf0 [ 869.108942] [<ffffffff8160b96d>] common_interrupt+0x6d/0x6d [ 869.112355] <EOI> [<ffffffff81052de6>] ? native_safe_halt+0x6/0x10 [ 869.115717] [<ffffffff8111265d>] ? rcu_eqs_enter_common.isra.31+0x3d/0xf0 [ 869.118995] [<ffffffff8101c85f>] default_idle+0x1f/0xc0 [ 869.122202] [<ffffffff8101c94e>] amd_e400_idle+0x4e/0x110 [ 869.125338] [<ffffffff8101d166>] arch_cpu_idle+0x26/0x30 [ 869.128405] [<ffffffff810c6801>] cpu_startup_entry+0xf1/0x290 [ 869.131405] [<ffffffff8104228a>] start_secondary+0x1ba/0x230 |