|
From: Tony B. <to...@cy...> - 2025-09-08 18:46:11
|
This patch series improves the qla2xxx FC driver in target mode. I developed these patches using the out-of-tree SCST target-mode subsystem (https://scst.sourceforge.net/), although most of the improvements will also apply to the other target-mode subsystems such as the in-tree LIO. Unfortunately qla2xxx+LIO does not pass all of my tests, but my patches do not make it any worse (results below). These patches have been well-tested at my employer with qla2xxx+SCST in both initiator mode and target mode and with a variety of FC HBAs and initiators. Since SCST is out-of-tree, some of the patches have parts that apply in-tree and other parts that apply out-of-tree to SCST. I am going to include the out-of-tree SCST patches to provide additional context; feel free to ignore them if you are not interested. All patches apply to linux 6.17-rc5 and SCST 3.10 master branch. Summary of patches: - bugfixes - cleanups - improve handling of aborts and task management requests - improve log message - add back SLER / SRR support (removed in 2017) Some of these patches improve handling of aborts and task management requests. This is some of the testing that I did: Test 1: Use /dev/sg to queue random disk I/O with short timeouts; make sure cmds are aborted successfully. Test 2: Queue lots of disk I/O, then use "sg_reset -N -d /dev/sg" on initiator to reset logical unit. Test 3: Queue lots of disk I/O, then use "sg_reset -N -t /dev/sg" on initiator to reset target. Test 4: Queue lots of disk I/O, then use "sg_reset -N -b /dev/sg" on initiator to reset bus. Test 5: Queue lots of disk I/O, then use "sg_reset -N -H /dev/sg" on initiator to reset host. Test 6: Use fiber channel attenuator to trigger SRR during write/read/compare test; check data integrity. With my patches, SCST passes all of these tests. Results with in-tree LIO target-mode subsystem: Test 1: Seems to abort the same cmd multiple times (both qlt_24xx_retry_term_exchange() and __qlt_send_term_exchange()). But cmds get aborted, so give it a pass? Test 2: Seems to work; cmds are aborted. Test 3: Target reset doesn't seem to abort cmds, instead, a few seconds later: qla2xxx [0000:04:00.0]-f058:9: qla_target(0): tag 1314312, op 2a: CTIO with TIMEOUT status 0xb received (state 1, port 51:40:2e:c0:18:1d:9f:cc, LUN 0) Tests 4 and 5: The initiator is unable to log back in to the target; the following messages are repeated over and over on the target: qla2xxx [0000:04:00.0]-e01c:9: Sending TERM ELS CTIO (ha=00000000f8811390) qla2xxx [0000:04:00.0]-f097:9: Linking sess 000000008df5aba8 [0] wwn 51:40:2e:c0:18:1d:9f:cc with PLOGI ACK to wwn 51:40:2e:c0:18:1d:9f:cc s_id 00:00:01, ref=2 pla 00000000835a9271 link 0 Test 6: passes with my patches; SRR not supported previously. So qla2xxx+LIO seems a bit flaky when handling exceptions, but my patches do not make it any worse. Perhaps someone who is more familiar with LIO can look at the difference between LIO and SCST and figure out how to improve it. Tony Battersby https://www.cybernetics.com/ |