[Scst-devel] SCST backend device activation problems : scst_translate_lun:FLAG SUSPENDED set, skipp

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all !

I'm currently facing weird problems with SCST, and after days of various
experiments and observations, trying to isolate as precisely as possible the
problem, I conclude that I now need a hand. Could somebody help me a bit on
that ?

Basically, we're running a 2 nodes single-primary DRBD/Pacemaker cluster
(kernel version 2.6.32-71.7.1., based from Openfiler 2.99 distro) hosting 4
DRBD resources each presented to 4 VMware hosts (ESXi 4.1) using two SCST
(vdisk_fileio) and ISCSI-SCST targets (version 2.0.0.1 at first, now in
2.2.0 but the problem persists) per resource. Resources are spread over the
2 nodes, 3 active TB per node overall. DRBD replication link is a dual 10GbE
link bonded in LACP (mode 4). Volumes are hardware RAID5 made up of 9 15krpm
146 or 300GB SAS drives (I mean, disk IO perf doesn't seem to be in cause)

Basically, the issue is : Cluster starts resources, the 4 DRBD primaries go
up, then the 4 pairs of virtual IPs, then the SCST services and things run
fine. Until you try to migrate resources back and forth. When you do that,
it works once, twice, sometimes even 3 times, but then you can see DRBD
promoted correctly, then the IPs wake up, but the SCST resource remains
stuck down, running into timeout after the configured 60s. At that moment,
everything fails back to its former place, as it should. If you try again,
same story. In the end, you obtain a cluster but the resource is stuck on a
node, unable to failover either manually or, more embarrassing, following a
node crash (Which we inevitably faced recently, thanks Mr Murphy.).

After digging the various logs, what I see is :

-          DRBD does its job 100% correctly

-          Pacemaker seems to do its job, with the resource it has, in the
state they are in. (I mean, the errors it mentions look normal errors in the
global failing context)

-          SCST starts its job, but hangs on the device handling section
(BTW, my RA agent uses the sysfs interface and is based on Patrick Zwahlen's
implementation that I customized a bit, mostly to add more friendly tracing,
and also to invert the order of activation : iSCSI target first, then the
backend device, instead of the reverse, although I now doubt it has a real
impact). Basically, all the iSCSI target setup stuff runs fine, but then :

o    Either it hangs on backend device creation

o    Or it hangs an LUN 0 assignment

>From that point on, it hangs until the configured start timeout, and then
everybody goes back home, however. The backend device that refused to get
created correctly has been created and remains, eventually the target
directory and even sometimes the LUN0 directory in it too. From that point
on, it turns into a good mess, in fact problems start here. After that, any
migration try is doomed to failure! If I reboot the node, it will accept a
couple of migrations again, and then fail again in the same manner.

Now the facts : When it hangs, I can see a whole lot bunch of error messages
like this in the logs :

-------------------------------------------------8<-------------------------
------------------------

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:41 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

...

-------------------------------------------------8<-------------------------
------------------------

For instance, in the above precise case, it occurred endlessly until
05:10:54 (Guess how many log entries it represents), then it seems to finish
something but immediately start again its "FLAG SUSPENDED set" stuff, until
another "more active" bunch of log entries, etc. Until the timeout. As
follows :

-------------------------------------------------8<-------------------------
------------------------

Jul 14 05:10:54 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: [2060]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: iscsi-scst: iSCSI TM fn 1

Jul 14 05:10:54 ipstore21 kernel: [2061]: scst: scst_rx_mgmt_fn:6045:TM fn 0

Jul 14 05:10:54 ipstore21 kernel: [2061]:
scst_rx_mgmt_fn:6057:sess=ffff88030cf70b68, tag_set 1, tag 18217820, lun_set
1, lun=0, cmd_sn_set 1, cmd_sn 86922, priv ffff8802899f2320

Jul 14 05:10:54 ipstore21 kernel: [2061]: scst_post_rx_mgmt_cmd:5976:Adding
mgmt cmd ffff88032fc4a470 to active mgmt cmd list

Jul 14 05:10:54 ipstore21 kernel: [1084]: scst_tm_thread:5862:Deleting mgmt
cmd ffff88032fc4a470 from active cmd list

Jul 14 05:10:54 ipstore21 kernel: [1084]: scst_mgmt_cmd_init:5164:ABORT
TASK: command for tag 18217820 not found

Jul 14 05:10:54 ipstore21 kernel: [1084]:
scst_call_task_mgmt_affected_cmds_done:5662:Target's iscsi
task_mgmt_affected_cmds_done() returned

Jul 14 05:10:54 ipstore21 kernel: [1084]: scst:
scst_mgmt_cmd_send_done:5709:TM fn 0 finished, status ffffffff

Jul 14 05:10:54 ipstore21 kernel: iscsi-scst: iSCSI TM fn 1 finished, status
0

Jul 14 05:10:54 ipstore21 kernel: [1084]:
scst_mgmt_cmd_send_done:5724:Target's iscsi task_mgmt_fn_done() returned

Jul 14 05:10:54 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: [2060]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: [2062]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

Jul 14 05:10:54 ipstore21 kernel: [2061]: scst_translate_lun:3891:FLAG
SUSPENDED set, skipping

...

-------------------------------------------------8<-------------------------
------------------------

When this occurs, I also can see a sudden increase of dropped received
packets on the network, just like if the kernel wad damn busy doing
something.

Just to be as precise as possible, hereafter is a sample of what happens in
the logs from the very beginning of a migration request that will end up in
backend device setup hang. This is not the only possible scenario, but is
the most frequent one, especially at the very first SCST activation failure
after a node reboot (I.e. when we start from a "clean" state) :

-------------------------------------------------8<-------------------------
------------------------

Jul 14 04:44:15 ipstore11 kernel: [2048]: scst:
scst_register_target:563:Target iqn.2003-08.fr.eurialys:rdmBackupExec for
template iscsi registered successfully

Jul 14 04:44:15 ipstore11 kernel: [1074]: dev_vdisk:
vdisk_report_registering:3330:Registering virtual vdisk_fileio device drbd2
(NV_CACHE)

Jul 14 04:44:15 ipstore11 kernel: [1074]:
scst_suspend_activity:745:suspend_count 0

Jul 14 04:44:15 ipstore11 kernel: [1074]: scst:
scst_suspend_activity:777:Waiting for 21 active commands to complete... This
might take few minutes for disks or few hours for tapes, if you use long
executed commands, like REWIND or FORMAT. In case, if you have a hung user
space device (i.e. made using scst_user module) not responding to any
commands, if might take virtually forever until the corresponding user space
program recovers and starts responding or gets killed.

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2041]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2041]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2039]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2041]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2041]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2039]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2041]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

Jul 14 04:44:15 ipstore11 kernel: [2040]: scst_translate_lun:3857:FLAG
SUSPENDED set, skipping

-------------------------------------------------8<-------------------------
------------------------

It sounds like the backend device is waked up but refuses to start working
before it unwinds some tasks it has in the pipe. But it was "sleeping", it's
the other node that was working, so why would it undo something ?

I also see things like this :

-------------------------------------------------8<-------------------------
------------------------

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_free_cmd:4225:Freeing
aborted cmd ffff8803071353c8 (scst_cmd_count 241)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd:3694:Aborted cmd
ffff8803071351d0 finished (cmd_ref 1, scst_cmd_count 240)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd_mgmt:4630:cmd
ffff8803071351d0 finished (tag 44725616)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd_mgmt:4639:mcmd
ffff88032fc4a0c8, mcmd->cmd_finish_wait_count 237

Jul 14 05:10:36 ipstore21 kernel: [29483]:
__scst_dec_finish_wait_count:4535:cmd_finish_wait_count(236) not 0, skipping

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_free_cmd:4225:Freeing
aborted cmd ffff8803071351d0 (scst_cmd_count 240)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd:3694:Aborted cmd
ffff880307359c20 finished (cmd_ref 1, scst_cmd_count 239)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd_mgmt:4630:cmd
ffff880307359c20 finished (tag 44725617)

Jul 14 05:10:36 ipstore21 kernel: [29483]: scst_finish_cmd_mgmt:4639:mcmd
ffff88032fc4a0c8, mcmd->cmd_finish_wait_count 236

-------------------------------------------------8<-------------------------
------------------------

Or also :

-------------------------------------------------8<-------------------------
------------------------

Jul 14 04:29:51 ipstore21 kernel: [30900]: scst:
scst_cmd_thread:4313:Processing thread drbd03_6 (PID 30900) started

Jul 14 04:29:51 ipstore21 kernel: [30900]: scst_cmd_thread:4345:Linked IO
context ffff8802d90dcf50 (p_cmd_threads ffff88017093e8e8)

Jul 14 04:29:51 ipstore21 kernel: iscsi-scst: Negotiated parameters:
InitialR2T No, ImmediateData Yes, MaxConnections 1, MaxRecvDataSegmentLength
1048576, MaxXmitDataSegmentLength 131072,

Jul 14 04:29:51 ipstore21 kernel: iscsi-scst:     MaxBurstLength 262144,
FirstBurstLength 262144, DefaultTime2Wait 2, DefaultTime2Retain 0,

Jul 14 04:29:51 ipstore21 kernel: iscsi-scst:     MaxOutstandingR2T 1,
DataPDUInOrder Yes, DataSequenceInOrder Yes, ErrorRecoveryLevel 0,

Jul 14 04:29:51 ipstore21 kernel: iscsi-scst:     HeaderDigest None,
DataDigest None, OFMarker No, IFMarker No, OFMarkInt 2048, IFMarkInt 2048

Jul 14 04:29:51 ipstore21 kernel: iscsi-scst: Target parameters set for
session 1380100003d0200: QueuedCommands 32, Response timeout 30, Nop-In
interval 30

Jul 14 04:29:51 ipstore21 kernel: [30901]: scst:
scst_cmd_thread:4313:Processing thread drbd03_7 (PID 30901) started

Jul 14 04:29:51 ipstore21 kernel: [30901]: scst_cmd_thread:4345:Linked IO
context ffff8802d90dcf50 (p_cmd_threads ffff88017093e8e8)

Jul 14 04:29:52 ipstore21 kernel: [30893]: scst_set_pending_UA:6191:Setting
pending UA cmd ffff8801c322c3c8

Jul 14 04:29:52 ipstore21 kernel: [30901]: scst_set_pending_UA:6191:Setting
pending UA cmd ffff8801da266638

Jul 14 04:30:22 ipstore21 kernel: iscsi-scst: ***ERROR***: Timeout sending
data/waiting for reply to/from initiator
iqn.1998-01.com.vmware:vmware12-0176b5c2 (SID 1350100003d0200), closing
connection

Jul 14 04:30:22 ipstore21 kernel: [31658]: scst: scst_rx_mgmt_fn:6045:TM fn
6

Jul 14 04:30:22 ipstore21 kernel: [31658]:
scst_rx_mgmt_fn:6057:sess=ffff8802e4c017f0, tag_set 0, tag 0, lun_set 1,
lun=0, cmd_sn_set 0, cmd_sn 0, priv ffff880289c30c00

Jul 14 04:30:22 ipstore21 kernel: [31658]: scst_post_rx_mgmt_cmd:5976:Adding
mgmt cmd ffff88032ddc1338 to active mgmt cmd list

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst_tm_thread:5862:Deleting mgmt
cmd ffff88032ddc1338 from active cmd list

Jul 14 04:30:22 ipstore21 kernel: [1095]:
scst_abort_all_nexus_loss_sess:5416:Nexus loss for sess ffff8802e4c017f0
(mcmd ffff88032ddc1338)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst: scst_abort_cmd:4707:Aborting
cmd ffff880131d0c970 (tag 1458026, op 12)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst_abort_cmd:4813:cmd
ffff880131d0c970 (tag 1458026, sn 0) being executed/xmitted (state 120, op
12, proc time 31 sec., timeout 3 sec.), deferring ABORT (cmd_done_wait_count
0, cmd_finish_wait_count 1)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst: scst_abort_cmd:4707:Aborting
cmd ffff880131d0c778 (tag 1458027, op 12)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst_abort_cmd:4813:cmd
ffff880131d0c778 (tag 1458027, sn 0) being executed/xmitted (state 120, op
12, proc time 30 sec., timeout 3 sec.), deferring ABORT (cmd_done_wait_count
0, cmd_finish_wait_count 2)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst: scst_abort_cmd:4707:Aborting
cmd ffff880131d0c580 (tag 1458028, op 12)

Jul 14 04:30:22 ipstore21 kernel: [1095]: scst_abort_cmd:4813:cmd
ffff880131d0c580 (tag 1458028, sn 0) being executed/xmitted (state 120, op
12, proc time 30 sec., timeout 3 sec.), deferring ABORT (cmd_done_wait_count
0, cmd_finish_wait_count 3)

-------------------------------------------------8<-------------------------
------------------------

This makes me think of section "Work if target's backstorage or link is too
slow" in the SCST Readme, but my RAID card is an HP SmartArray P812 with 1GB
cache onboard, RAID volumes consist of nine 15krpm SAS disks, I mean this
doesn't sound like a slow backstorage, does it ?

Well I'm stuck, don't know where to look at now, could someone give me some
advice or enlightment on that ? I don't understand why a device that was
idle until now tries to abort stacked commands if it was precisely idle. Can
it be the SCST workload of the other running resources that would fill the
SCST stack ? Reading the various READMEs and howtos, I feel like any
resource is independent from the others, so what ? And I'm fairly sure I had
the occasion to see that failing even when the destination node was empty,
all resources being running  on the other one.

Any help would be highly appreciated. I have tons of logs and tens of config
files, just ask me what you'd need. J

Thanks!

Best regards,

Pascal.

[Scst-devel] SCST backend device activation problems : scst_translate_lun:FLAG SUSPENDED set, skipp

[Scst-devel] SCST backend device activation problems : scst_translate_lun:FLAG SUSPENDED set, skipping