[Scst-devel] VMware ESXi 5 VMFS Issues

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

We are currently using two SCST disk arrays that have three volumes
each giving us (6) total VMFS volumes. We have approximately 700
virtual machines spread across these six VMFS datastores and three
ESXi 5 hosts. The volumes are backed by SATA SSDs and LSI MegaRAID SAS
RAID controllers. We are not experiencing any performance issues that
are noticeable to our users -- everything is extremely fast, however,
we are seeing the following errors in the 'vmkernel' log on each host.

--snip--
2012-02-29T21:22:28.394Z cpu58:636519)NMP:
nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41258056cf40) to dev
"eui.3533633631313666" on path "vmhba1:C0:T5:L105" Failed: H:0x0
D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:NONE
2012-02-29T21:22:28.394Z cpu58:636519)ScsiDeviceIO: 2305:
Cmd(0x41258056cf40) 0x2a, CmdSN 0x8000001e to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.394Z cpu58:636519)ScsiDeviceIO: 2305:
Cmd(0x412580540340) 0x2a, CmdSN 0x8000005f to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.415Z cpu60:8252)ScsiDeviceIO: 2305:
Cmd(0x41258129c200) 0x2a, CmdSN 0x8000006c to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.415Z cpu60:8252)ScsiDeviceIO: 2305:
Cmd(0x4125803f1d00) 0x2a, CmdSN 0x8000004e to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.416Z cpu60:8252)NMP:
nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4125801b8e00) to dev
"eui.3533633631313666" on path "vmhba1:C0:T5:L105" Failed: H:0x0
D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:NONE
2012-02-29T21:22:28.441Z cpu50:98752)ScsiDeviceIO: 2305:
Cmd(0x4125801d3a40) 0x2a, CmdSN 0x80000058 to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.441Z cpu50:98752)ScsiDeviceIO: 2305:
Cmd(0x4125801d1a40) 0x2a, CmdSN 0x8000005f to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.496Z cpu62:636519)ScsiDeviceIO: 2305:
Cmd(0x41258058a980) 0x2a, CmdSN 0x8000003c to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.496Z cpu62:636519)ScsiDeviceIO: 2305:
Cmd(0x4125801d9fc0) 0x2a, CmdSN 0x8000003c to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.496Z cpu62:636519)NMP:
nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4125812eaf40) to dev
"eui.3533633631313666" on path "vmhba1:C0:T5:L105" Failed: H:0x0
D:0x28 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:NONE
2012-02-29T21:22:28.497Z cpu62:636519)ScsiDeviceIO: 2305:
Cmd(0x4125812b0440) 0x2a, CmdSN 0x8000003e to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
2012-02-29T21:22:28.497Z cpu62:636519)ScsiDeviceIO: 2305:
Cmd(0x412580a218c0) 0x2a, CmdSN 0x8000004e to dev
"eui.3533633631313666" failed H:0x0 D:0x28 P:0x0 Possible sense data:
0x0 0x0 0x0.
--snip--

After reading some VMware knowledge base articles, it appears the
command above (0x2a) is a SCSI WRITE command, and the "D:0x28" part is
"VMK_SCSI_DEVICE_QUEUE_FULL (TASK SET FULL)".
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030381

So that article is saying that on the SCST (array) side, it has
stopped accepting commands since the queue is full. That
article/section then recommends controlling the queue depth via
throttling (adaptive queue depth algorithm) on the initiator side
until it stops seeing TASK SET FULL from the device.
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1008113

I guess, I have a few questions then. First, I totally understand we
are exceeding the recommended number of VMs per VMFS datastore. We are
working on deploying additional SCST disk arrays, but we are not there
yet.

- I've read the queue depth stuff in the SCST README and I see that
controller the queue depth from the initiator side is on solution. I
feel our back-storage is quite fast on the SCST side, but we truly are
just overwhelming the volumes with the 700 virtual machines. It is
probably recommended to turn on the adaptive queue depth stuff in
VMware ESXi to stop seeing these messages (or at least not so many).
Any downside to this?

- Is the queue depth size that we are hitting
SCST_MAX_TGT_DEV_COMMANDS (in scst_priv.h) (its 48 in the version of
SCST we're using)? Any advantages/disadvantages to increasing this?
Recommended? Yay? Nay?

- Is there any way to actually monitor what the queue depth is on the
SCST side? We built SCST for performance as these are production
machines so none of the SCST debug options are enabled.

- Anything else we're missing? Suggestions? Again, I think everything
is working correctly, but truly are overloaded with the 700 virtual
machines across the (6) volumes.

Thanks for your time.

--Marc