Welcome, Guest! Log In | Create Account

The case of stale iSCSI LUNs

From iscsitarget

Jump to: navigation, search

The case of stale LUNs and long startup time of ESX hosts. When using iscsi-target (IET) it is essential that you follow some best practices when configuring IET. Otherwise it can happen that you get very long host startup times.

1) Always assign SCSI serial numbers (ScsiSN)

This is important if you must reconfigure the ESX iSCSI software initiator. If you assigned ScsiSNs your ESX host will most likely recognize your iSCSI LUNs correctly and reassign them the correct datastore names.


!!! Attention !!!

Assigning ScsiSNs should be done prior the first assignment to an ESX host as VMware stores them in its internal storage database. Doing it later may result in loosing LUN - datastore relationship and things get complicate to fix.


So best practice here: Assign ScsiSNs when creating LUNs and before attaching them to the ESX host.

Example configuration in /etc/ietd.conf:

Target iqn.2007-09.com.company:ESX.Volume01
  IncomingUser ESX 123456789987654321
  Lun 0 Path=/dev/VGStorage00/LViscsi01,Type=fileio,ScsiSN=VMWARE-0001
  Alias ESX-Volume01

2) Preventing IET from announcing unnecessary LUNs to ESX hosts

Make sure you only announce the LUNs to your ESX hosts which they need. If your IET serves multiple Targets/LUNs for different systems (Windows, Linux, ESX, ...) holding different file systems (NTFS, EXT3, VMFS, ...) make sure to implement access lists preventing IET to announce unnecessary LUNs to your ESX hosts. If you don't your ESX host may take a long time to start up as it tries to connect to the different targets/LUNs.


Normally it takes the ESX swiSCSI-Initiator around 15-30 seconds to reconnect the LUNs but if misconfigured it can take around 5-10 mins.(!) for the host to start up.


IET has the possibility to implement access control lists using the two files

  • /etc/initiators.allow
  • /etc/initiators.deny


This opens a lot of possibilities to hide certain Targets to some selected initiators. Although one of the easiest ways to do so is to set the following rules.

Deny ANY targets to ANY possible systems/initiators by appending the following statment at the end of


/etc/initiators.deny:

ALL ALL

Make sure that ALL ALL is the ONLY un-commented statement in the file. Only use SPACES between the words (no TABs).

Allow specific targets to specific systems/initiators by using IP addresses or subnet masks. For subnet masks CIDR (Classless Inter Domain Routing) notation can be used. The example below show a possible configuration.


/etc/initiators.allow:

iqn.2007-09.com.company:ESX.Volume01 192.168.1.11, 192.168.1.10
iqn.2007-10.com.company:ESX.Volume02 172.16.1.0/24

If you use single IP addresses make sure you include both:

  • VMKernel
  • Service Console


IP addresses. This because iSCSI data traffic uses the VMKernel network interface and initial negotiation/connection is done by the Service Console itself. If working with subnets this is not necessary as long as both IPs reside in this same subnet.

At the first glance this may look complicate especially if IET has a lot of Target definitions serving lots of initiators. But proper access list definitions will improve security and stability in your iSCSI SAN. It is YOU controlling your environment and not your environment controlling you. Especially in production environments this becomes very important.

3) Cleaning a messed up ESX configuration

Maybe it is already to late when you read this and your configuration (iscsi-target / IETD) is already messed up. It takes your ESX host a long time to start up especially the iSCSI reconnect phase takes up to 10 minutes.

The following steps will help you to clean up your ESX hosts configuration files to get rid of irrelevant and/or no longer existing entries which are the root cause of the long delays during startup.

[1]

  • Configure access rules as shown in section 2
  • After completion your ESX host should only see the Targets assigned to it.

[2]

  • Login to Service Console (SSH) and check the following files
/var/lib/iscsi/vmkbindings
/var/lib/iscsi/vmkdiscovery

A 'vmkbindings' could look as shown below:

# Format:
# bus   target  iSCSI
# id    id      TargetName
#
0       2       iqn.2007-09.com.company:ESX.Volume01     0

A 'vmkdiscovery' could look as shown below:

0       2       iqn.2007-09.com.company:ESX.Volume01

If you have more entries in 'vmkbindings' or 'vmkdiscovery' than LUNs attached to your ESX host especially Targets/LUNs from other systems (Windows, Linux, ...) this is most likely the cause for long delays during startup.


To complete the following configuration steps you will need the VMWare VI Client as well as direct SSH root-access.


[3] Put your ESX host into maintenance mode.

[4] Disable the iSCSI software initiator


ESX Host: Configuration->Storage Adapters

The iSCSI Adapter is normally called vmhba32 but could be vmhba40 as well. In the details pane choose properties then configure and uncheck Enabled. Close all dialogs using OK. You will later use the reverse procedure to enable iSCSI software initiator again.

[5] Clean 'vmkbindings' and 'vmkdiscovery'

  • Change directory to /var/lib/iscsi
  • Make a backup copy of the files (e.g. copy them with WinSCP to your computer)
  • Use vi-editor to clean the two files so they only contain targets which are really used by this ESX host. Delete all other lines except comments starting with # symbols
  • Delete all *.bak files in this directory

[6] Reboot your ESX host

[7] Re-Enable the iSCSI software initiator

  • Go to the same dialog as in step 4 and mark the 'Enabled' check box.
  • Close all dialogs using OK.
    • Reboot your ESX host.

[8] Verify ESX iSCSI reconnection behavior

If you did all the above steps correctly including section 2 (Access Lists) your ESX host should have started up much faster. Login to the Service Console and check 'vmkbindings' and 'vmkdiscovery'. You should not see any new iSCSI entries not belonging to this ESX host.

[9] Verify attached iSCSI datastores

  • Verify that you can browse all the iSCSI datastores as before implementing the access lists.