Introduction to iSCSI & Getting Started Guide

This article offers an introduction to iSCSI through its terms and best practice. It seeks to shares some tips, traps and pitfalls that new users may initially struggle with. The document is not intended as a guide on how to setup and connect an Initiator to one or more brand of NAS, SAN or software Target. Instead, it focuses on introducing terms and offering answers to questions that you may (or may not yet) have thought of.

 

What is iSCSI?

iSCSI is low-level protocol that transmits storage commands (SCSI) to a remote disk across a standard IP network. It has no responsibility for disks, hard drives, SSD’s file systems or file permissions. Instead is merely allows remote storage to appear as if it were local.

iSCSI allows storage from a remote ‘Target’ device (e.g. disk array) to appear as if it were directly connected to the local ‘Initiator’ device (e.g. Windows Server). Targets are usually disk arrays, while Initiators are usually servers; but not always.

To Windows, Linux or Unix, connected  iSCSI storage will appear no differently than the storage found on a directly connected SATA hard drive or USB drive plugged into the computers motherboard.

When the Initiator (computer that is using the storage) wants to send a read or write request to the disk. A standardised, simplified version of the request that would ordinarily be sent by the disk controller sent over the network. The Target (the disk array) then translates this into an actual command to send to the real disk controller. This process adds convenience and is a powerful tool for businesses. Convenience comes at the expense of a small reduction in performance due to use of network equipment, TCP/IP and the translation of normal disk command instructions.

 

Targets

An iSCSI Target is the iSCSI server/service that runs on the disk array that you will connect to. This is akin to the controller chip on a USB drive which interprets requests between USB and the hard drive.

The target provides:

  • A “directory” of available Logical Units (LUN’s) on the disk array.
  • The name of any available LUNs on the disk array. These are ‘Target IQN’s’, an alphanumeric, human-readable name that can only identify one block of storage.
  • Access control services in the form of:
    • Password access controls (CHAP).
    • IQN based access controls – ensuring that only certain initiators can access certain targets.
    • Target IQN masking, ensuring that if an initiator does not have permission to access an IQN, that it cannot see it listed in the directory.
    • Permit or deny more than one initiator from accessing an IQN at the same time.
  • Data integrity services: for example whether the initiator is allowed to request the use of write caching or whether CRC/checksums must be generated to read/write operations.
  • Intermediation between Initiator, drivers and mounted file systems on the underlying disk array (not inside the LUN) so that the details of the connection to the LUN are transparent to the Initiator.
  • Statistics, diagnostics and logging services.

It is tempting to think of a Target as exposing access to a group of shares, akin to Windows File Sharing e.g. \\server\share. LUN’s however are not ‘shares’ and should the two terms should not be confused as they represent very different technologies.

 

Initiators

The initiator is the device that will make use of the disk space found on the target. In the USB drive analogy, it is the computer that you plug the USB disk into.

Initiators also have a named identifier called an ‘Initiator IQN’. This allows the Target to identify ‘who’ is attempting to connect. The connecting Initiator must obey all of the policies, rules and requirements of the Target if it is to be allowed to connect.

 

IQN | iSCSI Qualified Name

An IQN is little like an extended hostname. IQN’s are alphamnueric namespaces, that provide a Uniform Resource Identifier (URI) to iSCSI resources. LUN’s, Targets and Initiators all have IQN’s.

Some examples include:

iqn.1991.05.com.microsoft:myserver.mydomain.org
iqn.2004-04.qnap.ts-453:iscsi.mylun.0c2324
iqn.2005-10.org.freenas.ctl:mylun

The difference between the Initiator IQN and the Target IQN is that the Target namespace is usually only seen in its extended fashion which includes the name of a LUN.

IQN’s are used as part of the security and access control mechanism for iSCSI. Consequently they must always be unique.

IQN strings must be lower case and can usually only contain letters (a-z) and numbers (0-9). Spaces and other characters should not be used. It is important that you think carefully before naming your Target IQN’s as badly named LUNs IQNs can create confusion and risk data integrity in the future.

 

LUN | Logical Unit

A LUN is a logical allocation of disk space. LUNs are not ‘disks’* and they are not ‘shares’. They can comprise individual or groups of Hard Drives and SSD’s, but they can also be other devices such as tape drives.

It is up to the disk array, NAS or SAN exposing the LUN to decide where and how the data is stored. LUNs can have a multitude of terminology applied to them such as ‘fixed size’, ‘dynamically expanding’ and ‘differencing’. These are all expressions of features and services that are exposed from the underlying operating system used on the disk array. They have nothing to do with iSCSI. iSCSI is the protocol used to access the data in the LUN via a compatible iSCSI Target Service. It does not manage the features of that storage.

* Very early iSCSI LUN implantations could physically map disks. In 2019 this is extremely uncommon to see as most iSCSI LUNs constitute virtualised storage not physically redirected devices.

 

MPIO | Multi-Path Input/Output

In iSCSI terms, MPIO is an optional extension to the iSCSI Service. It allows access to the same LUN between the Initiator and Target via more than one cable. MPIO can either be used to provide extra bandwidth for a connection, or to provide additional safety in the event of a fault in one of the cable paths.

Drivers

The MPIO service uses special device drivers called “Device Specific Modules” (DSM). DSM’s manage ‘how’ the Initiator should communicate with the Target. Most small business MPIO deployments will make use of the built-in Windows/Linux/Unix DSM. For enterprise deployments, most SAN vendors will offer propriatory DSM’s with very specific instruction on how they are to be used. These instructions may include requirements for cabling and network design that significantly differ from small business/test lab deployments.

If a specific DSM is required, they should be installed on all Initiators before attempting to connect to the Target.

Design

Connections using MPIO usually use Active/Active connections. This means that both cables are live at the same time, increasing bandwidth. This contrasts with an Active/Passive (fault tolerance) desing in which one link will be idle until the Active member enters a failure state.

You must not mix different Ethernet speeds, cabling and ideally NIC’s in an Active/Active design. It is technically possible to mix different types in an Active/Passive design; for example a 10 Gigabit Ethernet Active connection backed with a 1 Gigabit passive connection. In practice, as the Windows DSM is not designed to account for this, there is no way to ensure which link is in use at any given time. You will often find that the 1GbE NIC is being used and the 10GbE NIC is idle. Therefore as a rule, you should not mix different NIC speeds at all when using MPIO.

If your MPIO system is properly designed and stable, it is possible to perform live disconnects on one of the MPIO links. Do remember that you are in effect disconnecting a SATA cable, and that there are inherent risks associated with this action. Once you reconnect the able, the MPIO driver may wait several minutes to ensure that the link is stable before it begins using it again. Keep this in mind if you ever plan to disconnect both paths in short succession.

You can mix NIC’s in an MPIO design. If you want to use an Intel 1GbE NIC and a Broadcom 1GbE NIC, this design will work. Where possible is should be avoided so as to simplify your design and ensure that both NICs are using the same settings. A mixed adapter design be used to safeguarding against driver crashes. Should an Intel driver crash, the Broadcom one should continue working. This is however a very rare occurrence. Similarly, if you need to perform live NIC driver updates without bringing down the service, a mixed adapter design can also be beneficial.

Switches

If you are creating a fault tolerant design, it is entirely appropriate to run each of your Active iSCSI connections to a different physical switch. Using two switches protects against a switch failure and may permit on-line servicing of switch firmware to be performed. You should ensure that the switches are the same model/firmware and that the cables are the same length to normalise the design. Most importantly, all Initiators and Target should have equal connections on each switch.

Most DSM’s will require different, wholly isolated VLAN’s and non inter-routable IP subnets on switch*. This ensures that there is no possible cross-talk between paths. Ensure however that you design to the requirements of your storage vendor and not to “I know best” posters on Internet forums.

* Dell EqualLogic SAN solutions are a notable exception here

Number of Connections

Multi-Path drivers are capable of using more than 2 NICs in a design. In practice, you should never attempt to use more than 2 Active connections concurrently. 3rd, 4th and subsequent connections should remain Passive (i.e. offline until a failure).

DSM drivers are not designed to efficiently fragment and reconstruct data from more than 2 connections. The extra CPU, NIC processor and delivery clock-skew (where the system has to wait for the expected data to arrive) cause significantly latency (delay) on the iSCSI device. This results is significantly slower read and write speeds by up to 96%!

If you are interested in this topic, please see my in-depth exploration on it at the link below.

View: iSCSI MPIO Recommendations & Best Practice on Windows Server

Use in Teams/Link Aggregation and ‘Converged Fabric’ designs

This is very simple: Never, ever connect to an iSCSI Target using a teamed/aggregated connection. If you want 4x1GbE ports available to your Initiator, then 4, non-teamed 1GbE connections must be made available at both the Initiator and Target.

Using iSCSI as part of a team can result in reduced I/O performance of up to 60%. If you are interested in this topic, please see my in-depth exploration on this at the link below.

View: iSCSI MPIO Recommendations & Best Practice on Windows Server

 

Cabling

When you start using iSCSI, it is very important that you psychologically differentiate between ‘Network connectivity’ and ‘iSCSI connectivity’. While they may share the same fabric (i.e. cabling and switches). They must no longer serve the same purpose.

  1. Use a dedicated, unambiguous colour for your iSCSI cabling.
  2. Stop thinking of it as a network cable and start thinking of it as being equivalent to a SATA cable.
    Would you ever pull a SATA cable out of an active hard drive? No?Would you ever attempt to use a SATA cable to browse YouTube? No?

    iSCSI should be treated no differently.

  3. Do not share iSCSI cabling with other services. There should be no Internet access via that cable. No file transfers should be allowed over it and SMB, Samba, NFT, FTP etc should all be disabled from the interface service binding.
    The best way to ensure this is to use dedicated NICs and cabling for your iSCSI connections. Ideally you should also use dedicated switches, but where this is not possible, dedicated VLANs must be used.

 

Removing iSCSI Drives

In the section above, I asked if you would ever pull a SATA cable out of an active hard drive. iSCSI drives are in essence internal drives. Windows/Linux or Unix will treat them as such.

When removing iSCSI storage, you must not:

  1. Disconnect the iSCSI cable(s)
  2. Open the iSCSI Initiator and force it to disconnect from the Target (Hint: this is exactly the same as disconnecting the cables)

Instead, you must think of it like a USB drive… except that the operating system sees it as an internal drive and will not present a USB-style “Safely Eject a Device” mechanism.

What you must do instead is completely stop all client traffic from flowing on the iSCSI cabling before you instruct the Initiator to disconnect. You do this by dismounting the iSCSI volume.

  • In Linux/Unix use the unmount command.
  • In Windows you can use Disk Manager, PowerShell or DiskPart to offline the Volume and Disk
diskpart
list disk
select disk #
offline disk

Wait a few moments and ensure that the drive letter/mount point has been removed from the operating system. The only traffic on your iSCSI links will now be the iSCSI protocol itself. It is now safe to disconnect the iSCSI Initiator from the Target. Once that has been cleanly disconnected, you can safely remove the cabling.

The good news is that you only need do this when you need to manually disconnect the Target. Your operating system will perform these tasks for you when you shutdown or reboot the Initiator computer.

Important Note: Never reboot a Target unless all of the Initiators have been properly offlined and disconnected. Significant unrecoverable data loss can result if you do.

 

Naming Conventions

The naming of your iSCSI resources is easily overlooked. Time and time again, I have encountered poorly considered and badly designed iSCSI solutions. Data loss has occurred and often the root cause is confusion over naming.

It is vitally important that you are able to understand how to your iSCSI project is configured when you return to it at a later time. It is even more important that the person coming in behind you can do the same!

To this end, ensure that you consider your naming conventions very carefully. You must consider and properly design naming standards for:

  1. Initiator hostnames
    Minimise character counts to improve IQN visibility e.g.
    FinSrv001
  2. Target hostnames
    If you are using consumer NAS’s, do they all appear with the same host IQN? If so, change them to be unique. Minimise character counts to improve IQN visibility e.g.
    NAS001
  3. LUN names on the storage array
    If you are planning of exposing the LUN with the Target name “iqn….fiancedb” call the LUN on the SAN/NAS something similar. Avoid calling it something meaningless like “LUN0001”. Calling it “financedb” would be more sensible! A clear relationship should thus be established between the Target and LUN.
  4. Target naming for LUNs exposed via the iSCSI Target Service
    As above, ensure that the LUN and the Target IQN are related. Keep character length short and clear. I suggest mapping initiator hostname to task to help preserve your sanity e.g.
    finsrv001financedb (i.e. FinSrv001—-FinanceDb)
  5. Volume labels on iSCSI mounted storage
    Labelling your volumes is important for when you need to find them in diskpart or disk manager e.g.
    FinanceDbKeep names short and keep them the same as the Target IQN and you will never accidentally offline the wrong volume!

If you follow this advice correctly, you will have a mounted disk labelled “FianceDb” and a Target IQN of:

iqn.1994-11.com.netgear:nas001:b7203c2c:finsrv001financedb

i.e. it is:

  • NetGear NAS
  • The NAS hostname is NAS001
  • The Target is intended only for use by FinSrv001
  • The Target is used for the Finance Database
  • FinanceDb volume (not in the IQN)

 

iSNS | Internet Storage Name Service

The iSNS protocol was ported to iSCSI from the enterprise world of Fibre Channel (FC) storage. FC is a high cost, high speed, highly fault tolerant data centre technology that provides access to storage systems via specialist connections. iSCSI took elements of FC and applied them to low-cost, standardised and common Ethernet connectivity.

iSNS provides a management layer on-top of iSCSI. The protocol and its support software allows iSCSI Initiators to be automatically reconfigured on the fly, using a centralised database. iSNS allows iSCSI  deployments to become more fault tolerant, more efficient and easier to scale (grow).

Access control services offered by iSNS allow security to be applied centrally, preventing errors from occurring in per-target permissioning while allowing central validation and access logging.

iSNS is not a technology that first-time users to iSCSI should look into.

 

Conclusion

This introduction to iSCSI has covered several key areas of terminology encountered by new iSCSI users. The sections presented above have covered answers to questions that many new users will encounter as they gain experience with using remote storage.

iSCSI is not intended as an end user tool. As a result, if used incorrectly it can present considerable risk to data. It is very easy to cause catastrophic data loss through the misuse of the technology. Approaching iSCSI correctly from the beginning is important for anyone pursuing a career in server or data centre technologies. Despite this, it is a highly robust and flexible tool at an administrators disposal. I hope this guide has given you a few key pointers on your journey.