stacd.conf — stacd(8) configuration file
/etc/stas/stacd.conf
stacd.conf
is a plain text file divided into
sections, with configuration entries in the style
key
=value
.
A space immediately before or after the "=
" is
ignored. Empty lines and lines starting with "#
"
are ignored, which may be used for commenting.
The following options are available in the
"[Global]
"section:
tron=
Trace ON. Takes a boolean argument. If "true
",
enables full code tracing. The trace will be displayed in
the system log such as systemd's journal. Defaults to
"false
".
hdr-digest=
Enable Protocol Data Unit (PDU) Header Digest. Takes a
boolean argument. NVMe/TCP facilitates an optional PDU
Header digest. Digests are calculated using the CRC32C
algorithm. If "true
", Header Digests
are inserted in PDUs and checked for errors. Defaults to
"false
".
data-digest=
Enable Protocol Data Unit (PDU) Data Digest. Takes a
boolean argument. NVMe/TCP facilitates an optional PDU
Data digest. Digests are calculated using the CRC32C
algorithm. If "true
", Data Digests
are inserted in PDUs and checked for errors. Defaults to
"false
".
kato=
Keep Alive Timeout (KATO) in seconds. Takes an unsigned integer. This field specifies the timeout value for the Keep Alive feature in seconds. Defaults to 30 seconds for Discovery Controller connections and 120 seconds for I/O Controller connections.
ip-family=
Takes a string argument. With this you can specify whether IPv4, IPv6, or both are supported when connecting to a Controller. Connections will not be attempted to IP addresses (whether discovered or manually configured with the 'controller') if those IP addresses are disabled by this option. If an invalid value is entered, then "ipv4+ipv6" will be used by default.
Choices are "ipv4
", "ipv6
", or "ipv4+ipv6
".
Defaults to "ipv4+ipv6
".
ignore-iface=
Takes a boolean argument. This option controls how connections with I/O Controllers (IOC) are made.
There is no guarantee that there will be a route to reach that IOC. However, we can use the socket option SO_BINDTODEVICE to force the connection to be made on a specific interface instead of letting the routing tables decide where to make the connection.
This option determines whether stacd
will use
SO_BINDTODEVICE to force connections on an interface
or just rely on the routing tables. The default is
to use SO_BINDTODEVICE, in other words, stacd
does
not ignore the interface.
BACKGROUND:
By default, stacd
will connect to IOCs on the same
interface that was used to retrieve the discovery
log pages. If stafd discovers a DC on an interface
using mDNS, and stafd connects to that DC and
retrieves the log pages, it is expected that the
storage subsystems listed in the log pages are
reachable on the same interface where the DC was
discovered.
For example, let's say a DC is discovered on interface ens102. Then all the subsystems listed in the log pages retrieved from that DC must be reachable on interface ens102. If this doesn't work, for example you cannot "ping -I ens102 [storage-ip]", then the most likely explanation is that arp proxy is not enabled on the switch that the host is connected to on interface ens102. Whatever you do, resist the temptation to manually set up the routing tables or to add alternate routes going over a different interface than the one where the DC is located. That simply won't work. Make sure arp proxy is enabled on the switch first.
Setting routes won't work because, by default, stacd
uses the SO_BINDTODEVICE socket option when it
connects to IOCs. This option is used to force a
socket connection to be made on a specific interface
instead of letting the routing tables decide where
to connect the socket. Even if you were to manually
configure an alternate route on a different interface,
the connections (i.e. host to IOC) will still be
made on the interface where the DC was discovered by
stafd.
Defaults to "false
".
udev-rule=
Takes a string argument "enabled
" or
"disabled
". This option determines
whether nvme-cli
's udev rule will be executed
or ignored.
A udev rule gets installed with nvme-cli
that tells the udev daemon (udevd
) to look
for Asynchronous Event Notifications (AEN) indicating
a change of Discovery Log Page Entries (DPLE). The
udev rule is installed as: /usr/lib/udev/rules.d/70-nvmf-autoconnect.rules
When an AEN is detected, udevd
simply
instructs systemd
to start a one-shot
service that will retrieve the changed DPLEs and
connect to all the I/O Controllers (IOC) listed in
the DPLEs. This is basically the same as performing
nvme-cli
's "connect-all
"
command.
Unfortunately, stafd
and stacd
also perform the same operations when an AEN is received.
This results in a race condition between udevd
and stafd
/stacd
.
This is not really a problem. stafd
and
stacd
are designed to handle this type
of race condition and will conclude, eventually, that
the connections succeeded. The only downside is that
there may be error messages printed to the syslog
when the race condition happens. These messages are
printed by the kernel because two processes are trying
to connect to the same IOC at the same time. One of
them will be rejected by the kernel, but the other
will succeed.
The udev-rule
option allows a user to
disable nvme-cli
's udev rule so that udevd
will
not act on received AENs. Instead, only
stafd
/stacd
will be allowed
to react to AENs and set up IOC connections.
Defaults to "enabled
", which means
that udevd
and stafd
/stacd
will react to AENs. It also means that the race condition
will happen by default and error messages will be
printed to the syslog.
sticky-connections=
Keep existing connections to I/O controllers (IOC).
Takes a string argument "enabled
" or
"disabled
".
The parameter sticky-connections
determines how stacd
reacts to the
removal of an IOC Discovery Page Entry (DLPE) or the
removal of a "controller=
" entry in
/etc/stas/stacd.conf
. In other
words, whether it should immediately disconnect
from IOC when the DPLE/"controller=
"
is removed, or whether it should maintain the connection.
Table 1. List of terms used in the following text:
Term | Description |
---|---|
Manual Config | Refers to manually adding entries to stacd.conf |
Automatic Config | Refers to receiving configuration from a Discovery Controller (DC) as DLPEs |
External Config | Refers to configuration done outside of the nvme-stas framework, for example using nvme-cli commands |
IOC connection creation. There are 3 ways to configure IOC connections on a host:
Manual Config by adding "controller=
" entries
to the "[Controllers]
" section (see below).
Automatic Config received in the form of DLPEs from a remote DC.
External Config using nvme-cli
(e.g. "nvme connect
")
Zoning and DLPEs. Zoning configuration is performed at Discovery Controllers (DC). A zone is used to specify the list of IOC that a host is allowed to access. The zone contains a list of hosts and the IOC that these hosts can access. Users can add or remove IOC and/or hosts from zones.
DCs notify hosts of zoning configuration changes by sending Asynchronous Event Notifications (AEN) indicating a "Change of Discovery Log Page (DLP)". The host uses these AENs as a trigger to retrieve the new list of DLPEs by issuing a Get DLP command. This happens in real time, which means that a host that was previously connected to an IOC may suddenly be told that it is no longer allowed to connect to that IOC and should disconnect from it.
IOC connection removal. There are 3 ways to remove controller connections to an IOC:
Manual Config.
by adding "blacklist=
" entries to
the "[Controllers]
" section (see below).
by removing "controller=
" entries
from the "[Controllers]
" section.
Automatic Config. As explained above, changing zoning at a DC will result in the host getting a new list of DLPEs. On DLPE removal, the host should remove the connection to the IOC matching that DLPE.
External Config using nvme-cli
(e.g. "nvme
disconnect
" or "nvme disconnect-all
")
Some users may prefer for the IOC to be "sticky" and
only be removed manually (nvme-cli
or "blacklist=
") or
removed by a system reboot. They don't want for IOC
connections to be removed unexpectedly on DLPE
removal. This is where sticky-connections=
comes into play.
sticky-connections=
tells stacd
whether to keep connections to IOC even if their
DPLEs have been removed or the "controller=
"
entries in stacd.conf
have been removed.
With sticky-connections=disabled
.
stacd
immediately disconnects from
a previously connected IOC if the response to a
Get DLP command no longer contains a DLPE matching
that IOC or a "controller=
"
entry in stacd.conf
is removed.
Ongoing I/O transactions will be terminated immediately as well. There is no way to tell what happens to the data being exchanged when such an abrupt termination happens. If a host was in the middle of writing to a storage subsystem, there is a good chance that incomplete and potentially corrupt data will be left on the remote storage.
NOTE.
This mode implies that nvme-stas
will
only allow Manually Configured or Automatically
Configured IOC connections to exist. Externally
Configured connections using nvme-cli
that do not match any Manual Config
(stacd.conf
)
or Automatic Config (DLPEs) will get deleted
immediately by stacd
.
With sticky-connections=enabled (default)
.
stacd
does not disconnect from IOCs
when a DPLE is removed or a "controller=
"
entry is removed from stacd.conf
.
Instead, users can issue the nvme-cli
command "nvme disconnect
", add a
"blacklist=
" entry to
stacd.conf
, or wait
until the next system reboot at which time all
connections will be removed.
The following options are available in the
"[Controllers]
" section:
controller=
Controllers are specified with the "controller
"
option. This option may be specified more than once to specify
more than one controller. The format is one line per Controller
composed of a series of fields separated by semi-colons as follows:
controller=transport=[trtype];traddr=[traddr];trsvcid=[trsvcid];host-traddr=[traddr],host-iface=[iface];nqn=[nqn]
transport=
This is a mandatory field that specifies the
network fabric being used for a
NVMe-over-Fabrics network. Current
"trtype
" values understood
are:
Table 2. Transport type
trtype | Definition |
---|---|
rdma | The network fabric is an rdma network (RoCE, iWARP, Infiniband, basic rdma, etc) |
fc | The network fabric is a Fibre Channel network. |
tcp | The network fabric is a TCP/IP network. |
loop | Connect to a NVMe over Fabrics target on the local host |
traddr=
This is a mandatory field that specifies the network address of the Controller. For transports using IP addressing (e.g. rdma) this should be an IP-based address (ex. IPv4, IPv6). It could also be a resolvable host name (e.g. localhost).
trsvcid=
This is an optional field that specifies the transport service id. For transports using IP addressing (e.g. rdma, tcp) this field is the port number.
Depending on the transport type, this field will default to either 8009 or 4420 as follows.
UDP port 4420 and TCP port 4420 have been assigned by IANA for use by NVMe over Fabrics. NVMe/RoCEv2 controllers use UDP port 4420 by default. NVMe/iWARP controllers use TCP port 4420 by default.
TCP port 4420 has been assigned for use by NVMe over Fabrics and TCP port 8009 has been assigned by IANA for use by NVMe over Fabrics discovery. TCP port 8009 is the default TCP port for NVMe/TCP discovery controllers. There is no default TCP port for NVMe/TCP I/O controllers, the Transport Service Identifier (TRSVCID) field in the Discovery Log Entry indicates the TCP port to use.
The TCP ports that may be used for NVMe/TCP I/O controllers include TCP port 4420, and the Dynamic and/or Private TCP ports (i.e., ports in the TCP port number range from 49152 to 65535). NVMe/TCP I/O controllers should not use TCP port 8009. TCP port 4420 shall not be used for both NVMe/iWARP and NVMe/TCP at the same IP address on the same network.
nqn=
This is an optional field that specifies the
Discovery Controller's NVMe Qualified Name.
If not specified, this will default to the
well-known DC NQN: "nqn.2014-08.org.nvmexpress.discovery
".
host-traddr=
This is an optional field that specifies the network address used on the host to connect to the Controller. For TCP, this sets the source address on the socket.
host-iface=
This is an optional field that specifies the network interface used on the host to connect to the Controller (e.g. IP eth1, enp2s0, enx78e7d1ea46da). This forces the connection to be made on a specific interface instead of letting the system decide.
Examples:
controller = transport=tcp;traddr=localhost;trsvcid=8009 controller = transport=tcp;traddr=[2001:db8:::370:7334];host-iface=enp0s8 controller = transport=fc;traddr=nn-0x204600a098cbcac6:pn-0x204700a098cbcac6
blacklist=
Blacklisted controllers can be specified with the
"blacklist
" option. Using mDNS to
automatically discover and connect to controllers, can result
in unintentional connections being made. This keyword allows
configuring the controllers that should not be connected to
(whatever the reason may be).
The syntax is the same as for "controller", except that the key
"host-traddr
" does not apply. Multiple
"blacklist
" keywords may appear in the config
file to specify more than 1 blacklisted controller.
Note 1: A minimal match approach is used to eliminate unwanted
controllers. That is, you do not need to specify all the
parameters to identify a controller. Just specifying the
"host-iface
", for example, can be used to
blacklist all controllers on an interface.
Note 2: "blacklist
" takes precedence over
"controller
". A controller specified by the
"controller
" keyword, can be eliminated by
the "blacklist
" keyword.
Examples:
blacklist = transport=tcp;traddr=fe80::2c6e:dee7:857:26bb # Eliminate a specific address blacklist = host-iface=enp0s8 # Eliminate everything on this interface