
Improve documentation on how to create, associate and use runbooks with nodes via traits for cleaning and servicing operations. Change-Id: If99866ef8bfc200b430f17cff784cf96e916536d
21 KiB
Node cleaning
Overview
Ironic provides two modes for node cleaning: automated
and manual
.
Automated cleaning
is automatically performed before the
first workload has been assigned to a node and when hardware is recycled
from one workload to another.
Manual cleaning
must be invoked by the operator.
Automated cleaning
When hardware is recycled from one workload to another, ironic performs automated cleaning on the node to ensure it's ready for another workload. This ensures the tenant will get a consistent bare metal node deployed every time.
Ironic implements automated cleaning by collecting a list of cleaning
steps to perform on a node from the Power, Deploy, Management, BIOS, and
RAID interfaces of the driver assigned to the node. These steps are then
ordered by priority and executed on the node when the node is moved to
cleaning
state, if automated cleaning is enabled.
With automated cleaning, nodes move to cleaning
state
when moving from active
-> available
state
(when the hardware is recycled from one workload to another). Nodes also
traverse cleaning when going from manageable
->
available
state (before the first workload is assigned to
the nodes). For a full understanding of all state transitions into
cleaning, please see states
.
Ironic added support for automated cleaning in the Kilo release.
Enabling automated cleaning
To enable automated cleaning, ensure that your ironic.conf is set as follows:
[conductor]
automated_clean=true
This will enable the default set of cleaning steps, based on your hardware and ironic hardware types used for nodes. This includes, by default, erasing all of the previous tenant's data.
You may also need to configure a Cleaning Network.
Cleaning steps
Cleaning steps used for automated cleaning are ordered from higher to lower priority, where a larger integer is a higher priority. In case of a conflict between priorities across interfaces, the following resolution order is used: Power, Management, Deploy, BIOS, and RAID interfaces.
You can skip a cleaning step by setting the priority for that cleaning step to zero or 'None'.
You can reorder the cleaning steps by modifying the integer priorities of the cleaning steps.
See How do I change the priority of a cleaning step? for more information.
Storage cleaning options
Warning
Ironic's storage cleaning options by default will remove data from the disk permanently during automated cleaning.
Clean steps specific to storage are erase_devices
,
erase_devices_metadata
and (added in Yoga)
erase_devices_express
.
erase_devices
aims to ensure that the data is removed in
the most secure way available. On devices that support hardware-assisted
secure erasure (many NVMe and some ATA drives), this is the preferred
option. If hardware-assisted secure erasure is not available and if
:oslo.configdeploy.continue_if_disk_secure_erase_fails
is set
to True
, cleaning will fall back to using
shred
to overwrite the contents of the device. By default,
if erase_devices
is enabled and Ironic is unable to erase
the device, cleaning will fail to ensure data security.
Note
erase_devices
may take a very long time (hours or even
days) to complete, unless fast, hardware-assisted data erasure is
supported by all the devices in a system.
erase_devices_metadata
clean step doesn't provide as
strong assurance of irreversible destruction of data as
erase_devices
. However, it has the advantage of a
reasonably quick runtime (seconds to minutes). It operates by destroying
the metadata of the storage device without erasing every bit of the data
itself. Attempts to restore data after running
erase_devices_metadata
may be successful but would
certainly require relevant expertise and specialized tools.
Lastly, erase_devices_express
combines some of the perks
of both erase_devices
and
erase_devices_metadata
. It attempts to utilize
hardware-assisted data erasure features if available (currently only
NVMe devices are supported). In case hardware-assisted data erasure is
not available, it falls back to metadata erasure for the device (which
is identical to erase_devices_metadata
). It can be
considered a time-optimized mode of storage cleaning, aiming to perform
as thorough data erasure as it is possible within a short period of
time. This clean step is particularly well suited for environments with
hybrid NVMe-HDD storage configuration as it allows fast and secure
erasure of data stored on NVMes combined with equally fast but more
basic metadata-based erasure of data on commodity HDDs.
By default, Ironic will use erase_devices_metadata
early
in cleaning for reliability (ensuring a node cannot reboot into its old
workload) and erase_devices
later in cleaning to securely
erase the drive; erase_devices_express
is disabled.
Operators can use :oslo.configdeploy.erase_devices_priority
and :oslo.configdeploy.erase_devices_metadata_priority
to change
the priorities of the default device erase methods or disable them
entirely by setting 0
. Other cleaning steps can have their
priority modified via the :oslo.configconductor.clean_step_priority_override
option. For
example, the configuration snippet below disables
erase_devices_metadata
and erase_devices
and
instead performs an erase_devices_express
erase step.
[deploy]
erase_devices_priority=0
erase_devices_metadata_priority=0
[conductor]
clean_step_priority_override=deploy.erase_devices_express:95
This ensures that erase_devices
and
erase_devices_metadata
are disabled so that storage is not
cleaned twice and then assigns a non-zero priority to
erase_devices_express
, hence enabling it. Any non-zero
priority specified in the priority override will work; larger values
will cause the disk erasure to run earlier in the cleaning process if
multiple steps are enabled.
Other configurations that can modify how Ironic erases disks are below. This list may not be comprehensive. Please review ironic.conf.sample (linked) for more details:
- :oslo.config
deploy.enable_ata_secure_erase
, defaultTrue
- :oslo.config
deploy.enable_nvme_secure_erase
, defaultTrue
- :oslo.config
deploy.shred_random_overwrite_iterations
, default1
- :oslo.config
deploy.shred_final_overwrite_with_zeros
, defaultTrue
- :oslo.config
deploy.disk_erasure_concurrency
, default4
Warning
Ironic automated cleaning is defaulted to a secure configuration. You should not modify settings related to it unless you have special hardware needs or a unique use case. Misconfigurations can lead to data exposure vulnerabilities.
Manual cleaning
Manual cleaning
is typically used to handle
long-running, manual, or destructive tasks that an operator wishes to
perform either before the first workload has been assigned to a node or
between workloads. When initiating a manual clean, the operator
specifies the cleaning steps to be performed. Manual cleaning can only
be performed when a node is in the manageable
state. Once
the manual cleaning is finished, the node will be put in the
manageable
state again.
Ironic added support for manual cleaning in the 4.4 (Mitaka series) release.
Setup
In order for manual cleaning to work, you may need to configure a Cleaning Network.
Starting manual cleaning via API
Manual cleaning can only be performed when a node is in the
manageable
state. The REST API request to initiate it is
available in API version 1.15 and higher:
PUT /v1/nodes/<node_ident>/states/provision
(Additional information is available here.)
This API will allow operators to put a node directly into
cleaning
provision state from manageable
state
via 'target': 'clean'. The PUT will also require the argument
'clean_steps' to be specified. This is an ordered list of cleaning
steps. A cleaning step is represented by a dictionary (JSON), in the
form:
{
"interface": "<interface>",
"step": "<name of cleaning step>",
"args": {"<arg1>": "<value1>", ..., "<argn>": <valuen>}
}
The 'interface' and 'step' keys are required for all steps. If a cleaning step method takes keyword arguments, the 'args' key may be specified. It is a dictionary of keyword variable arguments, with each keyword-argument entry being <name>: <value>.
If any step is missing a required keyword argument, manual cleaning
will not be performed and the node will be put in
clean failed
provision state with an appropriate error
message.
If, during the cleaning process, a cleaning step determines that it
has incorrect keyword arguments, all earlier steps will be performed and
then the node will be put in clean failed
provision state
with an appropriate error message.
An example of the request body for this API:
{
"target":"clean",
"clean_steps": [{
"interface": "raid",
"step": "create_configuration",
"args": {"create_nonroot_volumes": false}
},
{
"interface": "deploy",
"step": "erase_devices"
}]
}
In the above example, the node's RAID interface would configure hardware RAID without non-root volumes, and then all devices would be erased (in that order).
Alternatively, you can specify a runbook instead of clean_steps:
{
"target":"clean",
"runbook": "<runbook_name_or_uuid>"
}
The specified runbook must match one of the node's traits to be used.
Starting manual cleaning via "openstack baremetal" CLI
Manual cleaning is available via the
baremetal node clean
command, starting with Bare Metal API
version 1.15.
The argument --clean-steps
must be specified. Its value
is one of:
- a JSON string
- path to a JSON file whose contents are passed to the API
- '-', to read from stdin. This allows piping in the clean steps. Using '-' to signify stdin is common in Unix utilities.
The following examples assume that the Bare Metal API version was set
via the OS_BAREMETAL_API_VERSION
environment variable. (The
alternative is to add --os-baremetal-api-version 1.15
to
the command.):
export OS_BAREMETAL_API_VERSION=1.15
Examples of doing this with a JSON string:
baremetal node clean <node> \
--clean-steps '[{"interface": "deploy", "step": "erase_devices_metadata"}]'
baremetal node clean <node> \
--clean-steps '[{"interface": "deploy", "step": "erase_devices"}]'
Or with a file:
baremetal node clean <node> \
--clean-steps my-clean-steps.txt
Or with stdin:
cat my-clean-steps.txt | baremetal node clean <node> \
--clean-steps -
Runbooks for Manual Cleaning
Instead of passing a list of clean steps, operators can now use runbooks. Runbooks are curated lists of steps that can be associated with nodes via traits which simplifies the process of performing consistent cleaning operations across similar nodes.
To use a runbook for manual cleaning:
baremetal node clean <node> --runbook <runbook_name_or_uuid>
Runbooks must be created and associated with nodes beforehand. Only
runbooks that match the node's traits can be used for cleaning that
node. For more information on the runbook API usage, see runbooks
.
Cleaning Network
If you are using the Neutron DHCP provider (the default) you will
also need to ensure you have configured a cleaning network. This network
will be used to boot the ramdisk for in-band cleaning. You can use the
same network as your tenant network. For steps to set up the cleaning
network, please see configure-cleaning
.
In-band vs out-of-band
Ironic uses two main methods to perform actions on a node: in-band and out-of-band. Ironic supports using both methods to clean a node.
In-band
In-band steps are performed by ironic making API calls to a ramdisk running on the node using a deploy interface. Currently, all the deploy interfaces support in-band cleaning. By default, ironic-python-agent ships with a minimal cleaning configuration, only erasing disks. However, you can add your own cleaning steps and/or override default cleaning steps with a custom Hardware Manager.
Out-of-band
Out-of-band are actions performed by your management controller, such as IPMI, iLO, or DRAC. Out-of-band steps will be performed by ironic using a power or management interface. Which steps are performed depends on the hardware type and hardware itself.
For Out-of-Band cleaning operations supported by iLO hardware types,
refer to ilo_node_cleaning
.
FAQ
How are cleaning steps ordered?
For automated cleaning, cleaning steps are ordered by integer priority, where a larger integer is a higher priority. In case of a conflict between priorities across hardware interfaces, the following resolution order is used:
- Power interface
- Management interface
- Deploy interface
- BIOS interface
- RAID interface
For manual cleaning, the cleaning steps should be specified in the desired order.
How do I skip a cleaning step?
For automated cleaning, cleaning steps with a priority of 0 or None are skipped.
How do I change the priority of a cleaning step?
For manual cleaning, specify the cleaning steps in the desired order.
For automated cleaning, it depends on whether the cleaning steps are out-of-band or in-band.
Most out-of-band cleaning steps have an explicit configuration option for priority.
Changing the priority of an in-band (ironic-python-agent) cleaning
step requires use of :oslo.configconductor.clean_step_priority_override
, a
configuration option that allows specifying the priority of each step
using multiple configuration values:
[conductor]
clean_step_priority_override=deploy.erase_devices_metadata:123
clean_step_priority_override=management.reset_bios_to_default:234
clean_step_priority_override=management.clean_priority_reset_ilo:345
This parameter can be specified as many times as required to define priorities for several cleaning steps - the values will be combined.
What cleaning step is running?
To check what cleaning step the node is performing or attempted to
perform and failed, run the following command; it will return the value
in the node's driver_internal_info
field:
baremetal node show $node_ident -f value -c driver_internal_info
The clean_steps
field will contain a list of all
remaining steps with their priorities, and the first one listed is the
step currently in progress or that the node failed before going into
clean failed
state.
Should I disable automated cleaning?
Automated cleaning is recommended for ironic deployments, however,
there are some tradeoffs to having it enabled. For instance, ironic
cannot deploy a new instance to a node that is currently cleaning, and
cleaning can be a time consuming process. To mitigate this, we suggest
using NVMe drives with support for NVMe Secure Erase (based on
nvme-cli
format command) or ATA drives with support for
cryptographic ATA Security Erase, as typically the erase_devices step in
the deploy interface takes the longest time to complete of all cleaning
steps.
Why can't I power on/off a node while it's cleaning?
During cleaning, nodes may be performing actions that shouldn't be interrupted, such as BIOS or Firmware updates. As a result, operators are forbidden from changing the power state via the ironic API while a node is cleaning.
Advanced topics
Parent Nodes
The concept of a parent_node
is where a node is
configured to have a "parent", and allows for actions upon the parent,
to in some cases take into account child nodes. Mainly, the concept of
executing clean steps in relation to child nodes.
In this context, a child node is primarily intended to be an embedded device with its own management controller. For example "SmartNIC's" or Data Processing Units (DPUs) which may have their own management controller and power control.
The relationship between a parent node and a child node is established on the child node. Example:
baremetal node set --parent-node <parent_node_uuid> <child_node_uuid>
Child Node Clean Step Execution
You can execute steps that perform actions on child nodes. For
example, turn them on (via step power_on
), off (via step
power_off
), or to signal a BMC-controlled reboot (via step
reboot
).
For example, if you need to explicitly power off child node power, before performing another step, you can articulate it with a step such as:
[{
"interface": "power",
"step": "power_off",
"execute_on_child_nodes": True,
"limit_child_node_execution": ['f96c8601-0a62-4e99-97d6-1e0d8daf6dce']
},
{
"interface": "deploy",
"step": "erase_devices"
}]
As one would imagine, this step will power off a singular child node,
as a limit has been expressed to a singular known node, and that child
node's power will be turned off via the management interface.
Afterwards, the erase_devices
step will be executed on the
parent node.
Note
While the deployment step framework also supports the
execute_on_child_nodes
and
limit_child_node_execution
parameters, all of the step
frameworks have a fundamental limitation in that child node step
execution is intended for synchronous actions which do not rely upon the
ironic-python-agent
running on any child nodes. This
constraint may be changed in the future.
Power Management with Child Nodes
The mix of child nodes and parent nodes has special power
considerations, and these devices are evolving in the industry. That
being said, the Ironic project has taken an approach of explicitly
attempting to "power on" any parent node when a request comes in to
"power on" a child node. This can be bypassed by setting a
driver_info
parameter
has_dedicated_power_supply
set to True
, in
recognition that some hardware vendors are working on supplying
independent power to these classes of devices to meet their customer use
cases.
Similarly to the case of a "power on" request for a child node, when
power is requested to be turned off for a "parent node", Ironic will
issue "power off" commands for all child nodes unless the child node has
the has_dedicated_power_supply
option set in the node's
driver_info
field.
Troubleshooting
If cleaning fails on a node, the node will be put into
clean failed
state. If the failure happens while running a
clean step, the node is also placed in maintenance mode to prevent
ironic from taking actions on the node. The operator should validate
that no permanent damage has been done to the node and that no processes
are still running on it before removing the maintenance mode.
Note
Older versions of ironic may put the node to maintenance even when no clean step has been running.
Nodes in clean failed
will not be powered off, as the
node might be in a state such that powering it off could damage the node
or remove useful information about the nature of the cleaning
failure.
A clean failed
node can be moved to
manageable
state, where it cannot be scheduled by nova and
you can safely attempt to fix the node. To move a node from
clean failed
to manageable
:
baremetal node manage $node_ident
You can now take actions on the node, such as replacing a bad disk drive.
Strategies for determining why a cleaning step failed include checking the ironic conductor logs, viewing logs on the still-running ironic-python-agent (if an in-band step failed), or performing general hardware troubleshooting on the node.
When the node is repaired, you can move the node back to
available
state, to allow it to be scheduled by nova.
# First, move it out of maintenance mode
baremetal node maintenance unset $node_ident
# Now, make the node available for scheduling by nova
baremetal node provide $node_ident
The node will begin automated cleaning from the start, and move to
available
state when complete.