From 995b538e6ef98b06d2d1e7eb652afb8b5cc64977 Mon Sep 17 00:00:00 2001
From: Mark Goddard <mark@stackhpc.com>
Date: Mon, 23 Jul 2018 09:21:28 +0100
Subject: [PATCH] Deploy steps documentation

Change-Id: Ia6d5336ee074ac5f44226c09c3b3c239f0e50162
Story: #1753128
Task: #22592
---
 doc/source/admin/cleaning.rst                 | 11 ++--
 doc/source/admin/deploy-steps.rst             | 60 +++++++++++++++++++
 doc/source/admin/index.rst                    |  1 +
 .../contributor/code-contribution-guide.rst   |  2 +
 doc/source/contributor/drivers.rst            |  9 +++
 5 files changed, 78 insertions(+), 5 deletions(-)
 create mode 100644 doc/source/admin/deploy-steps.rst

diff --git a/doc/source/admin/cleaning.rst b/doc/source/admin/cleaning.rst
index 85d5fde3e9..8ef672b562 100644
--- a/doc/source/admin/cleaning.rst
+++ b/doc/source/admin/cleaning.rst
@@ -25,10 +25,10 @@ automated cleaning on the node to ensure it's ready for another workload. This
 ensures the tenant will get a consistent bare metal node deployed every time.
 
 Ironic implements automated cleaning by collecting a list of cleaning steps
-to perform on a node from the Power, Deploy, Management, and RAID interfaces
-of the driver assigned to the node. These steps are then ordered by priority
-and executed on the node when the node is moved
-to ``cleaning`` state, if automated cleaning is enabled.
+to perform on a node from the Power, Deploy, Management, BIOS, and RAID
+interfaces of the driver assigned to the node. These steps are then ordered by
+priority and executed on the node when the node is moved to ``cleaning`` state,
+if automated cleaning is enabled.
 
 With automated cleaning, nodes move to ``cleaning`` state when moving from
 ``active`` -> ``available`` state (when the hardware is recycled from one
@@ -63,7 +63,7 @@ Cleaning steps
 Cleaning steps used for automated cleaning are ordered from higher to lower
 priority, where a larger integer is a higher priority. In case of a conflict
 between priorities across interfaces, the following resolution order is used:
-Power, Management, Deploy, and RAID interfaces.
+Power, Management, Deploy, BIOS, and RAID interfaces.
 
 You can skip a cleaning step by setting the priority for that cleaning step
 to zero or 'None'.
@@ -236,6 +236,7 @@ across hardware interfaces, the following resolution order is used:
 #. Power interface
 #. Management interface
 #. Deploy interface
+#. BIOS interface
 #. RAID interface
 
 For manual cleaning, the cleaning steps should be specified in the desired
diff --git a/doc/source/admin/deploy-steps.rst b/doc/source/admin/deploy-steps.rst
new file mode 100644
index 0000000000..22bd87b98d
--- /dev/null
+++ b/doc/source/admin/deploy-steps.rst
@@ -0,0 +1,60 @@
+============
+Deploy steps
+============
+
+Overview
+========
+
+Node deployment is performed by the Bare Metal service to prepare a node for
+use by a workload.  The exact work flow used depends on a number of factors,
+including the hardware type and interfaces assigned to a node.
+
+Customizing deployment
+======================
+
+The Bare Metal service implements deployment by collecting a list of deploy
+steps to perform on a node from the Power, Deploy, Management, BIOS, and RAID
+interfaces of the driver assigned to the node. These steps are then ordered by
+priority and executed on the node when the node is moved to the ``deploying``
+state.
+
+Nodes move to the ``deploying`` state when attempting to move to the ``active``
+state (when the hardware is prepared for use by a workload).  For a full
+understanding of all state transitions into deployment, please see
+:ref:`states`.
+
+The Bare Metal service added support for deploy steps in the Rocky release.
+
+Deploy steps
+------------
+
+Deploy steps are ordered from higher to lower priority, where a larger integer
+is a higher priority. If the same priority is used by deploy steps on different
+interfaces, the following resolution order is used: Power, Management, Deploy,
+BIOS, and RAID interfaces.
+
+FAQ
+===
+
+What deploy step is running?
+----------------------------
+To check what deploy step the node is performing or attempted to perform and
+failed, run the following command; it will return the value in the node's
+``driver_internal_info`` field::
+
+    openstack baremetal node show $node_ident -f value -c driver_internal_info
+
+The ``deploy_steps`` field will contain a list of all remaining steps with
+their priorities, and the first one listed is the step currently in progress or
+that the node failed before going into ``deploy failed`` state.
+
+Troubleshooting
+===============
+If deployment fails on a node, the node will be put into the ``deploy failed``
+state until the node is deprovisioned.  A deprovisioned node is moved to the
+``available`` state after the cleaning process has been performed successfully.
+
+Strategies for determining why a deploy step failed include checking the ironic
+conductor logs, checking logs from the ironic-python-agent that have been
+stored on the ironic conductor, or performing general hardware troubleshooting
+on the node.
diff --git a/doc/source/admin/index.rst b/doc/source/admin/index.rst
index 42ccfd7ba9..b3ac27151d 100644
--- a/doc/source/admin/index.rst
+++ b/doc/source/admin/index.rst
@@ -11,6 +11,7 @@ the services.
    Drivers, Hardware Types and Hardware Interfaces <drivers>
    Ironic Python Agent <drivers/ipa>
    Node Hardware Inspection <inspection>
+   Deploy steps <deploy-steps>
    Node Cleaning <cleaning>
    Node Adoption <adoption>
    RAID Configuration <raid>
diff --git a/doc/source/contributor/code-contribution-guide.rst b/doc/source/contributor/code-contribution-guide.rst
index 4026a0830d..1a7c50914b 100644
--- a/doc/source/contributor/code-contribution-guide.rst
+++ b/doc/source/contributor/code-contribution-guide.rst
@@ -232,6 +232,8 @@ Here is the list of existing common and agent driver attributes:
 
   * ``is_whole_disk_image``: A Boolean value to indicate whether the user image contains ramdisk/kernel.
   * ``clean_steps``: An ordered list of clean steps that will be performed on the node.
+  * ``deploy_steps``: An ordered list of deploy steps that will be performed on the node. Support for
+    deploy steps was added in the ``11.1.0`` release.
   * ``instance``: A list of dictionaries containing the disk layout values.
   * ``root_uuid_or_disk_id``: A String value of the bare metal node's root partition uuid or disk id.
   * ``persistent_boot_device``: A String value of device from ``ironic.common.boot_devices``.
diff --git a/doc/source/contributor/drivers.rst b/doc/source/contributor/drivers.rst
index 519a58a7b2..7c9caa1b41 100644
--- a/doc/source/contributor/drivers.rst
+++ b/doc/source/contributor/drivers.rst
@@ -49,6 +49,15 @@ The minimum required interfaces are:
   A few common implementations are provided by the ``GenericHardware`` base
   class.
 
+  As of the Rocky release, a deploy interface should decorate its deploy method
+  to indicate that it is a deploy step. Conventionally, the deploy method uses
+  a priority of 100.
+
+  .. code-block:: python
+
+     @ironic.drivers.base.deploy_step(priority=100)
+     def deploy(self, task):
+
   .. note::
     Most of the hardware types should not override this interface.