Remote Redfish Subcloud Restore
Fixed Merge conflicts Fixed review comments for patchset 8 Fixed review comments for patchset 7 Fixed review comments for Patchset 4 Moved restoring-subclouds-from-backupdata-using-dcmanager to the Distributed Cloud Guide Story: 2008573 Task: 42332 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com> Change-Id: Ife0319125df38c54fb0baa79ac32070446a0d605 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
This commit is contained in:
parent
7230189e63
commit
e2e42814e6
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
{
|
||||
"restructuredtext.confPath": ""
|
||||
}
|
@ -28,25 +28,35 @@ specific applications must be re-applied once a storage cluster is configured.
|
||||
To restore the data, use the same version of the boot image \(ISO\) that
|
||||
was used at the time of the original installation.
|
||||
|
||||
The |prod| restore supports two modes:
|
||||
The |prod| restore supports the following optional modes:
|
||||
|
||||
.. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
|
||||
|
||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following syntax, when passing the extra arguments to the Ansible Restore
|
||||
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following parameter, when passing the extra arguments to the Ansible Restore
|
||||
playbook command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=false
|
||||
|
||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following syntax:
|
||||
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=true
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
on_box_data=true
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||
|
||||
Restoring a |prod| cluster from a backup file is done by re-installing the
|
||||
ISO on controller-0, running the Ansible Restore Playbook, applying updates
|
||||
\(patches\), unlocking controller-0, and then powering on, and unlocking the
|
||||
|
@ -18,22 +18,20 @@ following command to run the Ansible Restore playbook:
|
||||
|
||||
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
|
||||
|
||||
The |prod| restore supports two optional modes, keeping the Ceph cluster data
|
||||
intact or wiping the Ceph cluster.
|
||||
|
||||
.. rubric:: |proc|
|
||||
The |prod| restore supports the following optional modes, keeping the Ceph
|
||||
cluster data intact or wiping the Ceph cluster.
|
||||
|
||||
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
|
||||
|
||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following command:
|
||||
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=false
|
||||
|
||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following command:
|
||||
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -50,12 +48,23 @@ intact or wiping the Ceph cluster.
|
||||
the patches and prompt you to reboot the system. Then you will need to
|
||||
re-run Ansible Restore playbook.
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
on_box_data=true
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||
|
||||
.. rubric:: |postreq|
|
||||
|
||||
After running restore\_platform.yml playbook, you can restore the local
|
||||
registry images.
|
||||
|
||||
.. note::
|
||||
|
||||
The backup file of the local registry images may be large. Restore the
|
||||
backed up file on the controller, where there is sufficient space.
|
||||
|
||||
|
@ -51,18 +51,27 @@ In this method you can run Ansible Restore playbook and point to controller-0.
|
||||
|
||||
where optional-extra-vars can be:
|
||||
|
||||
- **Optional**: You can select one of the two restore modes:
|
||||
- **Optional**: You can select one of the following restore modes:
|
||||
|
||||
- To keep Ceph data intact \(false - default option\), use the
|
||||
following syntax:
|
||||
following parameter:
|
||||
|
||||
:command:`wipe_ceph_osds=false`
|
||||
|
||||
- Start with an empty Ceph cluster \(true\), to recreate a new
|
||||
Ceph cluster, use the following syntax:
|
||||
- To start with an empty Ceph cluster \(true\), where the Ceph
|
||||
cluster will need to be recreated, use the following parameter:
|
||||
|
||||
:command:`wipe_ceph_osds=true`
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
:command:`on_box_data=true`
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook
|
||||
expects both the **initial_backup_dir** and **backup_filename**
|
||||
to be specified.
|
||||
|
||||
- The backup\_filename is the platform backup tar file. It must be
|
||||
provided using the ``-e`` option on the command line, for example:
|
||||
|
||||
|
@ -49,6 +49,7 @@ Operation
|
||||
changing-the-admin-password-on-distributed-cloud
|
||||
updating-docker-registry-credentials-on-a-subcloud
|
||||
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
|
||||
restoring-subclouds-from-backupdata-using-dcmanager
|
||||
|
||||
----------------------------------------------------------
|
||||
Kubernetes Version Upgrade Distributed Cloud Orchestration
|
||||
|
@ -0,0 +1,113 @@
|
||||
|
||||
.. _restoring-subclouds-from-backupdata-using-dcmanager:
|
||||
|
||||
=========================================================
|
||||
Restoring a Subcloud From Backup Data Using DCManager CLI
|
||||
=========================================================
|
||||
|
||||
For subclouds with servers that support Redfish Virtual Media Service
|
||||
(version 1.2 or higher), you can use the Central Cloud's CLI to restore the
|
||||
subcloud from data that was backed up previously.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
The CLI command :command:`dcmanager subcloud restore` can be used to restore a
|
||||
subcloud from available system data and bring it back to the operational state
|
||||
it was in when the backup procedure took place. The subcloud restore has three
|
||||
phases:
|
||||
|
||||
- Re-install the controller-0 of the subcloud with the current active load
|
||||
running in the SystemController. For subcloud servers that support
|
||||
Redfish Virtual Media Service, this phase can be carried out remotely
|
||||
as part of the CLI.
|
||||
|
||||
- Run Ansible Platform Restore to restore |prod|, from a previous backup on
|
||||
the controller-0 of the subcloud. This phase is also carried out as part
|
||||
of the CLI.
|
||||
|
||||
- Unlock the controller-0 of the subcloud and continue with the steps to
|
||||
restore the remaining nodes of the subcloud where applicable. This phase
|
||||
is carried out by the system administrator, see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
- The SystemController is healthy, and ready to accept **dcmanager** related
|
||||
commands.
|
||||
|
||||
- The subcloud is unmanaged, and not in the process of installation,
|
||||
bootstrap or deployment.
|
||||
|
||||
- The platform backup tar file is already on the subcloud in
|
||||
/opt/platform-backup directory or has been transferred to the
|
||||
SystemController.
|
||||
|
||||
- The subcloud install values have been saved in the **dcmanager** database
|
||||
i.e. the subcloud has been installed remotely as part of :command:`dcmanager subcloud add`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Create the restore_values.yaml file which will be passed to the
|
||||
:command:`dcmanager subcloud restore` command using the ``--restore-values``
|
||||
option. This file contains parameters that will be used during the platform
|
||||
restore phase. Minimally, the **backup_filename** parameter, indicating the
|
||||
file containing a previous backup of the subcloud, must be specified in the
|
||||
yaml file, see :ref:`Run Ansible Restore Playbook Remotely <system-backup-running-ansible-restore-playbook-remotely>`,
|
||||
and, :ref:`Run Restore Playbook Locally on the Controller <running-restore-playbook-locally-on-the-controller>`,
|
||||
for supported restore parameters.
|
||||
|
||||
#. Restore the subcloud, using the dcmanager CLI command, :command:`subcloud restore`
|
||||
and specify the restore values, with the ``--with-install`` option and the
|
||||
subcloud's sysadmin password.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin) $ dcmanager subcloud restore --restore-values /home/sysadmin/subcloud1-restore.yaml --with-install --sysadmin-password <sysadmin_password> subcloud-name-or-id
|
||||
|
||||
Where:
|
||||
|
||||
- ``--restore-values`` must reference the restore values yaml file
|
||||
mentioned in Step 1 of this procedure.
|
||||
|
||||
- ``--with-install`` indicates that a re-install of controller-0 of the
|
||||
subcloud should be done remotely using Redfish Virtual Media Service.
|
||||
|
||||
If the ``--sysadmin-password`` option is not specified, the system
|
||||
administrator will be prompted for the password. The password is masked
|
||||
when it is entered. Enter the sysadmin password for the subcloud.
|
||||
The **dcmanager subcloud restore** can take up to 30 minutes to reinstall
|
||||
and restore the platform on controller-0 of the subcloud.
|
||||
|
||||
#. On the Central Cloud (SystemController), monitor the progress of the
|
||||
subcloud reinstall and restore via the deploy status field of the
|
||||
:command:`dcmanager subcloud list` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud list
|
||||
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
| id | name | management | availability | deploy status | sync |
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
| 1 | subcloud1 | unmanaged | online | installing | unknown |
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
|
||||
#. In case of a failure, check the Ansible log for the corresponding subcloud
|
||||
under /var/log/dcmanager/ansible directory.
|
||||
|
||||
#. When the subcloud deploy status changes to "complete", the controller-0
|
||||
is ready to be unlocked. Log into the controller-0 of the subcloud using
|
||||
its bootstrap IP and unlock the host using the following command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-0
|
||||
|
||||
#. For |AIO|-DX and Standard subclouds, follow the procedure,
|
||||
see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`
|
||||
to restore the rest of the subcloud nodes.
|
||||
|
||||
#. To resume subcloud audit, use the following command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud manage
|
Loading…
x
Reference in New Issue
Block a user