Update Backup and Restore

Some minor changes added


Signed-off-by: Rafael Jardim <rafaeljordao.jardim@windriver.com>
Change-Id: I6eabc540b7c1ec7f73a9665401f107721808aa64
This commit is contained in:
Rafael Jardim 2021-03-08 15:17:39 -03:00
parent b60579b001
commit 946f7d1f4c
6 changed files with 83 additions and 86 deletions

View File

@ -101,7 +101,7 @@ The backup contains details as listed below:
- item=/opt/extension
- dc-vault filesystem for Distributed Cloud system-controller:
- dc-vault filesystem for |prod-dc| system-controller:
- item=/opt/dc-vault

View File

@ -83,14 +83,14 @@ conditions are in place:
network when powered on. If this is not the case, you must configure each
host manually for network boot immediately after powering it on.
- If you are restoring a Distributed Cloud subcloud first, ensure it is in
- If you are restoring a |prod-dc| subcloud first, ensure it is in
an **unmanaged** state on the Central Cloud \(SystemController\) by using
the following commands:
.. code-block:: none
$ source /etc/platform/openrc
~(keystone_admin)$ dcmanager subcloud unmanage <subcloud-name>
~(keystone_admin)]$ dcmanager subcloud unmanage <subcloud-name>
where <subcloud-name> is the name of the subcloud to be unmanaged.
@ -117,17 +117,20 @@ conditions are in place:
#. Install network connectivity required for the subcloud.
#. Ensure the backup file is available on the controller. Run the Ansible
Restore playbook. For more information on restoring the back up file, see
:ref:`Run Restore Playbook Locally on the Controller
#. Ensure that the backup file are available on the controller. Run both
Ansible Restore playbooks, restore\_platform.yml and restore\_user\_images.yml.
For more information on restoring the back up file, see :ref:`Run Restore
Playbook Locally on the Controller
<running-restore-playbook-locally-on-the-controller>`, and :ref:`Run
Ansible Restore Playbook Remotely
<system-backup-running-ansible-restore-playbook-remotely>`.
.. note::
The backup file contains the system data and updates.
The backup files contains the system data and updates.
#. Update the controller's software to the previous updating level.
#. If the backup file contains patches, Ansible Restore playbook
restore\_platform.yml will apply the patches and prompt you to reboot the
system, you will need to re-run Ansible Restore playbook
The current software version on the controller is compared against the
version available in the backup file. If the backed-up version includes
@ -146,13 +149,16 @@ conditions are in place:
LIBCUNIT_CONTROLLER_ONLY Applied 20.06 n/a
STORAGECONFIG Applied 20.06 n/a
Rerun the Ansible Restore Playbook.
Rerun the Ansible Playbook if there were patches applied and you were
prompted to reboot the system.
#. Unlock Controller-0.
#. Restore the local registry using the file restore\_user\_images.yml.
This must be done before unlocking controller-0.
.. code-block:: none
~(keystone_admin)$ system host-unlock controller-0
~(keystone_admin)]$ system host-unlock controller-0
After you unlock controller-0, storage nodes become available and Ceph
becomes operational.
@ -165,37 +171,22 @@ conditions are in place:
$ source /etc/platform/openrc
#. For Simplex systems only, if :command:`wipe_ceph_osds` is set to false,
wait for the apps to transition from 'restore-requested' to the 'applied'
state.
#. Apps transition from 'restore-requested' to 'applying' state, and
from 'applying' state to 'applied' state.
If the apps are in 'apply-failed' state, ensure access to the docker
registry, and execute the following command for all custom applications
that need to be restored:
If apps are transitioned from 'applying' to 'restore-requested' state,
ensure there is network access and access to the docker registry.
.. code-block:: none
The process is repeated once per minute until all apps are transitioned to
'applied'.
~(keystone_admin)$ system application-apply <application>
For example, execute the following to restore stx-openstack.
.. code-block:: none
~(keystone_admin)$ system application-apply stx-openstack
.. note::
If you have a Simplex system, this is the last step in the process.
Wait for controller-0 to be in the unlocked, enabled, and available
state.
#. If you have a Duplex system, restore the controller-1 host.
#. If you have a Duplex system, restore the **controller-1** host.
#. List the current state of the hosts.
.. code-block:: none
~(keystone_admin)$ system host-list
~(keystone_admin)]$ system host-list
+----+-------------+------------+---------------+-----------+------------+
| id | hostname | personality| administrative|operational|availability|
+----+-------------+------------+---------------+-----------+------------+
@ -220,7 +211,7 @@ conditions are in place:
.. code-block:: none
~(keystone_admin)$ system host-unlock controller-1
~(keystone_admin)]$ system host-unlock controller-1
+-----------------+--------------------------------------+
| Property | Value |
+-----------------+--------------------------------------+
@ -235,7 +226,7 @@ conditions are in place:
.. code-block:: none
~(keystone_admin)$ system host-list
~(keystone_admin)]$ system host-list
+----+-------------+------------+---------------+-----------+------------+
| id | hostname | personality| administrative|operational|availability|
+----+-------------+------------+---------------+-----------+------------+
@ -247,9 +238,9 @@ conditions are in place:
| 6 | compute-1 | worker | locked |disabled |offline |
+----+-------------+------------+---------------+-----------+------------+
#. Restore storage configuration. If :command:`wipe_ceph_osds` is set to
**True**, follow the same procedure used to restore controller-1,
beginning with host storage-0 and proceeding in sequence.
#. Restore storage configuration. If :command:`wipe\_ceph\_osds` is set to
**True**, follow the same procedure used to restore **controller-1**,
beginning with host **storage-0** and proceeding in sequence.
.. note::
This step should be performed ONLY if you are restoring storage hosts.
@ -261,12 +252,12 @@ conditions are in place:
the restore procedure without interruption.
Standard with Controller Storage install or reinstall depends on the
:command:`wipe_ceph_osds` configuration:
:command:`wipe\_ceph\_osds` configuration:
#. If :command:`wipe_ceph_osds` is set to **true**, reinstall the
#. If :command:`wipe\_ceph\_osds` is set to **true**, reinstall the
storage hosts.
#. If :command:`wipe_ceph_osds` is set to **false** \(default
#. If :command:`wipe\_ceph\_osds` is set to **false** \(default
option\), do not reinstall the storage hosts.
.. caution::
@ -280,7 +271,7 @@ conditions are in place:
.. code-block:: none
~(keystone_admin)$ ceph -s
~(keystone_admin)]$ ceph -s
cluster:
id: 3361e4ef-b0b3-4f94-97c6-b384f416768d
health: HEALTH_OK
@ -316,41 +307,29 @@ conditions are in place:
Restore the compute \(worker\) hosts following the same procedure used to
restore controller-1.
#. Unlock the compute hosts. The restore is complete.
#. Allow Calico and Coredns pods to be recovered by Kubernetes. They should
all be in 'N/N Running' state.
The state of the hosts when the restore operation is complete is as
follows:
.. code-block:: none
~(keystone_admin)$ system host-list
+----+-------------+------------+---------------+-----------+------------+
| id | hostname | personality| administrative|operational|availability|
+----+-------------+------------+---------------+-----------+------------+
| 1 | controller-0| controller | unlocked |enabled |available |
| 2 | controller-1| controller | unlocked |enabled |available |
| 3 | storage-0 | storage | unlocked |enabled |available |
| 4 | storage-1 | storage | unlocked |enabled |available |
| 5 | compute-0 | worker | unlocked |enabled |available |
| 6 | compute-1 | worker | unlocked |enabled |available |
+----+-------------+------------+---------------+-----------+------------+
~(keystone_admin)]$ kubectl get pods -n kube-system | grep -e calico -e coredns
calico-kube-controllers-5cd4695574-d7zwt 1/1 Running
calico-node-6km72 1/1 Running
calico-node-c7xnd 1/1 Running
coredns-6d64d47ff4-99nhq 1/1 Running
coredns-6d64d47ff4-nhh95 1/1 Running
#. For Duplex systems only, if :command:`wipe_ceph_osds` is set to false, wait
for the apps to transition from 'restore-requested' to the 'applied' state.
If the apps are in 'apply-failed' state, ensure access to the docker
registry, and execute the following command for all custom applications
that need to be restored:
#. Run the :command:`system restore-complete` command.
.. code-block:: none
~(keystone_admin)$ system application-apply <application>
~(keystone_admin)]$ system restore-complete
For example, execute the following to restore stx-openstack.
.. code-block:: none
~(keystone_admin)$ system application-apply stx-openstack
#. Alarms 750.006 alarms disappear one at a time, as the apps are auto
applied.
.. rubric:: |postreq|
@ -359,14 +338,14 @@ conditions are in place:
- Passwords for local user accounts must be restored manually since they
are not included as part of the backup and restore procedures.
- After restoring a Distributed Cloud subcloud, you need to bring it back
- After restoring a |prod-dc| subcloud, you need to bring it back
to the **managed** state on the Central Cloud \(SystemController\), by
using the following commands:
.. code-block:: none
$ source /etc/platform/openrc
~(keystone_admin)$ dcmanager subcloud manage <subcloud-name>
~(keystone_admin)]$ dcmanager subcloud manage <subcloud-name>
where <subcloud-name> is the name of the subcloud to be managed.

View File

@ -9,16 +9,15 @@ Run Ansible Backup Playbook Locally on the Controller
In this method the Ansible Backup playbook is run on the active controller.
Use the following command to run the Ansible Backup playbook and back up the
|prod| configuration, data, and optionally the user container images in
registry.local data:
|prod| configuration, data, and user container images in registry.local data:
.. code-block:: none
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=<sysadmin password> admin_password=<sysadmin password>" [ -e "backup_user_local_registry=true" ]
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=<sysadmin password> admin_password=<sysadmin password>" -e "backup_user_local_registry=true"
The <admin\_password\> and <ansible\_become\_pass\> need to be set correctly
using the ``-e`` option on the command line, or an override file, or in the Ansible
secret file.
The <admin\_password> and <ansible\_become\_pass\> need to be set correctly
using the ``-e`` option on the command line, or an override file, or in the
Ansible secret file.
The output files will be named:

View File

@ -42,16 +42,35 @@ and target it at controller-0.
|prefix|\_Cluster:
ansible_host: 128.224.141.74
#. Create an ansible secrets file.
.. code-block:: none
~(keystone_admin)]$ cat <<EOF > secrets.yml
vault_password_change_responses:
yes/no: 'yes'
sysadmin*: 'sysadmin'
(current) UNIX password: 'sysadmin'
New password: 'Li69nux*'
Retype new password: 'Li69nux*'
admin_password: Li69nux*
ansible_become_pass: Li69nux*
ansible_ssh_pass: Li69nux*
EOF
#. Run Ansible Backup playbook:
.. code-block:: none
~(keystone_admin)$ ansible-playbook <path-to-backup-playbook-entry-file> --limit host-name -i <inventory-file> -e <optional-extra-vars>
~(keystone_admin)]$ ansible-playbook <path-to-backup-playbook-entry-file> --limit host-name -i <inventory-file> -e "backup_user_local_registry=true"
The generated backup tar file can be found in <host\_backup\_dir>, that
is, /home/sysadmin, by default. You can overwrite it using the ``-e``
is, /home/sysadmin, by default. You can overwrite it using the **-e**
option on the command line or in an override file.
.. warning::
If a backup of the **local registry images** file is created, the
file is not copied from the remote machine to the local machine.
If a backup of the **local registry images** file is created, the file
is not copied from the remote machine to the local machine. The
inventory\_hostname\_docker\_local\_registry\_backup\_timestamp.tgz
file needs to copied off the host machine to be used if a restore is
needed.

View File

@ -16,7 +16,7 @@ following command to run the Ansible Restore playbook:
.. code-block:: none
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
The |prod| restore supports two optional modes, keeping the Ceph cluster data
intact or wiping the Ceph cluster.
@ -43,7 +43,7 @@ intact or wiping the Ceph cluster.
.. code-block:: none
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St0rlingX* admin_password=St0rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true"
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St0rlingX* admin_password=St0rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true"
.. note::
If the backup contains patches, Ansible Restore playbook will apply
@ -63,4 +63,4 @@ For example:
.. code-block:: none
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_user_images.yml -e "initial_backup_dir=/home/sysadmin backup_filename=localhost_docker_local_registry_backup_2020_07_15_21_24_22.tgz ansible_become_pass=St0rlingX*"
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_user_images.yml -e "initial_backup_dir=/home/sysadmin backup_filename=localhost_docker_local_registry_backup_2020_07_15_21_24_22.tgz ansible_become_pass=St0rlingX*"

View File

@ -47,7 +47,7 @@ In this method you can run Ansible Restore playbook and point to controller-0.
.. code-block:: none
~(keystone_admin)$ ansible-playbook path-to-restore-platform-playbook-entry-file --limit host-name -i inventory-file -e optional-extra-vars
~(keystone_admin)]$ ansible-playbook path-to-restore-platform-playbook-entry-file --limit host-name -i inventory-file -e optional-extra-vars
where optional-extra-vars can be:
@ -89,7 +89,7 @@ In this method you can run Ansible Restore playbook and point to controller-0.
.. parsed-literal::
~(keystone_admin)$ ansible-playbook /localdisk/designer/jenkins/tis-stx-dev/cgcs-root/stx/ansible-playbooks/playbookconfig/src/playbooks/restore_platform.yml --limit |prefix|\_Cluster -i $HOME/br_test/hosts -e "ansible_become_pass=St0rlingX* admin_password=St0rlingX* ansible_ssh_pass=St0rlingX* initial_backup_dir=$HOME/br_test backup_filename= |prefix|\_Cluster_system_backup_2019_08_08_15_25_36.tgz ansible_remote_tmp=/home/sysadmin/ansible-restore"
~(keystone_admin)]$ ansible-playbook /localdisk/designer/jenkins/tis-stx-dev/cgcs-root/stx/ansible-playbooks/playbookconfig/src/playbooks/restore_platform.yml --limit |prefix|\_Cluster -i $HOME/br_test/hosts -e "ansible_become_pass=St0rlingX* admin_password=St0rlingX* ansible_ssh_pass=St0rlingX* initial_backup_dir=$HOME/br_test backup_filename= |prefix|\_Cluster_system_backup_2019_08_08_15_25_36.tgz ansible_remote_tmp=/home/sysadmin/ansible-restore"
.. note::
If the backup contains patches, Ansible Restore playbook will apply
@ -105,7 +105,7 @@ In this method you can run Ansible Restore playbook and point to controller-0.
.. code-block:: none
~(keystone_admin)$ ansible-playbook path-to-restore-user-images-playbook-entry-file --limit host-name -i inventory-file -e optional-extra-vars
~(keystone_admin)]$ ansible-playbook path-to-restore-user-images-playbook-entry-file --limit host-name -i inventory-file -e optional-extra-vars
where optional-extra-vars can be:
@ -144,4 +144,4 @@ In this method you can run Ansible Restore playbook and point to controller-0.
.. parsed-literal::
~(keystone_admin)$ ansible-playbook /localdisk/designer/jenkins/tis-stx-dev/cgcs-root/stx/ansible-playbooks/playbookconfig/src/playbooks/restore_user_images.ym --limit |prefix|\_Cluster -i $HOME/br_test/hosts -e "ansible_become_pass=St0rlingX* ansible_ssh_pass=St0rlingX* initial_backup_dir=$HOME/br_test backup_filename= |prefix|\_Cluster_docker_local_registry_backup_2020_07_15_21_24_22.tgz ansible_remote_tmp=/sufficient/space backup_dir=/sufficient/space"
~(keystone_admin)]$ ansible-playbook /localdisk/designer/jenkins/tis-stx-dev/cgcs-root/stx/ansible-playbooks/playbookconfig/src/playbooks/restore_user_images.ym --limit |prefix|\_Cluster -i $HOME/br_test/hosts -e "ansible_become_pass=St0rlingX* ansible_ssh_pass=St0rlingX* initial_backup_dir=$HOME/br_test backup_filename= |prefix|\_Cluster_docker_local_registry_backup_2020_07_15_21_24_22.tgz ansible_remote_tmp=/sufficient/space backup_dir=/sufficient/space"