
Remove section from customer documentation as it is not applicable. Remove occurrences of "multi-drivers-switch" Change-Id: I3b538fd4df2dbd331f9b1ed56ee2d809adcacd58 Signed-off-by: Elisamara Aoki Gonçalves <elisamaraaoki.goncalves@windriver.com> (cherry picked from commit 414493ce3db0f82bd5e200ef3d743e5487e33743)
29 KiB
Install a Subcloud in Phases
Subclouds can be deployed using individual phases.
When a subcloud is deployed using a single operation dcmanager subcloud add
that comprises different deployment phases and if there is a failure on
any of the phases, you would need to wait for the entire operation to
timeout.
Thus, instead of using a single operation, a subcloud can be deployed by executing each phase individually. This gives a user more control of the deployment process, allowing the execution of individual phases one at a time, with the possibility of aborting and resuming the deployment.
After physically installing the hardware and network connectivity of a subcloud, the subcloud deployment process executes the following phases in the central cloud:
The
dcmanager subcloud deploy create
command.Sets up the subcloud configuration in the central cloud.
The
dcmanager subcloud deploy install
command.Uses Redfish Virtual Media Service to remotely install the ISO on controller-0 in the subcloud.
The
dcmanager subcloud deploy bootstrap
command.Uses Ansible to bootstrap on controller-0 in the subcloud.
The
dcmanager subcloud deploy config
command.Uses Ansible to run deploy config on controller-0 in the subcloud.
The
dcmanager subcloud deploy complete
command.If the subcloud is manually configured post bootstrap, it concludes the subcloud deployment.
Note
Remove all the removable USB storage devices from the subcloud servers before installing a Redfish remote subcloud.
The
software upload
CLI command, executed in the system controller region, uploads and stores the ISO to be used to install subclouds in a single location, i.e. internally on the system controller in/opt/dc-vault/software
. You can upload a separate ISO for each major release.Note
This is required only once and does not have to be done for every subcloud install.
The
dcmanager
command recognizes boot image files ending in.iso
and signature files ending in.sig
. For example:~(keystone_admin)]$ software --os-region-name SystemController upload |installer-image-name|.iso |installer-image-name|.sig
The uploaded subcloud ISO must be at the same patch level as the system controller. This ensures that the subcloud boot image aligns with the patch level of the load to be installed on the subcloud.
Warning
If the patch level of the uploaded subcloud ISO does not match the system controller patch level, the resulting subcloud patch state may not align with the system controller patch state.
Run the following command on controller-0 to upload the subcloud ISO for the new release. You can specify either the full file path or relative paths to the
*.iso
bootimage file and to the*.sig
bootimage signature file.$ source /etc/platform/openrc ~(keystone_admin)]$ software --os-region-name SystemController upload <bootimage>.iso <bootimage>.sig This operation will take a while. Please wait. ..... Uploaded release nn.nn.n is the same as current release on the controller +----------------------------+--------------+ | Uploaded File | Release | +----------------------------+--------------+ | bootimage.iso | nn.nn.n | +----------------------------+--------------+
This command must be done on controller-0.
Note
This may take a few minutes to complete.
In order to deploy subclouds from either controller, all local files
that are referenced in the subcloud-bootstrap-values.yaml
file must exist on both controllers (for example,
/home/sysadmin/docker-registry-ca-cert.pem
).
At the subcloud location, physically install the servers and network connectivity required for the subcloud.
Note
Do not power off the servers. The host portion of the server can be powered off, but the portion of the server must be powered and accessible from the system controller.
There is no need to wipe the disks.
Note
The servers require connectivity to a gateway router that provides IP routing between the subcloud management or admin subnet and the system controller management subnet, and between the subcloud subnet and the system controller subnet.
Create the
subcloud-install-values.yaml
file and use the content to pass the file into thedcmanager subcloud deploy create
command, using the--install-values
command option.Note
If the subcloud will be manually installed, skip this step and go to deploy bootstrap.
Note
If your controller is on a ZTSystems Triton server that requires a longer timeout value, you can now use the
rd.net.timeout.ipv6dad
dracut parameter to specify an increased timeout value for dracut to wait for the interface to have carrier, and complete IPv6 duplicate address detection . For the ZTSystems server, this can take more than four minutes. It is recommended that you set this value to 300 seconds, by specifying the following in thesubcloud-install-values.yaml
file:rd.net.timeout.ipv6dad: 300
Note
The
wait_for_timeout
value must be chosen based on your network performance (bandwidth, latency, and quality) and should be increased if the network does not meet the minimum or timeout requirements. The default value of 3600 seconds is based on a network bandwidth of 100 Mbps with a 50 ms delay.partner
For example,
--install-values /home/sysadmin/subcloud-install-values.yaml
.bootstrap_interface: <bootstrap_interface_name> # e.g. eno1 bootstrap_address: <bootstrap_interface_ip_address> # e.g.128.224.151.183 bootstrap_address_prefix: <bootstrap_netmask> # e.g. 23
# Board Management Controller bmc_address: <BMCs_IPv4_or_IPv6_address> # e.g. 128.224.64.180 bmc_username: <bmc_username> # e.g. root
# If the subcloud's bootstrap IP interface and the system controller are not on the # same network, then the customer must configure a default route or static route # so that the Central Cloud can login bootstrap the newly installed subcloud.
# If nexthop_gateway is specified and the network_address is not specified then a # default route will be configured. Otherwise, if a network_address is specified then # a static route will be configured.
nexthop_gateway: <default_route_address> for # e.g. 128.224.150.1 (required) network_address: <static_route_address> # e.g. 128.224.144.0 network_mask: <static_route_mask> # e.g. 255.255.254.0
# Installation type codes #0 - Standard Controller, Serial Console #1 - Standard Controller, Graphical Console #2 - AIO, Serial Console #3 - AIO, Graphical Console install_type: 3
# Optional parameters defaults can be modified by uncommenting the option with a modified value.
# This option can be set to extend the installing stage timeout value # wait_for_timeout: 3600
# Set this options for https no_check_certificate: True
# If the bootstrap interface is a vlan interface then configure the vlan ID. # bootstrap_vlan: <vlan_id>
# Override default filesystem device. # rootfs_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0" # boot_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0"
# Set the value for persistent file system (/opt/platform-backup). # The value must be whole number (in MB) that is greater than or equal # to 30000. persistent_size: 30000
# Configure custom arguments applied at boot within the installed subcloud. # Multiple boot arguments can be provided by separating each argument by a # single comma. Spaces are not allowed. # Example: extra_boot_params: out-of-tree-drivers=none # extra_boot_params:
Note
By default, 30GB is allocated for
/opt/platform-backup
. If additional persistent disk space is required, the partition can be increased in the next subcloud redeploy using the following commands:To increase
/opt/platform-backup
to 40GB, add the persistent_size: 40000 parameter to thesubcloud-install-values.yaml
file.Use the
dcmanager subcloud update
command to save the configuration change for the next subcloud redeployment.~(keystone_admin)]$ dcmanager subcloud update --install-values <subcloud-install-values.yaml> <subcloud-name>
For a new subcloud deployment, use the
dcmanager subcloud deploy create
command with thesubcloud-install-values.yaml
file containing the desiredpersistent_size
value.At the system controller, create a
/home/sysadmin/subcloud-bootstrap-values.yaml
overrides file for the subcloud.For example:
system_mode: simplex name: "subcloud" description: "test" location: "loc" management_subnet: 192.168.101.0/24 management_start_address: 192.168.101.2 management_end_address: 192.168.101.50 management_gateway_address: 192.168.101.1 external_oam_subnet: 10.10.10.0/24 external_oam_gateway_address: 10.10.10.1 external_oam_floating_address: 10.10.10.12 systemcontroller_gateway_address: 192.168.204.101 docker_registries: k8s.gcr.io: url: registry.central:9001/k8s.gcr.io gcr.io: url: registry.central:9001/gcr.io ghcr.io: url: registry.central:9001/ghcr.io quay.io: url: registry.central:9001/quay.io docker.io: url: registry.central:9001/docker.io docker.elastic.co: url: registry.central:9001/docker.elastic.co registry.k8s.io: url: registry.central:9001/registry.k8s.io icr.io: url: registry.central:9001/icr.io defaults: username: sysinv password: <sysinv_password> type: docker
Where <sysinv_password> can be found by running the following command as 'sysadmin' on the central cloud:
$ keyring get sysinv services
In the above example, if the admin network is used for communication between the subcloud and system controller, then the
management_gateway_address
parameter should be replaced with admin subnet information.For example:
management_subnet: 192.168.101.0/24 management_start_address: 192.168.101.2 management_end_address: 192.168.101.50 admin_subnet: 192.168.102.0/24 admin_start_address: 192.168.102.2 admin_end_address: 192.168.102.50 admin_gateway_address: 192.168.102.1
This configuration will install container images from the local registry on your central cloud. The central cloud's local registry's HTTPS Certificate must have the central cloud's IP, registry.local and registry.central in the certificate's list. For example, a valid certificate contains a list:
"DNS.1: registry.local DNS.2: registry.central IP.1: floating_management IP.2: floating_OAM"
If required, run the following command on the central cloud prior to bootstrapping the subcloud to install the new certificate for the central cloud with the updated list:
~(keystone_admin)]$ system certificate-install -m docker_registry path_to_cert
If you prefer to install container images from the default external registries, make the following substitutions for the docker_registries sections of the file.
docker_registries: defaults: username: <your_default_registry_username> password: <your_default_registry_password>
Note
To modify the kernel using the ansible bootstrap, see
modify-the-kernel-in-the-cli-39f25220ec1b
.Create the subcloud using
dcmanager
.When calling the
subcloud deploy create
command, specify the install values, bootstrap values, deploy config values, and the subcloud's sysadmin password.~(keystone_admin)]$ dcmanager subcloud deploy create \ --bootstrap-address <oam_ip_address_of_subclouds_controller-0> \ --bootstrap-values /home/sysadmin/subcloud1-bootstrap-values.yaml \ --deploy-config /home/sysadmin/subcloud1-deploy-config.yaml \ --install-values /home/sysadmin/install-values.yaml \ --bmc-password <bmc_password> --release <software-release>
If
--sysadmin-password
is not specified, you are prompted to enter it once the full command is invoked. The password is masked when it is entered.Enter the sysadmin password for the subcloud:
(Optional) The
--deploy-config
option must reference the deployment configuration file mentioned above. In the deployment configurations, static routes from the management or admin interface of a subcloud to the system controller's management subnet must be explicitly listed. This ensures that the subcloud comes online after deployment. If the admin network is used for communication between the system controller and subcloud, the deployment configuration file must include both an admin network type and a management network type interface.(Optional) The
--bmc-password <password>
option is used for subcloud installation and is required only if the--install-values
option is specified.If the
--bmc-password <password>
option is omitted and the--install-values
option is specified, the system administrator will be prompted to enter it, following thedcmanager subcloud deploy create
command. This option is ignored if the--install-values
option is not specified. The password is masked when it is entered.Enter the bmc password for the subcloud:
The
dcmanager subcloud show
ordcmanager subcloud list
command can be used to view subcloud deploy create status.The deploy status field has the following values for this phase:
Install the subcloud using
dcmanager
.To install the subcloud using Redfish Virtual Media Service, use the
subcloud deploy install
command. Both--install-values
and--release
parameters are optional if they were provided previously, and will replace them if present on request.~(keystone_admin)]$ dcmanager subcloud deploy install <subcloud-name> \ --install-values /home/sysadmin/install-values.yaml \ --sysadmin-password <sysadmin_password> \ --bmc-password <bmc_password> \ --release <software-release>
If
--sysadmin-password
is not specified, you are prompted to enter it once the full command is invoked. The password is masked when it is entered.Enter the sysadmin password for the subcloud:
(Optional) The
--bmc-password <password>
option is used for subcloud installation and is required only if the--install- values
option is specified.If the
--bmc-password <password>
option is omitted and the--install-values
option is specified, the system administrator will be prompted to enter it, following thedcmanager subcloud add
command. This option is ignored if the--install-values
option is not specified. The password is masked when it is entered.Enter the bmc password for the subcloud:
The
dcmanager subcloud show
ordcmanager subcloud list
command can be used to view subcloud deploy install progress.The deploy status field has the following values for this phase:
Bootstrap the subcloud using
dcmanager
.To bootstrap the subcloud, use the
subcloud deploy bootstrap
command. Both--bootstrap-address
and--bootstrap-values
parameters are optional, and will replace the previous values if provided.~(keystone_admin)]$ dcmanager subcloud deploy bootstrap <subcloud-name> \ --bootstrap-address <oam_ip_address_of_subclouds_controller-0> \ --bootstrap-values /home/sysadmin/subcloud1-bootstrap-values.yaml \ --sysadmin-password <sysadmin_password>
If
--sysadmin-password
is not specified, you are prompted to enter it once the full command is invoked. The password is masked when it is entered.Enter the sysadmin password for the subcloud:
The
dcmanager subcloud show
ordcmanager subcloud list
command can be used to view subcloud deploy bootstrap progress.The deploy status field has the following values for this phase:
Configure the subcloud using
dcmanager
.To configure the subcloud, use the
subcloud deploy config
command. The--deploy-config
parameter is optional if it was provided previously, and will replace it if present on request.~(keystone_admin)]$ dcmanager subcloud deploy config <subcloud-name> \ --deploy-config /home/sysadmin/subcloud1-deploy-config.yaml \ --sysadmin-password <sysadmin_password>
If
--sysadmin-password
is not specified, you are prompted to enter it once the full command is invoked. The password is masked when it is entered.Enter the sysadmin password for the subcloud:
(Optional) The
--deploy-config
option must reference the deployment configuration file mentioned above. In the deployment configurations, static routes from the management or admin interface of a subcloud to the system controller's management subnet must be explicitly listed. This ensures that the subcloud comes online after deployment. If the admin network is used for communication between the system controller and subcloud, the deployment configuration file must include both an admin network type and a management network type interface.The
dcmanager subcloud show
ordcmanager subcloud list
command can be used to view the subcloud deploy configuration status.The deploy status field has the following values for this phase:
Complete the subcloud deployment using
dcmanager
.When manually configuring the subcloud, the deployment must be concluded by running the
subcloud deploy complete
command.~(keystone_admin)]$ dcmanager subcloud deploy complete <subcloud-name>
The deploy status field will transition to
Complete
.At the Central Cloud / System Controller, monitor the progress of the subcloud install, bootstrapping, and deployment by using the deploy status field of the
dcmanager subcloud list
command.Caution
If there is a failure during installation or bootstraping, you can resume the deployment using the dcmanager subcloud deploy resume command, or you can run individual phases with dcmanager subcloud deploy install, dcmanager subcloud deploy bootstrap or dcmanager subcloud deploy config.
If
deploy_status
shows an installation, bootstrap, or deployment failure state, you can use thedcmanager subcloud errors
command in order to get more detailed information about the failure.For example:
[sysadmin@controller-0 ~(keystone_admin)]$ dcmanager subcloud errors 1 FAILED bootstrapping playbook of (subcloud). detail: fatal: [subcloud]: FAILED! => changed=true failed_when_result: true msg: non-zero return code 500 Server Error: Internal Server Error ("manifest unknown: manifest unknown") Image download failed: admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06 500 Server Error: Internal Server Error ("Get https://admin-2.cumulus .mss.com: 30093/v2/: dial tcp: lookup admin-2.cumulus.mss.com on 10.41.0.1:53: read udp 10.41.1.3:40251->10.41.0.1:53: i/o timeout") Image download failed: gcd.io/kubebuilder/kube-rdac-proxy:v0.11.0 500 Server Error: Internal Server Error ("Get https://gcd.io/v2/: dial tcp: lookup gcd.io on 10.41.0.1:53: read udp 10.41.1.3:52485->10.41.0.1:53: i/o timeout") raise Exception("Failed to download images %s" % failed_downloads) Exception: Failed to download images ["admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06", "gcd.io kubebuilder/kube-rdac-proxy:v0.11.0"] FAILED TASK: TASK [common/push-docker-images Download images and push to local registry] Wednesday 12 October 2022 12:27:31 +0000 (0:00:00.042) 0:16:34.495
You can also monitor detailed logging of the subcloud installation, bootstrapping, and deployment by monitoring the log file
/var/log/dcmanager/ansible/<subcloud_name>_playbook_output.log
on the active controller in the central cloud.For example:
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud_playbook_output.log k8s.gcr.io: {password: secret, url: null} quay.io: {password: secret, url: null} ) TASK [bootstrap/bringup-essential-services : Mark the bootstrap as completed] *** changed: [subcloud] PLAY RECAP ********************************************************************* subcloud : ok=230 changed=137 unreachable=0 failed=0
Note
The subcloud_playbook_output.log can rotate, the previous log file will be subcloud_playbook_output.log.1.
If the install, bootstrap, or config phase fails, it can be re-executed using the same command that is used to trigger the deploy phase.
If more debugging is required, set
rvmc_debug_level
in theinstall-values.yaml
file. For more information, seeinstalling-a-subcloud-using-redfish-platform-management-service
.
Abort and Resume the Subcloud Deployment
The subcloud deployment can be aborted and resumed for the install, bootstrap and config phases.
To abort the deployment, use the subcloud deploy abort
command.
~(keystone_admin)]$ dcmanager subcloud deploy abort <subcloud-name>
View the subcloud deploy abort status by running the dcmanager subcloud show
or dcmanager subcloud list
command.
The deploy status field has the following values for this phase:
Note
If the dcmanager subcloud deploy abort <subcloud-name>
command is called during the installation phase, the subcloud will be
shut down via Redfish Virtual Media Service.
To resume the deployment, use the subcloud deploy resume
command.
The parameter values will be reused from the previous phases if new ones are not provided in the request.
~(keystone_admin)]$ dcmanager subcloud deploy resume <subcloud-name> \
--bootstrap-address <oam_ip_address_of_subclouds_controller-0> \
--bootstrap-values /home/sysadmin/subcloud1-bootstrap-values.yaml \
--sysadmin-password <sysadmin_password> \
--deploy-config /home/sysadmin/subcloud1-deploy-config.yaml \
--install-values /home/sysadmin/install-values.yaml \
--bmc-password <bmc_password> \
--release <software-release>
For detailed procedures on manually configuring the subcloud for the desired deployment configuration, see the post-bootstrap steps of .
Check and update docker registry credentials on the subcloud:
REGISTRY="docker-registry" SECRET_UUID='system service-parameter-list | fgrep $REGISTRY | fgrep auth-secret | awk '{print $10}'' SECRET_REF='openstack secret list | fgrep $ {SECRET_UUID} | awk '{print $2}'' openstack secret get ${SECRET_REF} --payload -f value
The secret payload should be
username: sysinv password:<password>
. If the secret payload isusername: admin password:<password>
, seeupdating-docker-registry-credentials-on-a-subcloud
for more information.For more information on bootstrapping and deploying, see the procedures listed under
install-a-subcloud
.Add static route for nodes in subclouds to access openldap service.
In system, openldap service is running on central cloud. In order for the nodes in the subclouds to access openldap service, such as
ssh
to the nodes as openldap users, a static route to the System Controller is required to be added in these nodes. This applies to controller nodes, worker nodes, and storage nodes (nodes that have sssd running).The static route can be added on each of the nodes in the subcloud using the system CLI.
The following examples show how to add the static route in controller node and worker node:
system host-route-add controller-0 mgmt0 <Central Cloud mgmt subnet> 64 <Gateway IP address> system host-route-add compute-0 mgmt0 <Central Cloud mgmt subnet> 64 <Gateway IP address>
The static route can also be added using deploy config by adding the route in its configuration file.
The following examples show how to add the route configuration in a controller and worker host profiles of the deploy config's configuration file:
Controller node: --- apiVersion: starlingx.windriver.com/v1 kind: HostProfile metadata: labels: controller-tools.k8s.io: "1.0" name: controller-0-profile namespace: deployment spec: administrativeState: unlocked bootDevice: /dev/disk/by-path/pci-0000:c3:00.0-nvme-1 console: ttyS0,115200n8 installOutput: text ...... routes: - gateway: <Gateway IP address> activeinterface: mgmt0 metric: 1 prefix: 64 subnet: <Central Cloud mgmt subnet> Worker node: --- apiVersion: starlingx.windriver.com/v1 kind: HostProfile metadata: labels: controller-tools.k8s.io: "1.0" name: compute-0-profile namespace: deployment spec: administrativeState: unlocked boardManagement: credentials: password: secret: bmc-secret type: dynamic bootDevice: /dev/disk/by-path/pci-0000:00:1f.2-ata-1.0 clockSynchronization: ntp console: ttyS0,115200n8 installOutput: text ...... routes: - gateway: <Gateway IP address> interface: mgmt0 metric: 1 prefix: 64 subnet: <Central Cloud mgmt subnet>