
This change addresses a long-standing issue in rST documentation imported from XML. That import process added backslash escapes in front of various characters. The three most common being '(', ')', and '_'. These instances are removed. Signed-off-by: Ron Stone <ronald.stone@windriver.com> Change-Id: Id43a9337ffcd505ccbdf072d7b29afdb5d2c997e
1117 lines
39 KiB
ReStructuredText
1117 lines
39 KiB
ReStructuredText
.. _r7-0-release-notes-85446867da2a:
|
|
|
|
.. All please review and comment
|
|
|
|
==================
|
|
R7.0 Release Notes
|
|
==================
|
|
|
|
.. contents::
|
|
:local:
|
|
:depth: 1
|
|
|
|
---------
|
|
ISO image
|
|
---------
|
|
|
|
The pre-built ISO (CentOS and Debian) and Docker images for StarlingX release
|
|
7.0 are located at the ``CENGN StarlingX mirror`` repos:
|
|
|
|
- http://mirror.starlingx.cengn.ca/mirror/starlingx/release/7.0.0/centos/flock/outputs/
|
|
|
|
- http://mirror.starlingx.cengn.ca/mirror/starlingx/release/7.0.0/debian/monolithic/outputs/
|
|
|
|
.. note::
|
|
Debian is a Technology Preview Release and only supports |AIO-SX| in StarlingX
|
|
Release 7.0 and uses the same docker images as CentOS.
|
|
|
|
------
|
|
Branch
|
|
------
|
|
|
|
The source code for StarlingX release 7.0 is available in the r/stx.7.0
|
|
branch in the `StarlingX repositories <https://opendev.org/starlingx>`_.
|
|
|
|
----------
|
|
Deployment
|
|
----------
|
|
|
|
To deploy StarlingX release 7.0. Refer to :ref:`Consuming StarlingX <index-intro-27197f27ad41>`.
|
|
|
|
For detailed installation instructions, see `R7.0 Installation Guides <https://docs.starlingx.io/deploy_install_guides/index-install-e083ca818006.html>`_.
|
|
|
|
-----------------------------
|
|
New features and enhancements
|
|
-----------------------------
|
|
|
|
.. start-new-features-r7
|
|
|
|
The list below provides a detailed list of new features and links to the
|
|
associated user guides (if applicable).
|
|
|
|
*********************
|
|
Debian-based Solution
|
|
*********************
|
|
|
|
|prod| |deb-eval-release| inherits the 5.10 kernel version from the Yocto
|
|
project introduced in |prod| |deb-510-kernel-release|, i.e. the Debian
|
|
5.10 kernel is replaced with the Yocto project 5.10.x kernel (linux-yocto).
|
|
|
|
|prod| |deb-eval-release| is a Technology Preview Release of Debian |prod|
|
|
for evaluation purposes.
|
|
|
|
|prod| |deb-eval-release| release runs Debian Bullseye (11.3). It is limited in
|
|
scope to the |AIO-SX| configuration, |deb-dup-std-na|. It is also limited in
|
|
scope to Kubernetes apps and does not yet support running OpenStack on Debian.
|
|
|
|
.. .. **See**:
|
|
|
|
.. .. - :ref:`index-debian-introduction-8eb59cf0a062`
|
|
|
|
.. .. - :ref:`operational-impacts-9cf2e610b5b3`
|
|
|
|
|
|
******************************
|
|
Istio Service Mesh Application
|
|
******************************
|
|
|
|
The Istio Service Mesh application is integrated into |prod| as a system
|
|
application.
|
|
|
|
Istio provides traffic management, observability as well as security as a
|
|
Kubernetes service mesh. For more information, see `https://istio.io/
|
|
<https://istio.io/>`__.
|
|
|
|
|prod| includes istio-operator container to manage the life cycle management
|
|
of the Istio components.
|
|
|
|
**See**: :ref:`Istio Service Mesh Application <istio-service-mesh-application-eee5ebb3d3c4>`
|
|
|
|
|
|
*********************************
|
|
Pod Security Admission Controller
|
|
*********************************
|
|
|
|
The Beta release of Pod Security Admission (PSA) controller is available in
|
|
StarlingX release 7.0 as a Technology Preview feature. It will replace Pod
|
|
Security Policies in a future release.
|
|
|
|
PSA controller acts on creation and modification of the pod and determines
|
|
if it should be admitted based on the requested security context and the
|
|
policies defined. It provides a more usable k8s-native solution to enforce
|
|
Pod Security Standards.
|
|
|
|
**See**:
|
|
|
|
- https://kubernetes.io/docs/concepts/security/pod-security-admission/
|
|
- :ref:`pod-security-admission-controller-8e9e6994100f`
|
|
|
|
|
|
****************************************
|
|
Platform Application Components Revision
|
|
****************************************
|
|
|
|
The following applications have been updated to a new version in |prod|
|
|
Release 7.0.
|
|
|
|
- cert-manager, 1.7.1
|
|
- metric-server, 1.0.18
|
|
- nginx-ingress-controller, 1.1.1
|
|
- oidc-dex, 2.31.1
|
|
|
|
**cert-manager**
|
|
|
|
The upgrade of cert-manager from 0.15.0 to 1.7.1 deprecated support for
|
|
cert manager API versions cert-manager.io/v1alpha2 and cert-manager.io/v1alpha3.
|
|
When creating cert-manager |CRDs| (certificates, issuers, etc) with |prod-long|,
|
|
Release 7.0, use API version of cert-manager.io/v1.
|
|
|
|
Cert manager resources that are already deployed on the system will be
|
|
automatically converted to API version of cert-manager.io/v1. Anything created
|
|
using automation or previous |prod-long| releases should be converted with the
|
|
cert-manager kubectl plugin using the instructions documented in
|
|
https://cert-manager.io/docs/installation/upgrading/upgrading-0.16-1.0/#converting-resources
|
|
before being deployed to the new release.
|
|
|
|
**metric-server**
|
|
|
|
In |prod| Release 7.0 the Metrics Server will NOT be automatically updated.
|
|
To update the Metrics Server, see :ref:`Install Metrics Server <kubernetes-admin-tutorials-metrics-server>`
|
|
|
|
**oidc-dex**
|
|
|
|
|prod-long| Release 7.0 supports helm-overrides of oidc-auth-apps application.
|
|
The recommended and legacy example Helm overrides of
|
|
``oidc-auth-apps`` are supported for upgrades, as described in |prod|
|
|
documentation :ref:`User Authentication Using Windows Active Directory
|
|
<user-authentication-using-windows-active-directory-security-index>`.
|
|
|
|
**See**: :ref:`configure-oidc-auth-applications`.
|
|
|
|
|
|
***************
|
|
Bond CNI plugin
|
|
***************
|
|
|
|
The Bond CNI plugin v1.0.1 is now supported in |prod-long| Release 7.0.
|
|
|
|
The Bond CNI plugin provides a method for aggregating multiple network
|
|
interfaces into a single logical "bonded" interface.
|
|
|
|
To add a bonded interface to a container, a network attachment definition of
|
|
type ``bond`` must be created and added as a network annotation in the pod
|
|
specification. The bonded interfaces can either be taken from the host or
|
|
container based on the value of the ``linksInContainer`` parameter in the
|
|
network attachment definition. It provides transparent link aggregation for
|
|
containerized applications via K8s configuration for improved redundancy and
|
|
link capacity.
|
|
|
|
**See**:
|
|
|
|
:ref:`integrate-the-bond-cni-plugin-2c2f14733b46`
|
|
|
|
************************************************
|
|
PTP GNSS and Time SyncE Support for 5G Solutions
|
|
************************************************
|
|
|
|
Intel's E810 Westport Channel and **Logan Beach NICs** support a built-in GNSS
|
|
module and the ability to distribute clock via Synchronous Ethernet (SyncE).
|
|
This feature allows a PPS signal to be taken in via the |GNSS| module and
|
|
redistributed to additional NICs on the same host or on different hosts.
|
|
This behavior is configured on |prod| using the ``clock`` instance type in
|
|
the |PTP| configuration.
|
|
|
|
These parameters are used to enable the UFL/SMA ports, recovered clock
|
|
syncE etc. Refer to the user's guide for the Westport Channel or Logan
|
|
Beach NIC for additional details on how to operate these cards.
|
|
|
|
**See**: :ref:`SyncE and Introduction <gnss-and-synce-support-62004dc97f3e>`
|
|
|
|
*********************
|
|
PTP Clock TAI Support
|
|
*********************
|
|
|
|
A special ptp4l instance level parameter is provided to allow a PTP node to
|
|
set the **currentUtcOffsetValid** flag in its announce messages and to
|
|
correctly set the CLOCK_TAI on the system.
|
|
|
|
**PTP Multiple NIC Boundary Clock Configuration**
|
|
StarlingX 7.0 provides support for PTP multiple NIC Boundary Clock
|
|
configuration. Multiple instances of ptp4l, phc2sys and ts2phc can now be
|
|
configured on each host to support a variety of configurations including
|
|
Telecom Boundary clock (T-BC), Telecom Grand Primary clock (T-GM) and Ordinary
|
|
clock (OC).
|
|
|
|
**See**:
|
|
|
|
:ref:`ptp-server-config-index`
|
|
|
|
|
|
**************************************************
|
|
Enhanced Parallel Operations for Distributed Cloud
|
|
**************************************************
|
|
|
|
The following operations can now be performed on a larger number of subclouds
|
|
in parallel. The supported maximum parallel number ranges from 100 to 500
|
|
depending on the type of operation.
|
|
|
|
- Subcloud Install
|
|
- Subcloud Deployment (bootstrap and deploy)
|
|
- Subcloud Manage and Sync
|
|
- Subcloud Application Deployment/Update
|
|
- Patch Orchestration
|
|
- Upgrade Orchestration
|
|
- Firmware Update Orchestration
|
|
- Kubernetes Upgrade Orchestration
|
|
- Kubernetes Root CA Orchestration
|
|
- Upgrade Prestaging
|
|
|
|
**************
|
|
--force option
|
|
**************
|
|
|
|
The ``--force`` option has been added to the :command:`dcmanager upgrade-strategy create`
|
|
command. This option upgrades both online and offline subclouds for a single
|
|
subcloud or a group of subclouds.
|
|
|
|
See :ref:`Distributed Upgrade Orchestration Process Using the CLI <distributed-upgrade-orchestration-process-using-the-cli>`
|
|
|
|
****************************************
|
|
Subcloud Local Installation Enhancements
|
|
****************************************
|
|
|
|
Error preventive mechanisms have been implemented for subcloud local
|
|
installation.
|
|
|
|
- Pre-check to avoid overwriting installed systems
|
|
- Unified ISO image for multiple systems and disk configurations
|
|
- Prestage execution optimization
|
|
- Effective handling of resized docker and docker-distribution filesystems
|
|
over subcloud upgrade
|
|
|
|
See :ref:`Subcloud Deployment with Local Installation <subcloud-deployment-with-local-installation-4982449058d5>`.
|
|
|
|
***********************************************
|
|
Distributed Cloud Horizon Orchestration Updates
|
|
***********************************************
|
|
|
|
You can use the Horizon Web interface to upgrade Kubernetes across the
|
|
Distributed Cloud system by applying the Kubernetes upgrade strategy for
|
|
Distributed Cloud Orchestration.
|
|
|
|
**See**: :ref:`apply-a-kubernetes-upgrade-strategy-using-horizon-2bb24c72e947`
|
|
|
|
You can use Horizon to update the device/firmware image across the Distributed
|
|
Cloud system by applying the firmware update strategy for Distributed Cloud
|
|
Update Orchestration.
|
|
|
|
**See**: :ref:`apply-the-firmware-update-strategy-using-horizon-e78bf11c7189`
|
|
|
|
You can upgrade the platform software across the Distributed Cloud
|
|
system by applying the upgrade strategy for Distributed Cloud
|
|
Upgrade Orchestration.
|
|
|
|
**See**: :ref:`apply-the-upgrade-strategy-using-horizon-d0aab18cc724`
|
|
|
|
You can use the Horizon Web interface as an alternative to the CLI for managing
|
|
device / firmware image update strategies (Firmware update).
|
|
|
|
**See**: :ref:`create-a-firmware-update-orchestration-strategy-using-horizon-cfecdb67cef2`
|
|
|
|
You can use the Horizon Web interface as an alternative to the CLI for managing
|
|
Kubernetes upgrade strategies.
|
|
|
|
**See**: :ref:`create-a-kubernetes-upgrade-orchestration-using-horizon-16742b62ffb2`
|
|
|
|
For more information, **See**: :ref:`Distributed Cloud Guide <index-dist-cloud-kub-95bef233eef0>`
|
|
|
|
********************************************
|
|
Security Audit Logging for Platform Commands
|
|
********************************************
|
|
|
|
|prod| logs all StarlingX REST API operator commands, except commands that use
|
|
only GET requests. |prod| also logs all |SNMP| commands, including ``GET``
|
|
requests.
|
|
|
|
**See**:
|
|
|
|
- :ref:`Operator Command Logging <operator-command-logging>`
|
|
- :ref:`Operator Login/Authentication Logging <operator-login-authentication-logging>`
|
|
|
|
**********************************
|
|
Security Audit Logging for K8s API
|
|
**********************************
|
|
|
|
Kubernetes API Logging can be enabled and configured in |prod|, and can be
|
|
fully configured and enabled at bootstrap time. Post-bootstrap, Kubernetes API
|
|
logging can only be enabled or disabled. Kubernetes auditing provides a
|
|
security-relevant, chronological set of records documenting the sequence of
|
|
actions in a cluster.
|
|
|
|
**See**: :ref:`kubernetes-operator-command-logging-663fce5d74e7`
|
|
|
|
*******************************************
|
|
Playbook for managing local LDAP Admin User
|
|
*******************************************
|
|
|
|
The purpose of this playbook is to simplify and automate the management of
|
|
composite Local |LDAP| accounts across multiple |DC| systems or standalone
|
|
systems. A composite Local |LDAP| account is defined as a Local |LDAP| account
|
|
that also has a unique keystone account with admin role credentials and access
|
|
to a K8S serviceAccount with ``cluster-admin`` role credentials.
|
|
|
|
**See**: :ref:`Manage Composite Local LDAP Accounts at Scale <manage-local-ldap-39fe3a85a528>`
|
|
|
|
*******************************
|
|
Kubernetes Custom Configuration
|
|
*******************************
|
|
|
|
Kubernetes configuration can be customized during deployment by specifying
|
|
bootstrap overrides in the ``localhost.yml`` file during the Ansible bootstrap
|
|
process. Additionally, you can also override the **extraVolumes** section in the
|
|
apiserver to add new configuration files that may be needed by the server.
|
|
|
|
**See**: :ref:`Kubernetes Custom Configuration <kubernetes-custom-configuration-31c1fd41857d>`
|
|
|
|
***********************************
|
|
Configuring Host CPU MHz Parameters
|
|
***********************************
|
|
|
|
Some hosts support setting a maximum frequency for their CPU cores (application
|
|
cores and platform cores). You may need to configure a maximum scaled
|
|
frequency to avoid variability due to power and thermal issues when configured
|
|
for maximum performance. For these hosts, the parameters control the maximum
|
|
frequency of their CPU cores.
|
|
|
|
Enable support for power saving modes available on Intel processors to
|
|
facilitate a balance between latency and power consumption.
|
|
|
|
- |prod-long| permits the CPU "p-states" and "c-states" control via the BIOS
|
|
|
|
- Introduce a new starlingx-realtime tuned profile, specifically configured
|
|
for the low latency profile to align with Intel recommendations for maximum
|
|
performance while enabling support for higher c-states.
|
|
|
|
**See**: :ref:`Host CPU MHz Parameters Configuration <host-cpu-mhz-parameters-configuration-d9ccf907ede0>`
|
|
|
|
**************************
|
|
vRAN Intel Tool Enablement
|
|
**************************
|
|
|
|
The following open-source |vRAN| tools are delivered in the following container
|
|
image, ``docker.io/starlingx/stx-centos-tools-dev:stx.7.0-v1.0.1``:
|
|
|
|
- ``dmidecode``
|
|
|
|
- ``net-tools``
|
|
|
|
- ``iproute``
|
|
|
|
- ``ethtool``
|
|
|
|
- ``tcpdump``
|
|
|
|
- ``turbostat``
|
|
|
|
- OPAE Tools (`Open Programmable Acceleration Engine
|
|
<https://opae.github.io/latest/>`__, ``fpgainfo``, ``fpgabist``, etc.)
|
|
|
|
- ACPICA Tools (``acpidump``, ``acpixtract``, etc.)
|
|
|
|
- PCM Tools (`https://github.com/opcm/pcm <https://github.com/opcm/pcm>`__,
|
|
pcm, pcm-core, etc.)
|
|
|
|
**See**: :ref:`vRAN Tools <vran-tools-2c3ee49f4b0b>`
|
|
|
|
******************************
|
|
Coredump Configuration Support
|
|
******************************
|
|
|
|
You can change the default core dump configuration used to create *core*
|
|
files. These are images of the system's working memory used to debug crashes or
|
|
abnormal exits.
|
|
|
|
**See**: :ref:`Change the Default Coredump Configuration <change-the-default-coredump-configuration-51ff4ce0c9ae>`
|
|
|
|
******************************
|
|
FluxCD replaces Airship Armada
|
|
******************************
|
|
|
|
|prod| application management provides a wrapper around FluxCD and Kubernetes
|
|
Helm (see `https://github.com/helm/helm <https://github.com/helm/helm>`__)
|
|
for managing containerized applications. FluxCD is a tool for managing multiple
|
|
Helm charts with dependencies by centralizing all configurations in a single
|
|
FluxCD YAML definition and providing life-cycle hooks for all Helm releases.
|
|
|
|
**See**: :ref:`StarlingX Application Package Manager <kubernetes-admin-tutorials-starlingx-application-package-manager>`.
|
|
**See**: FluxCD Limitation note applicable to |prod| Release 7.0.
|
|
|
|
******************
|
|
Kubernetes Upgrade
|
|
******************
|
|
|
|
Kubernetes has now been upgraded to k8s 1.23.1 and is the default version for
|
|
|prod-long| Release 7.0.
|
|
|
|
|
|
******************************
|
|
NetApp Trident Version Upgrade
|
|
******************************
|
|
|
|
|prod| |prod-ver| contains the installer for Trident 22.01
|
|
|
|
If you are using NetApp Trident in |prod| |prod-ver| and have upgraded from
|
|
the |prod| previous version, ensure that your NetApp backend version is
|
|
compatible with Trident 22.01.
|
|
|
|
.. note::
|
|
You need to upgrade the NetApp Trident driver to 22.01 before
|
|
upgrading Kubernetes to 1.22.
|
|
|
|
**See**: :ref:`upgrade-the-netapp-trident-software-c5ec64d213d3`
|
|
|
|
.. end-new-features-r7
|
|
|
|
----------
|
|
Bug status
|
|
----------
|
|
|
|
**********
|
|
Fixed bugs
|
|
**********
|
|
|
|
This release provides fixes for a number of defects. Refer to the StarlingX bug
|
|
database to review the R7.0 `Fixed Bugs <https://bugs.launchpad.net/starlingx/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=FIXRELEASED&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&field.tag=stx.7.0&field.tags_combinator=ANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on&search=Search>`_.
|
|
|
|
.. All please confirm if any Limitations need to be removed / added
|
|
|
|
---------------------------------
|
|
Known Limitations and Workarounds
|
|
---------------------------------
|
|
|
|
The following are known limitations you may encounter with your |prod| Release
|
|
7.0 and earlier releases. Workarounds are suggested where applicable.
|
|
|
|
.. note::
|
|
|
|
These limitations are considered temporary and will likely be resolved in
|
|
a future release.
|
|
|
|
****************
|
|
Debian Bootstrap
|
|
****************
|
|
|
|
On CentOS bootstrap worked even if **dns_servers** were not present in the
|
|
localhost.yml. This does not work for Debian bootstrap.
|
|
|
|
**Workaround**: You need to configure the **dns_servers** parameter in the
|
|
localhost.yml, as long as no |FQDNs| were used in the bootstrap overrides in
|
|
the localhost.yml file for Debian bootstrap.
|
|
|
|
***********************
|
|
Installing a Debian ISO
|
|
***********************
|
|
|
|
Installing a Debian ISO may fail with a message that the system is in emergency
|
|
mode. This occurs if the disks and disk partitions are not completely wiped
|
|
before the install, especially if the server was previously running a CentOS
|
|
ISO.
|
|
|
|
**Workaround**: When installing a lab for any Debian install, the disks must
|
|
first be completely wiped using the following procedure before starting
|
|
an install.
|
|
|
|
Use the following wipedisk commands to run before any Debian install for
|
|
each disk (eg: sda, sdb, etc):
|
|
|
|
.. code-block:: none
|
|
|
|
sudo wipedisk
|
|
# Show
|
|
sudo sgdisk -p /dev/sda
|
|
# Clear part table
|
|
sudo sgdisk -o /dev/sda
|
|
|
|
.. note::
|
|
|
|
The above commands must be run before any Debian install. The above
|
|
commands must also be run if the same lab is used for CentOS installs after
|
|
the lab was previously running a Debian ISO.
|
|
|
|
**********************************************
|
|
PTP 110.119 Alarm raised incorrectly on Debian
|
|
**********************************************
|
|
|
|
|PTP| Alarm 100.119 (controller not locked on remote PTP Grand Master
|
|
(|PTS| (Primary Time Source)) is raised on |prod| Release 7.0 systems
|
|
running Debian after configuring |PTP| instances. This alarm does not affect
|
|
system operations.
|
|
|
|
**Workaround**: Manually delete the alarm using the :command:`fm alarm-delete`
|
|
command.
|
|
|
|
.. note::
|
|
|
|
Lock/Unlock and reboot events will cause the alarm to reappear. Use the
|
|
workaround after these operations are completed.
|
|
|
|
***********************************************
|
|
N3000 image updates are not supported on Debian
|
|
***********************************************
|
|
|
|
N3000 image ``update`` and ``show`` operations are not supported on Debian.
|
|
Support will be included in a future release.
|
|
|
|
**Workaround**: Do not attempt these operations on a |prod| Release 7.0
|
|
Debian system.
|
|
|
|
**********************************
|
|
Security Audit Logging for K8s API
|
|
**********************************
|
|
|
|
- In |prod| Release 7.0, a custom policy file can only be created at bootstrap
|
|
in ``apiserver_extra_volumes`` section. If a custom policy file was
|
|
configured at bootstrap, then after bootstrap the user has the option to
|
|
configure the parameter ``audit-policy-file`` to either this custom policy
|
|
file (``/etc/kubernetes/my-audit-policy-file.yml``) or the
|
|
default policy file ``/etc/kubernetes/default-audit-policy.yaml``. If no
|
|
custom policy file was configured at bootstrap, then the user can only
|
|
configure the parameter ``audit-policy-file`` to the default policy file.
|
|
|
|
Only the parameter ``audit-policy-file`` is configurable after bootstrap, so
|
|
the other parameters (``audit-log-path``, ``audit-log-maxsize``,
|
|
``audit-log-maxage`` and ``audit-log-maxbackup``) cannot be changed at
|
|
runtime.
|
|
|
|
**Workaround**: NA
|
|
|
|
**See**: :ref:`kubernetes-operator-command-logging-663fce5d74e7`.
|
|
|
|
******************************************
|
|
PTP is not supported on Broadcom 57504 NIC
|
|
******************************************
|
|
|
|
|PTP| is not supported on the Broadcom 57504 NIC.
|
|
|
|
**Workaround**: Do not configure |PTP| instances on the Broadcom 57504
|
|
NIC.
|
|
|
|
*********************************************************************
|
|
Backup and Restore: Remote restore fails to gather the SSH public key
|
|
*********************************************************************
|
|
|
|
IPv4 |AIO-DX| remote restore fails while running restore bootstrap.
|
|
|
|
**Workaround**: If remote restore fails due to failed authentication, perform
|
|
the restore on the box instead of remotely. This issue is caused when
|
|
remote restore fails to gather the SSH public key.
|
|
|
|
************************************************************************************************
|
|
Deploying an App using nginx controller fails with internal error after controller.name override
|
|
************************************************************************************************
|
|
|
|
An Helm override of controller.name to the nginx-ingress-controller app may
|
|
result in errors when creating ingress resources later on.
|
|
|
|
Example of Helm override:
|
|
|
|
.. code-block::none
|
|
|
|
cat <<EOF> values.yml
|
|
controller:
|
|
name: notcontroller
|
|
|
|
EOF
|
|
|
|
~(keystone_admin)$ system helm-override-update nginx-ingress-controller ingress-nginx kube-system --values values.yml
|
|
+----------------+-----------------------+
|
|
| Property | Value |
|
|
+----------------+-----------------------+
|
|
| name | ingress-nginx |
|
|
| namespace | kube-system |
|
|
| user_overrides | controller: |
|
|
| | name: notcontroller |
|
|
| | |
|
|
+----------------+-----------------------+
|
|
|
|
~(keystone_admin)$ system application-apply nginx-ingress-controller
|
|
|
|
**Workaround**: NA
|
|
|
|
**********************************************************************
|
|
Cloud installation causes disk errors in /dev/mapper/mpatha and CentOS
|
|
**********************************************************************
|
|
|
|
During installation of the HPE SAN disk, an error "/dev/mapper/mpatha is invalid"
|
|
occurs (intermittent), and CentOS is not bootable (intermittent).
|
|
|
|
**Workaround**: Reboot the |prod-long| system to solve the issue.
|
|
|
|
****************************************
|
|
Optimization with a Large number of OSDs
|
|
****************************************
|
|
|
|
As Storage nodes are not optimized, you may need to optimize your Ceph
|
|
configuration for balanced operation across deployments with a high number of
|
|
|OSDs|. This results in an alarm being generated even if the installation
|
|
succeeds.
|
|
|
|
800.001 - Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s'
|
|
|
|
**Workaround**: To optimize your storage nodes with a large number of |OSDs|, it
|
|
is recommended to use the following commands:
|
|
|
|
.. code-block:: none
|
|
|
|
$ ceph osd pool set kube-rbd pg_num 256
|
|
$ ceph osd pool set kube-rbd pgp_num 256
|
|
|
|
***************
|
|
PTP Limitations
|
|
***************
|
|
|
|
NICs using the Intel Ice NIC driver may report the following in the `ptp4l``
|
|
logs, which might coincide with a |PTP| port switching to ``FAULTY`` before
|
|
re-initializing.
|
|
|
|
.. code-block:: none
|
|
|
|
ptp4l[80330.489]: timed out while polling for tx timestamp
|
|
ptp4l[80330.CGTS-30543489]: increasing tx_timestamp_timeout may correct
|
|
this issue, but it is likely caused by a driver bug
|
|
|
|
This is due to a limitation of the Intel ICE driver.
|
|
|
|
**Workaround**: The recommended workaround is to set the ``tx_timestamp_timeout``
|
|
parameter to 700 (ms) in the ``ptp4l`` config using the following command.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system ptp-instance-parameter-add ptp-inst1 tx_timestamp_timeout=700
|
|
|
|
***********************************************************************
|
|
Multiple Lock/Unlock operations on the controllers causes 100.104 alarm
|
|
***********************************************************************
|
|
|
|
Performing multiple Lock/Unlock operations on controllers while |prod-os|
|
|
is applied can fill the partition and can trigger an 100.104 alarm.
|
|
|
|
**Workaround**: Check the amount of space used by core dump using the
|
|
:command:`controller-0:~$ ls -lha /var/lib/systemd/coredump`` command.
|
|
Core dumps related to MariaDB can be safely deleted.
|
|
|
|
***************
|
|
BPF is disabled
|
|
***************
|
|
|
|
|BPF| cannot be used in the PREEMPT_RT/low latency kernel, due to the inherent
|
|
incompatibility between PREEMPT_RT and |BPF|, see, https://lwn.net/Articles/802884/.
|
|
|
|
Some packages might be affected when PREEMPT_RT and BPF are used together. This
|
|
includes the following, but not limited to these packages.
|
|
|
|
- libpcap
|
|
- libnet
|
|
- dnsmasq
|
|
- qemu
|
|
- nmap-ncat
|
|
- libv4l
|
|
- elfutils
|
|
- iptables
|
|
- tcpdump
|
|
- iproute
|
|
- gdb
|
|
- valgrind
|
|
- kubernetes
|
|
- cni
|
|
- strace
|
|
- mariadb
|
|
- libvirt
|
|
- dpdk
|
|
- libteam
|
|
- libseccomp
|
|
- binutils
|
|
- libbpf
|
|
- dhcp
|
|
- lldpd
|
|
- containernetworking-plugins
|
|
- golang
|
|
- i40e
|
|
- ice
|
|
|
|
**Workaround**: StarlingX recommends not to use BPF with real time kernel.
|
|
If required it can still be used, for example, debugging only.
|
|
|
|
*****************
|
|
crashkernel Value
|
|
*****************
|
|
|
|
**crashkernel=auto** is no longer supported by newer kernels, and hence the
|
|
v5.10 kernel will not support the "auto" value.
|
|
|
|
**Workaround**: |prod-long| uses **crashkernel=512m** instead of
|
|
**crashkernel=auto**.
|
|
|
|
********************************************************
|
|
New Kubernetes Taint on Controllers for Standard Systems
|
|
********************************************************
|
|
|
|
In |prod| future Releases, a new Kubernetes taint will be applied to
|
|
controllers for Standard systems in order to prevent application pods from
|
|
being scheduled on controllers; since controllers in Standard systems are
|
|
intended ONLY for platform services. If application pods MUST run on
|
|
controllers, a Kubernetes toleration of the taint can be specified in the
|
|
application's pod specifications.
|
|
|
|
**Workaround**: Customer applications that need to run on controllers on
|
|
Standard systems will need to be enabled/configured for Kubernetes toleration
|
|
in order to ensure the applications continue working after an upgrade to
|
|
|prod-long| Release 7.0 and |prod-long| future Releases.
|
|
|
|
You can specify toleration for a pod through the pod specification (PodSpec).
|
|
For example:
|
|
|
|
.. code-block:: none
|
|
|
|
spec:
|
|
....
|
|
template:
|
|
....
|
|
spec
|
|
tolerations:
|
|
- key: "node-role.kubernetes.io/master"
|
|
operator: "Exists"
|
|
effect: "NoSchedule"
|
|
|
|
**See**: `Taints and Tolerations <https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/>`__.
|
|
|
|
**************************************************************
|
|
Ceph alarm 800.001 interrupts the AIO-DX upgrade orchestration
|
|
**************************************************************
|
|
|
|
Upgrade orchestration fails on |AIO-DX| systems that have Ceph enabled.
|
|
|
|
**Workaround**: Clear the Ceph alarm 800.001 by manually upgrading both
|
|
controllers and using the following command:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ ceph mon enable-msgr2
|
|
|
|
Ceph alarm 800.001 is cleared.
|
|
|
|
***************************************************************
|
|
Storage Nodes are not considered part of the Kubernetes cluster
|
|
***************************************************************
|
|
|
|
When running the :command:`system kube-host-upgrade-list` command the output
|
|
must only display controller and worker hosts that have control-plane and kubelet
|
|
components. Storage nodes do not have any of those components and so are not
|
|
considered a part of the Kubernetes cluster.
|
|
|
|
**Workaround**: Do not include Storage nodes.
|
|
|
|
***************************************************************************************
|
|
Backup and Restore of ACC100 (Mount Bryce) configuration requires double unlock attempt
|
|
***************************************************************************************
|
|
|
|
After restoring from a previous backup with an Intel ACC100 processing
|
|
accelerator device, the first unlock attempt will be refused since this
|
|
specific kind of device will be updated in the same context.
|
|
|
|
**Workaround**: A second attempt after few minutes will accept and unlock the
|
|
host.
|
|
|
|
**************************************
|
|
Application Pods with SRIOV Interfaces
|
|
**************************************
|
|
|
|
Application Pods with |SRIOV| Interfaces require a **restart-on-reboot: "true"**
|
|
label in their pod spec template.
|
|
|
|
Pods with |SRIOV| interfaces may fail to start after a platform restore or
|
|
Simplex upgrade and persist in the **Container Creating** state due to missing
|
|
PCI address information in the CNI configuration.
|
|
|
|
**Workaround**: Application pods that require|SRIOV| should add the label
|
|
**restart-on-reboot: "true"** to their pod spec template metadata. All pods with
|
|
this label will be deleted and recreated after system initialization, therefore
|
|
all pods must be restartable and managed by a Kubernetes controller
|
|
\(i.e. DaemonSet, Deployment or StatefulSet) for auto recovery.
|
|
|
|
Pod Spec template example:
|
|
|
|
.. code-block:: none
|
|
|
|
template:
|
|
metadata:
|
|
labels:
|
|
tier: node
|
|
app: sriovdp
|
|
restart-on-reboot: "true"
|
|
|
|
|
|
***********************
|
|
Management VLAN Failure
|
|
***********************
|
|
|
|
If the Management VLAN fails on the active System Controller, communication
|
|
failure 400.005 is detected, and alarm 280.001 is raised indicating
|
|
subclouds are offline.
|
|
|
|
**Workaround**: System Controller will recover and subclouds are manageable
|
|
when the Management VLAN is restored.
|
|
|
|
********************************
|
|
Host Unlock During Orchestration
|
|
********************************
|
|
|
|
If a host unlock during orchestration takes longer than 30 minutes to complete,
|
|
a second reboot may occur. This is due to the delays, VIM tries to abort. The
|
|
abort operation triggers the second reboot.
|
|
|
|
**Workaround**: NA
|
|
|
|
**************************************
|
|
Storage Nodes Recovery on Power Outage
|
|
**************************************
|
|
|
|
Storage nodes take 10-15 minutes longer to recover in the event of a full
|
|
power outage.
|
|
|
|
**Workaround**: NA
|
|
|
|
*************************************
|
|
Ceph OSD Recovery on an AIO-DX System
|
|
*************************************
|
|
|
|
In certain instances a Ceph OSD may not recover on an |AIO-DX| system
|
|
\(for example, if an OSD comes up after a controller reboot and a swact
|
|
occurs), and remains in the down state when viewed using the :command:`ceph -s`
|
|
command.
|
|
|
|
**Workaround**: Manual recovery of the OSD may be required.
|
|
|
|
********************************************************
|
|
Using Helm with Container-Backed Remote CLIs and Clients
|
|
********************************************************
|
|
|
|
If **Helm** is used within Container-backed Remote CLIs and Clients:
|
|
|
|
- You will NOT see any helm installs from |prod| Platform's system
|
|
Armada applications.
|
|
|
|
**Workaround**: Do not directly use **Helm** to manage |prod| Platform's
|
|
system Armada applications. Manage these applications using
|
|
:command:`system application` commands.
|
|
|
|
- You will NOT see any helm installs from end user applications, installed
|
|
using **Helm** on the controller's local CLI.
|
|
|
|
**Workaround**: It is recommended that you manage your **Helm**
|
|
applications only remotely; the controller's local CLI should only be used
|
|
for management of the |prod| Platform infrastructure.
|
|
|
|
*********************************************************************
|
|
Remote CLI Containers Limitation for StarlingX Platform HTTPS Systems
|
|
*********************************************************************
|
|
|
|
The python2 SSL lib has limitations with reference to how certificates are
|
|
validated. If you are using Remote CLI containers, due to a limitation in
|
|
the python2 SSL certificate validation, the certificate used for the 'ssl'
|
|
certificate should either have:
|
|
|
|
#. CN=IPADDRESS and SAN=empty or,
|
|
|
|
#. CN=FQDN and SAN=FQDN
|
|
|
|
**Workaround**: Use CN=FQDN and SAN=FQDN as CN is a deprecated field in
|
|
the certificate.
|
|
|
|
*******************************************************************
|
|
Cert-manager does not work with uppercase letters in IPv6 addresses
|
|
*******************************************************************
|
|
|
|
Cert-manager does not work with uppercase letters in IPv6 addresses.
|
|
|
|
**Workaround**: Replace the uppercase letters in IPv6 addresses with lowercase
|
|
letters.
|
|
|
|
.. code-block:: none
|
|
|
|
apiVersion: cert-manager.io/v1
|
|
kind: Certificate
|
|
metadata:
|
|
name: oidc-auth-apps-certificate
|
|
namespace: test
|
|
spec:
|
|
secretName: oidc-auth-apps-certificate
|
|
dnsNames:
|
|
- ahost.com
|
|
ipAddresses:
|
|
- fe80::903a:1c1a:e802::11e4
|
|
issuerRef:
|
|
name: cloudplatform-interca-issuer
|
|
kind: Issuer
|
|
|
|
*******************************
|
|
Kubernetes Root CA Certificates
|
|
*******************************
|
|
|
|
Kubernetes does not properly support **k8s_root_ca_cert** and **k8s_root_ca_key**
|
|
being an Intermediate CA.
|
|
|
|
**Workaround**: Accept internally generated **k8s_root_ca_cert/key** or
|
|
customize only with a Root CA certificate and key.
|
|
|
|
************************
|
|
Windows Active Directory
|
|
************************
|
|
|
|
- **Limitation**: The Kubernetes API does not support uppercase IPv6 addresses.
|
|
|
|
**Workaround**: The issuer_url IPv6 address must be specified as lowercase.
|
|
|
|
- **Limitation**: The refresh token does not work.
|
|
|
|
**Workaround**: If the token expires, manually replace the ID token. For
|
|
more information, see, :ref:`Obtain the Authentication Token Using the Browser <obtain-the-authentication-token-using-the-browser>`.
|
|
|
|
- **Limitation**: TLS error logs are reported in the **oidc-dex** container
|
|
on subclouds. These logs should not have any system impact.
|
|
|
|
**Workaround**: NA
|
|
|
|
- **Limitation**: **stx-oidc-client** liveness probe sometimes reports
|
|
failures. These errors may not have system impact.
|
|
|
|
**Workaround**: NA
|
|
|
|
.. Stx LP Bug: https://bugs.launchpad.net/starlingx/+bug/1846418
|
|
|
|
************
|
|
BMC Password
|
|
************
|
|
|
|
The BMC password cannot be updated.
|
|
|
|
**Workaround**: In order to update the BMC password, de-provision the BMC,
|
|
and then re-provision it again with the new password.
|
|
|
|
****************************************
|
|
Application Fails After Host Lock/Unlock
|
|
****************************************
|
|
|
|
In some situations, application may fail to apply after host lock/unlock due to
|
|
previously evicted pods.
|
|
|
|
**Workaround**: Use the :command:`kubectl delete` command to delete the evicted
|
|
pods and reapply the application.
|
|
|
|
***************************************
|
|
Application Apply Failure if Host Reset
|
|
***************************************
|
|
|
|
If an application apply is in progress and a host is reset it will likely fail.
|
|
A re-apply attempt may be required once the host recovers and the system is
|
|
stable.
|
|
|
|
**Workaround**: Once the host recovers and the system is stable, a re-apply
|
|
may be required.
|
|
|
|
********************************
|
|
Pod Recovery after a Host Reboot
|
|
********************************
|
|
|
|
On occasions some pods may remain in an unknown state after a host is rebooted.
|
|
|
|
**Workaround**: To recover these pods kill the pod. Also based on `https://github.com/kubernetes/kubernetes/issues/68211 <https://github.com/kubernetes/kubernetes/issues/68211>`__
|
|
it is recommended that applications avoid using a subPath volume configuration.
|
|
|
|
****************************
|
|
Rare Node Not Ready Scenario
|
|
****************************
|
|
|
|
In rare cases, an instantaneous loss of communication with the active
|
|
**kube-apiserver** may result in kubernetes reporting node\(s) as stuck in the
|
|
"Not Ready" state after communication has recovered and the node is otherwise
|
|
healthy.
|
|
|
|
**Workaround**: A restart of the **kublet** process on the affected node\(s)
|
|
will resolve the issue.
|
|
|
|
*************************
|
|
Platform CPU Usage Alarms
|
|
*************************
|
|
|
|
Alarms may occur indicating platform cpu usage is \>90% if a large number of
|
|
pods are configured using liveness probes that run every second.
|
|
|
|
**Workaround**: To mitigate either reduce the frequency for the liveness
|
|
probes or increase the number of platform cores.
|
|
|
|
*******************
|
|
Pods Using isolcpus
|
|
*******************
|
|
|
|
The isolcpus feature currently does not support allocation of thread siblings
|
|
for cpu requests (i.e. physical thread +HT sibling).
|
|
|
|
**Workaround**: NA
|
|
|
|
*****************************
|
|
system host-disk-wipe command
|
|
*****************************
|
|
|
|
The system host-disk-wipe command is not supported in this release.
|
|
|
|
**Workaround**: NA
|
|
|
|
*************************************************************
|
|
Restrictions on the Size of Persistent Volume Claims (PVCs)
|
|
*************************************************************
|
|
|
|
There is a limitation on the size of Persistent Volume Claims (PVCs) that can
|
|
be used for all StarlingX Platform Releases.
|
|
|
|
**Workaround**: It is recommended that all PVCs should be a minimum size of
|
|
1GB. For more information, see, `https://bugs.launchpad.net/starlingx/+bug/1814595 <https://bugs.launchpad.net/starlingx/+bug/1814595>`__.
|
|
|
|
***************************************************************
|
|
Sub-Numa Cluster Configuration not Supported on Skylake Servers
|
|
***************************************************************
|
|
|
|
Sub-Numa cluster configuration is not supported on Skylake servers.
|
|
|
|
**Workaround**: For servers with Skylake Gold or Platinum CPUs, Sub-NUMA
|
|
clustering must be disabled in the BIOS.
|
|
|
|
*****************************************************************
|
|
The ptp-notification-demo App is Not a System-Managed Application
|
|
*****************************************************************
|
|
|
|
The ptp-notification-demo app is provided for demonstration purposes only.
|
|
Therefore, it is not supported on typical platform operations such as Backup
|
|
and Restore.
|
|
|
|
**Workaround**: NA
|
|
|
|
*************************************************************************
|
|
Deleting image tags in registry.local may delete tags under the same name
|
|
*************************************************************************
|
|
|
|
When deleting image tags in the registry.local docker registry, you should be
|
|
aware that the deletion of an **<image-name:tag-name>** will delete all tags
|
|
under the specified <image-name> that have the same 'digest' as the specified
|
|
<image-name:tag-name>. For more information, see, :ref:`Delete Image Tags in the Docker Registry <delete-image-tags-in-the-docker-registry-8e2e91d42294>`.
|
|
|
|
**Workaround**: NA
|
|
|
|
*****************
|
|
Vault Application
|
|
*****************
|
|
|
|
The Vault application is not supported in |prod| Release 7.0.
|
|
|
|
**Workaround**: NA
|
|
|
|
*********************
|
|
Portieris Application
|
|
*********************
|
|
|
|
The Portieris application is not supported in |prod| Release 7.0.
|
|
|
|
**Workaround**: NA
|
|
|
|
------------------
|
|
Deprecated Notices
|
|
------------------
|
|
|
|
***********************
|
|
Control Group parameter
|
|
***********************
|
|
|
|
The control group (cgroup) parameter **kmem.limit_in_bytes** has been
|
|
deprecated, and results in the following message in the kernel's log buffer
|
|
(dmesg) during boot-up and/or during the Ansible bootstrap procedure:
|
|
"kmem.limit_in_bytes is deprecated and will be removed. Please report your
|
|
usecase to linux-mm@kvack.org if you depend on this functionality." This
|
|
parameter is used by a number of software packages in |prod|, including,
|
|
but not limited to, **systemd, docker, containerd, libvirt** etc.
|
|
|
|
**Workaround**: NA. This is only a warning message about the future deprecation
|
|
of an interface.
|
|
|
|
****************************
|
|
Airship Armada is deprecated
|
|
****************************
|
|
|
|
StarlingX Release 7.0 introduces FluxCD based applications that utilize FluxCD
|
|
Helm/source controller pods deployed in the flux-helm Kubernetes namespace.
|
|
Airship Armada support is now considered to be deprecated. The Armada pod will
|
|
continue to be deployed for use with any existing Armada based applications but
|
|
will be removed in StarlingX Release 8.0, once the stx-openstack Armada
|
|
application is fully migrated to FluxCD.
|
|
|
|
**Workaround**: NA
|