Floating IPv6 addresses are configured with "preferred_lft 0," which
designates them as deprecated. As a result, when the host initiates
traffic, the static network address is selected as the source.
This can lead to situations where the upstream switch does not
observe packets originating from the floating address. Consequently,
the switch may lack the necessary MAC address mapping to deliver
traffic destined for the floating address. While switches are expected
to perform NDP Neighbor Solicitations to discover the floating
address's MAC, this process is not always reliable, especially when
relying solely on external traffic directed towards the host.
To accelerate the network's learning of the floating address's MAC,
the Unsolicited Advertisement (analogous to gratuitous ARP in IPv4) is
transmitted.
This behavior is primarily observed during the initial unlock of
controller-0, before the Service-Manager (SM) takes over the
management of the floating addresses. Subsequent unlocks do not
exhibit this issue.
Test Plan
=========
[PASS] install controller-0 on a AIO-DX configuration (IPv6), execute
network configuration and perform the fist unlock, upon system
return observe from external devices connected to controller-0
interfaces that "Unsolicited Advertisement" are sent related
to the floating addresses.
Closes-Bug: 2101145
Change-Id: If2a446ec5a8668b4dbb3e583aa8cb06cb49da0ff
Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>
This reverts commit 4d7a0438f626638c7a5b307ca32bfd2616f7661e.
Reason for revert: generated auto file in incorrect order, causing ifup to fail for some labels during startup.
Change-Id: Ic549a293bf28edf29f9d49d2a954791af4711f20
In this commit, added the code to set permissions to 600
for all .crt files in /etc/kubernetes/pki directory.
This commit will be in effect during deployment of peer
nodes using pxe server.
TEST CASES:
PASSED: Run full build, system install, bootstrap and unlock (SX)
PASSED: System install, bootstrap, unlock and swact (DX)
PASSED: Checked permission using below command
"ls -al /etc/kubernetes/pki/*.crt"
PASSED: Checked whether certificates are accessible and readable
Example:
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text
PASSED: Checked status of kubernetes cluster and pods.
PASSED: No alarms when ran "fm alarm-list"
Story: 2011334
Task: 51677
Depends-On: https://review.opendev.org/c/starlingx/ansible-playbooks/+/940238
Change-Id: I9f05b0e9e35910d5a1a113d2be02635d48bc1063
Signed-off-by: sshaikh1 <sirin.shaikh@windriver.com>
Currently, various file permissions under /var/log/ are more
permissive than 640. To comply with the CIS benchmark
requirements, the permissions should be set to 640 or more
restrictive.
This change updates the permissions and ownership of files
under /var/log/ to 640. Ownership is also set to root:root
wherever possible.
Below are the exception where permissions or ownership are not updated:
- /var/log/keystone/keystone.log: ownership set to keystone:keystone
After changing the user and group to root:root, Ansible bootstrap
is failing as keystone is unable to write to keystone.log.
- /var/log/flux/helm-controller.log: ownership set to nobody:nogroup
We can't change this to root:root because container won't be able
to write the logs.
- /var/log/flux/source-controller.log: ownership set to
nobody:nogroup. We can't change this to root:root because
container won't be able to write the logs.
- /var/log/puppet/masterhttp.log: mode set to 660. Changed
permission to 640, it reverts to the same permission (660) after
some time.
- /var/log/puppet/masterhttp.log: ownership set to puppet:puppet.
Changed ownership to "root:root", it reverts to the old
ownership "puppet:puppet" after some time.
- /var/log/horizon_sm.log: mode set to 644. Unable to modify it
because it is generated after this manifest completes execution.
- /var/log/multus.log: mode set to 644. Unable to modify it because
it is generated after this manifest completes execution.
Test Plan:
PASS: Build ISO and deploy AIO-SX, AIO-DX, standard, and storage.
PASS: Verify that all files under /var/log/ on all type of nodes
(controller-0, controller-1, compute), except for
those listed as exceptions, have 640 or more restrictive
permissions and ownership as root:root in the standard
deployment.
PASS: Verify that all files under /var/log/ on controller-0, except
for those listed as exceptions, have 640 or more restrictive
permissions and ownership as root:root in the AIO-SX
deployment.
PASS: Verify that all files under /var/log/ on controller-0 and
controller-1, except for those listed as exceptions, have
640 or more restrictive permissions and ownership as
root:root in the AIO-DX deployment.
PASS: Standard: check ceph health using 'ceph-s' command and verify
if cluster health is ok.
PASS: Standard: swact the controller and verify if there is no alarm.
Verify the log permission, it should not be reverted.
PASS: AIO-SX: Run the CIS script as mentioned in the specification
3-4 hours after installation to confirm that the file
permissions and ownership modified by this change have not
been reverted.
PASS: AIO-SX: Run the CIS benchmark test one day after installation
on controller-0 to verify that the file permissions and
ownership modified by this change remain unchanged.
PASS: Verify that all files under /var/log/ on storage node, except
for those listed as exceptions, have 640 or more restrictive
permissions and ownership as root:root in the storage
deployment.
Story: 2011241
Task: 51364
Change-Id: Ie15076ff0d66db98d8171fcfac9411ba0f8f8631
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
The timeout for unlock host during VIM strategies was extended. This
change will provided the matching puppet config update.
TEST PLAN
PASS: On NFV runtime, config is updated
Partial-Bug: https://bugs.launchpad.net/starlingx/+bug/2098767
Depends-On: https://review.opendev.org/c/starlingx/nfv/+/942119
Change-Id: I996429eb7e3d2a8aba5762357ae7ad2b6cd60270
Signed-off-by: Joshua Kraitberg <joshua.kraitberg@windriver.com>
This reverts commit 617b6b78327544003adcf05c033c51f04406d4bc.
Reason for revert: puppet error on controller-1
Change-Id: I128168f90fffdd7f90ff5e2fd7fd298f3b7c9bca
Currently, various file permissions under /var/log/ are more
permissive than 640. To comply with the CIS benchmark
requirements, the permissions should be set to 640 or more
restrictive.
This change updates the permissions and ownership of files
under /var/log/ to 640. Ownership is also set to root:root
wherever possible.
Below are the exception where permissions or ownership are not updated:
- /var/log/keystone/keystone.log: ownership set to keystone:keystone
- /var/log/flux/helm-controller.log: ownership set to nobody:nogroup
- /var/log/flux/source-controller.log: ownership set to nobody:nogroup
- /var/log/puppet/masterhttp.log: mode set to 660
- /var/log/puppet/masterhttp.log: ownership set to puppet:puppet
Test Plan:
PASS: Build ISO and deploy AIO-SX.
PASS: Verify that all files under /var/log/, except for those
listed as exceptions, have 640 or more restrictive permissions
and ownership as root:root in the AIO-SX deployment.
PASS: AIO-SX: Run the CIS script 3-4 hours after installation to
confirm that the file permissions and ownership modified by
this change have not been reverted.
PASS: AIO-SX: Run the CIS benchmark test one day after installation
to verify that the file permissions and ownership modified by
this change remain unchanged.
Story: 2011241
Task: 51364
Change-Id: I84109690a21363335726bcbeac68f9f7c332ed36
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
This reverts commit 0184254a37b5a1d2def122d33f80edb2912a2813.
Reason for revert: This is dependent on Https://review.opendev.org/c/starlingx/upstream/+/935495 which is reverted.
Change-Id: I4a67ca5978d3232b34a00529133a8996de200a8e
In fresh installations, during the first host unlock after running the
Ansible playbook, only controller-0 is initially configured.
The playbook creates the flag:
/etc/platform/simplex
This applies to AIO-DX and standard installations as well.
This behavior is implemented in the following change:
2b7bcd79c9/playbookconfig/playbookconfig/playbooks/bootstrap/roles/validate-config/tasks/main.yml (L421)
Impact on OAM Floating IP Installation
On the first boot after the Ansible playbook runs, the OAM Floating IP
is not installed by SM through the oam-ipv4 service.
Instead, it is installed by Puppet via network.pp.
In rare situations, especially in test environments where systems are
frequently reinstalled, it is recommended to send a Gratuitous ARP (GARP)
to update the ARP table of the switch managing external networks.
For subsequent host lock/unlock cycles or reboots, SM will install the
OAM Floating IP via the oam-ipv4 service, and IPaddr2 will send the
Gratuitous ARP automatically.
This is only required for IPv4 installations. In IPv6 scenarios,
the Neighbor Discovery Protocol (NDP) handles address resolution instead.
Tests Done:
- IPv4 AIO-SX fresh install
Confirmed that a Gratuitous ARP was sent
- IPv4 AIO-DX fresh install
Confirmed that a Gratuitous ARP was sent
- IPv4 Standard fresh install
Confirmed that a Gratuitous ARP was sent
- IPv6 AIO-DX fresh install
Confirmed that the code does not execute Gratuitous ARP
Closes-Bug: 2097624
Signed-off-by:
Fabiano Correa Mercer fabiano.correamercer@windriver.com
Change-Id: I981d37b0c655206f5d6770908b25d98a3fe76cee
This change introduces 2 memory limits for memcached:
large for system controllers: 782MB
small for subclouds and standalone: 32MB
Currently the memory limit for all system types is 782MB, which
is deemed excessive for most system types, hence the decrease to 32MB.
It also adds a runtime class to restart memcached in the event of
fernet key rotation in order to avoid stale token in cache.
Test plan:
PASS: Full build, install and bootstrap
PASS: Install sx system, verify memory limit is 32MB
PASS: Install DC system with sx subcloud, verify memory limit is 32MB
for subcloud and 782 for system controller
PASS: In the system controller, do manage and unmanage of the subcloud
to trigger fernet key rotation. Verify that memcached is
restarted in the subcloud and that the memory limit remains as
32 MB.
Closes-Bug: 2088084
Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: Icf83962c559ad513f0e5919c2bbca175bb187727
Currently, CIS benchmark checks fail for the permission and ownership
settings of /etc/cron* and the /etc/at.allow file. The CIS
recommendation is to ensure that root is both the owner and group for
these files, and that only the owner has access to them.
This change introduces a Puppet class that ensures file permissions and
ownership for /etc/cron* and /etc/at.allow are modified according to
CIS benchmark recommendations.
Test Plan:
PASS: Build iso and deploy.
PASS: AIO-SX: Ensure that the ownership and permission of
/etc/cron* and /etc/at.allow are set as per the CIS
recommendation.
PASS: AIO-SX: Run the CIS benchmark and ensure that the
controls/tests related to /etc/cron* and /etc/at.allow
pass successfully.
PASS: AIO-SX: Verify that cron jobs are executed correctly
after the change. Verify that there are no errors in the
cron log at /var/log/cron.log. Specifically, check for the
absence of any error messages or failed cron job logs.Verify
that the memory log file rss-memory.log exists in the
expected directory and a log of cron execution in
/var/log/cron.log.
Story: 2011241
Task: 51170
Change-Id: Ib72caedcd21be07daa310aa6eef21d26cc8db7cb
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
Puppet configuration for VIM Activate retry
is changed from 120 secs to 30 secs to be
inline with VIM repo configuration.
TEST PLAN
PASSED: Upgrade from 24.09 to 25.09 with activation
retries.
Story: 2011045
Task: 51650
Change-Id: Ia4f502ab371217aac06f98df45485b4b717bccf8
Signed-off-by: Vanathi.Selvaraju <vanathi.selvaraju@windriver.com>
Currently, various file permissions under /var/log/ are more
permissive than 640. To comply with the CIS benchmark
requirements, the permissions should be set to 640 or more
restrictive.
This change updates the permissions and ownership of files
under /var/log/ to 640. Ownership is also set to root:root
wherever possible.
Below are the exception where permissions or ownership are not updated:
- /var/log/keystone/keystone.log: ownership set to keystone:keystone
- /var/log/flux/helm-controller.log: ownership set to nobody:nogroup
- /var/log/flux/source-controller.log: ownership set to nobody:nogroup
- /var/log/puppet/masterhttp.log: mode set to 660
- /var/log/puppet/masterhttp.log: ownership set to puppet:puppet
Test Plan:
PASS: Build ISO and deploy AIO-SX.
PASS: Verify that all files under /var/log/, except for those
listed as exceptions, have 640 or more restrictive permissions
and ownership as root:root in the AIO-SX deployment.
PASS: AIO-SX: Run the CIS script 3-4 hours after installation to
confirm that the file permissions and ownership modified by
this change have not been reverted.
PASS: AIO-SX: Run the CIS benchmark test one day after installation
to verify that the file permissions and ownership modified by
this change remain unchanged.
Story: 2011241
Task: 51364
Depends-On: https://review.opendev.org/c/starlingx/integ/+/935493
Depends-On: https://review.opendev.org/c/starlingx/ha/+/935499
Depends-On: https://review.opendev.org/c/starlingx/upstream/+/935495
Change-Id: I32f4341f14b5258ece715c5081d675e34a86e624
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
In this commit, added the code to change the /etc/kubenetes/admin.conf
file ownership to root:root in fresh install.
Also, added the code to run the command
"setfacl -m g:sys_protected:r /etc/kubernetes/admin.conf" such that
all the WRCP users/applications that are in the sys_protected group
continue to have read access to this file.
TEST CASES:
PASSED: Checked ownership using below command
"ls -al /etc/kubernetes/admin.conf".
PASSES: Checked the file permission using below command which
will show 640.
"stat -c %a /etc/kubernetes/admin.conf"
PASSED: Checked the ACL entries using below command
"getfacl /etc/kubernetes/admin.conf".
PASSED: No error when ran "system host-swact" in AIO-DX.
PASSED: No alarms when ran "fm alarm-list".
PASSED: Verified that sysinv can read admin.conf file using below
commands:
"sudo -u sysinv cat "/etc/kubernetes/admin.conf" &>/dev/null"
"sudo -u sysadmin cat "/etc/kubernetes/admin.conf" &>/dev/null"
Added "testuser" to the users group and ran below command and
this gives output "can not read /etc/kubernetes/admin.conf":
sudo -u "testuser" cat "/etc/kubernetes/admin.conf" &>/dev/null
Also verified using system command which can read admin.conf:
"system service-parameter-modify kubernetes kube_apiserver
audit-log-maxage=30"
Story: 2011334
Task: 51610
Change-Id: I6097f9f4863d83f69b5e804fec6cf4a02607c799
Signed-off-by: Md Irshad Sheikh <mdirshad.sheikh@windriver.com>
Fix the following CIS Benchmark network configurations:
- 1.5.1 Ensure address space layout randomization is enabled
- 1.5.2 Ensure ptrace_scope is restricted
- 3.3.7 Ensure Reverse Path Filtering is enabled
Testing:
- Build successful
- SX and DX deployment successful
- Run CIS Tenable-IO scan with no errors
Story: 2011210
Task: 51627
Depends-On: https://review.opendev.org/c/starlingx/ansible-playbooks/+/940409
Change-Id: I3af0a4f1750ef11049530b1530c09283c9cb72be
Signed-off-by: Mohammad Issa <mohammad.issa@windriver.com>
Certificate super-admin.conf is a new certificate introduced
recently in k8s. This commit adds it to the Kube Root CA
update orchestration.
PASS: Run Kube Root CA update and verify that every certificate
signed by K8s Root CA is updated in show-certs.sh. For DX,
do this check in controller-1 too. Verify k8s is operation.
Run host-lock/unlock and verify k8s and certificates are
still the same.
Closes-Bug: 2097100
Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: Iaf5a94248310e53ebc19bcfbfddea80bde94694c
SCTP module autoload is enabled/disabled by using
service-parameter sctp_autoload.
Full Parameter name:
"platform::params::sctp_autoload"
This change implements a runtime manifest which makes sctp module
changes.
By default: sctp_autoload is "enabled"
If sctp_autoload=disabled:
- SCTP Module doesn't get loaded by default post host
lock/unlock
If sctp_autoload=enabled:
- SCTP Moduel gets loaded by default post host
lock/unlock.
Test Plan:
PASSED: build-pkgs
PASSED: Deployed AIO-Standard
PASSED: Verify service parameter configuration using
enabled/disabled values
PASSED: SCTP module loads/unloads after host-reboot
Story: 2011335
Task: 51623
Change-Id: Ib27367807f1096a6253d96f113d9107a3ff2f596
Signed-off-by: Aman Pandae <AmanPandae.Mothukuri@windriver.com>
When a ssl_ca certificate is installed as trusted, we need to restart
Docker to pick up the new certificate. A bug was introduced in this
behavior by verifying 'dockerd.service' with 'systemctl status' instead
of 'docker.service'.
Test plan:
PASS: Update platform certificates w/ playbook.
Observe that docker restarts.
Login to local registry.
Closes-bug: 2096667
Change-Id: I435a3af64b720d2f343af3709db1221ca798927e
Signed-off-by: Marcelo de Castro Loebens <Marcelo.DeCastroLoebens@windriver.com>
This commit adds list_services and list_endpoints as permitted for
the reader role. This puppet template generates file /etc/keystone/
policy.json
These endpoints only return information and fit the description of the
reader role as a role that has rights to perform actions with list,
query, show and summary [1].
[1] https://docs.starlingx.io/security/kubernetes/keystone-account-roles-64098d1abdc1.html
Test plan:
PASS: Full build, install and bootstrap.
PASS: With the default admin user verify this commands operate normally
from the cli with 'openstack service list' and 'openstack endpoint
list.'
PASS: Verify the reader role works by creating a new user with the
reader role and trying the 2 commands.
PASS: Verify change is persistent after host-lock / unlock.
Closes-Bug: 2095181
Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: Id911277501c6dea74536c95e9e4250958e7bf5b7
As part of the Kubernetes upgrade process, the pause image version may
be updated. We need to ensure that /etc/containerd/config.toml is updated
during the kubelet upgrade step to match the version corresponding to
the new k8s release. This ensures that, during backup and restore,
the correct version of the pause image is downloaded and made available
to pods. Without this change, the upgraded pause image may not be available
during backup and restore, and the containerd config.toml would
reference an outdated version.
Test Plan:
Pass: Upgrade k8s on AIO-SX and confirm config.toml is updated.
PXXX: Upgrade k8s on standard configuration and confirm config.toml
is updated.
Closes-Bug: 2095090
Change-Id: If5c35f58234db85d013dd55c443b12d7599fee03
Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>
synce4l 1.1.0 has new parameters and a new external source section.
This commit adds the new section to the puppet code.
Test Plan:
PASS: ensure the new section is correctly generated.
PASS: ensure synce4l daemon is started correctly.
Depends-On: https://review.opendev.org/c/starlingx/config/+/938609
Story: 2010540
Task: 51523
Change-Id: I9c88fbc5b006f9234e5d113bceab82f0b35ff7cd
Signed-off-by: Caio Bruchert <caio.bruchert@windriver.com>
CIS Benchmark requires password min age of 1 day and max of no more than
365 days. Additionally, it also requires the inactive password lock is
less than or equal to 45 days.
This adds the missing minimum password age and sets up an inactive
password lock for the sysadmin user, which means that after the password
expires, the user has 45 days to change the password, or else it will be
locked.
Test Plan:
PASS: Run build-pkgs -c -p puppet-manifests.
PASS: Run build-image.
PASS: Run fresh install of AIO-SX with complete bootstrap and unlock of
the controller-0.
PASS: Run fresh install of AIO-DX with complete bootstrap and unlock of
controller-0 and controller-1.
PASS: Run backup and restore and verify that the changes persist.
PASS: Change system date and verify that the account is locked 45 days
after the password expired.
Story: 2011283
Task: 51441
Change-Id: Ica830fffc59acaa631d5cb717f33fa8daca8f35c
Signed-off-by: Rodrigo Tavares <Rodrigo.DosSantosTavares@windriver.com>
These timeouts were missing on systems upgraded from previous releases.
The absence of these timeouts can cause issues because default timeouts
do not always work out.
TEST PLAN
PASS: Run platform::nfv::runtime manifest on already affected system
* configs are updated
PASS: AIO-SX patch upgrade
* Post patch audit triggers platform::nfv::runtime
PASS: AIO-SX major upgrade
* Unlock triggers config updates
Closes-Bug: 2093793
Change-Id: Ie13534f548987a119499203574cbd403551c92a6
Signed-off-by: Joshua Kraitberg <joshua.kraitberg@windriver.com>
This change removes the error message "Error from server (NotFound):
globalnetworkpolicies.crd.projectcalico.org [gnp name] not found" from
the operation destined to verify if the GNP exists as the result will
be used in the next line. No need to print the error message as that
can lead to wrong interpretation in the system logs.
Test Plan
[PASS] Install AIO-DX and observe that the message doesn't appear
anymore but the firewall still is correctly installed
Story: 2011324
Task: 51531
Change-Id: I75180a7d1ff8d706d5bc6ed9b5e3d630f5e811f8
Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>
Based on DC scale batch subcloud prestage operations, it was observed
80 cores fully utilized zero idle, causing starvation of software API.
The software-controller-daemon process was spreading work over multiple
seconds. Any processes used for DC operations are payload and should
fight for CPU equally.
This updates the list of systemd services that require disabled
CPUShares instead of having the reduced value 128 on all other nodes.
This adds: software-controller-daemon, software, and
sw-patch-controller-daemon.
Partial-Bug: 2092319
TEST PLAN:
- PASS: Fresh install DC lab
- PASS: Verify systemcontroller DropIn files are created with
CPUShares=1024 for: software-controller-daemon, software,
sw-patch-controller-daemon
Change-Id: I1c7b4e0c19e9209c59a70bb6b1826e18fdf59335
Signed-off-by: Jim Gauld <James.Gauld@windriver.com>
This commit checks the /var/run/.enrollment_in_progress flag to
prevent the openstack::keystone::endpoint::runtime class from running
during subcloud enrollment. For more details, see the related Ansible
changes [1].
[1] https://review.opendev.org/c/starlingx/ansible-playbooks/+/938058
Test Plan:
This commit was tested alongside the related changes [1].
Partial-bug: 2092214
Change-Id: I1cdd75001f2221a4e02ebac52494f91c7a40fa52
Signed-off-by: Salman Rana <salman.rana@windriver.com>