732 Commits

Author SHA1 Message Date
Zuul
d57eb27dc5 Merge "Move subcloud audit to separate process" 2020-05-14 19:02:11 +00:00
Zuul
12bb0ca2b8 Merge "DC cert manifest should only apply to controller nodes" 2020-05-13 14:56:35 +00:00
Bin Qian
65daac29e4 DC cert manifest should only apply to controller nodes
DC cert manifest should only apply to controller nodes on system
controller.
This fix is for DC with worker nodes in central cloud.

Change-Id: I4233509a6f0afb3013c01e81dea6f655d9e15371
Closes-Bug: 1878260
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2020-05-13 09:57:24 -04:00
Zuul
250f2a6dc6 Merge "Ensure containerd binds to the loopback interface" 2020-05-05 19:51:23 +00:00
Tao Liu
4e9153cf23 Move subcloud audit to separate process
Subcloud audit is being removed from the dcmanager-manager
process and it is running in dcmanager-audit process.

This update adds associated puppet config.

Story: 2007267
Task: 39640
Depends-On: https://review.opendev.org/#/c/725627/

Change-Id: Idd2e675126a01d6113597646ddd9eb4a0bc5be44
Signed-off-by: Tao Liu <tao.liu@windriver.com>
2020-05-05 10:33:16 -05:00
Robert Church
b793518f65 Ensure containerd binds to the loopback interface
Set the stream_server_address to bind to the loopback interface with a
value of "127.0.0.1" for IPv4 and "::1" for IPv6.

Without setting the stream_server_address in config.toml, containerd was
binding to the OAM interface. Under most situations this resulted in
containerd binding to the OAM fixed host address. But in an IPv6
configuration there were occasions where after controller-0 unlock, the
OAM floating IP would be used. When this happened, swacting away from
controller-0 would move the OAM floating IP to controller-1 and break
access to containers residing on controller-0.

This will explicitly update the containerd configuration to use the IP
address of the loopback interface based on the system's network
configuration.

This also removes any security concerns with containerd binding to the
OAM interface.

Change-Id: I0f914d738e94b525cf217712675d3b4575817d1d
Depends-On: https://review.opendev.org/#/c/725394/
Closes-Bug: #1875891
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-05-04 17:10:29 -04:00
Zuul
33d318b1fc Merge "Add a cron job to purge dcorch database" 2020-05-01 14:05:12 +00:00
Zuul
1ce9f18fc5 Merge "Add a new filesystem for image conversion" 2020-04-29 15:49:51 +00:00
Elena Taivan
4107faed7e Add a new filesystem for image conversion
Adding runtime manifest for conversion logical volume.
Adding new 'ensure' parameter for 'platform::filesystem' class.

Change-Id: I622837959a5a7aabc462640b588713396354ce73
Partial-bug: 1819688
Signed-off-by: Elena Taivan <elena.taivan@windriver.com>
2020-04-29 09:17:04 +00:00
Zuul
53a4250a65 Merge "Install DC adminep cert and DC root ca certificate" 2020-04-28 21:44:49 +00:00
Zuul
f0cf1888a3 Merge "Config platform service admin endpoints to https for DC" 2020-04-28 21:13:10 +00:00
Zuul
78e97f7aba Merge "Rename the existing /opt/patch-vault filesystem to /opt/dc-vault" 2020-04-28 18:46:42 +00:00
albailey
db97027fb7 Clamp pylint to be less than 2.5.0
A new version of pylint was released on April 25
and it is breaking zuul jobs so submissions cannot merge.
Clamping pylint to be less than 2.5.0 for now.

Change-Id: Ibd62a5d67bf8f37119b612a274c2d472a3474859
Partial-Bug: 1875705
Signed-off-by: albailey <Al.Bailey@windriver.com>
2020-04-28 12:40:48 -05:00
Jessica Castelino
77b2e1ccfa Rename the existing /opt/patch-vault filesystem to /opt/dc-vault
The filesystem /opt/patch-vault is renamed to /opt/dc-vault so that
it can be re-used to store FPGA images and software loads. Thus,
necessary changes have been made in the puppet manifests.

Story: 2006740
Task: 39550
Depends-On: https://review.opendev.org/#/c/723007/
Change-Id: I26055b12e7bd241adb072c609f72b8d113b4a20e
Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>
2020-04-24 16:16:31 -04:00
Zuul
a9e549a66d Merge "Enable --reserved-cpus option in k8s v1.18.1" 2020-04-24 19:34:16 +00:00
Robert Church
7a75923955 Enable --reserved-cpus option in k8s v1.18.1
The option was introduced in k8s v1.17 and will now be used to define
the explicit set of CPUs that are reserved for specific cpu functions in
StarlingX.

This retires setting the number of CPUs reserved in the --kube-reserved
and --system-reserved options.

Change-Id: I1a3d4e4cca7b6940682a787c2e7348e56a047a06
Story: 2006999
Task: 39529
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-04-22 23:29:46 -04:00
Tee Ngo
9e86812ec1 Add a cron job to purge dcorch database
This commit adds a daily cron job to purge deleted orch
requests that are older than 3 days, their orch jobs
and resources from dcorch database.

Story: 2007267
Task: 39044
Depends-On: https://review.opendev.org/720277
Change-Id: Ibc9f78ac89f4cc6706886a49062c3f5a6145cc9f
Signed-off-by: Tee Ngo <tee.ngo@windriver.com>
2020-04-22 09:31:49 -04:00
Zuul
d9aa740f9a Merge "Enable duplex platform upgrades: migrate etcd" 2020-04-21 20:59:22 +00:00
Andy Ning
e5f325ccca Config platform service admin endpoints to https for DC
With this update https is enabled for platform services' admin endpoints
for System Controller and subclouds when the first controller is
unlocked.

The services with admin endpoints enabled are:
- fm
- patching
- vim
- smapi
- barbican
- keystone
- sysinv
- dcdbsync
- dcmanager

Change-Id: I45b3c541cdb6191dad6d3e2b3e9cf8a3398b3a1b
Story: 2007347
Task: 38891
Depends-On: https://review.opendev.org/#/c/720224/
Signed-off-by: Andy Ning <andy.ning@windriver.com>
2020-04-20 17:47:43 -04:00
Zuul
7665c92ec9 Merge "Support subcloud deploy upload the common files" 2020-04-17 15:01:38 +00:00
Zuul
cb03cec9ce Merge "Add B&R information comments to DRBD manifest" 2020-04-16 15:57:01 +00:00
Tao Liu
7910646e9b Support subcloud deploy upload the common files
Create /opt/platform/deploy to host the deploy common files.

Partial-Bug: 1864508

Change-Id: Ifd40cb02d4a2ee17a05457b43c6227aaa069e01e
Signed-off-by: Tao Liu <tao.liu@windriver.com>
2020-04-16 10:08:59 -04:00
Stefan Dinescu
4fc8bdcf4a Add B&R information comments to DRBD manifest
This commit adds a series of comments to the DRBD manifest
so that users doing any changes to this manifest know also
update the list of DRBD devices in the restore playbook.

Change-Id: Iae1d9d98391759669871b016721418922aa134ce
Partial-bug: 1854169
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
2020-04-16 07:22:12 +00:00
Bin Qian
c82b459703 Install DC adminep cert and DC root ca certificate
This is to install DC admin endpoint certificate (pem).
This also install root CA to trusted CA, so to trust the certificate
issued directly and indirectly by DC root CA.

Story: 2007347
Task: 39430

Depends-on: https://review.opendev.org/720273

Change-Id: Ie242c6e833a574ff29562b468fff72352515d22a
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2020-04-15 15:24:05 -04:00
Paul Vaduva
9a18b70860 Introduce a wait until network interfaces are ready
The DAD (Duplicate Address Detection) mechanism keeps
ipv6 network interface in tentative state until it finishes.
During this time no binding to this interface address is
possible and networking dependent services fail to start

Change-Id: I9cfa604a0d75400f6d3c7172b3b973b0d50c3578
Closes-bug: 1871638
Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>
2020-04-15 12:58:39 -04:00
Bart Wensley
ccb7249097 Allow k8s upgrades to any release if necessary
The default behaviour of the "kubeadm upgrade apply" command is
to only allow upgrades to stable kubernetes versions. However,
for both testing purposes and for potential critical fixes in
the future, it may be necessary to upgrade to a release
candidate or other release that kubernetes deems as unstable.
Adding in the appropriate options when calling the "kubeadm
upgrade apply" command to make this possible.

Change-Id: I164caf495ee3680f549d651b97e7e502b1172c70
Story: 2006781
Task: 37578
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
2020-04-14 15:43:20 -05:00
Zuul
6a50be8449 Merge "Free dcdbsync openstack instance port for https admin endpoint" 2020-04-14 20:36:17 +00:00
Zuul
73da4b6918 Merge "Upversion sandbox image to align with k8s v1.18.0" 2020-04-14 20:07:57 +00:00
Andy Ning
3b7ab6010e Free dcdbsync openstack instance port for https admin endpoint
Currently dcdbsync instance for openstack is listening on port 8220.
With the admin endpoint of dcdbsync instance for platform has https
enabled and uses port 8220, the port of dcdbsync instance for
openstack is updated to use 8229.

Change-Id: Ie3d60164e4e81de8e53ad452d4dbeab7ce4a5058
Story: 2007347
Task: 39409
Signed-off-by: Andy Ning <andy.ning@windriver.com>
2020-04-14 11:31:00 -04:00
Robert Church
438354a28c Upversion sandbox image to align with k8s v1.18.0
Change-Id: I02f6158d39b4f10764faf4055da4ab4cdc1f9662
Story: 2006999
Task: 39342
Depends-On: https://review.opendev.org/#/c/718568
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-04-08 19:45:02 -04:00
Jessica Castelino
7134a06250 Database connection exhaustion in dcmanager during sync
When a data sync is triggered for large number of subclouds (~100),
the sync fails for some subclouds due to database connection exhaustion.
In order to fix this issue, the limit on the number of database
connections has been increased.

Story: 2007267
Task: 38956
Change-Id: I88ed37ba3a143e3abee78a9f5584b16f17becc76
Signed-off-by: Jessica Castelino <jessica.castelino@windriver.com>
2020-04-08 11:24:37 -04:00
Zuul
7d943888c9 Merge "Remove dcorch-snmp" 2020-04-08 14:21:43 +00:00
John Kung
21690922e2 Enable duplex platform upgrades: migrate etcd
Enable the mechanism to upgrade the platform components on
a running StarlingX system with duplex controllers.

This includes upgrade updates for:
  o migrate etcd on host-swact

Depends-On: https://review.opendev.org/#/c/717038/
Change-Id: Ife45253b46a9d58216d6cc943d7f4d40dd48b970
Story: 2007403
Task: 39246
Signed-off-by: John Kung <john.kung@windriver.com>
2020-04-07 18:14:23 -04:00
Zuul
9f506d359a Merge "Ensure network config has been applied before containerd" 2020-04-06 20:18:00 +00:00
Zuul
f7c0c41636 Merge "Configure docker and containerd once per AIO deploy" 2020-04-06 20:05:44 +00:00
Zuul
d5e4e7d5c2 Merge "Support adding admission plugin post bootrstrap" 2020-04-06 19:24:19 +00:00
Jim Somerville
6b11dcc799 lowlat: enable ktimer_lockless_check if it exists
Enable check for raising timer interrupt only if one is pending.
This allows nohz full mode to operate properly on isolated cores.
Without it, ktimersoftd interferes with only one job being
on the run queue on that core, causing it to drop out of nohz.

If ktimer_lockless_check doesn't exist in the kernel, then no
error is reported ie. it just fails silently.

Closes-Bug: 1870456
Change-Id: I93d0fab3e9f4f56f9afb9bbfaa04882cf9068db5
Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
2020-04-06 13:35:22 -04:00
Jerry Sun
45ecd74e05 Support adding admission plugin post bootrstrap
This commit adds mandatory plugins automatically, without having the
user specify them through system service-parameters.

Story: 2007351
Task: 38897

Change-Id: Ia423bc3b7be241297d9d1c7a917ac308855c6114
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
2020-04-03 15:25:51 -04:00
Paul Vaduva
93d22c438e Configure docker and containerd once per AIO deploy
Prevent a double configuration of docker and containerd
for AIO scenarios.

Change-Id: I0cb9fdde5acf8d5d44d526e70ae4af726932709f
Closes-bug: 1869193
Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>
2020-04-03 22:01:05 +03:00
Robert Church
296bd3d1f7 Ensure network config has been applied before containerd
If containerd is started prior to networking providing a default route,
the containerd cri plugin will fail to load with the following message:

msg="failed to load plugin io.containerd.grpc.v1.cri" error="failed to
create CRI service: failed to create stream server: failed to get stream
server address: no default routes found in \"/proc/net/route\" or
\"/proc/net/ipv6_route\""

and the status of the plugin will be in 'error'

TYPE                  ID  PLATFORMS   STATUS
io.containerd.grpc.v1 cri linux/amd64 error

This will prevent any crictl image pulls from working.

This change will ensure the network config is applied prior to
configuring and restarting containerd.

Docker and containerd also have a dependency, so also ensure the
network config is applied prior to configuring and restarting
docker.

Change-Id: I94a3349b438816d21b147cbd62054862d07d8bee
Partial-Bug: #1868728
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-04-02 10:02:13 -05:00
Zuul
1ec5117202 Merge "Set preferred_lft to 0 for mgmt and nfs floating ips" 2020-03-31 14:03:09 +00:00
Paul Vaduva
07edad67cc Set preferred_lft to 0 for mgmt and nfs floating ips
For ipv6 the only way to prefer the fixed ip for
outgoing connection is to set preferred_lft to 0 for
the floating ips

Change-Id: I13573ac4628db1fc49146f353d7eb2c96eb1aff0
Closes-bug: 1856064
Signed-off-by: Paul Vaduva <Paul.Vaduva@windriver.com>
2020-03-31 16:14:38 +03:00
Zuul
e80352ce0d Merge "Support adding admission plugin post bootstrap" 2020-03-31 13:00:03 +00:00
Jerry Sun
cc786eda4d Support adding admission plugin post bootstrap
This commit adds the ability to change the admission plugins of
kube-apiserver post bootstrap. We need this for pod security plugin.
Starting pod security plugin without any policies will result in all
pods being denied.

Story: 2007351
Task: 38897

Change-Id: I3ad3ba91f3084bd2f0054d5d063d2242594997b2
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
2020-03-30 13:49:35 -04:00
Gerry Kopec
f24b2f5054 Remove dcorch-snmp
dcorch-snmp process/service is being removed from distributed cloud.
Remove associated puppet config.

Change-Id: I5691648887e2302eeda0b5e853a72df52ae0ba72
Story: 2007267
Task: 39190
Depends-On: https://review.opendev.org/#/c/715765
Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>
2020-03-30 01:41:25 -04:00
Steven Webster
ed763e6a5d Fix SR-IOV runtime manifest apply
When an SR-IOV interface is configured, the platform's
network runtime manifest is applied in order to apply the virtual
function (VF) config and restart the interface.  This results in
sysinv being able to determine and populate the puppet hieradata
with the virtual function PCI addresses.

A side effect of the network manifest apply is that potentially
all platform interfaces may be brought down/up if it is determined
that their configuration has changed.  This will likely be the case
for a system which configures SR-IOV interfaces before initial
unlock.

A few issues have been encountered because of this, with some
services not behaving well when the interface they are communicating
over suddenly goes down.

This commit makes the SR-IOV VF configuration much more targeted
so that only the operation of setting the desired number of VFs
is performed.

Closes-Bug: #1868584

Change-Id: Ic867fccae89fe8bc9173598c3c84c94ba2d7511f
Signed-off-by: Steven Webster <steven.webster@windriver.com>
2020-03-29 13:14:32 -04:00
Zuul
6060fb15cd Merge "Add kubelet support for volume plugins" 2020-03-25 17:12:36 +00:00
Zuul
85f973be10 Merge "Remove creation of /etc/kuberetes/kubeadm.yaml" 2020-03-23 14:57:31 +00:00
Robert Church
1ca6d59142 Add kubelet support for volume plugins
When upversioning Calico from 3.6 to 3.12 the --volume-plugin-dir
argument needs to be provided to kubelet.

Specifically, the configuration for Calico 3.8 "Adds a Flex Volume
Driver that creates a per-pod Unix Domain Socket to allow Dikastes to
communicate with Felix over the Policy Sync API."

Change-Id: Ic76baa00de4402cbb65c37fe89835b114d424634
Story: 2006999
Task: 39111
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-03-19 23:22:18 -04:00
Jerry Sun
17ce7aa97e Remove creation of /etc/kuberetes/kubeadm.yaml
Now that we are not using /etc/kubernetes/kubeadm.yaml anymore,
we can remove the creation of the file from puppet. Bootstrap will
still create it for bootstrap use.

Change-Id: Id08af049fac3fc68b70a7dae5aec8548865a4784
Closes-bug: 1866695
Depends-On: https://review.opendev.org/#/c/713020/
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
2020-03-13 12:45:27 -04:00