We had been testing the gerrit upgrade from 3.9 to 3.11. Now that we are
running 3.10 we really should test the 3.10 to 3.11 upgrade. Fix that.
While we are at it catch the default test gerrit version up to 3.10 as
well (it was 3.8 but I don't think we use the default anywhere so this
is mostly a noop).
Change-Id: Idafbddf3b9af54b45e6b7e06fda1ede6aa0a995e
Gerrit is running on a Noble node now which uses docker compose not
docker-compose. This newer tool warns about the version in our
docker-compose.yaml file because it is ignored by the newer tool. Drop
it to clean up the warning.
Change-Id: Idebf6bb40309e4e8a50a0ed39e23e67e37510af8
This change removes review02 from our inventory and configuration
management. This should be landed after we're confident we're unlikely
to need to roll back to that server. That said if we do rollback before
the server is cleaned up reverting this change isn't too bad.
Change-Id: Ica14ae92c4c1ef6db76acef93d6d65977aab4def
The old gerrit init script uses sighup to request a graceful shutdown
of the service which is why when we ported to docker-compose we
configured it to also use sighup. Unfortunately, on noble with podman
the podman container apparmor profiles don't allow podman to issue a
sighup to the container. This means when we try to stop the service we
wait until the 5 minute timeout expires then docker compose + podman
issue a sigkill.
This is less graceful than we want. To address this we switch to sigint
instead. The reason for this is the podman container apparmor profiles
do allow signit and the jvm appears to treat sigint, sigterm, and sighup
as equivalent triggers for the shutdown hook.
Change-Id: Iacfc70713d63443d58bb563b895fdc5dfb0642e2
This shouldn't be considered and absolute notification as recordings can
occur outside of jitsi meet as well (obs etc). But during the ptg people
had to notify others manually when using meetpad's built in local
recording mechanism let's make that more automatic.
Change-Id: I5374773ef2262971a049143aed2353cc8366345d
This came up as something that was missing while we bootstrapped a new
gerrit server. The rsa hostkey is managed but none of the three ecdsa
keys or the ed25519 key is. Fix that by managing these keys in the same
manner we manager the RSA key.
Change-Id: Iaf58543b6833273ca45fa5c359dc88eaf64d7a03
Now that codesearch is deployed on Noble with podman as the container
runtime we can push our hound container to quay and still have
speculative container images. Do this to reduce our reliance on docker
hub as their rate limits are very aggressive now.
Change-Id: I364da9ebe10e681de024b50cbdccdb5b3fce3617
This is a new Gerrit server that will replace the old review02 server.
We add it to the review-staging group so that manage-projects ignores it
for now. We also give it an empty replication config so that it will not
try to force push repo content to the gitea farm.
This change does not enable any gerrit init, gerrit reindexing, or even
docker compose up the service on the server. That means after it lands
and we're convinced it isn't creating any problems for review02 we will
manually need to sync content from review02 and then manually bring this
server up (which may require an init depending on how much content we
copy).
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/946711
Change-Id: Iadf0ed75539c7673544bd8d856e0a3832a5541c2
There is a new Etherpad 2.3.0 release. We update our Dockerfile to build
that release and in the process attempt to resynchronize with the
upstream Dockerfile. The config files don't seem to change in any
meanginful way.
The changelog can be found here:
https://github.com/ether/etherpad-lite/blob/v2.3.0/CHANGELOG.md
While we are at it we add screenshots of the main landing page and an
etherpad. This should make it easier to quickly check things when making
changes.
Change-Id: Ibfdab811b51626729f8107146b34794db0e9e2ae
The new server is on its way into service. We'll want to clean up the
old one so that the backing instance and volume can be deleted. This is
the first step in that process.
Change-Id: I7ac37c6d6ea9c637c7782fa277693265445b51b9
This serves two purposes. The first is to attempt to address the
internal network slowness of the existing mirror by booting a new server
that will hopefully not have this problem. The other is it gives us a
new shiny noble node.
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/945658
Change-Id: Iae5cf08018a5b2f935b6edfcdfd6b120baf31e87
At this point all four of these servers have been replaced by new Noble
nodepool launchers. When we are happy with the new servers we should
land this change and remove the other servers from our inventory so that
they can be deleted.
Change-Id: Ia0b39aae8f6cfa139a81877554c34bb5b8e5cb1a
This old mirror01 host has been replaced by a new Noble mirror02 host.
Pull this server out of configuration management so that it can be
deleted.
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/945254
Change-Id: I9cc6b5b36641cced02be82a5d8405f02a06ea05b
This is a new Noble mirror that will replace the old mirror. We update
the inventory test cases to stop matching the old mirror because that
old mirror will eventually be removed from the inventory. Otherwise this
is a pretty standard mirror replacement.
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/945230
Change-Id: Ib18d834e16ebeec75fb7f16e1dc83b357efb646c
On Ubuntu Noble we run `docker compose` instead of `docker-compose`.
This newer tool ignores the version set in docker-compose.yaml files and
emits a warning when it is set. Clean up this version on services that
only run with `docker compose` and not `docker-compose`.
Change-Id: I08ce1f2ddc6a07fd47b4524af21255c1c4903785
These servers have been replaced by new Noble servers (nb05, nb06,
nb07). These new servers have managed to build every one of our current
images except for gentoo, openeuler, and openeuler arm64. These three
images weren't building on the old system either.
There is a small amount of concern that removing the old servers without
letting them clean up the database after themselves may orphan some
zookeeper database records. However the current rockylinux-9 images were
both built by nb05 or nb06 and we don't have any old records from nb01
or nb02 remaining so it seems nodepool cleans up after itself properly.
Worst case we can probably do manual database edits.
We also remove the version specifier in the docker-compose.yaml file as
`docker compose` ignores it and emits a warning when it is present. Once
this change lands all of our nodepool builders will use `docker compose`
instead of `docker-compose` making this a safe cleanup.
Change-Id: Iab8d2b6493b78cc3711d64119da2da5d3456a25a
This is a followup to the prior fix that addressed the path issue. Now
we have the problem of docker-compose attempting to allocate a tty (the
default) which isn't possible by default under cron. We don't need a tty
so we pass -T to disable tty allocation in the first place.
I should've caught this the last time around but my testing didn't catch
it because I was running from a shell.
Change-Id: I57797c8d140335d9edcdcd324239fdefb09882d4
As docker-compose resides in /usr/local/bin, which is not in the
default PATH for crontabs, use the full path to the executable.
Change-Id: I26e4147c4d2e964ff1c91831cf326222b92147bf
This adds two new Noble nodepool builders to our inventory. When we
deploy these two servers we will shutdown services on nb01 and nb02 and
put those older servers in the emergency file to force the new Noble
nodes to build images. This should give us a safe way to rollforward
onto the new platform and catch any problems.
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/944794
Change-Id: Icbb48404ff11a1c887a0184fc60ae2ff6f7a3409
As we rollout Noble nodes we have to maintain compatiblity between focal
with docker-compose and noble with docker compose. One difference is the
default container names change between them. We can work around that by
using docker compose commands to refer to the logical container rather
than the specific container.
Update the nodepool builder image export cron job to use docker-compose
exec instead of docker exec for this reason.
Change-Id: Iba2e395cf1792096c629ab74f849d55e96d74329
This removes the old mirror01 vexxhost mirrors from config management.
The old mirror02 mirrors were removed when we added mirror03 nodes. With
both pairs out of configuration management we can cleanup DNS then
delete the servers and their volumes.
Change-Id: I6f2d914ee8fbf9358b182b05c91fe97bc7edcc5b
The mirror02 mirrors were booted on flavors that were much larger than
necessary and didn't have external volumes attached for teh cache
content. I've gone ahead and booted replacement Noble nodes using a
smaller flavor naming them mirror03 and attached a volume to each one
for caching.
We pull mirror02 out of the inventory as we don't need it anymore
(mirror01 is in use in production and will be cleaned up in followups).
Depends-On: https://review.opendev.org/c/opendev/zone-opendev.org/+/944150
Change-Id: Ice9b4e79bfde5a8364d084c7434b848805d8ecfd
Currently this logs to /var/log/ansible.log via the log_path setting
in the Ansible config, and we also redirect output to a file. The
stdout dump is the primary debugging method, and contains the same
info as what is put into /var/log/ansible.log by Ansible logging.
Instead, set ANSIBLE_LOG_PATH to /dev/null these logs, and just save
the stdout output. While we're here, save stderr too.
This way if you manually run Ansible on bridge you've got logging by
default, but this should stop multiple runs of production Ansible via
Zuul all mushing together thier output into a fairly useless global
log file.
Change-Id: Iae32f501dc718f9bbfd403c6857ca7c8dc8767de
We've moved all our resources to the new project now, so no longer
need old cloud and hostvar references.
Also include some comments about manual adjustments we made to the
MTU in the new projects.
Change-Id: I0bca50f2193d89fffd3ca20c8f8fc79e376eebb1
This reverts commit 03816fa43363d9162749bf3cf418f788acfee7cc.
This is a partial reapplication of the previously broken change. We make
a small edit to the ansible playbook to run zuul_return in a valid
context. Specifically as a task against localhost.
We also move the infra-prod-bootstrap-bridge dependency into the PPC
because the PPC dependencies override job dpendencies.
Change-Id: Icc2e0871abfed28937eb96bc14bb2be6b0d882d8
This reverts commit d616ec9d9ae2e2fb7f5d53f0f3f14917f0028b0d.
We are hitting ERROR! 'zuul_return' is not a valid attribute for a Play
in the bootstrap-bridge job.
Change-Id: Iebb49ae9c01ea62e8877860fdb0bf1e3d4080607
We have mirrored the selenium/standalone-firefox image to
quay.io/opendevmirror so that we don't have to pull this image from
docker hub and eat into quotas there. Start fetching the image from the
mirror in our CI jobs.
Change-Id: I790f7b29f7e30c2cc2a8b37c0146d1f8e594264e