183 Commits

Author SHA1 Message Date
Matt Crees
50bbcb09d0 Don't allow quorum queues to be disabled
We will be upgrading RabbitMQ to version 4.0 in Epoxy. This will not
work without quorum queues being enabled.

Change-Id: Ic6ad64bf8c62bbff175e15029eb121814032c40e
2025-04-09 16:20:59 +02:00
Matt Crees
8f0a4f6726 Remove om_enable_rabbitmq_high_availability
We're going to upgrade RabbitMQ to 4.0, so this option will not longer
be supported.

Change-Id: Ide75a8c9086798bf4bdf5bc02d4a1be17017884f
2025-04-03 09:02:34 +01:00
Michal Nasiadka
4614aad4cc rabbitmq: Add support for using stream queues for fanout
The global configuration of rabbit_qos_prefetch_count = 1
in oslo.messaging is a deliberate choice that balances
compatibility, fairness, and reliability across all
OpenStack services.

While some services (particularly those using quorum queues)
could theoretically benefit from a higher or even unlimited
prefetch count for performance reasons, this is not universally
safe. Specifically, RabbitMQ stream queues - which are used in
certain OpenStack components do not support a prefetch_count of
0. Setting this value to 0 would result in runtime errors
or stalls, as the stream protocol requires a positive credit
window to function properly. Oslo.messaging enforces this by
raising an error if prefetch_count = 0 is set while using stream
queues.

A value of 1 is the lowest legal and universally compatible
setting. It works safely with both stream and quorum queues.
It also improves fairness in message distribution across worker
processes. In distributed services like Nova, Neutron, or
Cinder, this helps avoid uneven load where one worker might
prefetch many messages while others remain idle.
A low prefetch count ensures that messages are pulled only
when the worker is ready, promoting better load balancing and
more predictable performance.

Although a higher prefetch value could increase throughput in
certain scenarios, it comes at the cost of memory usage, risk
of overloading specific nodes, and potential starvation of
others. Until oslo.messaging supports queue-type-specific or
per-service tuning, the value of 1 remains the safest and
most predictable option that works well in mixed OpenStack
environments.

In summary, rabbit_qos_prefetch_count = 1 is not about
optimizing raw throughput for any one service, but about
ensuring stable, fair, and reliable behavior across all
services that rely on oslo.messaging, regardless of the queue
type or broker backend used.

Change-Id: I541f704bfa2e98068096331afbdb591659cbc40b
2025-04-02 12:59:07 +02:00
Sven Kieske
3c3a18aa5b Rabbitmq: enable quorum for transient queues
This helps to improve the reliability of openstack services when
a rabbitmq node has issues.

See also: https://bugs.launchpad.net/oslo.messaging/+bug/2031497

Adjust the upgrade tests similar to what was done
during the introduction of quorum queues in I6c033d460a5c9b93c346e9e47e93b159d3c27830

Closes-Bug: #2078339
Partial-Bug: #2077448
Depends-On: https://review.opendev.org/c/openstack/oslo.messaging/+/888479
Depends-On: https://review.opendev.org/c/openstack/kolla-ansible/+/924623
Signed-off-by: Sven Kieske <kieske@osism.tech>
Change-Id: Idb8a8d2e560206f7697c0771c9ae3913268fa6dd
2025-03-27 08:35:35 +00:00
Zuul
d7c4546fce Merge "Remove Swift role" 2025-03-24 14:13:46 +00:00
Michal Arbet
4e0c0aa767 Add oslo.messaging Queue Manager
Adds the variable ``om_enable_queue_manager`` to configure enabling
oslo.messaging Queue Manager for all services which use RabbitMQ. This
is enabled by default.

This is required before we can move away from transient queues
(``rabbit_transient_quorum_queue``) as Queue Manager is needed to avoid
consuming all erlang atoms after some time. It is also a useful feature
for debugging, as queues are now named with hostnames and services
included.

Also setting ``lock_path`` and mounting ``dev/shm`` is dependent on
``om_enable_queue_manager``, these are now enabled too. This will allow
us to backport these features to Caracal and Dalamation without
enforcing the changes right away.

Why is -t added for exec commands?
When using Podman, behavior differs from Docker regarding exec
commands. If -t (which allocates a pseudo-terminal) is not used, the
process runs as a child of podman exec without a PTY. This causes
Python inside the container to assign PGID=0 because the pseudo-terminal
is missing, leading to issues with non-existent paths like /proc/0/....
Adding -t ensures proper PGID assignment, preventing these issues and
ensuring consistent behavior across Docker and Podman.

Side-notes:
The quorum queue precheck needed to be updated as queue manager puts
"_fanout" on the end of the name instead.
We need to override the ``[oslo_messaging_rabbit] processname`` for
services running udner wsgi, as they will otherwise all use the same
processname ``mod_wsgi``. This will cause Permission Errors trying to
access the same file in shared memory as services run with different
users.

Change-Id: Iae5f268e778fbbd2b744dc71a84253ec9e758a99
2025-03-24 09:17:25 +00:00
Michal Nasiadka
adb1a9b918 Remove Swift role
Since Swift is broken and since deprecation nobody did
pick up the work to make it working - let's remove
swift role and associated integrations/CI scripts.

Change-Id: I08e92aaeea644053fd25f80ce1f276a495cebbfc
2025-03-21 16:30:59 +00:00
Michal Arbet
94765ab468 Fix boolean representation in all configurations
In the past, we agreed to use lowercase letters
for boolean representation. This patch implements
exactly that, just changing True -> true, False -> false.

Change-Id: Ia601d84283dd8bc45e56e1029106de8836139173
2025-03-13 18:19:27 +01:00
Zuul
6a7c29e4ff Merge "Move actions to kolla_container_facts" 2025-02-07 17:36:55 +00:00
Ivan Halomi
19af5826fc Move actions to kolla_container_facts
Move actions responsible for info about containers
from kolla_container module to kolla_container_facts.
Also fixes a bug with inconsistencies between docker
and podman in kolla_container_facts.

Closes-bug: #2084878
Change-Id: I1db88e28a828ebf073f018b2bae1d9556ec22807
Signed-off-by: Ivan Halomi <ivan.halomi@tietoevry.com>
Signed-off-by: Martin Hiner <martin.hiner@tietoevry.com>
Signed-off-by: Roman Krček <roman.krcek@tietoevry.com>
2025-02-07 09:40:43 +00:00
Michal Arbet
2523ab4376 Ensure consistent lock_path across all services
This patch removes the conditional check
`om_enable_queue_manager` for `oslo_concurrency` as
it was inconsistently applied across services and
is actually unrelated to the queue manager. Simply
said somewhere the conditional was present and somewhere
it wasn't.

While `oslo.concurrency` itself does not
require a specific path, the implementation of the queue
manager expects locks to be placed under
`/var/lib/<service>/tmp`, making it necessary to define
this path explicitly. Therefore, the lock path is set
accordingly across all services, regardless of whether
the queue manager is used.

Additionally, this patch adds missing `lock_path`
configurations where they were absent to ensure uniformity.

Change-Id: I93bbaa61b2d8b5cb0d1a11783086b37a860173b6
2025-02-04 11:33:29 +00:00
Michal Arbet
a944fad527 Set lock_path for openstack services
The Oslo.messaging project implemented a nice feature
called Queue Manager. This means that instead of using
random queue names in RabbitMQ, it uses queue names composed
of hostname, process-name and some integer. For proper
functioning, the code using lockutils from oslo_concurrency
and it's failing if lock_path is not set.

This patch simply set lock_path of oslo_concurrency for
services which can potentionally use Queue Manager.

This is enabled when ``om_enable_queue_manager`` is set to ``True``.
Queue Manager will be supported in a follow-up patch.

Change-Id: Ided3b2bce03ea11fb34820fc40b4a3d694b8b44c
2025-01-29 09:03:51 +01:00
Zuul
b4c8edf10f Merge "Support mounting host's /dev/shm into container" 2025-01-29 01:26:57 +00:00
Mark Goddard
fa6535890c Reintroduce kolla-ansible check
This allows operators quickly diagnose all containers across
all hosts by running kolla-ansible check. It returns a list
of containers that are missing, not running or in unhealthy
state for each OpenStack service.

Change-Id: I36119ccdeb264aa3de928ec2254d6ff4cc955bfb
Implements: blueprint check-containers
Co-Authored-By: Roman Krček <roman.krcek@tietoevry.com>
2025-01-27 20:22:46 +00:00
Michal Arbet
3a8bfc0ace Support mounting host's /dev/shm into container
The Oslo.messaging project implemented a nice feature called Queue
Manager. This means that instead of using random queue names in
RabbitMQ, it uses queue names composed of hostname, process-name and a
counter. For proper functioning, the code stores some informations in
/dev/shm. This is needed to avoid creating queues with the same name.
We'd otherwise hit this where multiple services run under mod_wsgi, or
with services such as Magnum just within a single container as it needs
to create multiple "reply" queues.

This patch mounts /dev/shm in containers where oslo_messaging is used.

This is enabled when ``om_enable_queue_manager`` is set to ``True``.
Queue Manager will be supported in a follow-up patch.

Change-Id: Ib85ce252374fae917d329e1824800a288c6bc9f1
2025-01-21 10:25:46 +00:00
Dr. Jens Harbott
9ecdf2f0a3 Use public keystone URL for www_authenticate_uri
The `www_authenticate_uri` parameter is used to indicate to clients
where they should get a token from in order to authenticate against a
service. Most clients are not expected to be able to talk to the
internal identity endpoint, so this parameter should refer to the public
endpoint instead, see also [0].

[0] https://opendev.org/openstack/keystonemiddleware/src/branch/master/keystonemiddleware/auth_token/_opts.py#L31-L50

Change-Id: Ic99804967b5a62b5a9e39486749474520734ba48
2025-01-09 19:56:36 +00:00
Aravindh Murugesan
70279972b6 HAProxy: Switch to L7 Healthchecks
Address occasional issues where TCP connections appear healthy,
yet the web servers within containers fail to respond,
resulting in requests being sent to unhealthy servers.

Implemented for services I currently use.
- Aodh (OPTIONS to /, expects 2XX or 3XX)
- Barbican (OPTIONS to /, expects 2XX or 3XX)
- Blazar (OPTIONS to /, expects 401)
- Cinder API (OPTIONS to /, expects 2XX or 3XX)
- CloudKitty (OPTIONS to /, expects 2XX or 3XX)
- Designate (OPTIONS to /, expects 2XX or 3XX)
- Glance (OPTIONS to /, expects 2XX or 3XX)
- Gnocchi (OPTIONS to /, expects 2XX or 3XX)
- Grafana (OPTIONS to /, expects 2XX or 3XX)
- Heat (OPTIONS to /, expects 2XX or 3XX)
- Horizon (OPTIONS to /, expects 2XX or 3XX)
- Ironic (OPTIONS to /, expects 2XX or 3XX)
- Keystone (OPTIONS to /, expects 2XX or 3XX)
- Magnum (OPTIONS to /, expects 2XX or 3XX)
- Manila (OPTIONS to /, expects 2XX or 3XX)
- Masakari (OPTIONS to /, expects 2XX or 3XX)
- Mistral (OPTIONS to /, expects 2XX or 3XX)
- Nova API (OPTIONS to /, expects 2XX or 3XX)
- Nova Metadata (OPTIONS to /, expects 2XX or 3XX)
- Neutron (OPTIONS to /, expects 2XX or 3XX)
- Opensearch (OPTIONS to /, expects 2XX or 3XX)
- Opensearch Dashboards (OPTIONS to /, expects 401)
- Placement (GET to /, expects 2XX or 3XX)
- Prometheus (OPTIONS to /, expects 2XX or 3XX)
- Prometheus AlertManager (OPTIONS to /, expects 2XX or 3XX)
- Prometheus Openstack Exporter (OPTIONS to /, expects 2XX or 3XX)
- Prometheus Server (OPTIONS to /, expects 2XX or 3XX)
- Skyline API (OPTIONS to /docs, expects 2XX or 3XX)
- Skyline Console (GET to /, expects 2XX or 3XX)
- Swift (OPTIONS to /info, expects 2XX or 3XX)
- Trove (OPTIONS to /, expects 2XX or 3XX)
- Venus (OPTIONS to /, expects 2XX or 3XX)
- Watcher (GET to /, expects 2XX or 3XX)
- Zun (OPTIONS to /, expects 2XX or 3XX)

Change-Id: I839f7f1051182fe797394e5436571d64d5c5b5a4
2024-12-20 09:16:28 +01:00
Radosław Piliszek
345ecbf55e Refactor services' check-containers and optimise
This might fix some hidden bugs where the check tasks forgot to
include params important for the service.

We also get a nice optimisation by using a filtered loop instead
of task skipping per service with 'when'. As proven in
https://review.opendev.org/c/openstack/kolla-ansible/+/914997

This refactoring allows for further optimisation and
fixing work to proceed with much less hassle. Including getting
rid of many notify statements as the restarts are now safely handled
by check-containers. Some notifies had to stay, because of special
edge cases eg. in rolling upgrades and loadbalancer config.

One downside is we remove the little optimisation for Zun that
ignored config change for copying loopback but this is an
acceptable tradeoff considering the benefits above.

Co-Authored-By: Roman Krček <roman.krcek@tietoevry.com>
Change-Id: I855dfef33aa0f3fd1301295bb8ede3e587e7162a
Partially-Implements: blueprint performance-improvements
2024-12-01 22:16:51 +01:00
Radosław Piliszek
53376aed8f Performance: Don't notify handlers during config
This patch builds upon genconfig optimisation and it takes it
further by not having genconfig ever touch the handlers!
Calling the handlers and skipping them created an unnecessary slow down
if only config was ran. It also depends on the config checking fix.

This gets us closer to the single responsibility principle -
config only generates the config, container checks only validate
whether container restart is needed.

And this also means that we will have single place were containers
are restarted, were we can fix the ansible quirk of it restarting
the whole group even when one container changed in the following patches.

The only exception is the loadbalance role. As the loadbalancer services
have their config altered by other roles registering their services
using loadbalancer-config. This is in contrast to typical roles,
which do config in one step and can then run check-containers in
the next step.

Fixes some handlers that were missing the necessary guard,
making genconfig actually able to restart some containers.

Future work:
- optimise config by doing local generation and mass rsync
- support for reloads
- unconditional restart/reload (separate action)
- make 'reconfigure' act like 'genconfig' + 'deploy-containers'
  - this would avoid calling bootstrapping each time but might
    be tricky as it would break current compatibility
  - could call this 'reconfigure-containers' and deprecate
    'reconfigure'
- fix the ansible quirk that notifies more handlers then intended

Change-Id: I0ce24043ae5486b2b55489ba40abe2b96b0991a6
Partially-Implements: blueprint performance-improvements
Co-Authored-By: Roman Krček <roman.krcek@tietoevry.com>
2024-12-01 22:16:38 +01:00
Roman Krček
006ff07185 Don't notify handlers during copy-cert
This is a prerequisite for patchset #745164

This fixes unwanted restarts during copying of certificates.
By removing conditional statements from role handlers in #745164,
copying certificates caused containers to restart, this is unwanted
during the genconfig process. However, if we would remove handler
notifiers from copying certificates, the container would never
restart, since from #745164, containers will restart only if any
of the files specified in config.json change. Certificates are now
copied to intermediary location inside of the container, from which
the script kolla_copy_cacerts will install them in the trust store.

Depends-on: https://review.opendev.org/c/openstack/kolla/+/926882
Change-Id: Ib89048c7e0f250182c4bf57d8c8a1b5478e9b4ab
Signed-off-by: Roman Krček <roman.krcek@tietoevry.com>
2024-12-01 22:16:25 +01:00
Zuul
f35cf5572c Merge "Automate prometheus blackbox configuration" 2024-09-27 17:03:49 +00:00
Roman Krček
b327527259 Refactor dev mode
Build upon changes in kolla which change strategy of installing projects
in containers when in dev mode. This fixes problems where when package
file manifest changes, the changes were not reflected in to
devmode-enabled container.

It changes the strategy of installing projects in dev mode in containers.
Instead of bind mounting the project's git repository to the venv
of the container, the repository is bind mounted to
/dev-mode/<project_name> from which the it is installed using pip
on every startup of the container using kolla_install_projects script.

Also updates docs to reflect the changes.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/925712
Closes-Bug: #1814515
Singed-off-by: Roman Krček <roman.krcek@tietoevry.com>
Change-Id: If191cd0e3fcf362ee058549a1b6c244d109b6d9a
2024-09-03 09:49:37 +02:00
Zuul
99ffff3551 Merge "Add support for docker_image_name_prefix" 2024-08-20 13:37:50 +00:00
Ivan Halomi
4ce47e2250 Refactor of kolla_container_facts
Refactor that prepares kolla_container_facts
module for introducing more actions that will be moved
from kolla_container module and kolla_container_volume_facts.

This change is based on a discussion about adding a new action
to kolla_container module that retrieves all names of the running
containers. It was agreed that kolla-ansible should follow Ansible's
direction of splitting modules between action modules and facts
modules. Because of this, kolla_container_facts needs to be able
to handle different requests for data about containers or volumes.

Change-Id: Ieaec8f64922e4e5a2199db2d6983518b124cb4aa
Signed-off-by: Ivan Halomi <ivan.halomi@tietoevry.com>
2024-08-12 09:54:05 +02:00
Michal Arbet
ae86e3a0db Add support for docker_image_name_prefix
The Kolla project supports building images with
user-defined prefixes. However, Kolla-ansible is unable
to use those images for installation.

This patch fixes that issue.

Closes-Bug: #2073541
Change-Id: Ia8140b289aa76fcd584e0e72686e3786215c5a99
2024-07-19 08:10:45 +02:00
Roman Krček
fb3a8f5fa9 Performance: use filters for service dicts
Most roles are not leveraging the jinja filters available.
According to [1] filtering the list of services makes the execution
faster than skipping the tasks.

This patchset also includes some cosmetic changes to genconfig.
Individual services are now also using a jinja filter. This has
no impact on performance, just makes the tasks look cleaner.

Naming of some vars in genconfig was changed to "service" to make
the tasks more uniform as some were previously using
the service name and some were using "service".

Three metrics from the deployment were taken and those were
- overall deployment time [s]
- time spent on the specific role [s]
- CPU usage (measured with perf) [-]
Overall genconfig time went down on avg. from 209s to 195s
Time spent on the loadbalancer role went down on avg. from 27s to 23s
Time spent on the neutron role went down on avg from 102s to 95s
Time spent on the nova-cell role went down on avg. from 54s to 52s
Also the average CPUs utilized reported by perf went down
from 3.31 to 3.15.
For details of how this was measured see the comments in gerrit.

[1] - https://github.com/stackhpc/ansible-scaling/blob/master/doc/skip.md

Change-Id: Ib0f00aadb6c7022de6e8b455ac4b9b8cd6be5b1b
Signed-off-by: Roman Krček <roman.krcek@tietoevry.com>
2024-06-28 09:04:43 +02:00
Alex-Welsh
91470d4c21 Automate prometheus blackbox configuration
This change automates the prometheus blackbox monitoring configuration
for common endpoints. Custom endpoints can be added to
prometheus_blackbox_exporter_endpoints_custom.

Change-Id: Id6f51a2bebee3ab63b84ca7032aad17c2933838c
2024-05-16 11:11:50 +01:00
Roman Krček
bc9f3f1931 Fix trove module imports
Path to the modules needed by trove-api changed in source trove
package so the configuration was updated.

Closes-bug: #1937120
Signed-of-by: Roman Krček <roman.krcek@tietoevry.com>
Change-Id: I5df02af004fabb9bb1d6ca7c3fd83954cbf63a51
2024-03-11 12:54:36 +01:00
wu.chunyang
9eff43809f Fix trove failed to discover swift endpoint
This change fixes the trove failed to discover swift endpoint
by adding service_credentials in guest-agent.conf

Closes-Bug: #2048829

Change-Id: I185484d2a0d0a2d4016df6acf8a6b0a7f934c237
2024-01-11 10:15:12 +00:00
wu.chunyang
57b24f01f3 Fix trove failed to connect rabbitmq - quorum queues support
This change fixes the trove guest instance failed to connect to
RabbitMQ by adding quorum queues support to oslo_messaging_rabbit
section in guest-agent.conf.

Closes-Bug: #2048822
Change-Id: I94908f8e20981f20fbe4dc18e2091d3798f8b801
2024-01-11 10:14:18 +00:00
wu.chunyang
6b96d098bf Fix trove failed to connect rabbitmq - durable queues support
This change fixes the trove guest instance failed to connect to
RabbitMQ by adding durable queues support to oslo_messaging_rabbit
section in guest-agent.conf.

Partial-Bug: #2048822

Change-Id: I8efc3c92e861816385e6cda3b231a950a06bf57d
2024-01-11 10:11:29 +00:00
Sven Kieske
64575519aa enable quorum queues
This implements a global toggle `om_enable_rabbitmq_quorum_queues`
to enable quorum queues for each service in RabbitMQ, similar to
what was done for HA[0].

Quorum Queues are enabled by default.

Quorum queues are more reliable, safer, simpler and faster than
replicated mirrored classic queues[1].

Mirrored classic queues are deprecated and scheduled for removal
in RabbitMQ 4.0[2].

Notice, that we do not need a new policy in the RabbitMQ definitions
template, because their usage is enabled on the client side and can't
be set using a policy[3].

Notice also, that quorum queues are not yet enabled in oslo.messaging
for the usage of reply_ and fanout_ queues (transient queues).
This will change once[4] is merged.

[0]: https://review.opendev.org/c/openstack/kolla-ansible/+/867771
[1]: https://www.rabbitmq.com/quorum-queues.html
[2]: https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/
[3]: https://www.rabbitmq.com/quorum-queues.html#declaring
[4]: https://review.opendev.org/c/openstack/oslo.messaging/+/888479

Signed-off-by: Sven Kieske <kieske@osism.tech>
Change-Id: I6c033d460a5c9b93c346e9e47e93b159d3c27830
2023-11-30 13:53:00 +00:00
Martin Hiner
a13d83400f Rename kolla_docker to kolla_container
Changes name of ansible module kolla_docker to
kolla_container.

Change-Id: I13c676ed0378aa721a21a1300f6054658ad12bc7
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
2023-11-15 13:54:57 +01:00
Michal Nasiadka
cea076f379 Introduce oneshot docker_restart_policy
docker_restart_policy: no causes systemd units to not get created
and we use it in CI to disable restarts on services.

Introducing oneshot policy to not create systemd unit for oneshot
containers (those that are running bootstrap tasks, like db
bootstrap and don't need a systemd unit), but still create systemd
units for long lived containers but with Restart=No.

Change-Id: I9e0d656f19143ec2fcad7d6d345b2c9387551604
2023-11-14 15:17:50 +00:00
Michal Nasiadka
4bc410c6ca haproxy: support single external frontend
Use case: exposing single external https frontend and
load balancing services using FQDNs.

Support different ports for internal and external endpoints.

Introduced kolla_url filter to normalize urls like:
- https://magnum.external:443/v1
- http://magnum.external:80/v1

Change-Id: I9fb03fe1cebce5c7198d523e015280c69f139cd0
Co-Authored-By: Jakub Darmach <jakub@stackhpc.com>
2023-06-29 01:44:00 +02:00
wu.chunyang
303998e294 Switch trove-api to wsgi running under apache.
This change also adds support for Trove backend TLS.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/854744
Change-Id: I2acf7820b24b112b57b0c00a01f5c4b8cb85ce25
2023-02-02 01:22:59 +00:00
Zuul
383dfc21d6 Merge "Fix prechecks in check mode" 2023-01-16 11:14:45 +00:00
Matt Crees
09df6fc1aa Add a flag to handle RabbitMQ high availability
A combination of durable queues and classic queue mirroring can be used
to provide high availability of RabbitMQ. However, these options should
only be used together, otherwise the system will become unstable. Using
the flag ``om_enable_rabbitmq_high_availability`` will either enable
both options at once, or neither of them.

There are some queues that should not be mirrored:
* ``reply`` queues (these have a single consumer and TTL policy)
* ``fanout`` queues (these have a TTL policy)
* ``amq`` queues (these are auto-delete queues, with a single consumer)
An exclusionary pattern is used in the classic mirroring policy. This
pattern is ``^(?!(amq\\.)|(.*_fanout_)|(reply_)).*``

Change-Id: I51c8023b260eb40b2eaa91bd276b46890c215c25
2023-01-13 15:40:08 +00:00
Mark Goddard
46aeb9843f Fix prechecks in check mode
When running in check mode, some prechecks previously failed because
they use the command module which is silently not run in check mode.
Other prechecks were not running correctly in check mode due to e.g.
looking for a string in empty command output or not querying which
containers are running.

This change fixes these issues.

Closes-Bug: #2002657
Change-Id: I5219cb42c48d5444943a2d48106dc338aa08fa7c
2023-01-12 14:27:36 +00:00
Zuul
2b88144c05 Merge "Explicitly set the value of heartbeat_in_pthread" 2023-01-05 13:02:20 +00:00
Matt Crees
8b8b4a8217 Explicitly set the value of heartbeat_in_pthread
The ``[oslo_messaging_rabbit] heartbeat_in_pthread`` config option
is set to ``true`` for wsgi applications to allow the RabbitMQ
heartbeats to function. For non-wsgi applications it is set to ``false``
as it may otherwise break the service [1].

[1] https://docs.openstack.org/releasenotes/oslo.messaging/zed.html#upgrade-notes

Change-Id: Id89bd6158aff42d59040674308a8672c358ccb3c
2023-01-05 09:18:13 +00:00
Matt Crees
6c2aace8d6 Integrate oslo-config-validator
Regularly, we experience issues in Kolla Ansible deployments because we
use wrong options in OpenStack configuration files. This is because
OpenStack services ignore unknown options. We also need to keep on top
of deprecated options that may be removed in the future. Integrating
oslo-config-validator into Kolla Ansible will greatly help.

Adds a shared role to run oslo-config-validator on each service. Takes
into account that services have multiple containers, and these may also
use multiple config files. Service roles are extended to use this shared
role. Executed with the new command ``kolla-ansible validate-config``.

Change-Id: Ic10b410fc115646d96d2ce39d9618e7c46cb3fbc
2022-12-21 17:19:09 +00:00
Ivan Halomi
4ca2d41762 Adding container_engine to kolla_toolbox module
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

THis change adds container_engine to module parameters
so when we introduce podman, kolla_toolbox can be used
for both engines.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: Ic2093aa9341a0cb36df8f340cf290d62437504ad
2022-11-04 15:32:30 +01:00
Ivan Halomi
7a9f04573a Adding container engine to kolla_container_facts
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.

This change adds container_engine variable to kolla_container_facts
module, this prepares module to be used with docker and podman as well
without further changes in roles.

Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I9e8fa30646844ab4a288555f3aafdda345b3a118
2022-11-02 13:44:45 +01:00
Michal Nasiadka
1aac65de0c Fix issues introduced by ansible-lint 6.6.0
mainly jinja spacing and jinja[invalid] related

Change-Id: I6f52f2b0c1ef76de626657d79486d31e0f47f384
2022-09-21 14:34:54 +00:00
Zuul
89c3a92066 Merge "Add api_workers for each service to defaults" 2022-08-22 15:30:33 +00:00
Michal Arbet
4838591c6c Add loadbalancer-config role and wrap haproxy-config role inside
This patch adds loadbalancer-config role
which is "wrapper" around haproxy-config
and proxysql-config role which will be added
in follow-up patches.

Change-Id: I64d41507317081e1860a94b9481a85c8d400797d
2022-08-09 12:15:49 +02:00
Michal Arbet
baad47ac61 Edit services roles to support database sharding
Depends-On: https://review.opendev.org/c/openstack/kolla/+/769385
Depends-On: https://review.opendev.org/c/openstack/kolla/+/765781

Change-Id: I3c4182a6556dafd2c936eaab109a068674058fca
2022-08-09 12:15:26 +02:00
Michal Nasiadka
dcf5a8b65f Fix var-spacing
ansible-lint introduced var-spacing - let's fix our code.

Change-Id: I0d8aaf3c522a5a6a5495032f6dbed8a2be0251f0
2022-07-25 22:15:15 +02:00
Michal Arbet
3e8db91a1e Add api_workers for each service to defaults
Render {{ openstack_service_workers }} for workers
of each openstack service is not enough. There are
several services which has to have more workers because
there are more requests sent to them.

This patch is just adding default value for workers for
each service and sets {{ openstack_service_workers }} as
default, so value can be overrided in hostvars per server.
Nothing changed for normal user.

Change-Id: Ifa5863f8ec865bbf8e39c9b2add42c92abe40616
2022-07-12 20:09:16 +02:00