From 28bbeb21e1d8a1edf35480dbf208a896b695ef37 Mon Sep 17 00:00:00 2001 From: Iago Regiani Date: Thu, 2 Sep 2021 16:54:45 -0300 Subject: [PATCH] Controller lock failure in RR patch removal During update strategy in a AIO-DX the swact might complete and a few moments later a swact back be triggered by SM due to rabbitmq being able to reach enabled state. This happened because port 5672 was being used. No other service should use it, suggesting that a handler was not properly disposed of in the previous execution of rabbit. This adds a PID file parameter to rabbit stop command of the rabbit instance, making sure that the command will finish when the related process is terminated. Also, was added a port check, logging when it was in use after the stop command has concluded. Test plan: PASS: SX-AIO Bootstrap & deployment PASS: DX-AIO Bootstrap & deployment Failure path: PASS: Launched an unexpected process to occupy port 5672 and it is logged successfully Closes-bug: #1942464 Signed-off-by: Iago Regiani Change-Id: Id6c0770ba8e1a5460e4af07e0e6d0d5447581771 --- rabbitmq-server-config/files/rabbitmq-server.ocf | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/rabbitmq-server-config/files/rabbitmq-server.ocf b/rabbitmq-server-config/files/rabbitmq-server.ocf index fc09205..2165f83 100644 --- a/rabbitmq-server-config/files/rabbitmq-server.ocf +++ b/rabbitmq-server-config/files/rabbitmq-server.ocf @@ -358,12 +358,18 @@ rabbit_stop() { # return $OCF_SUCCESS fi - $RABBITMQ_CTL stop + $RABBITMQ_CTL stop $RABBITMQ_PID_FILE rc=$? if [ "$rc" != 0 ]; then ocf_log err "rabbitmq-server stop command failed: $RABBITMQ_CTL stop, $rc" - return $rc + fi + + process_name=$(lsof -i TCP:5672 | grep LISTEN | awk '{print $1}' | sed 1q) + + if [ ! -z $process_name ]; then + ocf_log err "rabbitmq-server stop command executed: '$RABBITMQ_CTL stop $RABBITMQ_PID_FILE', but port is still in use by $process_name." + exit $OCF_ERR_GENERIC fi # Spin waiting for the server to shut down.