Controller lock failure in RR patch removal

During update strategy in a AIO-DX the swact might complete and a few
moments later a swact back be triggered by SM due to rabbitmq being able
to reach enabled state.

This happened because port 5672 was being used. No other service
should use it, suggesting that a handler was not properly disposed of in
the previous execution of rabbit. This adds a PID file parameter to
rabbit stop command of the rabbit instance, making sure that the
command will finish when the related process is terminated. Also, was
added a port check, logging when it was in use after the stop command
has concluded.

Test plan:

PASS: SX-AIO Bootstrap & deployment
PASS: DX-AIO Bootstrap & deployment

Failure path:

PASS: Launched an unexpected process to occupy port 5672 and it is
logged successfully

Closes-bug: #1942464
Signed-off-by: Iago Regiani <Iago.RodriguezRegiani@windriver.com>
Change-Id: Id6c0770ba8e1a5460e4af07e0e6d0d5447581771
This commit is contained in:
Iago Regiani 2021-09-02 16:54:45 -03:00
parent 7c503a3033
commit 28bbeb21e1

View File

@ -358,12 +358,18 @@ rabbit_stop() {
# return $OCF_SUCCESS
fi
$RABBITMQ_CTL stop
$RABBITMQ_CTL stop $RABBITMQ_PID_FILE
rc=$?
if [ "$rc" != 0 ]; then
ocf_log err "rabbitmq-server stop command failed: $RABBITMQ_CTL stop, $rc"
return $rc
fi
process_name=$(lsof -i TCP:5672 | grep LISTEN | awk '{print $1}' | sed 1q)
if [ ! -z $process_name ]; then
ocf_log err "rabbitmq-server stop command executed: '$RABBITMQ_CTL stop $RABBITMQ_PID_FILE', but port is still in use by $process_name."
exit $OCF_ERR_GENERIC
fi
# Spin waiting for the server to shut down.