Replace lsof by ss in RabbitMQ ocf script

It has been noted on heavy load test conditions that lsof
can hang for a considerable time and cause timeouts on the
RabbitMQ stop path triggered from Service Manager on a
swact scenario.

To avoid that, both netstat or ss commands could be used to
check for listening process on the amqp port (5672).

The ss command has been chosen since man page of netstat mark
it as obsolete and points ss as replacement for the major part
of it.

Also, note that ss uses Netlink which uses socket API.

Closes-Bug: 2018346

Test Plan:

PASS: Verify, using ss, the listening amqp socket
PASS: Verify AIO-DX is properly deployed
PASS: Restart RabbitMQ service successfully using sm-restart
PASS: Swact successfully on DX system
PASS: Lock/unlock successfully

Change-Id: I929b2a1b7a61eb70154c00177aa0b7f2fc46890a
Signed-off-by: Adriano Oliveira <adriano.oliveira@windriver.com>
This commit is contained in:
Adriano Oliveira 2023-05-03 16:30:53 -04:00
parent 25bdea4131
commit 0dee292bef

View File

@ -369,10 +369,10 @@ rabbit_stop() {
ocf_log err "rabbitmq-server stop command failed: $RABBITMQ_CTL stop, $rc"
fi
process_name=$(lsof -i TCP:5672 | grep LISTEN | awk '{print $1}' | sed 1q)
process_info=$(ss -ntlp | grep -w 5672 | awk '{print $6}' | sed 1q)
if [ ! -z $process_name ]; then
ocf_log err "rabbitmq-server stop command executed: '$RABBITMQ_CTL stop $RABBITMQ_PID_FILE', but port is still in use by $process_name."
if [ ! -z "${process_info}" ]; then
ocf_log err "rabbitmq-server stop command executed: '$RABBITMQ_CTL stop $RABBITMQ_PID_FILE', but port is still in use by ${process_info}."
exit $OCF_ERR_GENERIC
fi