Clustered Gateway node off line due to failing health check

book

Article ID: 197682

calendar_today

Updated On:

Products

CA API Gateway API SECURITY CA API Gateway Precision API Monitoring Module for API Gateway (Layer 7) CA API Gateway Enterprise Service Manager (Layer 7) STARTER PACK-7 CA Microgateway

Issue/Introduction

4 clustered GW running in AWS.  Three of the nodes is receiving traffic but one of the nodes is not.  
The health check is reporting OK for 3 nodes.
There are two Gateway1 nodes and both point to node001 skipping node002.  One of these Gateway1 nodes is meant to be node002 but is reporting FAIL for node001 making node002 effectively off line.

All 4 gateway nodes are running.

Cause

The data in cluster_info table is wrong.

The SQL select * from cluster_info\G shows 2 nodes have the same name 'Gateway1',

mysql> select * from cluster_info\G
*************************** 1. row ***************************
           ...
             name: Gateway2
          ...
*************************** 2. row ***************************
           ...
             name: Gateway3
          ...
*************************** 3. row ***************************
           ...
             name: Gateway1
         ...
*************************** 4. row ***************************
           ...
             name: Gateway1
          ...
4 rows in set (0.00 sec)

Environment

Release : 9.4

Component : API GATEWAY

Resolution

1. Delete the record of problematic node but has duplicated name 'Gateway1' from cluster_info table

2. Restart the problematic gateway node to re-register the node info

 

Now all 4 rows have unique names and traffic is flowing to all 4 gateways.