Behaviors of CA API Gateway process nodes during DB replication failures and updates
search cancel

Behaviors of CA API Gateway process nodes during DB replication failures and updates

book

Article ID: 415241

calendar_today

Updated On:

Products

CA API Gateway

Issue/Introduction

To allow temporarily database failure during a production CA API Gateway implementation, the CA API Gateway applicance form factors suggested a particular master-master configuration using the embedded MySQL database. Customers want to know how a simple stopped database engine and on-going usage of the production system will impact the behaviors of the CA API Gateway processing nodes.

Environment

Component: CA API Gateway 11.x Appliance Form Factor

Resolution

This article provides a test that was performed by a Broadcom Support engineer hoping to help anwer the question raised by customers:

The test is performed using a two-node cluster built from CA API Gateway 11.0 Software Form Factor:

node 1 contains primary DB
node 2 contains secondary DB

node 1 use node 1 as primary DB and node 2 as secondary DB
node 2 use node 1 as primary DB and node 2 as secondary DB

This configuration is based on a suggestion documented in the CA API product document.

Now the test flow:

S1. everything is working
S2. shut down DB on node1 (primary)
S3. wait 5 minutes to allow the connections all re-stablished, (both nodes are connected to the secondary DB)
S4. use Policy Manager to connect to node 2
S5. modify policy through node 2, at the moment policy is written into the secondary DB on node 2
S6. test browser against both node 1 and node 2 and they all respond correctly (base on the modification done to secondary DB on node 2)
S7. shutdown DB on node 2 (now both DBs are down)
S8. wait 5 minues and use a browser to connect to node 1, did not try node 2, presumably it would behave the same, and see that the browser was unable to get a response from node 1 and timeout <--- node timeout when no DB connection is available.
S9. start DB on node 1 (primary)
S10 wait 5 minutes to allow node 1 connects back to DB on node 1(since the updates were written to DB on node 2), node 1 behaves the same way as what it did before updates done to DB on node 2 <-- node 1 behaves as if no policy was ever updated, even though it was using policies on DB2 after updates were first made 
S11 start DB on node 2 (secondary)
S12 wait 5 minutes and the replication resumed from secondary DB on node 2 to primary DB on node 1 (as it should be)
S13 wait 5 minutes to see node 1 start using updated policies originally made to secondary DB on node 2, presumably because primary DB on node 1 has caught up updates made to the secondary DB on node 2 through the master-master replication configuration.