To allow temporarily database failure during a production CA API Gateway implementation, the CA API Gateway applicance form factors suggested a particular master-master configuration using the embedded MySQL database. Customers want to know how a simple stopped database engine and on-going usage of the production system will impact the behaviors of the CA API Gateway processing nodes.
Component: CA API Gateway 11.x Appliance Form Factor
This article provides a test that was performed by a Broadcom Support engineer hoping to help anwer the question raised by customers:
The test is performed using a two-node cluster built from CA API Gateway 11.0 Software Form Factor:
node 1 contains primary DB
node 2 contains secondary DB
node 1 use node 1 as primary DB and node 2 as secondary DB
node 2 use node 1 as primary DB and node 2 as secondary DB
This configuration is based on a suggestion documented in the CA API product document.
Now the test flow:
S1. everything is working
S2. shut down DB on node1 (primary)
S3. wait 5 minutes to allow the connections all re-stablished, (both nodes are connected to the secondary DB)
S4. use Policy Manager to connect to node 2
S5. modify policy through node 2, at the moment policy is written into the secondary DB on node 2
S6. test browser against both node 1 and node 2 and they all respond correctly (base on the modification done to secondary DB on node 2)
S7. shutdown DB on node 2 (now both DBs are down)
S8. wait 5 minues and use a browser to connect to node 1, did not try node 2, presumably it would behave the same, and see that the browser was unable to get a response from node 1 and timeout <--- node timeout when no DB connection is available.
S9. start DB on node 1 (primary)
S10 wait 5 minutes to allow node 1 connects back to DB on node 1(since the updates were written to DB on node 2), node 1 behaves the same way as what it did before updates done to DB on node 2 <-- node 1 behaves as if no policy was ever updated, even though it was using policies on DB2 after updates were first made
S11 start DB on node 2 (secondary)
S12 wait 5 minutes and the replication resumed from secondary DB on node 2 to primary DB on node 1 (as it should be)
S13 wait 5 minutes to see node 1 start using updated policies originally made to secondary DB on node 2, presumably because primary DB on node 1 has caught up updates made to the secondary DB on node 2 through the master-master replication configuration.