Reset/Reinitialize replication in Gateway with MySQL 8.4

Products

CA API Gateway

Issue/Introduction

This procedure will explain how to reinitialize failed replication in a multi-node cluster during a maintenance window.

The API Gateway uses MySQL replication to provide database failover and availability should one Gateway appliance or database server become unavailable or degraded. MySQL replication will ensure that a duplicated copy of a database object is maintained in one or more locations. The Gateway uses a master-master implementation in a multi-node environment to ensure that database changes to one host are replicated to the other database host.

MySQL replication is capable of repairing itself if the other database node is unavailable. Replication can repair itself after a node in the cluster experiences a graceful shutdown, MySQL server shutdown, or a network outage. Replication will fail if a data consistency error is experienced.

The following data may be visible when running the SHOW REPLICA STATUS query against the local MySQL database:

Slave_IO_Running: No
Slave_SQL_Running: No

The following log entries may be present in the Gateway log files:

2380 WARNING "Error accessing host/database
2381 WARNING "Replication failing for host/database

For Gateway with MySQL version 8.0 please see this KB document.

Environment

Gateway 11.2

MySQL 8.4

Resolution

The following procedure should be executed on Gateway 11.x with MySQL 8.4 in order to reinitialize replication in a multi-node cluster:

Stop the API Gateway 'ssg' service on Primary/Secondary AND all PROCESSING nodes: service ssg stop
Backup the database on the PRIMARY node first as a precaution: mysqldump --all-databases --set-gtid-purged=OFF | gzip > ~/all_databases_`date '+%Y%m%d_%T'`.sql.gz
- Make sure that the last line of the newly created SQL file contains "-- Dump completed <date-time>"
- Store the file for future use in case the database needs to be restored
- Consider also taking a VM snapshot if possible to most easily restore a virtual appliance
Stop slave replication on both nodes: mysqladmin stop-replica
Reset the master configuration on both nodes: mysql -e "RESET BINARY LOGS AND GTIDS"
Reset the slave configuration on both nodes: mysql -e "reset replica; reset replica all"
Execute the create_replica.sh script on the SECONDARY node: /opt/SecureSpan/Appliance/bin/create_replica.sh
- Follow the prompts of the script, keeping the following in mind:
- Provide the FQDN of the primary node
- Enter yes to the prompt "Do you want to clone a database from $Master (yes or no)?"
On the secondary node after step 6 is completed, reset the master configuration one more time: mysql -e "RESET BINARY LOGS AND GTIDS"
Execute the create_replica.sh script on the PRIMARY node: /opt/SecureSpan/Appliance/bin/create_replica.sh
- Follow the prompts of the script, keeping the following in mind:
- Provide the FQDN of the secondary node
- Enter no to the prompt "Do you want to clone a database from $Master (yes or no)?"
Start the Gateway 'ssg' service on Primary/Secondary AND all PROCESSING nodes: service ssg start
Query the status of the replication on both nodes: mysql -e "show replica status\G"
Verify both nodes return the following lines:
- Slave_IO_Running: Yes
- Slave_SQL_Running: Yes
- Seconds_Behind_Master: 0

Replication should be re-initialized at this point. The above output indicates that the master/slave relationship is functioning.

Additional Information

For gateway 11.x with MySQL 8.0, please use this KB document