Replication breaks after applying a Patch to the Gateway
book
Article ID: 73759
calendar_today
Updated On:
Products
CA API Gateway
Issue/Introduction
If the replication is enabled, in some cases the replication can break after a reboot of the SSG service. The error message is similar as below:
[ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'ssgbin-log.001155' at 367782067, the last event read from '/var/lib/mysql/ssgbin-log.001155' at 4, the last byte read from '/var/lib/mysql/ssgbin-log.001155' at 4.', Error_code: 1236
The message above indicates master server crashed or rebooted and the binary log events are not synchronized on disk. This usually happens when sync_binlog does not equal 1 on the master. Sync_binlog=1 means synchronize the binary log to disk after every commit. As we set sync_binlog=16 in /etc/my.cnf by default for performance purpose, and we usually reboot after apply patch, if we don't stop slave before reboot, then we will have chance to get the above replication failure and have to reinitialize the replication.
Environment
Gateway 11.x
Resolution
Before any reboot, run the following commands on both database nodes:
stop slave;
flush logs;
After reboot, run the following command on both database nodes to resume the replication: