Replication breaks after applying a Patch to the Gateway
search cancel

Replication breaks after applying a Patch to the Gateway

book

Article ID: 73759

calendar_today

Updated On:

Products

CA API Gateway

Issue/Introduction

  • If the replication is enabled, in some cases the replication can break after a reboot of the SSG service. The error message is similar as below:
    • [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position; the first event 'ssgbin-log.001155' at 367782067, the last event read from '/var/lib/mysql/ssgbin-log.001155' at 4, the last byte read from '/var/lib/mysql/ssgbin-log.001155' at 4.', Error_code: 1236
  • The message above indicates master server crashed or rebooted and the binary log events are not synchronized on disk. This usually happens when sync_binlog does not equal 1 on the master. Sync_binlog=1 means synchronize the binary log to disk after every commit. As we set sync_binlog=16 in /etc/my.cnf by default for performance purpose, and we usually reboot after apply patch, if we don't stop slave before reboot, then we will have chance to get the above replication failure and have to reinitialize the replication.


Environment

Gateway 11.x

Resolution

  1. Before any reboot, run the following commands on both database nodes:
    1. stop slave;
    2. flush logs;

  2. After reboot, run the following command on both database nodes to resume the replication:
    1. start slave;