The follower in my leader-follower service instance no longer syncs. The "inspect" errand output looks good, except for the fact that the follower's GTID is falling farther and farther behind the leader's.
The follower's mysql.err.log contains an error like the following:
2020-04-23T20:18:04.236979Z 72810 [ERROR] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Error_code: 1236
This is an indication that the follower should be jumpstarted to re-sync with the leader. The steps to jumpstart a follower are as follows:
1. Run the "inspect" errand for the leader-follower service instance deployment to identify the follower.
$ bosh -d service-instance_f15ab82b-311d-4b32-b2a1-e14ad18d08fb run-errand inspect Using environment '10.193.93.11' as user 'director' Using deployment 'service-instance_f15ab82b-311d-4b32-b2a1-e14ad18d08fb' Task 966 Task 966 | 15:30:24 | Preparing deployment: Preparing deployment Task 966 | 15:30:25 | Warning: Executing errand on multiple instances in parallel. Use the `--instance` flag to run the errand on a single instance. Task 966 | 15:30:25 | Preparing deployment: Preparing deployment (00:00:01) Task 966 | 15:30:25 | Running errand: mysql/ed91fa04-8c19-46d1-8bad-6b5bb7f363f5 (1) Task 966 | 15:30:26 | Running errand: mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0) Task 966 | 15:30:27 | Running errand: mysql/ed91fa04-8c19-46d1-8bad-6b5bb7f363f5 (1) (00:00:02) Task 966 | 15:30:27 | Running errand: mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0) (00:00:01) Task 966 | 15:30:27 | Fetching logs for mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0): Finding and packing log files Task 966 | 15:30:27 | Fetching logs for mysql/ed91fa04-8c19-46d1-8bad-6b5bb7f363f5 (1): Finding and packing log files Task 966 | 15:30:29 | Fetching logs for mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0): Finding and packing log files (00:00:02) Task 966 | 15:30:29 | Fetching logs for mysql/ed91fa04-8c19-46d1-8bad-6b5bb7f363f5 (1): Finding and packing log files (00:00:02) Task 966 Started Fri May 1 15:30:24 UTC 2020 Task 966 Finished Fri May 1 15:30:29 UTC 2020 Task 966 Duration 00:00:05 Task 966 done Instance mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 Exit Code 0 Stdout 2020/05/01 15:30:26 Started executing command: inspect 2020/05/01 15:30:26 IP Address: 10.193.93.151 Role: follower Super Read Only: true Read Only: true Replication Configured: true Replication Mode: async Has Data: true GTID Executed: 70c9d29f-8646-11ea-b57a-005056acf44c:1-120451 2020/05/01 15:30:26 Successfully executed command: inspect Stderr - Instance mysql/ed91fa04-8c19-46d1-8bad-6b5bb7f363f5 Exit Code 0 Stdout 2020/05/01 15:30:26 Started executing command: inspect 2020/05/01 15:30:26 IP Address: 10.193.93.152 Role: leader Super Read Only: false Read Only: false Replication Configured: false Replication Mode: async Has Data: true GTID Executed: 70c9d29f-8646-11ea-b57a-005056acf44c:1-120451 2020/05/01 15:30:26 Successfully executed command: inspect Stderr - 2 errand(s) Succeeded
$ bosh -d service-instance_f15ab82b-311d-4b32-b2a1-e14ad18d08fb ssh mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 [...] $ sudo su - # monit stop mysql # monit summary The Monit daemon 5.2.5 uptime: 6d 23h 23m Process 'loggregator_agent' running Process 'mysql' not monitored Process 'mysql-agent' running Process 'mysql-metrics' running Process 'service-backup' running Process 'streaming-mysql-backup-tool' running Process 'bosh-dns' running Process 'bosh-dns-resolvconf' running Process 'bosh-dns-healthcheck' running Process 'system-metrics-agent' running System 'system_localhost' running
# rm -rf /var/vcap/store/mysql/data
# /var/vcap/jobs/mysql/bin/pre-start Fri May 1 15:42:11 UTC 2020 waiting for bosh-dns to initialize Fri May 1 15:42:11 UTC 2020 bosh-dns is ready mysqladmin: connect to server at 'localhost' failed error: 'Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)' Check that mysqld is running and that the socket: '/tmp/mysql.sock' exists! bosh-dns-healthcheck is ready ------------ STARTING mysql_ctl at Fri May 1 15:42:11 UTC 2020 -------------- ------------ STARTING mysql_ctl at Fri May 1 15:42:11 UTC 2020 -------------- Total memory in bytes: 83647488005. Exit the follower and run the "inspect" errand again. At this point the former follower's "Role" will appear as "unknown".
$ bosh -d service-instance_f15ab82b-311d-4b32-b2a1-e14ad18d08fb run-errand configure-leader-follower --instance=mysql/a66703f6-9958-45ef-9373-ae87850 Using environment '10.193.93.11' as user 'director' Using deployment 'service-instance_f15ab82b-311d-4b32-b2a1-e14ad18d08fb' Task 980 Task 980 | 15:52:46 | Preparing deployment: Preparing deployment Task 980 | 15:52:47 | Warning: Executing errand on multiple instances in parallel. Use the `--instance` flag to run the errand on a single instance. Task 980 | 15:52:47 | Preparing deployment: Preparing deployment (00:00:01) Task 980 | 15:52:47 | Running errand: mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0) (00:00:27) Task 980 | 15:53:14 | Fetching logs for mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 (0): Finding and packing log files (00:00:02) Task 980 Started Fri May 1 15:52:46 UTC 2020 Task 980 Finished Fri May 1 15:53:16 UTC 2020 Task 980 Duration 00:00:30 Task 980 done Instance mysql/a66703f6-9958-45ef-9373-ae87850ce6d4 Exit Code 0 Stdout 2020/05/01 15:52:47 Started executing command: configure-leader-follower 2020/05/01 15:53:14 Leader: q-m53n2s0.q-g181.bosh, Follower: q-m52n2s0.q-g181.bosh 2020/05/01 15:53:14 Successfully executed command: configure-leader-follower Stderr - 1 errand(s) Succeeded