MySQL instance sync issue "not allowing follower to become leader"
search cancel

MySQL instance sync issue "not allowing follower to become leader"

book

Article ID: 297970

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

You may see the following error message:

Stdout 2020/02/25 17:03:10 Started executing command: make-leader
2020/02/25 17:03:10 Requesting instance to be configured as leader
2020/02/25 17:03:10 Failed to promote leader: make-leader request failed: [500 Internal Server Error] Replication settings exist on this instance and Slave SQL Thread is turned off. Fix replication settings and try again.

Summary of the Issue

In some situations, for example scaling to a larger database, the follower VM is always far behind the leader VM.

Run the command "cf nozzle -no-filter | grep "origin:\”p.mysql\”” | grep seconds", and observe the following:

origin:"p.mysql" eventType:ValueMetric timestamp:1583179694657263273 deployment:"service-instance_#############" job:"mysql" index:"1f8491bb-b4e7-4654-aa21-c52c0a247d87" ip:"10.233.8.47" tags:<key:"source_id" value:"##################" > valueMetric:<name:"/p.mysql/follower/seconds_since_leader_heartbeat" value:173162 unit:"integer" >



Environment

Product Version: 2.6

Resolution

From the value of "seconds_since_leader_heartbeat" you can see the follower is far behind the leader, this appears to be a lagging follower which caused by network issue.

One workaround in the short term would be to relax durability on the follower.

This can be done by running the following two queries on a follower:

mysql> SET GLOBAL sync_binlog = off;
mysql> SET GLOBAL innodb_flush_log_at_trx_commit = 2;


This should improve the throughput on the follower and you would expect the “seconds_behind_leader” metric to either level off or start decreasing.