mysql container failed to start due to group replication does not skip events that have its own server ID
search cancel

mysql container failed to start due to group replication does not skip events that have its own server ID

book

Article ID: 297312

calendar_today

Updated On:

Products

VMware Tanzu MySQL

Issue/Introduction

Customer is using MySQL for K8S v1.5.0-mysql-8.0.28 based on TKGI cluster. They found one pod of the mysql node mysql-1 was failed to start.

pod/mysql-0      3/3      Running     0                      42d
pod/mysql-1      2/3      Running     5 (2m17s ago)          7m31s
pod/mysql-2      3/3      Running     3 (26d ago)            42d


Reviewed mysql log, we found below error message:

2023-01-16T09:02:30.174559Z 20 [ERROR] [MY-010411] [Repl] Transaction's sequence number is inconsistent with that of a preceding one: sequence_number (2) <= previous sequence_number (197015)
2023-01-16T09:02:30.174639Z 20 [Warning] [MY-010584] [Repl] Slave SQL for channel 'group_replication_applier': Coordinator thread of multi-threaded slave is being stopped in the middle of assigning a group of events; deferring to exit until the group completion ... , Error_code: MY-000000

It seems hit the bug Parallel replication always fails with specific workload from sysbench (https://bugs.mysql.com/bug.php?id=89375). Internally Group Replication does always use replicate_same_server_id and a cluster could trip over this behavior in older MySQL versions.

Environment

Product Version: 1.5

Resolution

Suggest to delete the problematic PVC and pod of the issued mysql node to fix it.

1. Check the current relationship between pvc, pv and pods:
kubectl get pv -n <namespace>
kubectl get pvc -n <namespace>
2. Fix this issue by deleting the old pv:
kubectl -n <namespace> delete pvc mysql-data-<pod_name> --wait=false
kubectl -n <namespace> delete pod <pod_name>
kubectl -n <namespace> delete pv pvc-<xxx> #ensure old pv name