All mariadb-server pods not starting

Products

VMware Integrated OpenStack

Issue/Introduction

MariaDB only running on 2 of 3 containers.
Deployment status is Starting:
OpenStack Deployment State: STARTING
All mariadb server pods not ready:
SERVICE CONTROLLER READY FAILURES
...
mariadb mariadb-server 2/3 -
...
Describing the mariadb-server pod that isn't running shows similar:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 2m34s (x30510 over 10d) kubelet, controller-############ Readiness probe failed:
Logs for that mariadb-server pod show similar:
2025-06-09 06:37:35,301 - OpenStack-Helm Mariadb - INFO - Checking to see if cluster data is fresh
2025-06-09 06:37:35,304 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is too old to make a decision for node mariadb-server-0
2025-06-09 06:37:35,304 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is ok for node mariadb-server-1
2025-06-09 06:37:35,305 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is too old to make a decision for node mariadb-server-2
The leader node is the mariadb server pod that isn't running:
For example:
#osctl get cm mariadb1-mariadb-state -oyaml
...

openstackhelm.openstack.org/cluster.state: live

openstackhelm.openstack.org/leader.expiry: "2025-01-01T12:49:54.353900Z"

openstackhelm.openstack.org/leader.node: mariadb-server-1

openstackhelm.openstack.org/reboot.node: mariadb-server-0

Environment

7.3

Cause

The configuration has the cluster leader node as the node that does not have the highest sequence number. It cannot become leader of the cluster nor join the existing cluster.

Resolution

Note: Make sure we have valid current backups. Failure to have valid current backups can result in data loss.

In this scenario, the listed leader node is not as updated as the running cluster. We could end up in a scenario where that node becomes the cluster leader and the other 2 nodes will update from that. Resulting in data loss that may not be recoverable.

Backing up the database where we don't have a current valid backup. Use one of the current running nodes. This will store the file in /tmp on the pod.

osctl exec -it mariadb-server-0 bash
mysqldump --defaults-file=/etc/mysql/admin_user.cnf --host=localhost --all-databases > /tmp/<backup name>.sql
exit
osctl cp openstack/mariadb-server-0:/tmp/<database>.sql /tmp/<database>.sql **copy backup to the manager for safe keeping**

If you run into any problems with backup piece, contact Broadcom Support and do not proceed.

Correction of the issue.

The leader node needs to be updated:
#osctl annotate --overwrite cm mariadb1-mariadb-state openstackhelm.openstack.org/leader.node='mariadb-server-x'

Note: Replace x with any of the other running nodes
Restart the non-running node:
#osdel pod mariadb-server-y

Note: Replace y with the non-running node
Validate that the node is in a running now up and ready
#viocli get deployment
...
mariadb mariadb-server 3/3
...
OpenStack Deployment State: RUNNING

Additional Information

Additionally we can also change the reboot.node as well.
#osctl annotate --overwrite cm mariadb1-mariadb-state openstackhelm.openstack.org/reboot.node='mariadb-server-x'