All mariadb-server pods not starting
search cancel

All mariadb-server pods not starting

book

Article ID: 400737

calendar_today

Updated On:

Products

VMware Integrated OpenStack

Issue/Introduction

  • Deployment status is Starting:
    OpenStack Deployment State: STARTING
  • All mariadb server pods not ready:
    SERVICE        CONTROLLER                       READY   FAILURES
    ...
    mariadb        mariadb-server                    2/3       -
    ...
  • Describing the mariadb-server pod that isn't running shows similar:
    Events:
      Type     Reason     Age                      From                            Message
      ----     ------     ----                     ----                            -------
      Warning  Unhealthy  2m34s (x30510 over 10d)  kubelet, controller-############  Readiness probe failed:
  • Logs for that mariadb-server pod show similar:
    2025-06-09 06:37:35,301 - OpenStack-Helm Mariadb - INFO - Checking to see if cluster data is fresh
    2025-06-09 06:37:35,304 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is too old to make a decision for node mariadb-server-0
    2025-06-09 06:37:35,304 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is ok for node mariadb-server-1
    2025-06-09 06:37:35,305 - OpenStack-Helm Mariadb - INFO - The data we have from the cluster is too old to make a decision for node mariadb-server-2

  • The leader node is the mariadb server pod that isn't running:
    For example:
    #osctl get cm mariadb1-mariadb-state -oyaml
    ...
    openstackhelm.openstack.org/cluster.state: live
    openstackhelm.openstack.org/leader.expiry: "2025-01-01T12:49:54.353900Z"
    openstackhelm.openstack.org/leader.node: mariadb-server-1
    openstackhelm.openstack.org/reboot.node: mariadb-server-0

Environment

7.3

Cause

The configuration has the cluster leader node as the node that does not have the highest sequence number so it cannot become leader of the cluster nor join the existing cluster.

Resolution

Note:  Make sure we have valid current backups.  Failure to have valid current backups can result in data loss.  

In this scenario, the listed leader node is not as updated as the running cluster.  We could end up in a scenario where that node becomes the cluster leader and the other 2 nodes will update from that.  Resulting in data loss that may not be recoverable.

Backing up the database where we don't have a current valid backup.  Use one of the current running nodes.  This will store the file in /tmp on the pod.  

  1. osctl exec -it mariadb-server-0 bash
  2. mysqldump --defaults-file=/etc/mysql/admin_user.cnf --host=localhost  --all-databases > /tmp/<backup name>.sql
  3. exit
  4. osctl cp openstack/mariadb-server-0:/tmp/<database>.sql /tmp/<database>.sql    **copy backup to the manager for safe keeping**

If you run into any problems with backup piece, contact Broadcom Support and do not proceed. 


Correction of the issue.

  1. The leader node needs to be updated:
    #osctl annotate --overwrite cm mariadb1-mariadb-state openstackhelm.openstack.org/leader.node='mariadb-server-x'

    Note: Replace x with any of the other running nodes

  2. Restart the non-running node:
    #osdel pod mariadb-server-y

    Note: Replace y with the non-running node

  3. Validate that the node is in a running now up and ready
    #viocli get deployment
    ...
    mariadb        mariadb-server                    3/3
    ...
    OpenStack Deployment State: RUNNING