VIO status stuck at RECONFIGURING after changing admin password
search cancel

VIO status stuck at RECONFIGURING after changing admin password

book

Article ID: 384851

calendar_today

Updated On:

Products

VMware Integrated OpenStack

Issue/Introduction

Symptoms:

  • OpenStack Deployment State is stuck at  RECONFIGURING state post resetting the forgotten VIO admin password 
  • viocli check health reports "Pod keystone-api unreachable
  • Deployment was initially stuck at 'STARTING' state and then went to 'RECONFIGURING' state.

Environment

VMware Integrated Openstack 7.3

Cause

  • viocli check health reports "Pod keystone-api unreachable" 


2024/11/05 05:17:46 Output logs to /var/log/viocli_health_check.log
2024/11/05 05:17:46 Start health check
2024/11/05 05:17:46 Check 'basic'

Name: vio health check
+-------+----------+------------------------------+
| NAME | RESULT | ALARM |
+-------+----------+------------------------------+
| basic | Alarms:1 | Pod keystone-api unreachable |
| | Passed:3 | |
+-------+----------+------------------------------+

root@photon-machine [ ~ ]# cd /var/log/viocli_health_check

time="2024/11/05 00:00:01" level=info msg="======================================= Health check start ======================================="
2024/11/05 00:00:01 basic.sh : Check start ...
2024/11/05 00:00:01 basic.sh check_depend_cmd: Passed
2024/11/05 00:00:07 basic.sh check_ssh_node: Passed
2024/11/05 00:00:07 basic.sh check_not_ready_node: Passed
2024/11/05 00:00:07 basic.sh check_ssh_pod: Run command in pod: no running pod for keystone-api
2024/11/05 00:00:07 basic.sh : Check complete.
time="2024/11/05 00:00:07" level=info msg="======================================= Health check complete ======================================="
time="2024/11/05 05:17:46" level=info msg="======================================= Health check start ======================================="
2024/11/05 05:17:46 basic.sh : Check start ...
2024/11/05 05:17:46 basic.sh check_depend_cmd: Passed
2024/11/05 05:17:51 basic.sh check_ssh_node: Passed
2024/11/05 05:17:51 basic.sh check_not_ready_node: Passed
2024/11/05 05:17:52 basic.sh check_ssh_pod: Run command in pod: no running pod for keystone-api
2024/11/05 05:17:52 basic.sh : Check complete.
time="2024/11/05 05:17:52" level=info msg="======================================= Health check complete ======================================="
viocli_health_check.log.20241105 (END)

 

Note: The preceding log excerpts are only examples. Date, time, and jobs may vary depending on your environment

Resolution

Found to be an issue in the helm chart that it could not generate job-domain-manager.yaml correctly. 

  • Helm keystone is deployed
  • However, due to the admin password changes made, it wasn't able to update all the components in keystone helm charts
  • Recommendation was to delete the keystone helm and allow it to redeploy

 1. Check for keystone helm 

#root@photon-machine [~]#  helm list | grep keystone 

keystone1           2              Fri May 31 09:35:15 2024             DEPLOYED           keystone-7.3.0+21849206 openstack

  2. Check the the keystone helm job


#root@photon-machine [~]#  osctl get job  |  grep keystone 


3. Check the controller status 

#root@photon-machine [~]# osctl get po |  grep openstack-controller

           openstack-controller-7db9dd5bbf-w2rcv               1/1              Running                  190d  

4. From the output in step-2, check for  a  error pod with name that starts with "keysotne-domain-manage"

5. Proceed with deleting the helm-keystone 

#root@photon-machine [~] osctl delete job helm-keystone-keystone1-b5mgwhmvf8

5. Wait for 15 to 20 minutes and until new keystone helm chart is "Deployed"

#root@photon-machine [~]#  helm list | grep keystone 

6. Check the vio deployment status and check if it's "Running" state

#root@photon-machine [~] viocli get deployment

 

 

Additional Information

  • If after waiting for 20 or 30 minutes (After step 5 ) openstack-keystone01 in 'helm list -a' is still not "DEPLOYED" and there is a error pod with name that starts with "keysotne-domain-manage" we will need to delete openstack-keystone01 helm chart.
  • Since it still failed in our case, we moved on to purge keystone1 helm chart and delete job again.
  • In this case, a fresh new keystone1 helm chart will be deployed.

#root@photon-machine [~]# helm delete --purge --no-hooks keystone1 

  • Wait for 15 to 20 minutes and until new keystone helm chart is "Deployed"
  • Check the VIO deployment status and check if it's "Running" state

#root@photon-machine [~] viocli get deployment