Unable to deploy prometheus add-on on cluster
search cancel

Unable to deploy prometheus add-on on cluster

book

Article ID: 391705

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

While deploying the prometheus addon, it can be seen that the addon is stuck in the processing state and below errors are seen for the prometheus PVC which is in pending state.

On describing the the prometheus PVC, below error is seen:

tanzu-system-monitoring   prometheus-server                                                                                        Pending                                                                        vsphere-csi    24m

Warning   ProvisioningFailed       persistentvolumeclaim/prometheus-server                    failed to provision volume with StorageClass "vsphere-csi": rpc error: code = Internal desc = failed to get shared datastores in kubernetes cluster. Error: no shared datastores found for nodeVm: VirtualMachine:vm-15111 [VirtualCenterHost: esxi01.example.com, UUID: ######-####-####-####-###########, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-XYZ, VirtualCenterHost: esxi01.example.com]]

Environment

TCA 2.x, 3.x

Cause

VM failed to access the datastore because the datastore was not mounted on the ESXI on which VMs recited.

Resolution

Verify and make sure that the ESXI is provided with that datastore access. Once the datastore access is provided to all the hosts on which the VMs recited. Retry adding the prometheus addon and you will be able to deploy it successfully.

Additional Information

Below additional logs can be checked for the errors:

Login to management cluster:

  • kubectl get tka -A |grep <namespace>
  • kubectl decribe tka prometheus -n <namespace>

Login to Workload cluster 

  • kubectl get apps -A |grep prometheus
  • kubectl get pkgi -A // can also describe to look for errors
  • kubectl get events -n tanzu-system-monitoring