DSM Provisioned DB displays "Critical Status" Alert and does not come Ready when created Using VCF Automation 9.0.1
search cancel

DSM Provisioned DB displays "Critical Status" Alert and does not come Ready when created Using VCF Automation 9.0.1

book

Article ID: 412306

calendar_today

Updated On:

Products

VMware Data Services Manager

Issue/Introduction

Symptoms:

  • VCF Automation Provisioned DSM database instance has been deployed

  • From VCF Automation Tenant Portal ,
    • Database instance may show Alert Status as Critical 

    • Database instance Status may show InProgress 

    • Last Status Condition may show NotOperational 
      (click on ' i ' icon for status condition details)

      message: >-
        database node xxxxxx-0 (unknown) is not ready (unknown) | database
        node xxxxxx-monitor-0 (monitor) is ready
      observedGeneration: 1
      reason: NotOperational

  • After connecting to database cluster node using the procedure  SSH to VKS Cluster Nodes  and executing sudo su on the machine, the following symptoms may be observed on the database pod(s)

     
    • The database pod(s) are in Pending state

      kubectl get pods -A --kubeconfig=/etc/kubernetes/admin.conf

      NAMESPACE          NAME              READY   STATUS      RESTARTS

      <namespace>         <db-pod-instance>-0    0/4    Pending     0       



    • Describing the database pods may show the following status

kubectl describe pod -n <namespace> <db-pod-instance>-0 --kubeconfig=/etc/kubernetes/admin.conf

Events:

 Type     Reason            Age                 From               Message

  ----     ------            ----                ----               -------

 Warning  FailedScheduling  67s (x558 over 1h)  default-scheduler  0/1 nodes are available: 1 Insufficient memory. 
 preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

 

    •  vmware-system-vksm namespace may be present on the database node(s)

      kubectl get namespace  vmware-system-vksm --kubeconfig=/etc/kubernetes/admin.conf

      NAME                 STATUS   AGE

      vmware-system-vksm   Active   9h

 

    

Environment

VCF 9.0.1

VCF Automation 9.0.1.0

Data Service Manager 9.0.0.0

Data Service Manager 9.0.1.0

 

Cause

With the release of VCF Automation 9.0.1, VCF Automation deployments may include a new VKS cluster management features that integrate with deployed DSM database clusters when provisioned via VCFA organization portal . For more information see VCF Automation 9.0.1 release notes

This prevents database pods starting correctly.

Broadcom Engineering are currently investigating  this issue.

Resolution

To workaround this issue 

Use the kubectl-vsphere plugin tool.

 

  1. Connect to the supervisor management cluster
    kubectl-vsphere login --server  <Supervisor IP or hostname>

  2. list contexts to take note of  namespace where the affected DSM database is deployed
    kubectl  config get-contexts

    CURRENT   NAME                       CLUSTER          AUTHINFO                            NAMESPACE
              XXX.XXX.XX.X               XXX.XXX.XX.X     wcp:XXX.XXX.XX.X:<username@domain>  vcfa-provisioned-namespace
              svc-auto-attach-domain-XXX XXX.XXX.XX.X     wcp:XXX.XXX.XX.X:<username@domain>  svc-auto-attach-domain-XXX
  3. Switch to the svc-auto-attach-domain-cxx context  

    kubectl config use-context  svc-auto-attach-domain-XXX

  4. locate the config map auto-attach-config

    kubectl get configmap auto-attach-config
    NAME                 DATA   AGE
    auto-attach-config   1      7h56m

  5. backup the current config map by exporting it to a yaml file

    kubectl get configmap auto-attach-config -o yaml > auto-attach-config.yaml

  6. The config map can be edited directly using kubectl edit

kubectl edit  configmap auto-attach-config -n svc-auto-attach-domain-XXX


# Edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
ApiVersion
: v1
data:
 hostname: <vcf-automation-fqdn>
exclusions: | - namespace: vcfa-provisioned-namespace
kind: ConfigMap
metadata:
annotations:
  updatedAt: "XXXX-XX-XXTXX:XX:XXZ"
creationTimestamp: "XXXX-XX-XXTXX:XX:XXZ"
name: auto-attach-config
namespace: svc-auto-attach-domain-XXX
resourceVersion: "1570042"
uid: 5db240f2-3c94-4abd-94f4-d484d5ebd0db
 

This removes the VKSM management pods and namespace, allowing the database pod to start.

The database instance should progress and go to Ready state

Note: Excluding the VKSM integration from a namespace/specific vSphere Kubernetes Service clusters will prevent management of VKS clusters from VCF Automation via the vSphere Kubernetes Service Cluster Management view.

 


Alternatively using a text editor, create a yaml file , e.g. exclusion.yaml to modify the  config map,  to exclude the namespace identified on step 2  
       

 
ApiVersion: v1
kind: ConfigMap
metadata:
  name: auto-attach-config
  namespace: svc-auto-attach-domain-XXXX
data:
   exclusions: |
     - namespace: vcfa-provisioned-namespace
#       clusterName: cluster-1
hostname: <vcf-automation-fqdn>

In the example above the entire DSM namespace, vcfa-provisioned-namespace, is excluded from VKSM.
Which means all the clusters that will created in that namespace will also be excluded.

However it is possible to exclude specific clusters in a namespace by filling in the clusterName variable. 

 Apply the modified config map using kubectl apply

kubectl apply -f exclusion.yaml