Wavefront observability-system namespace gets deleted when updating/upgrading TKGI clusters
search cancel

Wavefront observability-system namespace gets deleted when updating/upgrading TKGI clusters

book

Article ID: 405433

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

If you have deployed the Wavefront Observability for Kubernetes Operator manually on a TKGI cluster, it may get deleted when updating/upgrading the cluster.

This issue is known to happen with the following configuration:

  1. You have deployed the Observability for Kubernetes Operator via some process external to TKGI
  2. Wavefront Integration is disabled in the TKGI tile settings

Environment

TKGI v1.20.0 or later

Cause

When you update or upgrade TKGI clusters, it runs the wavefront-proxy-errand. This errand deletes the observability-system namespace if Wavefront Integration is disabled in the TKGI tile settings

Resolution

One solution is to enable the TKGI Wavefront integration in the tile settings instead of deploying the Wavefront operator yourself. Please note the TKGI Wavefront Integration setting is deprecated and will be removed in a future release

Another solution is to use the following runtime config to patch the wavefront-proxy-errand so it doesn't delete the observability-system namespace:

  1. Upload os-conf release to bosh (we have tested this patch with os-conf v23.0.0)

    bosh upload-release "https://bosh.io/d/github.com/cloudfoundry/os-conf-release?v=23.0.0"


  2. Save the following runtime config to a file (wavefront-errand-config.yml for example)

    addons:
    - name: wavefront-errand-config
      jobs:
      - name: pre-start-script
        release: os-conf
        properties:
          script: |-
            #!/bin/bash
            if [ -f /var/vcap/jobs/wavefront-proxy-errand/bin/run ]; then
              sed -i '/delete_spec "wavefront.yml"/d' /var/vcap/jobs/wavefront-proxy-errand/bin/run
              sed -i '/delete_spec "wavefront-proxy.yml"/d' /var/vcap/jobs/wavefront-proxy-errand/bin/run
              sed -i '/delete_spec "wavefront-operator.yml"/d' /var/vcap/jobs/wavefront-proxy-errand/bin/run
              sed -i '/${kubectl} delete secret wavefront-secret --namespace=${new_namespace} --ignore-not-found/s/^/# /' /var/vcap/jobs/wavefront-proxy-errand/bin/run
            fi
      include:
        instance_groups: [apply-addons]
    releases:
    - name: "os-conf"
      version: "((OS_CONF_RELEASE))"


    NOTE: The version listed in the wavefront-errand-config.yml should match the os-conf-release version downloaded in step 1, example:

    releases:
    - name: "os-conf"
      version: "23.0.0"


  3. Upload the runtime config to bosh

    bosh update-runtime-config --name wavefront-errand-config ./wavefront-errand-config.yml

     

  4. Try your cluster update/upgrade again and verify wavefront deployment is not deleted

Additional Information

If you encounter bosh errors indicating "Error: - Failed to find variable '/p-bosh/service-instance_<ID>/OS_CONF_RELEASE' from config server: HTTPCode '404'", please see the note in step 2 of the resolution section and ensure you are using the correct os-conf version instead of OS_CONF_RELEASE