PubSub Pod Initialization Delays causes Intelligence to be non-functional
search cancel

PubSub Pod Initialization Delays causes Intelligence to be non-functional

book

Article ID: 380781

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

The PubSub pod may fail to activate, leading to a complete system degradation. Symptoms include the pod reporting zero ready replicas and frequent liveness probe failures. Intelligence UI doesn't load as a result

Environment

All NAPP environments

Cause

The main root cause is that pubsub is taking longer to come up and by the time it starts reporting health status, readiness probe declares it as dead. 

Resolution

Vmware by Broadcom is aware of this issue and the fix will be merged in a future release.

To workaround this issue, follow these steps:

(1) Edit the PubSub deployment by accessing NSX manager via SSH using root account :


napp-k edit deployment pubsub -n nsxi-platform

(2) Increase the initialDelaySeconds for both liveness and readiness probes from 180 to 300 seconds. Update the configurations as follows:


Change the liveness and readiness probe initialDelaySeconds from 180 to 300. '


        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /actuator/health/liveness
            port: http
            scheme: HTTP
          initialDelaySeconds: 300 <-------------- update from 180 to 300.
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: pubsub
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 8443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /actuator/health/readiness
            port: http
            scheme: HTTP
          initialDelaySeconds: 300 <-------------- update from 180 to 300.
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5


This adjustment allows the PubSub pod additional time to become ready before the health checks are executed.