NSX Application Platform shows Degraded status and ANALYTICS_SERVICE is down after upgrade
book
Article ID: 319052
calendar_today
Updated On:
Products
VMware NSX
Issue/Introduction
Symptoms: 1. After an NSX Application Platform upgrade to 4.1.1, the NSX UI shows NAPP status is Degraded and ANALYTICS_SERVICE is down.
2. The nsx-config pod is stuck with Status: Init 4/5. root@nsxmgr:~# napp-k get pods | grep nsx-config | grep -v metrics NAME READY STATUS RESTARTS AGE nsx-config-XXXXXXXX-XXXXX 0/1 Init:4/5 33 (5m15s ago) 4h53m
3. Command napp-k logs <nsx-config pod name> -c wait-for-druid-supervisor-ready shows an error similar to:
INFO:root:Supervisor: pace2druid_policy_intent_config status: UNHEALTHY_SUPERVISOR INFO:root:Supervisor pace2druid_policy_intent_config is not ready or INFO:root:Supervisor: pace2druid_manager_realization_config status: UNHEALTHY_SUPERVISOR INFO:root:Supervisor pace2druid_manager_realization_config is not ready
4. Run napp-k get pods | grep druid-overlord to get the name of the druid-overlord pod napp-k logs <druid-overlord pod name>, shows warning similar to:
2023-07-25T08:05:08,492 WARN [IndexTaskClient-pace2druid_manager_realization_config-0] org.apache.druid.indexing.common.IndexTaskClient - submitRequest failed for [https://X.X.X.X:8104/druid/worker/v1/chat/index_kafka_pace2druid_manager_realization_config_XXXXXXXXXXXX_cainaphn/status] java.net.ConnectException: Connection timed out (Connection timed out)
Cause
There's a chance for this issue to occur upon upgrading to 4.1.1, or when all druid pods are restarted at the same time.
Resolution
There is currently no resolution for this issue. Workaround steps should be followed at this time.
Workaround: Workaround this issue with the following steps from the root shell of an NSX Manager:
Step 1. Run napp-k get pods | grep druid-overlord to get the name of the druid-overlord pod
Step 2. Restart the druid-overlord pod by running napp-k delete <druid-overlord pod name>
Additional Information
Impact/Risks: Users will not be able to use NAPP and NSX Intelligence while this issue is present.