Contour Supervisor Service Pods Failing Readiness/Liveness Probes due to NSX Distributed Load Balancer Degraded Alarm

search cancel

Contour Supervisor Service Pods Failing Readiness/Liveness Probes due to NSX Distributed Load Balancer Degraded Alarm

book

Article ID: 423771

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VMware NSX

Issue/Introduction

Contour supervisor service pods fail to come up and remain in a CrashLoopBackOff (CLBO) or Not Ready state on a VKS Supervisor cluster using NSX-T.
When describing the Contour pods, readiness and liveness probe failures are observed with connection refused errors on ports 8000 and 8001.

Example errors:

readiness probe failed for container contour:
dial tcp <pod-ip>:8001: connection refused

liveness probe failed for container contour:
GET http://<pod-ip>:8000/healthz: connection refused

Environment

vSphere Kubernetes Service
NSX-T

Cause

The issue is caused by the NSX Distributed Load Balancer (DLB) being in a Degraded state.

Contour supervisor service relies on NSX-T load balancer services to route traffic to the Contour pods on ports 8000 (liveness) and 8001 (readiness). When the Distributed Load Balancer is degraded:

Virtual servers backing the Contour service are not fully functional
Traffic is not correctly forwarded to the Contour pod IPs
Kubernetes readiness and liveness probes fail with connection refused
As a result, Contour pods never transition to a healthy Running state

This is not an issue with the Contour pod itself but with the underlying NSX-T load balancer infrastructure.

Resolution

Resolve the degraded state of the NSX Distributed Load Balancer backing the Supervisor Services.

Log in to the NSX-T Manager UI.
Navigate to Networking > Load Balancers > Distributed Load Balancer.
Identify the DLB showing a Degraded alarm.
Investigate and remediate the underlying cause (for example, pool member issues, configuration errors, or service failures).
Once the DLB returns to a healthy state, the Contour pods will automatically pass their readiness and liveness probes and transition to the Running state.

For detailed steps to troubleshoot and resolve the Distributed Load Balancer degraded alarm, refer to the following KB article:

NSX Distributed Load Balancer shows Degraded alarm
https://knowledge.broadcom.com/external/article/420132/nsx-distributed-load-balancer-shows-degr.html

Additional Information

Notes:

Japanese version: NSX 分散ロードバランサーの劣化（Degraded）アラームに起因して、Contour Supervisor サービスの Pod が Readiness／Liveness プローブに失敗する事象

Feedback

thumb_up Yes

thumb_down No