NSX ALB Virtual Service creation error - VS instance, pools and VIPs are not creating during the Contour Service deployment in Supervisor Cluster.
search cancel

NSX ALB Virtual Service creation error - VS instance, pools and VIPs are not creating during the Contour Service deployment in Supervisor Cluster.

book

Article ID: 405392

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VMware Avi Load Balancer

Issue/Introduction

In Workload Management - yourSupervisor - Configure - Supervisor Services - Contour will have a Status with 'Error'.



Config Status shows below error after configuring Contour as a Service on the Supervisor Cluster

The error: Reason: ReconcileFailed. Message: kapp: Error: Timed out waiting after 5m0s for resources: - deployment/envoy (apps/v1) namespace: svc-contour-domain-c9 - deployment/contour (apps/v1) namespace: svc-contour-domain-c9.

Environment

vCenter Server 8.0 Update 3d
Supevisor: v1.29.7
NSX-T: 4.2.1.0.0.24302016
ALB: 22.1.7-9093

Cause

This can happen due to the following sequence of events : 

svc-nsx-mgmt-proxy namespace is created and proxy-loadbalancer service is deployed.
NCP is yet to realise the T1LR for the namespace, but given the service is already created, AKO triggers VS creation in Avi without T1LR with VRFContext set to global.
Avi incorrectly resolves the global VRFContext to /infra/tier-1s/global during VSVip creation. 
NCP realises the T1 and updates the namespacenetworkinfos CR with the correct T1LR, However, when AKO tries to reconcile the VS with the correct T1LR it fails as it's an immutable field in Avi NSXT cloud configuration. 

Resolution

Workaround:

 - Restarted all NSX controller nodes one by one.
(After above all the visible NSX issues were fixed. Contour deployment marked as Configured, VS will create, however there will be errors in AVI about placement segment not found - similar to below.)

'Placement segement not found in Cloud config'


 - Remove the VS instance created by Contour.

 - Manually add a T1 instance and Avi-* segment to the cloud config in AVI UI.

 - 'kubectl delete..' of AKO pods in 'vmware-system-ako' namespace of the Supervisor cluster.


Fix: 

Upgrading Avi Load balancer version to 30.2.2 will fix the issue.

Additional Information

Similar solution found in KB - https://knowledge.broadcom.com/external/article/396852