mp-adapter container in Antrea interworking pod went CLBO after vks cluster upgrade from 1.34 to 1.35
search cancel

mp-adapter container in Antrea interworking pod went CLBO after vks cluster upgrade from 1.34 to 1.35

book

Article ID: 432638

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

User is using VKS3.5+VKR1.34 and has enabled the interworking feature.  After upgrading VKS to 3.6 and the VKR to 1.35, the interworking pod CLBO

NSXServiceAccount is created after the AddonConfig. 

Examples:

- NSXServiceAccount creationTimestamp: "2026-02-24T13:14:44Z"
- AntreaConfig creationTimestamp: "2026-02-17T11:39:41Z"
- AddonConfig creationTimestamp: "2026-02-24T13:14:43Z"

Environment

VKS 3.6

VKR 1.35

Cause

This issue is caused by an empty addonconfig being generated during migration. The empty addonconfig causes interworking to start its cleanup process and then recreate the nsxserviceaccount when the empty addonconfig is filled with migrated fields. 

Resolution

The issue will be fixed in the future release of VKS.

Before upgrading the VKS cluster from v1.34. x to v1.35.x, revoke the permission to delete nsxserviceaccount in VKS. For clusters where the upgrade was already initiated, and the interworking pod is in CLBO, manually restore the NSX by engaging Broadcom support  .

 

 

Additional Information

Before a long-term fix is applied, the following steps should be applied before upgrading from VKR 1.34 -> 1.35  to prevent the cluster from being deregistered during the upgrade, so the interworking pod does not crash.

1. The following step will remove the delete NSXServiceAccount permission for addon-manager, so during the upgrade, the NSXServiceAccount won't be deleted by the VKS addonconfig controller. Run the following commands in the supervisor cluster kubectl context. After these commands are executed, the pending cluster can be upgraded. During the upgrade, don't delete clusters.

kubectl patch pkgi svc-tkg.vsphere.vmware.com -n vmware-system-supervisor-services --type merge -p '{"spec":{"paused":true}}'
kubectl patch pkgi tanzu-addons-manager -n <domain-namespace> --type merge -p '{"spec":{"paused":true}}'

# edit role
kubectl edit clusterrole tanzu-addons-manager-clusterrole
# search 'nsxserviceaccounts' and remove the delete permission
- apiGroups:
  - nsx.vmware.com
  resources:
  - nsxserviceaccounts
  verbs:
  - get
  - create
  - update
  - list
  - patch
  - delete # <------ remove this line
  - watch

2. After all clusters are upgraded, unpause the pkgi and confirm the ClusterRole is reset to original permissions.

kubectl patch pkgi svc-tkg.vsphere.vmware.com -n vmware-system-supervisor-services --type merge -p '{"spec":{"paused":false}}'

# wait for the role to be reset to its original permissions (including delete permission)
kubectl get clusterrole tanzu-addons-manager-clusterrole  -o yaml |grep -C 8 nsxservice
- apiGroups:
  - nsx.vmware.com
  resources:
  - nsxserviceaccounts
  verbs:
  - get
  - create
  - update
  - list
  - patch
  - delete
  - watch