Unable to upgrade or create new clusters on TKGi/TKGs with NSX-T v3.2.x
search cancel

Unable to upgrade or create new clusters on TKGi/TKGs with NSX-T v3.2.x

book

Article ID: 297311

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Symptoms on tanzu side

  • Can't create new clusters on TKGi/TKGs
  • Can't upgrade clusters on TKGi (as it runs an errand deploying a new cluster which fails)
  • Affected versions: 
    • TKGi: 1.16.x, 1.15.x and 1.14.x
    • TKGs: vsphere 7.0u3x releases 
Symptoms on NSX side: 
  • Existing network objects work. No alarms or red flags anywhere
  • New Objects are deployed but not realized.
  • new Logical switches won't have connectivity to their router T1
  • Issue is seen in any upgrades to NSX is 3.2.x
  • Under /var/log/proton/nsxapi.log you can see the following entry:
"Failed to re-subscribe [tag:worker_framework].*Last retry failed 20/20 ENDING!"

Trigger for the problem:
nsx$SegmentPortInternal table which was present in earlier release with worker_framework stream tag has removed the worker_framework stream tag  in 3.2.1 but its old schema definition was still present in corfu after upgrade, so in the logic of subscribing the tables to corfu we pass the tag of worker_framework and corfu fetches all the tables of this stream tag worker_framework and subscribe it but corfu checks for the schema structure of the table which is currently present (old one is still present) not from  the actual table's schema.

 

 

Environment

Product Version: Other

Resolution

Currently, there is a workaround. Rolling reboot of NSX-T manager cluster fixes the problem for some time, however can come back.

Issue is resovled in NSX-T 3.2.3