New Federation sites can not sync with existing local site that was upgraded through a rolling upgrade.
search cancel

New Federation sites can not sync with existing local site that was upgraded through a rolling upgrade.

book

Article ID: 322652

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • NSX-T 3.2.x / 4.x
  • Local site was recently upgraded using the rolling upgrade process.
  • New Federation sites can not sync with upgraded local site.
  • Possible scenarios:
    • Scenario 1:
      Site A is not currently onboarded to Federation.
      Site A is upgraded (rolling upgrade).
      Site A is onboarded after rolling upgrade.
      Site A cannot be synced with any other Sites already onboarded.
    • Scenario 2:
      Site B is not onboarded to Federation, while site A is onboarded to Federation.
      Site A is rolling-upgraded.
      Site B is onboarded after site A is rolling upgraded.
      Site B cannot be connected to site A.
  • On the local site that has upgraded via rolling upgrade, you will see the following logs from /var/log/cloudnet/nsx-ccp.log, that the new site is added but the state of it is closed:
2023-01-24T23:45:21.848Z INFO nsx-rpc:CCP-AphProvider-a2ffa5b0-7a2c-41e5-894d-031496f5e12f:user-executor-3 SiteSyncManager 3532 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="sitesync"] Remote site added a32ec9ab-1300-4cb4-8207-e83f44d3616f with APH [4420e2e4-d276-4373-a175-3f4856e848b5, 976b13e2-abdd-4af7-a48a-72978c5a9c2e, d2fcc4bb-a407-4d25-bbbd-c4036978c22e]
...
2023-01-24T23:45:21.848Z INFO nsx-rpc:CCP-AphProvider-a2ffa5b0-7a2c-41e5-894d-031496f5e12f:user-executor-3 SiteSyncManager 3532 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="sitesync"] State for site a32ec9ab-1300-4cb4-8207-e83f44d3616f is CLOSED


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center
VMware NSX-T Data Center 4.x

Cause

  • This is caused due to an incorrect flag being set during the rolling upgrade, causing the manager node of the site to drop handshake requests coming from other sites.

Resolution

  • This is resolved in NSX-T version 3.2.3 available at VMware Downloads.
  • This is a known issue impacting NSX-T 4.x.


Workaround:
Restart the controller service on all manager nodes on the site that had the rolling upgrade

On each NSX Manager node as root user:
  • root@nsx-mngr-01:~# service nsx-ccp restart

Note: Do this on each manager node one-by-one, to ensure the controller cluster stays up and check the cluster status using get cluster status before proceeding to the next NSX-T manager.