Federation Sites Only Sync across all sites when specific GM Cluster is Set to Active
search cancel

Federation Sites Only Sync across all sites when specific GM Cluster is Set to Active

book

Article ID: 369815

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

User may experience an issue when one Global Manager is activated (Global Manager A), only local Manager (LMA1) is synchronized successfully and other local manager (LMA2) show status as failed.

When Global Manager B is activated, both local Managers (LMA1 & LMA2) are synchronized successfully.

Environment

NSX 4.1.1 Federation

In the above example there are 2 LM's on each Site.

All LMs connected to respective T1's

Cause

    1. LM node's client truststore was missing an entry for GM.
      2024-04-29T13:18:39.450Z XXXXXXXXXXXXXXXXXXXXX NSX 5389 - [nsx@6876 comp="global-manager" level="WARNING" subcomp="global-manager"] Error org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 Internal Server Error: "{"module_name":"common-services","error_message":"Internal server error has occurred.","details":"Client certificate not found in trust store","error_code":99}"#012#011at


    2. There is a known issue with NsxTRestClient using old thumbprint when certificates are replaced. In this example, user had previously updated LM PI certificates.

Resolution

Perform a rolling reboot of GM/LM nodes in below order. Be sure to perform a rolling reboot or restart of proton service on LM nodes to minimize downtime. 

  1. Rolling restart of LM Nodes OR restart proton service using root login -- sudo systemctl restart proton
  2. Restart GM nodes