GM To LM Synchronization Warning
search cancel

GM To LM Synchronization Warning

book

Article ID: 383762

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Alarm "GM To LM Synchronization Warning" is present
  • If the alarm is manually resolved, it retriggers again
  • There are no sync issues present, objects created on the Global Managers are propagated to the Local Managers as expected

  • On the Local Manager /var/log/async-replicator/ar.log, the following logging is seen

<DATE>T03:23:11.823Z  INFO nsx-rpc:unix:///var/run/vmware/nsx-opsagent/alarms-provider-service.sock:user-executor-0 EventSource 87147 MONITORING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="async-replicator"] EventSource: Sync triggered. featureName: federation, eventType: gm_to_lm_synchronization_warning, entityId: UUID, status: true, context: {"site_id":"SITE ID","remote_site_id":"UUID","site_name":"SITE_NAME","remote_site_name":"SITE_NAME","flow_identifier":"FlowIdentifier{role='Policy', nameSpace='LM_2_GM_ONBOARD_CONFIG'}","sync_issue_reason":"Remote site disconnected"}

  • A leadership change has occurred for the ArAlarmService service, on the Local Manager /var/log/cbm/gmle-leadership-lease.log shows leaderId change 

    <DATE>T11:59:08.285Z  INFO ClusteringRpcServer-Leadership-Thread1 LeadershipRpcHandler 84807 - [nsx@6876 comp="global-manager" level="INFO" s2comp="leadership-rpc-handler" subcomp="cbm"] Renewing the leadership lease of group <GROUP_ID>, new lease LeadershipLease{serviceName=ArAlarmService, leaderId=<MANAGER_UUID_1>, leaseVersion=16####8, revocationCount=0, serviceWeight=1, serviceWeightCategory=SMALL, leaseId=<LEASE_ID>, relinquishInProgress=false}

    <DATE>T12:00:41.312Z  INFO ClusteringRpcServer-Leadership-Thread1 LeadershipRpcHandler 84807 - [nsx@6876 comp="global-manager" level="INFO" s2comp="leadership-rpc-handler" subcomp="cbm"] Renewing the leadership lease of group <GROUP_ID>, new lease LeadershipLease{serviceName=ArAlarmService, leaderId=<MANAGER_UUID_2>, leaseVersion=16####6, revocationCount=0, serviceWeight=1, serviceWeightCategory=SMALL, leaseId=<LEASE_ID>, relinquishInProgress=false}

    Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX 4.x
VMware NSX-T 3.x

Cause

After a Synchronization alarm validly triggers, if the leader of the ArAlarmService changes from an NSX Manager to another NSX Manager before the alarm resolves then the alarm cannot be cleared.

Resolution

This issue is resolved in VMware NSX 4.2.0 available at Broadcom Downloads.

To resolve the alarm, restart the Async Replication service on the Local NSX Manager

  1. ssh as admin user to any NSX Manager
  2. Identify the leader of the Async Replicator service

    #get cluster status verbose | find ArAlarmService

    In the section ASYNC_REPLICATOR, check for the UUID of the Ar Alarm Service
     ArAlarmService      1       SMALL     <node UUID>       1641194

  3. ssh to this node if not already on it and restart the service

    #systemctl stop async-replicator-service
    #systemctl start async-replicator-service

  4. Resolve the Synchronization Alarm on the NSX UI

Note: Also you might be able to fix this alarm by rebooting one of the Lm manager.

Note:  If you see this issue on a later version  than 4.2.0 let the alarm go for 5 or more minutes  and the do the workaround above and collected both local manager and global manager log's and upload to the case.