The NSX Manager syslog contains stale realizations of entities that have already been removed or deleted from the CCP

Products

VMware NSX

Issue/Introduction

NSX Manager log file /var/log/syslog shows error "Logical Switch not successfully realized"

NSX 5251 POLICY [nsx@6876 comp="nsx-manager" errorCode="PM610532" level="ERROR" subcomp="manager"] Entity LogicalSwitch/########-#####-#####-###### not successfully realized after max retry attempt. Updated retry count to second highest value 55 to retry again at max interval.

NSX Manager log file /var/log/syslog may also show error "NatSection not successfully realized"

NSX 5251 POLICY [nsx@6876 comp="nsx-manager" errorCode="PM610532" level="ERROR" subcomp="manager"] Entity NatSection/########-#####-#####-###### not successfully realized after max retry attempt. Updated retry count to second highest value 55 to retry again at max interval.

NSX Manager log file /var/log/cloudnet/nsx-ccp-rtf-api.log may show the following warning

WARN nsx-rpc:CCP-AphProvider-########-#####-#####-###### :user-executor-1 rtfapilogger 22651 - [nsx@6876 comp="nsx-controller" level="WARNING" subcomp="realization"] Falcon doesn't contain the following queried entities: LogicalSwitch(########-#####-#####-###### )

WARN nsx-rpc:CCP-AphProvider-########-#####-#####-###### :user-executor-2 rtfapilogger 22651 - [nsx@6876 comp="nsx-controller" level="WARNING" subcomp="realization"] Falcon doesn't contain the following queried entities: LogicalSwitch(########-#####-#####-###### )

Environment

VMware NSX

Cause

The warnings and errors in the logs indicate multiple retires from CCP (Central Control Plane) and Policy RTF (Realization Tracker Framework) on Management Plane for realization of entities that are already removed.
NSX Manager /var/log/syslog shows the LogicalSwitch being deleted

NSX 5186 SWITCHING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Request to delete LogicalSwitch : LogicalSwitch/#########-####-####-########
NSX 5186 SWITCHING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Deleted LogicalSwitch with Id [LogicalSwitch/#########-####-####-########]
NSX 5186 ROUTING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Deleting internal switch Id #########-####-####-######## for LR #########-####-####-########
NSX 22651 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="ccp"] Receives update for LogicalSwitch, key: uuid {#########-####-####-######## }, operation: DELETE
NSX 22651 - [nsx@6876 comp="nsx-controller" level="WARNING" subcomp="adapter-ufo"] The dependency node does not exist for id uuid {##########-####-####-########
NSX 22651 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="adapter-ufo"] Delete a UFO entity. key : uuid {#########-####-####-########
NSX 22651 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="redistribution"] Deleted object for LogicalSwitch(#########-####-####-########)

On NSX Manager in /var/log/syslog, on proton side, Realization Tracker Framework starts querying for the realization status
Initially status returned from CCP is "CCP is in the process of converging." This is returned 4 times as there were pending changes.

INFO com.vmware.nsx.management.policy.policyframework.realization.StatusTracker RealizationStateServiceImpl 5186 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Returning current realization status 'Status = 'IN_PROGRESS', Message = 'CCP is in the process of converging.', TNs = '[]', Pending Changes = '[########-####-####-########, ########-####-####-########, ########-####-####-########, ########-####-####-########]'.' for entity 'LogicalSwitch/########-####-####-########'.

INFO LogicalSwitchStateOperationScheduler2 NsxPortgroupExecuteVcUtils 5186 SWITCHING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] nsxpg deleted on vds ########-####-####-######## for ls LogicalSwitchState/########-####-####-########

Later, the response changes to "There are no transport nodes in the span". At this point there are NO pending changes.

INFO com.vmware.nsx.management.policy.policyframework.realization.StatusTracker RealizationStateServiceImpl 5186 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Returning current realization status 'Status = 'IN_PROGRESS', Message = 'There are no transport nodes in the span.', TNs = '[]', Pending Changes = '[]'.' for entity 'LogicalSwitch/########-####-####-########'.

Above messages repeats multiple times. Then following message is logged about the logical switch not being successfully realized. This pattern the continues in the loop.

ERROR com.vmware.nsx.management.policy.policyframework.realization.StatusTracker DefaultExponentialRetryableImpl 5186 POLICY [nsx@6876 comp="nsx-manager" errorCode="PM610532" level="ERROR" subcomp="manager"] Entity LogicalSwitch/########-####-####-######## not successfully realized after max retry attempt. Updated retry count to second highest value 55 to retry again at max interval.

These stale entries projected by the Policy RTF (Realization Tracker Framework) can cause unnecessary processing and loggings

Resolution

This issue is resolved in VMware NSX 4.2.3, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround:

As a temporary work around reboot the NSX managers one-by-one.

Reboot the first NSX Manager.
Once the NSX manager is back online, SSH into Manager as admin user and check cluster health by running the command get cluster status
When all services report as UP on 3 NSX Manager nodes, reboot the next Manager.
Repeat the same process till you reboot the third Manager.