NSX-T objects appear to be missing after proton or policy service restarts
search cancel

NSX-T objects appear to be missing after proton or policy service restarts

book

Article ID: 316649

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • NSX-T version is 2.4.x
  • NSX Manager logs (nsxapi.log, policy.log, syslog) display message(s) showing NSX Manager or Policy service restarted (either manually or automatically), similar to:
NSX Manager service automatic restart (nsxapi.log):
2019-11-06T09:34:44.749Z INFO CorfuDB-DCN-Publisher-0 ContainerConfigServiceImpl - - [nsx@6876 comp="nsx-manager" subcomp="manager"] Restart application now.

NSX Manager service manual restart (syslog):
<182>1 2020-01-06T10:09:11.599Z nsx-ctl-2-bb145 NSX 8973 - [nsx@6876 comp="nsx-cli" subcomp="node-mgmt" username="admin" level="INFO"] CMD: restart service manager

Policy service automatic restart (policy.log):
2019-12-02T09:42:05.777Z  INFO localhost-startStop-2 ContainerConfigServiceImpl - - [nsx@6876 comp="nsx-manager" subcomp="policy"] Out-of-bound lifecycle event received: Restart = true

Policy service manual restart (syslog):
<182>1 2020-01-07T10:38:03.664Z nsx-mngr-01 NSX 9481 - [nsx@6876 comp="nsx-cli" subcomp="node-mgmt" username="root" level="INFO"] CMD: restart service policy
  • NSX Manager logs (nsxapi.log or policy.log) display message(s) showing trimmed exception occurred while NSX Manager service is restarting, similar to:
2019-11-06T09:36:58.379Z  WARN pool-2-thread-1 FastObjectLoader - applyForEachAddress[171952709, start=171952684] address is trimmed
2019-11-06T09:36:58.387Z  WARN pool-2-thread-1 FastObjectLoader - applyForEachAddress[171952709, start=171957863] address is trimmed
  • NSX-T objects (NSGroup, Firewall sections and rules, Logical Switches, Logical Ports etc.) appear to be missing on the NSX Manager UI and API results:
GET /api/v1/ns-groups
GET /api/v1/firewall/sections/summary
GET /api/v1/logical-switches/status
GET /api/v1/logical-ports




NOTE: The objects missing dictate the impact this symptom carries.  Network traffic can be impacted if objects that traffic depends on are missing. 

Environment

VMware NSX-T

Cause

The issue is due to concurrent operations when the NSX Manager or Policy service restarts and while the restart is in progress Corfu performs routing trim and checkpoint operations.
When this happens, a Corfu trimmed exception occurs and the affected NSX Manager get presented with empty Corfu tables and the NSX-T objects appear missing.

Resolution

This issue is resolved in NSX-T 2.5.0.

Workaround:
To workaround the issue, restart the NSX Manager or Policy service on the impacted NSX Manager(s). 

1. Identify the impacted NSX Manager(s):
#grep "address is trimmed" /var/log/proton/nsxapi.log
#grep "address is trimmed" /var/log/policy/policy.log


Example of output:
2019-11-06T09:36:58.379Z  WARN pool-2-thread-1 FastObjectLoader - applyForEachAddress[171952709, start=171952684] address is trimmed
2019-11-06T09:36:58.387Z  WARN pool-2-thread-1 FastObjectLoader - applyForEachAddress[171952709, start=171957863] address is trimmed


2. Based on the result of step 1., restart the relevant NSX Manager service:
- If the "address is trimmed" error is found in nsxapi.log, restart the NSX Manager service:
#> restart service manager
- If the "address is trimmed" error is found in policy.log, restart the NSX Manager policy service:
#> restart service policy

3. Verify the object counts are as expected using REST API:
GET /api/v1/ns-groups
GET /api/v1/firewall/sections/summary
GET /api/v1/logical-switches/status
GET /api/v1/logical-ports


Note: the Corfu trimmed exception may occur after a manual restart of the NSX Manager or Policy services. Restart the service again if this happens.