NSX-T managers crash every 10 minutes showing application not ready error message.
This issue is resolved in VMware NSX-T 3.1.3.6, 3.2.x, or higher
Workaround:
- Clear all the segments port which are in realization failed state.
Detailed steps for the workaround
----------------------------
Method 1:
---------
1) Search for "is already attached to logical port" in the /var/log/policy/policy.log
root@nsxmgr-01a:~# less /var/log/policy/policy.log | grep "is already attached to logical port"
2021-12-15T14:40:09.094Z INFO providerTaskExecutor-7 AlarmServiceImpl32764 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="policy"] Message returned Attachment Id ########-####-####-####-########f368:4000 is already attached to logical port LogicalPort/########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14
<snip>
path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] ] ] path=[/infra/segments/########-####-####-####-########a451/ports/default:9d8f63da-20
2) From the error message in the policy.log, identify the attachment id.
(In this example the attachment ID id ########-####-####-####-########f368:4000.)
3)Search for the "attachment id" in the NSX-T Manager.
(Note: You can search with ########-####-####-####-########f368 (without the ":4000"), If the segment port has special characters refer https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.1/administration/GUID-AA9D1719-2860-4B84-8E79-898317F45EE4.html, for search instructions, else use the part of the attachment ID to search)
4) Click on the Segment Ports which shows "failed" in the status column
5) On the pop-up, click the three dots beside the segment port name with the failed status and delete it.
=============
Or
=============
Method 2:
----------
1) Search for "Unable to create logical port" in the /var/log/syslog
less /var/log/syslog | grep "Unable to create logical port"
2022-01-17T14:20:00.704Z nsxmgr-01a.corp.tanzu NSX 6155 POLICY [nsx@6876 comp="nsx-manager" errorCode="MP500016" level="ERROR" subcomp="policy"] Unable to create logical port LogicalPortDto{logicalSwitchId='########-####-####-####-########9887', attachment='LogicalPortAttachmentDto{attachmentType='VIF', id='########-####-####-####-########f368:4000', context='null'}', adminState='UP', addressBindings='null', switchingProfileIds='null', extraConfigs='null', initState='null', ignoreAddressBindings='null', internalId='null', super{ManagedResource{resourceType='null', aCreateUser='null', aCreateTime='null', aLastModifiedUser='null', aLastModifiedTime='null', aSystemOwned='null', aProtection='null', id='null', displayName='test-policy3', description='Logical Port on LogicalSwitch: ########-####-####-####-########9887', tags='[Tag{scope='policyPath', tag='/infra/segments/VM-T1A/ports/test-policy3'}]', super{RevisionedResource{aRevision='null', super{Resource{aSelf='null', aLinks='null', aSchema='null'}}}}}}} on enforcement point default
2) From the error message in the Syslog identify the "logicalSwitchId". In this example ########-####-####-####-########9887.
(you can also identify the "attachment id" and follow steps 3,4,5 in Method 1)
3) Search for the "logicalSwitchId" in the NSX-T Manager, click on the segment name
4) Click on the ports, Identify all the port with "failed" status and delete it.
*Relevant log’s location*:
In the policy.log, for the following lines, the path is repeated several times
2021-12-15T14:40:09.094Z INFO providerTaskExecutor-7 AlarmServiceImpl32764 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="policy"] Message returned Attachment Id ########-####-####-####-########f368:4000 is already attached to logical port LogicalPort/########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14
<snip>
path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] ] ] path=[/infra/segments/########-####-####-####-########a451/ports/default:9d8f63da-20
2021-12-15T14:40:09.134Z INFO providerTaskExecutor-7 PolicyProviderUtil32764 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="policy"] Created alarm Alarm [policyPath=/infra/realized-state/enforcement-points/default/logical-ports/infra-########-####-####-####-########a451-423_dvportgroup-169525-lp/alarms/########-####-####-####-########243b, message=Attachment Id ########-####-####-####-########f368:4000 is already attached to logical port LogicalPort/########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] .,errorId=REST_API_FAILED, path=null, apiError=error_code=8411, module_name=NsxSwitching service, error_message='Attachment Id ########-####-####-####-########f368:4000 is already attached to logical port LogicalPort/########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14] ]
<snip>
path=[/infra/segments/########-####-####-####-########a451/ports/default:########-####-####-####-########1c14 path=[/infra/segments/########-####-####-####-########a451/ports/default:9d8f63da
2021-12-15T14:40:09.470Z
In the syslog.log you see the following error
2022-01-17T14:20:00.704Z nsxmgr-01a.corp.tanzu NSX 6155 POLICY [nsx@6876 comp="nsx-manager" errorCode="MP500016" level="ERROR" subcomp="policy"] Unable to create logical port LogicalPortDto{logicalSwitchId='########-####-####-####-########9887', attachment='LogicalPortAttachmentDto{attachmentType='VIF', id='########-####-####-####-########f368:4000', context='null'}', adminState='UP', addressBindings='null', switchingProfileIds='null', extraConfigs='null', initState='null', ignoreAddressBindings='null', internalId='null', super{ManagedResource{resourceType='null', aCreateUser='null', aCreateTime='null', aLastModifiedUser='null', aLastModifiedTime='null', aSystemOwned='null', aProtection='null', id='null', displayName='test-policy3', description='Logical Port on LogicalSwitch: ########-####-####-####-########9887', tags='[Tag{scope='policyPath', tag='/infra/segments/VM-T1A/ports/test-policy3'}]', super{RevisionedResource{aRevision='null', super{Resource{aSelf='null', aLinks='null', aSchema='null'}}}}}}} on enforcement point default
Impact/Risks:
The NSX-T Manager UI becomes unavailable every few minutes making the system unmanageable