CPU exhaustion on Active Directory server used in IDFW
search cancel

CPU exhaustion on Active Directory server used in IDFW

book

Article ID: 323540

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:

NSX-T Data Center version 3.x

NSX version 4.0.x

Active Directory server used for NSX Identify Firewall (IDFW)

High CPU consumption on Active Directory (AD) server, which may result in hang of AD, or render the service unresponsive.

NSX Manager logs indicate "Thumbprint mismatch" and "Hand shake with server" errors:

/var/log/proton/nsxapi.log:

2023-08-23T11:20:14.552Z WARN http-nio-127.0.0.1-7440-exec-2 NsxTrustManager 4416 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] Thumbprint mismatch for cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d
2023-08-23T11:20:14.552Z WARN http-nio-127.0.0.1-7440-exec-2 LdapsSocketFactory 4416 INVENTORY [nsx@6876 comp="nsx-manager" level="WARNING" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] Hand shake with server give exception: cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d
javax.net.ssl.SSLHandshakeException: cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d
2023-08-23T11:20:14.864Z ERROR http-nio-127.0.0.1-7440-exec-2 AbstractLdapConnector 4416 INVENTORY [nsx@6876 comp="nsx-manager" errorCode="MP38522" level="ERROR" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] Thumbprint mismatch encountered for LDAP server'ad-01.acme.com' and baseDn 'null'.
2023-08-23T11:20:14.864Z ERROR http-nio-127.0.0.1-7440-exec-2 CompositeConnector 4416 INVENTORY [nsx@6876 comp="nsx-manager" errorCode="MP38502" level="ERROR" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] Simple Connector: connect() failed after 621 ms
2023-08-23T11:20:14.864Z ERROR http-nio-127.0.0.1-7440-exec-2 CompositeConnector 4416 INVENTORY [nsx@6876 comp="nsx-manager" errorCode="MP38502" level="ERROR" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] Simple Connector: cannot connect to LDAP server: ad-01.acme.com, Error: Thumbprint provided 'null' does not match with the thumbprint from the LDAP server 'ad-01.acme.com', 'cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d'.
2023-08-23T11:20:14.864Z ERROR http-nio-127.0.0.1-7440-exec-2 DirectoryServiceFacadeImpl 4416 INVENTORY [nsx@6876 comp="nsx-manager" errorCode="MP38519" level="ERROR" reqId="0f6b46e6-d917-445f-8220-6564a53cab03" subcomp="manager" username="[email protected]"] LDAP server ad-01.acme.com connection failed during verification, error Thumbprint provided 'null' does not match with the thumbprint from the LDAP server 'ad-01.acme.com', 'cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d'.
2023-08-23T11:20:14.865Z INFO http-nio-127.0.0.1-7440-exec-2 NsxBaseRestController 4416 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Error in API /nsxapi/api/v1/directory/ldap-server?action=CONNECTIVITY caused by exception com.vmware.nsx.management.directory.exceptions.DirectoryLdapsThumbprintException: {"moduleName":"DirectoryService","errorCode":38522,"errorMessage":"Thumbprint provided 'null' does not match with the thumbprint from the LDAP server 'ad-01.acme.com', '{Correct thumbprint=cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d}'."}
2023-08-23T11:20:14.867Z WARN http-nio-127.0.0.1-7440-exec-13 PolicyIdentityFacadeImpl 4416 FIREWALL [nsx@6876 comp="nsx-manager" level="WARNING" reqId="a8fbf337-fc73-4b8e-aa73-9c13d2c3201f" subcomp="manager" username="[email protected]"] Error in IDFW api /api/v1/directory/ldap-server?action=CONNECTIVITY for EP /infra/sites/default/enforcement-points/default. Error Message - Thumbprint provided 'null' does not match with the thumbprint from the LDAP server 'ad-01.acme.com', '{Correct thumbprint=cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d}'.
com.vmware.nsx.management.policy.policyframework.restutils.NsxTRestException: Thumbprint provided 'null' does not match with the thumbprint from the LDAP server 'ad-01.acme.com', '{Correct thumbprint=cc25000c6636bf47c8df0824ab3a94e0f54211f68df4af8daf22125ffabefa2d}'.


Also, logging of NSX Manager's "LoginLogoutEventMessageHandler" will indicate high number of errors, with logging similar to the logs below:
/var/log/proton/nsxapi.log | grep -v RpcManager
2023-08-24T01:11:29.641Z ERROR scheduling-worker-1 NapiBackupGenerationServiceImpl 4412 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP29256" level="ERROR" subcomp="manager"]
2023-08-24T01:11:29.641Z ERROR scheduling-worker-1 InventorySummaryService 4412 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP29111" level="ERROR" subcomp="manager"] Error while generating inventory summary
2023-08-24T01:11:29.661Z ERROR scheduling-worker-1 InventoryBackupJob 4412 - [nsx@6876 comp="nsx-manager" errorCode="MP29111" level="ERROR" s2comp="backup-restore" subcomp="manager"] Inventory backup failed with ; BackupAsyncStatus [BackupStatus [status=URI_NOT_FOUND, statusDetail=Specified uri '/nsxt/nsxtbackup_prod/inventory-summary/3.2.3.0.0.21703641-70691442-8287-1ac4-6476-7fe9374f4fb6-10.236.153.111/inventory-2023-08-24T01_11_24UTC.json' not found., remoteUri=sftp://10.126.173.25:10022/nsxt/nsxtbackup_prod/inventory-summary/3.2.3.0.0.21703641-70691442-8287-1ac4-6476-7fe9374f4fb6-10.236.153.111/inventory-2023-08-24T01_11_24UTC.json, errorCode=null, startTime=1692839487773, endTime=1692839489641]; responseBody=null].
2023-08-24T07:31:40.143Z ERROR EamPollingThread VCSoapServiceImpl 4412 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP40428" level="ERROR" subcomp="manager"] Error connecting EAM com.vmware.vim.vmomi.client.common.UnexpectedStatusCodeException: Unexpected status code: 503 on compute manager f3f06b64-0f1a-4a95-8cd9-66bedcf5aa9e
2023-08-24T07:31:40.143Z ERROR EamPollingThread AgencyMonitoringService 4412 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP26153" level="ERROR" subcomp="manager"] [HostPinned]: Failed to get EAM status for compute manager ComputeManagerModelMsg/f3f06b64-0f1a-4a95-8cd9-66bedcf5aa9e, com.vmware.nsx.management.lcm.vc.soap.exceptions.HostBasedVMException: Error connecting EAM on compute manager f3f06b64-0f1a-4a95-8cd9-66bedcf5aa9e. Error Unexpected status code: 503
2023-08-24T10:06:36.223Z ERROR pool-340-thread-1 EventReportSyslogSender 4412 MONITORING [nsx@6876 comp="nsx-manager" entId="0f294c7e-7429-4609-b943-447f7b38dfb2" eventFeatureName="certificates" eventSev="error" eventState="Off" eventType="certificate_is_about_to_expire" level="ERROR" subcomp="manager"] The expiring certificate 0f294c7e-7429-4609-b943-447f7b38dfb2 has been removed or is no longer about to expire.
2023-08-24T10:32:45.648Z ERROR IDFW-Vertical1 LoginLogoutEventListener 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error processing user session events: Error querying LDAP to find parent groups
2023-08-24T10:32:45.648Z ERROR IDFW-Vertical1 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39003" level="ERROR" subcomp="manager"] Error in handling the message in the LoginLogoutEventMessageHandler
2023-08-24T11:24:02.099Z ERROR IDFW-Vertical3 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null
2023-08-24T11:25:19.221Z ERROR IDFW-Vertical1 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null
2023-08-24T11:26:05.037Z ERROR IDFW-Vertical5 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null
2023-08-24T11:27:37.644Z ERROR IDFW-Vertical4 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null
2023-08-24T11:31:25.536Z ERROR IDFW-Vertical2 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null
2023-08-24T11:32:26.821Z ERROR IDFW-Vertical5 LoginLogoutEventMessageHandler 4412 FIREWALL [nsx@6876 comp="nsx-manager" errorCode="MP39006" level="ERROR" subcomp="manager"] Error handling login/logout event: null


Environment

VMware NSX-T Data Center

Cause

This is due to scalability issues related to on-demand LDAP querying of AD server, which can exhaust the resources on the AD server.

Resolution

This is a known issue impacting NSX-T Data Center 3.2.x, and NSX 4.0.x.

This issue is resolved in NSX 4.1 and higher.


Workaround:

None.