net-dvs -l will show below output from ESXi SSH.port ########-####-####-####-############: com.vmware.common.port.volatile.status = inUse linkUp blocked portID=###### Port blocked by admin propType = RUNTIME /var/log/hostd.log we can see the VM was migrated/powered on on the host just before the connectivity issue started:
<time stamp> In(166) Hostd[2102730]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/6###-d##-3##-3###/<vm-path>/<vm-name>.vmx opID=CdrsLoadBalancer- sid=52c9df38 user=vpxuser:<no user>] State Transition (VM_STATE_OFF -> VM_STATE_IMMIGRATING)
/var/log/nsx-syslog, we can see ATTACH_PORT call followed by repeated SYNC_ATTACH_PORT for the impacted VM till the issue is remediated or VM is moved off the host
<Time stamp> In(182) nsx-opsagent[3287033]: NSX 3287033 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="24216973" level="INFO"] [DoVifPortOperation] request=[opId:[CdrsLoadBalancer-] op:[HOSTD_ATTACH_PORT(1)] vif:[f###-d##-4##-8###-30ffafe6ad23] ls:[2####-3##-4###-9#####a] vmx:[/vmfs/volumes/6####-d#####-3###-3######//<vm-path>/<vm-name>.vmx] lp:
<Time stamp> In(182) nsx-opsagent[3287033]: NSX 3287033 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="3287483" level="INFO"] [DoVifPortOperation] request=[opId:[sync-attach-5] op:[SYNC_ATTACH_PORT(1001)] vif:[f###-d##-4##-8###-30ffafe6ad23] ls:[2####-3##-4###-9#####a] vmx:[/vmfs/volumes/6####-d#####-3###-3######//<vm-path>/<vm-name>.vmx] lp:
<Time stamp> In(182) nsx-opsagent[3287033]: NSX 3287033 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="3287483" level="INFO"] [SwitchingVertical] SendRequest: To Master APH, type (com.vmware.nsx.switching.VifMsg) correlationId () trackingIdStr (2###-b##-c##-8#####) Success. <<<<<<<<<< Copy trackingIdStr
<Time stamp> In(182) nsx-proxy[3286484]: NSX 3286484 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="mpa-proxy-lib" tid="3286484" level="INFO"] MessagingClientService: Heartbeat message received in FrameworkUnifiedMsg from endpoint: ssl://#.#.#15:1234 client_id: 1####-f###-4##-9##-2##### <<<<<<< #.#.#.15 is the manager IP to which this host is connected
<time stamp> INFO L2TaskExecutor10 RpcManager 77237 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Sending error response to call handle of incoming-request with id aad97####5-fc30-d##-d##application SwitchingVertical<time stamp> WARN GmleClientBlockingOpsThread-1 Lease 77237 - [nsx@6876 comp="nsx-manager" level="WARNING" s2comp="lease" subcomp="manager"] Leadership lease size is 0 for group 1####-7###-3###-b##-f#### and service POLICY_SVC_ROUTING<time stamp> ERROR GmleClientBlockingOpsThread-1 Lease 77237 - [nsx@6876 comp="nsx-manager" errorCode="GML206" level="ERROR" s2comp="lease" subcomp="manager"] Unable to get LeadershipLease for service POLICY_SVC_ROUTING on member 4####-1###-f#####-a6###### of group 1####-7###-3###-b##-f####.org.bouncycastle.crypto.fips.FipsOperationError: proportionate test failed >>>>>this is the causeVMware NSX 4.2.x and later
org.bouncycastle.crypto.fips.FipsOperationError: proportionate test failed indicates that BouncyCastle's FIPS-certified cryptographic module failed its continuous self-testing requirements. This is a built-in safety mechanism in FIPS 140-2/140-3 certified cryptographic modules. When this self test fails, modules running on NSX Manager initialize but get into an error state.Workaround:
Need to reboot the NSX Manager node where we observe the error org.bouncycastle.crypto.fips.FipsOperationError.
Permanent fix will be in future release.