SRM Server cannot connect to VR Management Server at 'https://vSphere_replication_FQDN:8043'. A runtime error occurred in the vSphere Replication Management Server

search cancel

SRM Server cannot connect to VR Management Server at 'https://vSphere_replication_FQDN:8043'. A runtime error occurred in the vSphere Replication Management Server

book

Article ID: 416820

calendar_today

Updated On:

Products

VMware Live Recovery

Issue/Introduction

After migrating Site Recovery Manager (SRM) and vSphere Replication Management Server (VRMS) appliances to a new cluster within the same vCenter.

SRM_Server_name SRM Server cannot connect to VR Management Server at 'https://vSphere_replication_FQDN:8043'. A runtime error occurred in the vSphere Replication Management Server. Exception details: 'Task ava.util.concurrent.FutureTask@7cbc7e04[Not completed, task = ava.util.concurrent.ExecutorsSRunnableAdapter@58cf60d6[Wrapped task =com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2@74b3dc6f]] rejected from ava.util.concurrent.ThreadPoolExecutor@2bfbffb2[Shutting down, pool size = 1, active hreads = 1, queued tasks = 0, completed tasks = 43]'

/opt/vmware/support/logs/srm/vmware-dr.log

YYYY-MM-DD HH:MM:SS verbose vmware-dr[02547] [SRM@6876 sub=IO.Connection] Attempting connection; <resolver p:0x00007f53d43ef140, '<vSphere_replication_FQDN>:8043', next:<TCP '###.###.###.### : 8043'>>, last e: 0(Success)
YYYY-MM-DD HH:MM:SS info vmware-dr[02568] [SRM@6876 sub=LocalSite.LocalVersionReg connID=hms-99c4] Compatible VMODL version found: 'hms.version.version23/internalhmssrv/9.0.2'
YYYY-MM-DD HH:MM:SS verbose vmware-dr[02568] [SRM@6876 sub=LocalHms connID=hms-99c4] Using VMODL version 'hms.version.version23/internalhmssrv/9.0.2'

YYYY-MM-DD HH:MM:SS info vmware-dr[02566] [SRM@6876 sub=vmomi.soapStub[1290]] SOAP request returned HTTP failure; <SSL(<io_obj p:0x00007f5418098c50, h:20, <TCP '###.###.###.### : 33020'>, <TCP '###.###.###.### : 8043'>>), />, method: GetContent; code: 500(Internal Server Error); fault: (hms.fault.HmsRuntimeFault) {
-->    faultCause = (vim.fault.InvalidState) {
-->       faultCause = (vmodl.MethodFault) null,
-->       faultMessage = <unset>
-->       msg = "The operation is not allowed in the current state."
-->    },
-->    faultMessage = <unset>,
-->    originalMessage = "EntityManagerFactory is closed"
-->    msg = "Received SOAP response fault from [<SSL(<io_obj p:0x00007f5418098c50, h:20, <TCP '###.###.###.### : 33020'>, <TCP '###.###.###.### : 8043'>>), />]: GetContent
--> A runtime error occurred in the vSphere Replication Management Server. Exception details: 'EntityManagerFactory is closed'."
--> }

YYYY-MM-DD HH:MM:SS warning vmware-dr[02583] [SRM@6876 sub=LocalHms connID=hms-99c4] Failed to connect
--> (hms.fault.HmsRuntimeFault) {
-->    faultCause = (vim.fault.InvalidState) {
-->       faultCause = (vmodl.MethodFault) null,
-->       faultMessage = <unset>
-->       msg = "The operation is not allowed in the current state."
-->    },
-->    faultMessage = <unset>,
-->    originalMessage = "EntityManagerFactory is closed"
-->    msg = "Received SOAP response fault from [<SSL(<io_obj p:0x00007f5418098c50, h:20, <TCP '###.###.###.### : 33020'>, <TCP '###.###.###.### : 8043'>>), />]: GetContent
--> A runtime error occurred in the vSphere Replication Management Server. Exception details: 'EntityManagerFactory is closed'."
--> }
--> [context]zKq7AVECAAQAANjOcAEKdm13YXJlLWRyAAAsGRxsaWJ2bWFjb3JlLnNvAAFaBCZsaWJobXMtdHlwZXMuc28AAVirGgIqISRsaWJ2bW9taS5zbwAC0ywkAM4pNADSQjQA4H1JA7COAGxpYnB0aHJlYWQuc28uMAAE3/oPbGliYy5zby42AA==[/context]

The HMS service failed to start correctly after migration to the new cluster.

The vSphere Replication Management Server (VRMS) failed to establish a connection with the HMS service.

/opt/vmware/hms/logs/hms.log


YYYY-MM-DD HH:MM:SS WARN  com.vmware.jvsl.sessions [hms-main-thread-5] (..net.impl.VmomiConnectionBase) [operationID=4fdc8a81-HMS-1712,sessionID=15C98FFE] | Failed to init vmomi client for HMS@769490998
java.util.concurrent.CompletionException: (hms.fault.HmsRuntimeFault) {
   faultCause = (hms.fault.HmsRuntimeFault) {
      faultCause = null,
      faultMessage = null,
      originalMessage = Task java.util.concurrent.FutureTask@58b80454[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@544c3b47[Wrapped task = com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2@46350fc]] rejected from java.util.concurrent.ThreadPoolExecutor@2bfbffb2[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 119]
   },
   faultMessage = null,
   originalMessage = Task java.util.concurrent.FutureTask@58b80454[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@544c3b47[Wrapped task = com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2@46350fc]] rejected from java.util.concurrent.ThreadPoolExecutor@2bfbffb2[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 119]
}
 at java.util.concurrent.CompletableFuture.encodeRelay(Unknown Source) ~[?:?]
 at java.util.concurrent.CompletableFuture.completeRelay(Unknown Source) ~[?:?]
 at java.util.concurrent.CompletableFuture$UniRelay.tryFire(Unknown Source) ~[?:?]
 at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?]
 at java.util.concurrent.CompletableFuture.completeExceptionally(Unknown Source) ~[?:?]
 at com.vmware.hms.net.impl.hms.HmsConnection.lambda$3(HmsConnection.java:154) ~[hms.jar:?]
 at com.vmware.hms.util.executor.LoggerOpIdConfigurator$RunnableWithDiagnosticContext.run(LoggerOpIdConfigurator.java:132) ~[hms.jar:?]
 at com.vmware.hms.util.executor.LoggerOpIdConfigurator$2.run(LoggerOpIdConfigurator.java:99) ~[hms.jar:?]
 at com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2.run(TlsPreservingWrapper.java:47) ~[jvsl-sessions-


YYYY-MM-DD HH:MM:SS WARN  com.vmware.jvsl.sessions [hms-main-thread-5] (..net.impl.VmomiConnection) [operationID=4fdc8a81-HMS-1712,sessionID=15C98FFE] | Failed to connect to HMS@769490998.
java.util.concurrent.CompletionException: (hms.fault.HmsRuntimeFault) {
   faultCause = (hms.fault.HmsRuntimeFault) {
      faultCause = null,
      faultMessage = null,
      originalMessage = Task java.util.concurrent.FutureTask@58b80454[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@544c3b47[Wrapped task = com.vmware.jvsl.sessions.net.impl.TlsPreservingWrapper$2@46350fc]] rejected from java.util.concurrent.ThreadPoolExecutor@2bfbffb2[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 119]

Environment

VMware Live Site Recovery 9.X
vSphere Replication 9.X

Cause

When relocating the Site Recovery Manager (SRM) and vSphere Replication Management Server (VRMS) appliances between clusters within the same vCenter, networking parameters, such as the ESXi host's management port group or vSwitch uplink, may change.

When the vSphere Replication appliance is moved, a delay may occur in refreshing the DNS or vCenter inventory cache. Furthermore, the appliance relocation can delay the complete startup of the HMS service.

Successful pairing requires the HMS service to be healthy and operational.

Resolution

vSphere Replication Connectivity Checklist

1) Please verify Network Connectivity first.

2) Ensure that firewalls between the vSphere Replication appliance and ESXi hosts are not blocking connections.

3) Ensure that all ESXi hosts in another cluster can communicate with the vSphere Replication appliance and SRM appliance over port 443.

We can check by using ping, nc, or curl from the VR server to test connectivity to target ESXi hosts.

ping <vcenter_fqdn>

curl -vk https://<vcenter_fqdn>

openssl s_client -connect <esx-ip>:443

Feedback

thumb_up Yes

thumb_down No