Socket read timeout.
/opt/vmware/h4/cloud.log
file on the destination Cloud Director Replication Management Appliance, you see an entry similar to:2024-09-27 10:55:30.489 ERROR - [########-####-####-####-########0831] [c4-scheduler-4] c.v.h.c.v.i. LocalVcloud InventoryService : Failed to update cluster member cache for Datastore{id='########-####-####-####-########12dd', name='EPG-Cloud-DR', type='application/vnd.vmware.admin.datastore+xml', moRef='group-p3140388', vimServerId='########-####-####-####-########203d', vimServerInstanceUuid='########-####-####-####-########6b70',totalCapacity=1.154473728E9, usedCapacity=6.74044736E8, usedCapacityPercent=58.38545227050781, provisionedSpace=1.28298624E9, requestedStorage=6.05552082E8, isCluster=true}
com.vmware.exception.ReadTimeoutException: Read timed out
at com.vmware.exception.converter.ClientExceptionConverter.convertException (ClientExceptionConverter.java:40)
at com.vmware.rest.client.AbstractRestClient.genericExchange (Abstract RestClient.java:160)
at com.vmware.rest.client. Abstract RestClient.genericExchange (Abstract RestClient.java:104)
at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:97)
at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:128)
at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:171)
at com.vmware.h4.manager.client.ManagerClient.getDsClusterMembers (ManagerClient.java:2897)
at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.retrieveDsClusterMembers (LocalVcloud InventoryService.java:636)
at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.refreshDatastores Impl (LocalVcloudInventoryService.java:1684)
at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.refreshDatastores (LocalVcloud InventoryService.java:1664)
at com.vmware.h4.cloud.vcloud.inventory.LocalVcloudInventoryService.lambda$scheduleVcloudDataRefresh$90 (LocalVcloudInventoryService.java:2617)
at com.vmware.h4.common.mdc.MDCRunnableWrapper.run (MDCRunnableWrapper.java:30)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)
at java.base/java.util.concurrent.FutureTask.runAndReset (FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.base/sun.nio.ch. NioSocketImpl.timedRead (NioSocketImpl.java:288)
at java.base/sun.nio.ch. NioSocketImpl.implRead(NioSocketImpl.java:314)
at java.base/sun.nio.ch. NioSocketImpl.read(NioSocketImpl.java:355)
at java.base/sun.nio.ch. NioSocketImpl$1.read(NioSocketImpl.java:808)
at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
at java.base/sun.security.ssl.SSLSocketInput Record. read (SSLSocketInputRecord.java:484)
at java.base/sun.security.ssl.SSLSocketInput Record.readHeader (SSLSocketInputRecord.java:478)
/opt/vmware/h4/manager/log/manager.log
file on the destination Cloud Director Replication Management Appliance, you see an entry similar to:2024-09-27 10:55:29.655 WARN - [UI-########-####-####-####-############-#####-##-jn] [job-91] com.vmware.h4.jobengine.ChildJobManager : Child job failures [Task{id=########-####-####-####-########8e85, user='<username>@<organization name>', workflowInfo=WorkflowInfo{type='start', resourceType='vmReplication', resourceId='C4-########-####-####-####-############', isPrivate=false, resourceName='<vm name>'}, progress=100, state=FAILED, lastUpdated=1727434383225, startTime=1727434312058, endTime=1727434383226, isAlreadyStarted=false, warnings=[], result=null, error=com.vmware.h4.api.error.exceptions.client.ReadTimeoutException: Socket read timeout.}]
2024-09-27 10:55:29.655 ERROR -[UI-########-####-####-####-############-#####-##-jn] [job-91] com.vmware.h4.jobengine.JobExecution: Task ########-####-####-####-########44a4 (WorkflowInfo{type='start', resourceType='vappReplication', resourceId='C4VAPP-########-####-####-####-############', isPrivate=false, resourceName='null'}) has failed
com.vmware.h4.api.error.exceptions.client.ReadTimeoutException: Socket read timeout
at jdk.internal.reflect.GeneratedConstructorAccessor1429.newInstance (Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller (Constructor.java:499)
at java.base/java.lang.reflect.Constructor.newInstance (Constructor.java:480)
at com.vmware.h4.api.error.GenericServerExceptionProvider.get (GenericServerException Provider.java:123)
at com.vmware.h4.api.error.GenericServerExceptionProvider.get(GenericServerExceptionProvider.java:95)
at com.vmware.h4.common.job.TaskRepo.deserializeError (TaskRepo.java:526)
at com.vmware.h4.common.job.TaskRepo.toTask (TaskRepo.java:463)
at com.vmware.h4.common.job.TaskRepo.toTask (TaskRepo.java:445)
at com.vmware.h4.cloud.jobengine.C4TaskRepo.toTask (C4TaskRepo.java:331)
at com.vmware.h4.cloud.jobengine.C4TaskRepo.toTask (C4TaskRepo.java:63)
at com.vmware.h4.common.job.TaskRepo.find(TaskRepo.java:311)
at com.vmware.h4.common.job.TaskRepo.find(TaskRepo.java:305)
/opt/vmware/h4/replicator/log/replicator.log
file on the destination Replicator, you see an entry similar to:2024-09-27 10:55:25.244 WARN - [########-####-####-####-########3d6f] [pc-host-monitor-1] c.v.h.c.e.ExceptionConversionService : Unable to convert exception. Using fallback exception instead.
com.vmware.vim.binding.hbr.replica.fault.HbrRuntimeFault: Exception Vmacore::Exception: Can't login to the host; Connection failed for agent ########-####-####-####-########6b70: host-####02/hostd:
at jdk.internal.reflect.GeneratedConstructorAccessor388.newInstance (Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller (Constructor.java:499)
at java.base/java.lang.reflect.ReflectAccess.newInstance (ReflectAccess.java:128)
at java.base/jdk.internal.reflect. ReflectionFactory.newInstance (ReflectionFactory.java:347)
at java.base/java.lang.Class.newInstance (Class.java:645)
at com.vmware.vim.vmomi.core.types.impl.ComplexTypeImpl.newInstance (ComplexTypeImpl.java:174)
at com.vmware.vim.vmomi.core.types.impl.DefaultDataObjectFactory.newDataObject (DefaultDataObjectFactory.java:25)
at com.vmware.vim.vmomi.core.soap.impl.unmarshaller. ComplexStackContext.<init>(ComplexStackContext.java:30)
at com.vmware.vim.vmomi.core.soap.impl.unmarshaller. StackContextFactory.newContext(StackContextFactory.java:113)
/var/log/vmware/hbrsrv.log
file on the destination Replicator, you see entries similar to:2024-09-27T11:06:04.185Z verbose hbrsrv [01974] [Originator@6876 sub-PropertyProvider] Recordop ASSIGN: groupStats, Hbr.Replica.Group.H4-########-####-####-####-########74e1. Applied change to temp map.
2024-09-27T11:06:04.193Z warning hbrsrv [505469] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Failed to connect; io obj p:0x00007fca044b31c0, h:40, <TCP 192.0.2.51: 56718'>, <TCP 192.0.2.50: 443'>>, e: 110 (Connection timed out), duration: 130458msec
2024-09-27T11:06:04.194Z verbose hbrsrv [505469] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Attempting connection; <resolver p:0x00007fc9e00cfcf0, '192.0.2.50:443', next: (null)>, last e: 110 (Connection timed out)
2024-09-27T11:06:04.194Z warning hbrsrv [505469] [Originator@6876 sub-HttpConnectionPool-000000 opID=########-####-####-####-########4c36] Failed to get pooled connection; <cs p:00007fc9e0023b80, TCP:192.0.2.50:443>, (null), duration: 130459msec, N7Vmacore15SystemExceptionE (Connection timed out)
2024-09-27T11:06:04.194Z info hbrsrv [505469] [Originator@6876 sub-10.Http opID=########-####-####-####-########4c36] Set user agent error; state: 1, (null), N7Vmacore15SystemExceptionE (Connection timed out)
2024-09-27T11:06:04.1942 error hbrsrv [505469] [Originator@6876 sub-10.Http opID=########-####-####-####-########4c36] User agent failed to send request; (null), N7Vmacore15SystemExceptionE (Connection timed out)
2024-09-27T11:06:04.194Z info hbrsrv [01839] [Originator@6876 sub-AgentConnection opID=########-####-####-####-########4c36] Agent bd99958f-d65a-42 e8-97d6-064526086b70: host-####02/hostd: failed to log in. Connection type: /hbrTunnel
2024-09-27T11:06:04.194Z verbose hbrsrv [01839] [Originator@6876 sub-vmomi.soapStub [13064] opID=########-####-####-####-########4c36] Using VMODL version; internalvim25/u6A440899 (vim.version.unstable), <<cs p:00007fc9e00253c0, TCP:192.0.2.50:80>, /sdk>
2024-09-27T11:06:04.195Z verbose hbrsrv [523454] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Attempting connection; <resolver p:0x00007fc9e0009010, '192.0.2.50:80', next:<TCP 192.0.2.50: 80'>>, last e: 0 (Success)
2024-09-27T11:06:04.220Z verbose hbrsrv [523430] [Originator@6876 sub-RemoteTask groupID=H4-########-####-####-####-########75f0 opID=hs1-43257a88] Completing task haTask--vim.VirtualDiskManager.createChildDisk-4119186218 after 3 sec (success)
VMware Cloud Director Availability 4.x
This issue is caused by a change in how the HBR service communicates with the ESXi hosts by default in the version that is used in VMware Cloud Director Availability 4.7.0, connecting now over port 443 instead of port 80.
This issue is resolved in Cloud Director Availability 4.7.1, available at Broadcom Downloads.
To work around this issue if you are unable to upgrade, follow the workaround steps outline in the VMware Cloud Director Availability 4.7 Release Notes.