Error: "Socket read timeout" after upgrading to VMware Cloud Director Availability 4.7.0
search cancel

Error: "Socket read timeout" after upgrading to VMware Cloud Director Availability 4.7.0

book

Article ID: 378733

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • After upgrading VMware Cloud Director Availability to 4.7.0, creating new replications fail and you see an error similar to:

    Socket read timeout.

  • In the /opt/vmware/h4/cloud.log file on the destination Cloud Director Replication Management Appliance, you see an entry similar to:

    2024-09-27 10:55:30.489 ERROR - [########-####-####-####-########0831] [c4-scheduler-4] c.v.h.c.v.i. LocalVcloud InventoryService : Failed to update cluster member cache for Datastore{id='########-####-####-####-########12dd', name='EPG-Cloud-DR', type='application/vnd.vmware.admin.datastore+xml', moRef='group-pID', vimServerId='########-####-####-####-########203d', vimServerInstanceUuid='########-####-####-####-########6b70',totalCapacity=1.154473728E9, usedCapacity=6.74044736E8, usedCapacityPercent=58.38545227050781, provisionedSpace=1.28298624E9, requestedStorage=6.05552082E8, isCluster=true}

    com.vmware.exception.ReadTimeoutException: Read timed out
        at com.vmware.exception.converter.ClientExceptionConverter.convertException (ClientExceptionConverter.java:40)
        at com.vmware.rest.client.AbstractRestClient.genericExchange (Abstract RestClient.java:160)
        at com.vmware.rest.client. Abstract RestClient.genericExchange (Abstract RestClient.java:104)
        at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:97)
        at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:128)
        at com.vmware.rest.client.json. RestClient.exchange (RestClient.java:171)
        at com.vmware.h4.manager.client.ManagerClient.getDsClusterMembers (ManagerClient.java:2897)
        at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.retrieveDsClusterMembers (LocalVcloud InventoryService.java:636)
        at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.refreshDatastores Impl (LocalVcloudInventoryService.java:1684)
        at com.vmware.h4.cloud.vcloud.inventory. LocalVcloud InventoryService.refreshDatastores (LocalVcloud InventoryService.java:1664)
        at com.vmware.h4.cloud.vcloud.inventory.LocalVcloudInventoryService.lambda$scheduleVcloudDataRefresh$90 (LocalVcloudInventoryService.java:2617)
        at com.vmware.h4.common.mdc.MDCRunnableWrapper.run (MDCRunnableWrapper.java:30)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.runAndReset (FutureTask.java:305)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
    Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch. NioSocketImpl.timedRead (NioSocketImpl.java:288)
        at java.base/sun.nio.ch. NioSocketImpl.implRead(NioSocketImpl.java:314)
        at java.base/sun.nio.ch. NioSocketImpl.read(NioSocketImpl.java:355)
        at java.base/sun.nio.ch. NioSocketImpl$1.read(NioSocketImpl.java:808)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at java.base/sun.security.ssl.SSLSocketInput Record. read (SSLSocketInputRecord.java:484)
        at java.base/sun.security.ssl.SSLSocketInput Record.readHeader (SSLSocketInputRecord.java:478)

  • In the /opt/vmware/h4/manager/log/manager.log file on the destination Cloud Director Replication Management Appliance, you see an entry similar to:

    2024-09-27 10:55:29.655 WARN - [UI-########-####-####-####-############-#####-##-jn] [job-91] com.vmware.h4.jobengine.ChildJobManager : Child job failures [Task{id=########-####-####-####-########8e85, user='<username>@<organization name>', workflowInfo=WorkflowInfo{type='start', resourceType='vmReplication', resourceId='C4-########-####-####-####-############', isPrivate=false, resourceName='<vm name>'}, progress=100, state=FAILED, lastUpdated=1727434383225, startTime=1727434312058, endTime=1727434383226, isAlreadyStarted=false, warnings=[], result=null, error=com.vmware.h4.api.error.exceptions.client.ReadTimeoutException: Socket read timeout.}]
    2024-09-27 10:55:29.655 ERROR -[UI-########-####-####-####-############-#####-##-jn] [job-91] com.vmware.h4.jobengine.JobExecution: Task ########-####-####-####-########44a4 (WorkflowInfo{type='start', resourceType='vappReplication', resourceId='C4VAPP-########-####-####-####-############', isPrivate=false, resourceName='null'}) has failed

    com.vmware.h4.api.error.exceptions.client.ReadTimeoutException: Socket read timeout
        at jdk.internal.reflect.GeneratedConstructorAccessor1429.newInstance (Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller (Constructor.java:499)
        at java.base/java.lang.reflect.Constructor.newInstance (Constructor.java:480)
        at com.vmware.h4.api.error.GenericServerExceptionProvider.get (GenericServerException Provider.java:123)
        at com.vmware.h4.api.error.GenericServerExceptionProvider.get(GenericServerExceptionProvider.java:95)
        at com.vmware.h4.common.job.TaskRepo.deserializeError (TaskRepo.java:526)
        at com.vmware.h4.common.job.TaskRepo.toTask (TaskRepo.java:463)
        at com.vmware.h4.common.job.TaskRepo.toTask (TaskRepo.java:445)
        at com.vmware.h4.cloud.jobengine.C4TaskRepo.toTask (C4TaskRepo.java:331)
        at com.vmware.h4.cloud.jobengine.C4TaskRepo.toTask (C4TaskRepo.java:63)
        at com.vmware.h4.common.job.TaskRepo.find(TaskRepo.java:311)
        at com.vmware.h4.common.job.TaskRepo.find(TaskRepo.java:305)

  • In the /opt/vmware/h4/replicator/log/replicator.log file on the destination Replicator, you see an entry similar to:

    2024-09-27 10:55:25.244 WARN - [########-####-####-####-########3d6f] [pc-host-monitor-1] c.v.h.c.e.ExceptionConversionService : Unable to convert exception. Using fallback exception instead.

    com.vmware.vim.binding.hbr.replica.fault.HbrRuntimeFault: Exception Vmacore::Exception: Can't login to the host; Connection failed for agent ########-####-####-####-########6b70: host-####02/hostd:
        at jdk.internal.reflect.GeneratedConstructorAccessor388.newInstance (Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller (Constructor.java:499)
        at java.base/java.lang.reflect.ReflectAccess.newInstance (ReflectAccess.java:128)
        at java.base/jdk.internal.reflect. ReflectionFactory.newInstance (ReflectionFactory.java:347)
        at java.base/java.lang.Class.newInstance (Class.java:645)
        at com.vmware.vim.vmomi.core.types.impl.ComplexTypeImpl.newInstance (ComplexTypeImpl.java:174)
        at com.vmware.vim.vmomi.core.types.impl.DefaultDataObjectFactory.newDataObject (DefaultDataObjectFactory.java:25)
        at com.vmware.vim.vmomi.core.soap.impl.unmarshaller. ComplexStackContext.<init>(ComplexStackContext.java:30)
        at com.vmware.vim.vmomi.core.soap.impl.unmarshaller. StackContextFactory.newContext(StackContextFactory.java:113)

  • In the /var/log/vmware/hbrsrv.log file on the destination Replicator, you see entries similar to:

    2024-09-27T11:06:04.185Z verbose hbrsrv [01974] [Originator@6876 sub-PropertyProvider] Recordop ASSIGN: groupStats, Hbr.Replica.Group.H4-########-####-####-####-########74e1. Applied change to temp map.
    2024-09-27T11:06:04.193Z warning hbrsrv [505469] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Failed to connect; io obj p:0x00007fca044b31c0, h:40, <TCP X.X.X.X: 56718'>, <TCP X.X.X.X: 443'>>, e: 110 (Connection timed out), duration: 130458msec
    2024-09-27T11:06:04.194Z verbose hbrsrv [505469] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Attempting connection; <resolver p:0x00007fc9e00cfcf0, 'X.X.X.X:443', next: (null)>, last e: 110 (Connection timed out)
    2024-09-27T11:06:04.194Z warning hbrsrv [505469] [Originator@6876 sub-HttpConnectionPool-000000 opID=########-####-####-####-########4c36] Failed to get pooled connection; <cs p:00007fc9e0023b80, TCP:X.X.X.X:443>, (null), duration: 130459msec, N7Vmacore15SystemExceptionE (Connection timed out)
    2024-09-27T11:06:04.194Z info hbrsrv [505469] [Originator@6876 sub-10.Http opID=########-####-####-####-########4c36] Set user agent error; state: 1, (null), N7Vmacore15SystemExceptionE (Connection timed out)
    2024-09-27T11:06:04.1942 error hbrsrv [505469] [Originator@6876 sub-10.Http opID=########-####-####-####-########4c36] User agent failed to send request; (null), N7Vmacore15SystemExceptionE (Connection timed out)
    2024-09-27T11:06:04.194Z info hbrsrv [01839] [Originator@6876 sub-AgentConnection opID=########-####-####-####-########4c36] Agent ########-d65a-42 e8-97d6-########6b70: host-####02/hostd: failed to log in. Connection type: /hbrTunnel
    2024-09-27T11:06:04.194Z verbose hbrsrv [01839] [Originator@6876 sub-vmomi.soapStub [13064] opID=########-####-####-####-########4c36] Using VMODL version; internalvim25/u6A440899 (vim.version.unstable), <<cs p:00007fc9e00253c0, TCP:X.X.X.X:80>, /sdk>
    2024-09-27T11:06:04.195Z verbose hbrsrv [523454] [Originator@6876 sub-10.Connection opID=########-####-####-####-########4c36] Attempting connection; <resolver p:0x00007fc9e0009010, 'X.X.X.X:80', next:<TCP X.X.X.X: 80'>>, last e: 0 (Success)
    2024-09-27T11:06:04.220Z verbose hbrsrv [523430] [Originator@6876 sub-RemoteTask groupID=H4-########-####-####-####-########75f0 opID=hs1-43257a88] Completing task haTask--vim.VirtualDiskManager.createChildDisk-4119186218 after 3 sec (success)

Environment

VMware Cloud Director Availability 4.x

Cause

This issue is caused by a change in how the HBR service communicates with the ESXi hosts by default in the version that is used in VMware Cloud Director Availability 4.7.0, connecting now over port 443 instead of port 80.

Resolution

This issue is resolved in Cloud Director Availability 4.7.1, available at Broadcom Downloads.

Workaround

To work around this issue if you are unable to upgrade, follow the workaround steps outline in the VMware Cloud Director Availability 4.7 Release Notes.