During VMware Cloud Foundation bring-up, the Deploy vCenter and SDDC Manager phase takes an unexpectedly long time or fails.
This issue typically occurs under the following conditions:
The "Configure Base Install Image Repository in SDDC Manager" task may fail with the following error in the SDDC Manager UI:
Failed to copy /nfs/vmware/vcf/nfs-mount/base-install-images/ to base install mount point
The /var/log/vmware/vcf/domainmanager/domainmanager.log file contains entries similar to:
{Timestamp} ERROR [vcf_dm,0000000000000000,0000] [c.v.v.v.f.a.ConfigureBaseImageRepoAction,dm-exec-2291] Failed to copy folder from /nfs/vmware/vcf/nfs-mount/bundle/aa88f811-700a-5384-b86e-c40191985348/aa88f811-700a-5384-b86e-c40191985348 to path /nfs/vmware/vcf/nfs-mount/base-install-images/vsp_folder with exception
com.vmware.vcf.secure.ssh.errors.VcfSshException: Failed to upload file /nfs/vmware/vcf/nfs-mount/base-install-images/vsp_folder/vmsp-platform-9.1.0.0.25370367.tar
at com.vmware.vcf.secure.ssh.SshExecuter.upload(SshExecuter.java:293)
at com.vmware.vcf.secure.ssh.SshExecuter.uploadFolder(SshExecuter.java:353)
at com.vmware.vcf.vimanager.fsm.actions.ConfigureBaseImageRepoAction.lambda$copyFolder$6(ConfigureBaseImageRepoAction.java:793)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: com.vmware.vcf.secure.ssh.common.SshClientException: sddcm.vcf.internal: Failed to upload file to /nfs/vmware/vcf/nfs-mount/base-install-images/vsp_folder/vmsp-platform-9.1.0.0.25370367.tar via ssh
at com.vmware.vcf.secure.ssh.common.SshClientImpl.upload(SshClientImpl.java:300)
at com.vmware.vcf.secure.ssh.SshExecuter.upload(SshExecuter.java:291)
... 6 common frames omitted
Caused by: org.apache.sshd.common.SshException: IoWriteFutureImpl[SftpChannelSubsystem[id=10, recipient=4]-ClientSessionImpl[vcf@{SDDC Manager FQDN}/{SDDC Manager IP}:22][sftp][SSH_MSG_CHANNEL_DATA]]: Failed to get operation result within specified timeout: 30000 msec
at org.apache.sshd.common.future.AbstractSshFuture.lambda$verifyResult$1(AbstractSshFuture.java:114)
at org.apache.sshd.common.future.AbstractSshFuture.formatExceptionMessage(AbstractSshFuture.java:206)
at org.apache.sshd.common.future.AbstractSshFuture.verifyResult(AbstractSshFuture.java:114)
at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:41)
at org.apache.sshd.common.io.AbstractIoWriteFuture.verify(AbstractIoWriteFuture.java:32)
at org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:110)
at org.apache.sshd.common.future.VerifiableFuture.verify(VerifiableFuture.java:96)
at org.apache.sshd.sftp.client.SftpMessage.waitUntilSent(SftpMessage.java:85)
at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.internalFlush(SftpOutputStreamAsync.java:358)
at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.internalTransfer(SftpOutputStreamAsync.java:285)
at org.apache.sshd.sftp.client.impl.SftpOutputStreamAsync.transferFrom(SftpOutputStreamAsync.java:184)
at org.apache.sshd.sftp.client.impl.AbstractSftpClient.put(AbstractSftpClient.java:1257)
at org.apache.sshd.sftp.client.SftpClient.put(SftpClient.java:972)
at com.vmware.vcf.secure.ssh.common.SshClientImpl.upload(SshClientImpl.java:297)
... 7 common frames omitted
Caused by: java.util.concurrent.TimeoutException: Timed out after 30000 msec
at org.apache.sshd.common.future.AbstractSshFuture.verifyResult(AbstractSshFuture.java:113)
... 18 common frames omittedRunning esxtop shows a high kernel average latency (KAVG) (e.g., >1000ms) for the NFS datastore.
When checking the ESXi Host Client during the deployment, the temporary vSwitch created for NFS traffic shows that a traffic shaping policy is active with a strict bandwidth limit:
During the vCenter and SDDC Manager deployment, a traffic shaping policy is mistakenly enabled on the temporary vSwitch created for NFS traffic.
This policy restricts the bandwidth to 100 Mbit/s, causing the large file transfers (such as .tar bundles) to take an excessively long time or time out completely.
To workaround this issue, disable the traffic shaping policy on the affected vSwitch and retry the task.