"Connection to sfcbd lost Attempting to reconnect: 1" message shown and upgrade is stuck when running "vamicli update --install latest" during Cloud Director Appliance upgrade
book
Article ID: 325522
calendar_today
Updated On:
Products
VMware Cloud Director
Issue/Introduction
Symptoms:
Running the vamicli update --install latest command during Cloud Director Appliance upgrade.
The upgrade process is stuck and shows output similar to the following:
vamicli update --install latest Installing version - 10.4.0.8214 Build 20079248 .............................................................................Connection to sfcbd lost Attempting to reconnect: 1 ...................
The /opt/vmware/var/log/vami/updatecli.log stops logging any additional entries.
Access to the NFS Transfer Share is lost on the Cloud Director Cell Appliance at this phase of the upgrade.
Custom static routes for the eth0 and eth1 network interfaces on the Appliance are no longer present after beginning the upgrade.
Environment
VMware Cloud Director 10.x
Cause
In certain circumstances the custom static routes for the eth0 and eth1 network interfaces on the Cloud Director Appliance Cell can be lost during the upgrade. If these static routes are required for the Cloud Director Appliance Cell to access the NFS Transfer Share then the upgrade can become stuck.
Resolution
This is a known issue affecting Cloud Director 10.4.x prior to the 10.4.2 release. The issue is resolved in VMware Cloud Director 10.4.2 and later, available at VMware By Broadcom Downloads. To workaround the issue please follow the steps below if an upgrade attempt is failing.
Workaround:
To workaround the issue during an upgrade take these steps:
Before beginning the upgrade review the static routes which have been applied on the Cloud Director Appliance Cells so that they can be reapplied later.
For eth0, run the following command: ovfenv --key vcloudnet.routes0.VMware_vCloud_Director
For eth1, run the following command: ovfenv --key vcloudnet.routes1.VMware_vCloud_Director
At step 9 of the documentation when running the vamicli update --install latest command the issue will occur where the upgrade becomes stuck. Do not cancel the command, instead let it remain running.
Open a second SSH session to the same Cloud Director Appliance Cell where the upgrade is currently stalled.
Confirm the static routes which should be present are no longer listed, for example by running: ip route
Manually add the missing static routes which have been found to be missing, for example by running the following command with the appropriate desired values: ip route add {destination} via {gateway} dev {interface}
After adding the routes back ensure that the NFS Transfer Share is again mounted, once it is present and mounted the stuck upgrade command from step 3 above should start to continue again.
Continue the upgrade but do not perform step 12 to reboot the Cloud Director Appliance Cell yet.
Before the reboot edit the /etc/systemd/system/vcd-ova-netconfig.service to ensure the static routes are retained on all subsequent reboots by making the following changes to move the position of the vaos.service: Original Configuration [Unit] Description=Cloud Director Appliance Network Configuration Service Before=vpostgres.service vmware-vcd.service vaos.service After=network.target systemd-networkd.service ...
Updated Configuration [Unit] Description=Cloud Director Appliance Network Configuration Service Before=vpostgres.service vmware-vcd.service After=network.target systemd-networkd.service vaos.service ...
Complete step 12 to reboot the Cloud Director Appliance Cell and all subsequent upgrade steps.